The GNU C library (glibc) is one of oldest, mature, and widely used open source software (OSS) projects. It's also a key part of the GNU Project, which is a project that is very sensitive to licensing considerations. To many, the answer to the question "What is the license of the GNU C library?" is easy: the Lesser General Public license. The Wikipedia page for the C library unequivocally states that glibc is under the LGPL:
The project page is similarly clear and unambiguous. From an OSS and IP management perspective, however, the answer becomes less clear once we take a few minutes to actually inspect and audit the source code. Let's have a look at what a closer inspection reveals.
First, let's download a copy of the latest source code tarball straight from GNU website. I've gone ahead and scanned this project with the Palamida software, but you can visually inspect the source code with an archive manager like WinRAR or 7Zip. Once we've burst open the archive we'll can drill down into in the top-level
/glibc-2.21/ directory. Lo and behold, there's a COPYING.LIB file right there for all to see:
The curious thing is that there's also a file called COPYING!:
So now we see two different licenses indicated for this library. The difference between the GPL and LGPL for many commercial software companies is huge. Companies typically need their applications to link against the C Library on their platforms. They're not interested in making modifications to the library, they just need to use it. Under the terms of the LGPL, linking to an LGPL-licensed library does not require a company to release the proprietary code that uses the library. The same is not true of GPL-licensed libraries. So we have an interest in knowing what exactly the licensing situation is here. Okay, maybe the README can help us sort this out:
Well, unfortunately this note shines no light on why a copy of the GPL is included. Even the referenced file LICENSES only contains the third party licenses of various subcomponents of glibc (nearly all of which are BSD-style attribution licenses).
The likely explanation for the inclusion of the copy of the GPL is found on the GNU website: "[S]ince the LGPL is a set of additional permissions on top of the GPL, it's important to include both licenses so users have all the materials they need to understand their rights." To be absolutely sure about the licensing situation, though, are we able to say there's no GPL-licensed code across the nearly 15,000 files in this library?
Thankfully Palamida provides us with some help answering this question by finding license text matches and other clues.
First of all, we do see LGPL license text in files claiming to be from the C library:
Upon seeing a file or two with LGPL text, we might be satisfied. However, using Palamida's license detection, we find a file with GPL license text:
Well, maybe it's just shell scripts and other build paraphernalia under the GPL? That's far from certain; here's another file (from the
Interesting ... this file contains a special exception! Notice that we were not warned about such files in the README or in the top-level license files! Here's another example:
That file came from the
/manual/examples/ directory, so perhaps it and others like it could be easily stripped away without losing glibc functionality. But can we be confident that stripping a file promising C string table handling won't break anything?:
Finally, can we at least be assured that the GPL in this library is limited to GPL v2.0? Apparently not:
Note that a full copy of the GPL v3.0 doesn't even appear in the glibc distribution.
What are the lessons here?
- We cannot automatically trust unambiguous assertions of a project's license on the web.
- Even a mature open source project won't necessarily provide a roadmap (let alone an accurate one) to the various licenses and sublicenses found within.