One of the most important tasks of the distribution packager is to ensure that the software shipped to our users is free of security vulnerabilities. While finding and fixing the vulnerable code is usually considered upstream’s responsibility, the packager needs to ensure that all these fixes reach the end users ASAP. With the aid of central package management and dynamic linking, the Linux distributions have pretty much perfected the deployment of security fixes. Ideally, fixing a vulnerable dependency is as simple as patching a single shared library via the distribution’s automated update system.
Of course, this works only if the package in question is actually following good security practices. Over the years, many Linux distributions (at the very least, Debian, Fedora and Gentoo) have been fighting these bad practices with some success. However, today the times have changed. Today, for every 10 packages fixed, a completely new ecosystem emerges with the bad security practices at its central point. Go, Rust and to some extent Python are just a few examples of programming languages that have integrated the bad security practices into the very fabric of their existence, and recreated the same old problems in entirely new ways.
The root issue of bundling dependencies has been discussed many times before. The Gentoo Wiki explains why you should not bundle dependencies, and links to more material about it. I would like to take a bit wider approach, and discuss not only bundling (or vendoring) dependencies but also two closely relevant problems: static linking and pinning dependencies.
In the simplest words, static linking means embedding your program’s dependencies directly into the program image. The term is generally used in contrast to dynamic linking (or dynamic loading) that keep the dependent libraries in separate files that are loaded at program’s startup (or runtime).
Why is static linking bad? The primary problem is that since they become an integral part of the program, they can not be easily replaced by another version. If it turns out that one of the libraries is vulnerable, you have to relink the whole program against the new version. This also implies that you need to have a system that keeps track of what library versions are used in individual programs.
While you might think that rebuilding a lot of packages is only a problem for source distributions, you are wrong. While indeed the users of source distributions could be impacted a lot, as their systems remain vulnerable for a long time needed to rebuild a lot of packages, a similar problem affects binary distributions. After all, the distributions need to rebuild all affected programs in order to fully ship the fix to their end users which also involves some delay.
Comparatively, shipping a new version of a shared library takes much less time and fixes all affected programs almost instantly (modulo the necessity of restarting them).
The extreme case of static linking is to distribute proprietary software that is statically linked to its dependencies. This is primarily done to ensure that the software can be run easily on a variety of systems without requiring the user to install its dependencies manually. However, this scenario is really a form of bundling dependencies, so it will be discussed in the respective section.
However, static linking has also been historically used for system programs that were meant to keep working even if their dependent libraries became broken.
In modern packages, static linking is used for another reason entirely — because they do not require the modern programming languages to have a stable ABI. The Go compiler does not need to be concerned about emitting code that would be binary compatible with the code coming from a previous version. It works around the problem by requiring you to rebuild everything every time the compiler is upgraded.
To follow the best practices, we strongly discourage static linking in C and its derivatives. However, we can’t do much about languages such as Go or Rust that put static linking at the core of their design and have time and again stated publicly that they will not switch to dynamic linking of dependencies.
While static linking is bad, at least it provides a reasonably clear way for automatic updates (and therefore the propagation of vulnerability fixes) to happen, pinning dependencies means requiring a specific version of your program’s dependencies to be installed. While the exact results depend on the ecosystem and the exact way of pinning the dependency, generally it means that at least some users of your package will not be able to automatically update the dependencies to newer versions.
That might not seem that bad at first. However, it means that if a bug fix or — even more importantly — a vulnerability fix is released for the dependency, the users will not get it unless you update the pin and make a new release. And then, if somebody else pins your package, then that pin will also need to be updated and released. And the chain goes on. Not to mention what happens if some package just happens to indirectly pin to two different versions of the same dependency!
Why do people pin dependencies? The primary reason is that they don’t want dependency updates to suddenly break their packages for end users, or to have their CI results suddenly broken by third-party changes. However, all that has another underlying problem — the combination of not being concerned with API stability on upstream part, and not wishing to unnecessarily update working code (that uses deprecated API) on downstream part. Truth is, pinning makes this worse because it sweeps the problem under the carpet, and actively encourages people to develop their code against specific versions of their dependencies rather than against a stable public API. Hyrum’s Law in practice.
Dependency pinning can have really extreme consequences. Unless you make sure to update your pins often, you may one day find yourself having to take a sudden leap — because you have relied on a very old version of a dependency that is now known to be vulnerable, and in order to update it you suddenly have to rewrite a lot of code to follow the API changes. Long term, this approach simply does not scale anymore, the effort needed to keep things working grows exponentially.
We try hard to unpin the dependencies and test packages with the newest versions of them. However, often we end up discovering that the newer versions of dependencies simply are not compatible with the packages in question. Sadly, upstreams often either ignore reports of these incompatibilities or even are actively hostile to us for not following their pins.
Now, for the worst of all — one that combines all the aforementioned issues, and adds even more. Bundling (often called vendoring in newspeak) means including the dependencies of your program along with it. The exact consequences of bundling vary depending on the method used.
In open source software, bundling usually means either including the sources of your dependencies along with your program or making the build system fetch them automatically, and then building them along with the program. In closed source software, it usually means linking the program to its dependencies statically or including the dependency libraries along with the program.
The baseline problem is the same as with pinned dependencies — if one of them turns out to be buggy or vulnerable, the users need to wait for a new release to update the bundled dependency. In open source software or closed source software using dynamic libraries, the packager has at least a reasonable chance of replacing the problematic dependency or unbundling it entirely (i.e. forcing the system library). In statically linked closed source software, it is often impossible to even reliably determine what libraries were actually used, not to mention their exact versions. Your distribution can no longer reliably monitor security vulnerabilities; the trust is shifted to software vendors.
However, modern software sometimes takes a step further — and vendor modified dependencies. The horror of it! Now not only the packager needs to work to replace the library but often has to actually figure out what was changed compared to the original version, and rebase the changes. In worst cases, the code becomes disconnected from upstream to the point that the program author is no longer capable of updating the vendored dependency properly.
Sadly, this kind of vendoring is becoming more common with the rapid development happening these days. The cause is twofold. On one hand, downstream consumers find it easier to fork and patch a dependency than to work with upstreams. On the other hand, many upstreams are not really concerned with fixing bugs and feature requests that do not affect their own projects. Even if the fork is considered only as a stop-gap measure, it often takes a real lot of effort to push the changes upstream afterwards and re-synchronize the codebases.
We are strongly opposed to bundling dependencies. Whenever possible, we try to unbundle them — sometimes having to actually patch the build systems to reuse system libraries. However, this is a lot of work, and often it is not even possible because of custom patching, including the kind of patching that has been explicitly rejected upstream. To list a few examples — Mozilla products rely on SQLite 3 patches that collide with regular usage of this library, Rust bundles a fork of LLVM.
Static linking, dependency pinning and bundling are three bad practices that have serious impact on the time and effort needed to eliminate vulnerabilities from production systems. They can make the difference between being able to replace a vulnerable library within a few minutes and having to spend a lot of effort and time in locating multiple copies of the vulnerable library, patching and rebuilding all the software including them.
The major Linux distributions had policies against these practices for a very long time, and have been putting a lot of effort into eliminating them. Nevertheless, it feels more and more like Sisyphean task. While we have been able to successfully resolve these problems in many packages, whole new ecosystems were built on top of these bad practices — and it does not seem that upstreams care about fixing them at all.
New programming languages such as Go and Rust rely entirely on static linking, and there’s nothing we can do about it. Instead of packaging the dependencies and having programs use the newest versions, we just fetch the versions pinned by upstream and make big blobs out of it. And while upstreams brag how they magically resolved all security issues you could ever think of (entirely ignoring other classes of security issues than memory-related), we just hope that we won’t suddenly be caught with our pants down when a common pinned dependency of many packages turns out to be vulnerable.