The modern packager’s security nightmare

One of the most important tasks of the distribution packager is to ensure that the software shipped to our users is free of security vulnerabilities. While finding and fixing the vulnerable code is usually considered upstream’s responsibility, the packager needs to ensure that all these fixes reach the end users ASAP. With the aid of central package management and dynamic linking, the Linux distributions have pretty much perfected the deployment of security fixes. Ideally, fixing a vulnerable dependency is as simple as patching a single shared library via the distribution’s automated update system.

Of course, this works only if the package in question is actually following good security practices. Over the years, many Linux distributions (at the very least, Debian, Fedora and Gentoo) have been fighting these bad practices with some success. However, today the times have changed. Today, for every 10 packages fixed, a completely new ecosystem emerges with the bad security practices at its central point. Go, Rust and to some extent Python are just a few examples of programming languages that have integrated the bad security practices into the very fabric of their existence, and recreated the same old problems in entirely new ways.

The root issue of bundling dependencies has been discussed many times before. The Gentoo Wiki explains why you should not bundle dependencies, and links to more material about it. I would like to take a bit wider approach, and discuss not only bundling (or vendoring) dependencies but also two closely relevant problems: static linking and pinning dependencies.

Static linking

In the simplest words, static linking means embedding your program’s dependencies directly into the program image. The term is generally used in contrast to dynamic linking (or dynamic loading) that keep the dependent libraries in separate files that are loaded at program’s startup (or runtime).

Why is static linking bad? The primary problem is that since they become an integral part of the program, they can not be easily replaced by another version. If it turns out that one of the libraries is vulnerable, you have to relink the whole program against the new version. This also implies that you need to have a system that keeps track of what library versions are used in individual programs.

While you might think that rebuilding a lot of packages is only a problem for source distributions, you are wrong. While indeed the users of source distributions could be impacted a lot, as their systems remain vulnerable for a long time needed to rebuild a lot of packages, a similar problem affects binary distributions. After all, the distributions need to rebuild all affected programs in order to fully ship the fix to their end users which also involves some delay.

Comparatively, shipping a new version of a shared library takes much less time and fixes all affected programs almost instantly (modulo the necessity of restarting them).

The extreme case of static linking is to distribute proprietary software that is statically linked to its dependencies. This is primarily done to ensure that the software can be run easily on a variety of systems without requiring the user to install its dependencies manually. However, this scenario is really a form of bundling dependencies, so it will be discussed in the respective section.

However, static linking has also been historically used for system programs that were meant to keep working even if their dependent libraries became broken.

In modern packages, static linking is used for another reason entirely — because they do not require the modern programming languages to have a stable ABI. The Go compiler does not need to be concerned about emitting code that would be binary compatible with the code coming from a previous version. It works around the problem by requiring you to rebuild everything every time the compiler is upgraded.

To follow the best practices, we strongly discourage static linking in C and its derivatives. However, we can’t do much about languages such as Go or Rust that put static linking at the core of their design and have time and again stated publicly that they will not switch to dynamic linking of dependencies.

Pinning dependencies

While static linking is bad, at least it provides a reasonably clear way for automatic updates (and therefore the propagation of vulnerability fixes) to happen, pinning dependencies means requiring a specific version of your program’s dependencies to be installed. While the exact results depend on the ecosystem and the exact way of pinning the dependency, generally it means that at least some users of your package will not be able to automatically update the dependencies to newer versions.

That might not seem that bad at first. However, it means that if a bug fix or — even more importantly — a vulnerability fix is released for the dependency, the users will not get it unless you update the pin and make a new release. And then, if somebody else pins your package, then that pin will also need to be updated and released. And the chain goes on. Not to mention what happens if some package just happens to indirectly pin to two different versions of the same dependency!

Why do people pin dependencies? The primary reason is that they don’t want dependency updates to suddenly break their packages for end users, or to have their CI results suddenly broken by third-party changes. However, all that has another underlying problem — the combination of not being concerned with API stability on upstream part, and not wishing to unnecessarily update working code (that uses deprecated API) on downstream part. Truth is, pinning makes this worse because it sweeps the problem under the carpet, and actively encourages people to develop their code against specific versions of their dependencies rather than against a stable public API. Hyrum’s Law in practice.

Dependency pinning can have really extreme consequences. Unless you make sure to update your pins often, you may one day find yourself having to take a sudden leap — because you have relied on a very old version of a dependency that is now known to be vulnerable, and in order to update it you suddenly have to rewrite a lot of code to follow the API changes. Long term, this approach simply does not scale anymore, the effort needed to keep things working grows exponentially.

We try hard to unpin the dependencies and test packages with the newest versions of them. However, often we end up discovering that the newer versions of dependencies simply are not compatible with the packages in question. Sadly, upstreams often either ignore reports of these incompatibilities or even are actively hostile to us for not following their pins.

Bundling/vendoring dependencies

Now, for the worst of all — one that combines all the aforementioned issues, and adds even more. Bundling (often called vendoring in newspeak) means including the dependencies of your program along with it. The exact consequences of bundling vary depending on the method used.

In open source software, bundling usually means either including the sources of your dependencies along with your program or making the build system fetch them automatically, and then building them along with the program. In closed source software, it usually means linking the program to its dependencies statically or including the dependency libraries along with the program.

The baseline problem is the same as with pinned dependencies — if one of them turns out to be buggy or vulnerable, the users need to wait for a new release to update the bundled dependency. In open source software or closed source software using dynamic libraries, the packager has at least a reasonable chance of replacing the problematic dependency or unbundling it entirely (i.e. forcing the system library). In statically linked closed source software, it is often impossible to even reliably determine what libraries were actually used, not to mention their exact versions. Your distribution can no longer reliably monitor security vulnerabilities; the trust is shifted to software vendors.

However, modern software sometimes takes a step further — and vendor modified dependencies. The horror of it! Now not only the packager needs to work to replace the library but often has to actually figure out what was changed compared to the original version, and rebase the changes. In worst cases, the code becomes disconnected from upstream to the point that the program author is no longer capable of updating the vendored dependency properly.

Sadly, this kind of vendoring is becoming more common with the rapid development happening these days. The cause is twofold. On one hand, downstream consumers find it easier to fork and patch a dependency than to work with upstreams. On the other hand, many upstreams are not really concerned with fixing bugs and feature requests that do not affect their own projects. Even if the fork is considered only as a stop-gap measure, it often takes a real lot of effort to push the changes upstream afterwards and re-synchronize the codebases.

We are strongly opposed to bundling dependencies. Whenever possible, we try to unbundle them — sometimes having to actually patch the build systems to reuse system libraries. However, this is a lot of work, and often it is not even possible because of custom patching, including the kind of patching that has been explicitly rejected upstream. To list a few examples — Mozilla products rely on SQLite 3 patches that collide with regular usage of this library, Rust bundles a fork of LLVM.

Summary

Static linking, dependency pinning and bundling are three bad practices that have serious impact on the time and effort needed to eliminate vulnerabilities from production systems. They can make the difference between being able to replace a vulnerable library within a few minutes and having to spend a lot of effort and time in locating multiple copies of the vulnerable library, patching and rebuilding all the software including them.

The major Linux distributions had policies against these practices for a very long time, and have been putting a lot of effort into eliminating them. Nevertheless, it feels more and more like Sisyphean task. While we have been able to successfully resolve these problems in many packages, whole new ecosystems were built on top of these bad practices — and it does not seem that upstreams care about fixing them at all.

New programming languages such as Go and Rust rely entirely on static linking, and there’s nothing we can do about it. Instead of packaging the dependencies and having programs use the newest versions, we just fetch the versions pinned by upstream and make big blobs out of it. And while upstreams brag how they magically resolved all security issues you could ever think of (entirely ignoring other classes of security issues than memory-related), we just hope that we won’t suddenly be caught with our pants down when a common pinned dependency of many packages turns out to be vulnerable.

Added 2021-04-27: Deadlock vulnerability through embedded app-emulation/containers-storage (CVE-2021-20291) is an example how a vulnerability in a single Go package requires hunting all affected ebuilds.

63 thoughts on “The modern packager’s security nightmare”

  1. The unstable ABI/static linking of modern languages is a real let down and the saddest part about this is that they could have been done differently if dynamic linking had been a top priority for their developers. I’ll leave you an article about “How Swift achieved dynamic linking where Rust couldn’t” for when you are bored and want to get pissed off a bit more: https://gankra.github.io/blah/swift-abi/

  2. I think, dependency pinning issue should be at least partially solvable with tool like this added in CI: https://github.com/RustSec/cargo-audit

    But of course not all upstream developers are gonna use it, so the global issue remains. Cargo should probably audit lock files by default.

    1. There’s also a bot on GitHub that reports outdated pinned dependencies for Python; but I don’t know whether it works for Rust too.

      However, this doesn’t really resolve the underlying issue. Tools make it possible to workaround the problem if you only make new releases fast enough. They don’t make things work smoothly. It’s like adding a device to push square wheels and then claiming they work as good as round ones.

      1. You may be referring to Dependabot, and it does support Rust:

        https://dependabot.com/rust/

        It sends pull requests to the project, and even includes some handy information like the upstream changelog. Here’s an example pull request it made:

        https://github.com/bowlofeggs/rpick/pull/39

        This, of course, doesn’t necessarily help at the distribution packaging level, but it does help an upstream project stay on top of updates, and to stay aware of security issues.

    1. I don’t really know the Node ecosystem (and I don’t really want to know it, I guess), so I can’t really tell.

    2. I know nothing of node but I was bored (and tired of waiting for nodejs in gentoo) and I made an experimental nodejs overlay [major help needed :)].
      My approach is to unbundle and unpin everything (currently there are >2000 packages but I’m not scared of the number).
      What I’ve seen in nodejs packages? Bundling is the way it work so you end up with a 1 GB atom folder and pinning is widespread but at least they enforce semantic versioning. Also I see that alot of packages depend/bundle 4-6 year old deprecated (by upstream) and unmaintained dependencies.

  3. > This also implies that you need to have a system that keeps track of what library versions are used in individual programs.

    I assume such a system doesn’t exist for gentoo? How far would checking EGO_SUM and CRATES help? (Those don’t list bundled dependencies, but would it be a start?)

    Also, is this really a new problem? What’s the story with tracking C++ header-only libraries, npm dependency lock files, or maven poms?

    1. There’s no such system at the moment. It is possible that the approach used by modern eclasses would help for Go/Rust. Still, somebody needs to do the work.

      As for header-only libraries (or pure static libraries), subslots provide a ‘usually good enough’ solution to that. That is, as long as all consumers depend on them directly.

      I don’t know much about NPM, and I know we haven’t managed to deal with Maven yet.

      1. > Still, somebody needs to do the work.

        I felt like doing a bit of it. https://github.com/jcaesar/ebuild-crates-check (Sorry, no README yet.)

        Out of the 1983 crates in use over all overlays, 43 of them have advisories open. https://gist.github.com/jcaesar/53ed6ef09233987cbb7c8b21ba336ef3 But now I’m unsure what to do about them. Teach my tool to open issues on bugs.gentoo.org for new advisories? Manually examine whether the advised issues can even cause a problem? Hm…

  4. Thanks for that very nice summary. The only thing I’m missing is the fact that containerized package formats like AppImage, Flatpak, Snaps or Docker images have more or less the same issues as bundling/vendoring dependencies and accordingly hav a similar if not worse security impact.

    Unfortunately—like many application developers—also at least one of the bigger Linux distributions has fallen for that easy path to the dark side of packaging and has unnecessarily started shipping dummy packages which pull in containerized applications. (Yes, I look at you, Ubuntu.)

  5. Too true. We’re in the dark age of software security and data safety. Companies have almost no reasons to invest in it. Apparently the number of opportunistic software engineers not to care about this either seems to be growing; and a lot of younger ones learn this bad behavior.

    In my opinion this isn’t a problem, which can be sorted out on the engineers level, it’s a social and political one: The question is who can be held responsible; unless there’s no chance companies and their leaders and cannot be sued out of (economic) existence for data breaches, etc., nothing will change.

    If the law forced long support time frames for software and computer systems, publication of used software components/libraries and prompt publication and correction of security issues and other severe bugs, the software development and packaging world would be another one; simply because of changed requirements.

    Unfortunately our societies do not care or even understand the problem.

  6. Ideas:
    – Use-Flagging: BLOB_UPSTREAM_UNSUPPORTIVE
    – Installing in (para-)virtualized environments only
    – In the worst case it may be better to package mask this sort of software and not invest developer time

    1. I’d read the “Gotta go deeper” part of Let’s Be Real About Dependencies before making that decision.

      TL;DR: Due to the historical lack of a distro-independent dependency management system, pretty much any large C or C++ project is going to be a horror show of “library as a single header that’s designed to be vendored and has drifted away from every other copy in the distro package repository” and bespoke implementations of things.

  7. The bit about rust and llvm is (as far as I understand) completely false. Rust supports (and tests against) stock LLVM. Can you elaborate where you got your information there?

    1. Gentoo rust maintainer here:

      I’m not sure about testing right now, maybe they do now. But some time ago the only tests were only against internal llvm copy. I think there’s even an option to download their compiled llvm for build/tests landed nowadays, that narrows test/build scope even more.

      as for “supported” (quoted on purpose), sure it kinda is, but in reality it is not. Internal llvm copy is pinned and contains some specific codegen patches on top. so it differs from system-llvm.
      also, rust ebuild has a test suite, which I do run on bumps. and I see codegen failures with system-llvm more often than with internal one. not really surprising.
      we even have a bug for that with more points: https://bugs.gentoo.org/735154
      lu_zero will have more info on specific examples, because they were able to trigger a specific breakage with system-llvm and not bundled one.
      to be specific, I run tests on amd64, arm64 and ppc64le, with both system-llvm and bundled one. system-llvm in most cases fails more tests, but not always and situation has been better lately.

      major blocker: one can’t really build firefox(or other mozilla rust consumers) with system-llvm 11 and rust llvm !=11. So one can’t build firefox/thunderbird/spidermonkey if there’s a mismatch between llvm versions.

      and cherry on top, debian, fedora and other distros do prohibit bundling of llvm. but they have a rather dated version of rust compared to gentoo, so they can move slower.
      In gentoo we have to keep a bootstrap path ( because rust has a specific bootstrap version requirements) and for stability I’ve been forcing llvm dep of rust to be the same as internal llvm, sans the patches.

      as for FreeBSD, last time I checked they’ve used bundled one, despite llvm being base system component. Maybe that has changed, they are making a lot of progress, but maybe not.
      they’ve also swiched to using bundled libssh2 and libgit2 ( as we did as well) because updating system copies of those packages broke cargo to the point of inability for rust to update itself.

      there are a lot of problems not seen to consumers, but developers and packagers see and deal with it daily, as blog post title says.

  8. One upstream that drove bundling to an excess is MoonchildProductions with their Firefox fork called PaleMoon. They removed all configure switches to enable usage of system libs so you can only use their bundled stuff. Security at its best.
    I was even badmouthed to spread FUD when I tried to convince developers of another Firefox fork called Waterfox to not merge with Moonchild’s Basilisk browser (unbranded PaleMoon):

    https://github.com/MrAlex94/Waterfox/issues/1290

    Unfortunately the original issue at MoonchildProductions regarding bundled libs has been removed.

    I’m just glad that MoonchildProductions did this stupid move before I started to waste my time adding PaleMoon to Gentoo. I also convinced some other Gentoo devs to not waste their time on this as long as this ignorant upstream continues with their bundling stupidity.

      1. This is not 100% true. The clause in the license only states that if you are compiling a build of Pale Moon with modifications you must build with the –disable-official-branding flag and cannot call your package Pale Moon.

  9. OK, this sucks. I just looked at sys-apps/ripgrep’s ebuild. Why is dependency pinning so widespread in the Rust ecosystem? Are updates to its dependencies more or less always breaking a program or what?

    With regard to static linking:
    Wouldn’t it be enough to generate a list of build-time dependencies from ebuilds of statically linked programs and compare that to the output of ’emerge -uDN @world’ and then trigger an einfo to run something to the likes of ’emerge @static-rebuild’ when a match is found?

    Each time an update of one of the behemoths (libreoffice, firefox, etc.) is pushed to the tree I anyways have to wait a full day until the build is finished, on my low end system.

    1. Dependency pinning is usually not widespread.

      As Gentoo we use the Cargo.lock provided to pin the dependencies in the ebuild, but if the need arises we can use the same crate and generate a cargo lock with newer dependencies.

      By default `cargo install` does not use the Cargo.lock.

      Yet I heard people argue that it should because they do like to have a set of known-to-work dependencies.

      I suggest anybody with strong feelings about it to first read how cargo works and then pour effort in making the current situation better instead of perpetuating imprecise takes ^^.

  10. Hi! Rust core team (though not compiler team) member here.

    There’s a lot of opinion in this piece, which is totally fine of course, but I wanted to point out a factual inaccuracy:

    > Rust bundles a huge fork of LLVM

    It is not super huge, and we try to upstream patches regularly to minimize the fork. We also regularly re-base all current patches on each release of LLVM, so it’s less ‘fork’ and more ‘maintain some extra commits to fix bugs.’

    > and explicitly refuses to support to distributions using the genuine LLVM libraries.

    We always support the latest LLVM release, and try to maintain compatibilities with older ones as long as is reasonable and possible.

    IIRC, the last time we raised the base LLVM requirement was in Rust 1.49, at the end of last year. The minimum version it was raised to was LLVM 9, which was released in September of 2019. The current release is 11.

    Again, because of the lack of the extra patches, you may see miscompilation bugs if you use the stock LLVM. There’s pros and cons to every choice here, of course.

    For more on this, please see https://rustc-dev-guide.rust-lang.org/backend/updating-llvm.html

    The Rust project is and has been interested in collaborating on issues to make things easier for folks when there’s demand. We’ve sought feedback from distros in the past and made changes to make things easier! I would hope taking advantage of the stuff above would solve at least some of your issues with packaging rustc itself. If there are other ways that we can bridge this gap, please get in touch.

    1. >> Rust bundles a huge fork of LLVM

      > It is not super huge, and we try to upstream patches regularly to minimize the fork.

      I am sorry about that. I didn’t mean to say that the divergence is huge but that LLVM is huge. I tend to edit myself a few times, and I must’ve accidentally changed the meaning at that.

    2. If I had one (only one!) free wish at the Rust developers, I’ld ask for better supporting dynamic linking :-)

      I do understand, that defining a stable ABI is not an easy task.

      At the moment the only option is to build a library in Rust, but export only a “C” API/ABI, and let the Rust application consume that “C” API/ABI, right?

  11. Hi Michael, I enjoyed your post as it provides some useful context for those unaware of common issues faced by distros. Do you know of any other blogs with similar posts?

    The issues you describe seem very much like a cursed problem found in game development, if you aren’t familiar with that term there’s a great GDC youtube video on it.

    What are your thoughts regarding allowing both behaviors in a way that effectively favors dynamic linking under the hood.

    I would think the first step would be getting visibility on the problem of similarly classified code between dynamic and static binaries.

    There really isn’t much to be done regarding the upstream maintainers not paying attention to issues unfortunately, the only thing I come up with is make it easier to collaborate on issues but its been my experience that these people have a vested interest in not changing their workflows (even if it helps them in the long run).

    Snap is a perfect example of what’s wrong with the linux ecosystem.

    1. I can think of three that touch somewhat related topics, and haven’t been linked here already:

      The Flameeyes’ classic on bundling:
      https://flameeyes.blog/2009/01/02/bundling-libraries-for-despair-and-insecurity/

      Michael Orlitzky’s on programming language design:
      https://orlitzky.com/articles/greybeards_tomb%3A_the_lost_treasure_of_language_design.xhtml

      Drew Devault directly on Rust:
      https://drewdevault.com/2021/02/09/Rust-move-fast-and-break-things.html

  12. apologies if this is a naive or dumb question, but in the bundling case, why is it bad to depend on the app developer to address security issues? it seems the user has to trust the os provider or the app developer to fix security issues as they arise. shouldn’t we look to the app developer themselves to keep their own apps updated for security? it seems like having every os manage security updates of every library is a lot of duplicated effort.

    1. I can think of a few reasons at least.

      Firstly, because it reduces the number of vendors you have to trust. Generally, you can expect that if somebody founds a vulnerability in a package, then a CVE will be requested. Based on the CVE, the distribution security team will know about the issue, monitor it and ship any fixes or workarounds necessary. Upstream’s help is nice but you don’t have to strictly rely on it.

      Secondly, because distributions have hundreds of developers, including dedicated security teams. The bus factor is much higher than for most upstreams that can disappear at any time and leave the package full of holes. Unfortunately, this is happening far more often with ‘new’ upstreams.

      Thirdly, because upstream developers generally maintain the list only for the very newest version of their package. You may think distributions are doing duplicate work but in the fact, bundling makes them make duplicate work. There is little duplicate work in fixing a vulnerability in shared dependency once. There is a lot when you have to wait for 20 different packages to release a new fixed version, then test and update all of them. And it isn’t uncommon that the vulnerability fix happens on top of other changes that could introduce bugs or incompatibilities.

      Not to mention the case when someone bundles an older version of your library, which in turn bundles vulnerable versions of other stuff. Even if you address the issue quickly, the users depend on a whole chain of people who recursively address the problem.

  13. This post is very useful for helping application developers understand dependency issues from the PoV of those who have to maintain packages and systems. It’s really very good from that perspective. Application developers don’t always think about this.

    However, it also largely neglects discussion of the *tradeoffs* for developers making these kinds of decisions. I understand that pinning dependencies limits flexibility for updating, and that can be frustrating. But, if I want to ensure that I’m shipping correct, tested software, the alternative for the developer is to run tests against a combinatorial number of all possible dependencies. For a large project, this is not feasible, even if one limits to recently released dependencies. I want to ship software that I *know* works, and I can only do that if I know what my dependencies actually are so I can test against them. New updates break stuff **all the time** and as anyone knows who has been around a while, software does not always get monotonically improve with releases.

    1. This is a hard problem. I do understand your reasoning and sympathize with your concerns but it’s not really doable long term.

      If you are shipping an end-user application that can reasonably work in an isolated bubble, then sure, that could work. If you’re developing a library or module that is supposed to be used by other programs, it won’t. Even if you can reasonable assume that people will run it in a virtual environment or alike, the other dependencies of the final product may get in the way of the versions you’ve tested.

      https://github.com/dask/s3fs/issues/357 is one example of the problem. While my original report was from perspective of using system-wide packages, a large number of replies are people who ended up with dependency conflicts because of two dependencies requiring different versions of botocore.

      Secondly, what worked for you doesn’t necessarily work for end users. You’re probably testing your package on a few platforms at most. The versions you’ve used may contain major bugs on other platforms that you’re not aware of, nor can test.

      Finally, in my opinion this attitude actually lets things get worse. I don’t think most of the people deliberately want to break your software. I’m afraid that some breakage simply needs to happen for people to be more conscious of the problem. If everything is always pinned, people see no reason to try to avoid breaking stuff.

      1. PHP web dev here.
        While I have been bitten a couple of times by system-distributed dependencies missing bugfixes that are present in the bundled versions (recent example: abandonware xmlrpc-epi lib bugs fixed in php sources but neither upstream nor by debian maintainers), I think that the ‘old school’ approach of having distro maintainers deal with a good chunk of the security/update headaches has many advantages compared to letting the app developers themselves take care, such as:
        – relying on the distro updates makes life easy for the end user, as they don’t have to track a million software sources. Debian was a godsend compared to windows as it has everything+kitchen-sink in the base repo
        – distros will often maintain an app version much longer than upstream, adding only security fixes. good! enterprise users love long lived apps
        – distro maintainers are basically extra hands that upstream devs are able to take advantage of, as well as the more-eyes which make bugs shallow

        The problem I see is that increasingly developers adopt a very entitled attitude of ‘latest version or the highway’ (aka. the let’s only offer our product as a cloud service syndorme). It’s not without reason, as maintaining many versions of anything comes with a cost, but still…

        Want a security fix? You can only have it along with new features and changes in api and gui. Oh, and the extra features, which you probably never asked for.

        The fact that VMs and isolation technologies make it easy to sandbox apps from each other does not help end users as much as developers think, as end users still face the dilemma of upgrading to latest and greatest, and paying the integration cost that it entails, or firewall off the obsolete app in a concrete bunker and let it run without an update, hoping it never breaks.

        The fun thing is that, as Josh stated, developers _know_ that updates break things all the time. But they keep pushing the release-cycle always faster instead of slowing down! With the atomization of dependencies, the complexity of having a coherent set of working components snowballs of course, but the proposed solution is never to slow down – it is to bundle instead…
        Devs can at least bundle and ship their dependencies, but what can users do? They are lucky when an LTS release is available – though some apps now think that 3 years is sufficient for lts, and they don’t even have overlapping lts versions (did someone here mention a compiler version from 2019 as being old enough ???).

        I find it also quite candid to admit that CI and test matrixes are not good at solving the problem of finding bugs with dependencies, as that’s one of their big selling points :-O

        If we are not able to tackle the problem at the source (pun intended), we definitely will not be able to tackle it at the ecosystem level.

        1. The reason we developers push for faster cycles is simply the wide adoption of continuous integration in practice and in terms of a mindset. And Gentoo is, and always was, in a way, a great vehicle for that if you think of it. Most of us update our system somewhere between daily and weekly, I’d guess? And we do so with –update –deep –newuse –changed-use and revdep-rebuild and all, and on different platforms, too. We’re really one big CI machine. And I think the open source world as a whole is better for it.

          But back to the question of fast vs slow update cycles in terms of software maintenance. I tend to believe that the frequency of updates is inversely proportional to their impact on a code base. In other words, updating a slowly developed dependency is more likely to break many things at once, causing a lot of work at once (obviously) for downstream users, while a dependency that releases frequently will break things all the time, but in fairly small ways, and thus cause less work at once.

          Now, a fast release-cycle typically benefits both upstream and downstream. Upstream gets feedback on the impact of smaller change sets sooner, and downstream will typically be able to “adopt on the go” during normal feature development. Downstream thus is not overloaded with a major update (halting feature development) due to major breakage in many spots at a time that is convenient to upstream but not downstream, other than not adopting upstream’s new major release now or ever. Such non-adoption has been quite a problem for a long time. (Just look at how many packages were removed due to not adopting supported python versions in the last half year.)

          But: There is one problem with this fast approach. Upstream typically will not be able to maintain ABI stability in a traditional way, or at least that ABI stability has a shorter life than traditionally, and that’s where bundling comes into play. It’s essentially a (foul) way for downstream to emulate traditional ABI stability “internally” at the expense of upstream’s ability to develop their product and push out fixes of any kind — performance, features, or hardening. And *that* is a problem, because what’s supposed to be a win-win turns into a “I win and I don’t care about my surroundings” kind of situation. Now, will the bundle’s provider assume responsibility if their *bundle* now exhibits a security bug that may even be fixed upstream already? Or would they *still* say, “ultimately it was upstream’s fault?”. I think I know the answer.

          Maybe we’ll see a Heartbleed-like vulnerability sometime, and we’ll see what impact it may have on the community and the question at hand, then.

      2. Referencing my previous comment and your reply:

        Yes, I agree that for libraries, it is a different situation. In my case, I have a large end-user application. It’s compiled statically, so no one is ever going to have conflicting packages. When there are bugs, we fix them or we’ll lose users.

        I think the approach you’re recommending is appropriate for dynamically-linked libraries. It makes a lot of sense. But I don’t think it’s a one-size-fits-all pattern.

    2. I think the core of the problem is this desire: “I want to ensure that I’m shipping correct, tested software”. Your users will inevitably run your code differently than your tests, so this desire simply can’t be satisfied. Also if you want that, then only ship *your* code (no bundled code, so you’re off the hook for bugs in those). Just clarify with which dependencies your code has been tested (that can be useful info for your users), and when a problem comes up because of a bug in some dependent library, forward the bug to them (and inform the users who reported that bug, to clear your name ;-)

    3. > But, if I want to ensure that I’m shipping correct, tested software, the alternative for the developer is to run tests against a combinatorial number of all possible dependencies.

      There is actually an alternative: Guix and Nix.
      If you know all the commits of your channels and you know the configuration of your system, then you know precisely what you have deployed.
      It also comes with a bunch of other nice features, like having multiple conflicting version of packages (wanna test the latest commit or an in-development branch of Blender in Guix? guix environment –with-commit=blender=whatever –ad-hoc blender — blender), atomic upgrades without needing a fancy file system, both system generations and per profile generations so you can roll back upgrades or just run something from a previous profile generation without rolling it back (still no fancy file system necessary), and probably a bunch more things I left out.
      Problem is, cargo doesn’t get along nicely with Guix. I am trying to package Supertag and I have to recompile all of its dependencies (even transitive ones) every time I try to build it. And yes, I also ran into some pinning issues.

  14. I am more a Linux user than a dev, but it seems this post addresses the results, but not the cause of a problem that is huge within the GNU/Linux eco system: First packaging that is too slow nowadays and second: missing a stable API for user applications.

    The first: It seems to me the packaging systems of traditional linux distributions come from a time where you shipped software on floppy disks. Nowadays as a developer you are almost always presented with outdated libraries. Plus it takes even longer to get your software packaged and distributed within a certain distribution. Keeping in mind there are very many of these and almost noone is able to package his/her software for all of these dists and their deps. Modern language environments seem to try to solve this by shipping their own deps?

    Second: It seems to me whole problem would not exist if the GNU/Linux world at one point had agreed about a stable API on how to do (graphical) user applications. If every distribution would ship a stable set of such a standardized API there would be much less need to ship own dependencies again and again. Once built, you’re application would run everywhere, indepedent of a Linux distribution. See macOS/Cocoa for a solution. The GNUstep project once has begun to port this approach, but never succeeded (and even nowadays does not have anything like stable API versioning).

    So maybe it’s time for a new approach to gather all the different ways on how to ship software within the GNU/Linux world? (And no, it’s not flatpak/AppImage for sure).

    1. Rather than lack of stable API, I think the fundamental cause is more at: Linux / BSD / Your-flavor-of-Unix (and also Windows and Mac) don’t have a default mechanism to allow coexistence of multiple versions of the same software / library. No, packing multiple versions into independent package names such as “Python2” and “Python3” does not count.

      A fundamental mechanism for multi-versions coexistence shall let developer/user select which version of application link to or call which version of another software / library. In this sense, an update to a program will relink everything that rely on that program to the latest version by default, with mechanism for other software to fallback to older versions in a software-by-software basis when such update breaks a particular software. In this sense, security update is deployed in speed (only fundamental system software need to be tested before new libraries are submitted to package repository) while any potential breakage of non-essential user applications can be workarounded temporarily by users in a few easy clicks (or a few lines of command).

      NixOS and Guix are doing the “Pinning dependencies” thing so their multi-version coexistence mechanism is still quite nightmarish in the eye of the blog author here. In my knowledge only GoboLinux implemented such kind of speedy flexibility but they don’t have a large team to package everything+kitchen-sink. So their design doesn’t have chance to shine. I am not sure if GoboLinux get the program config / user setting files into versioning mechanism yet. If that piece of puzzle is also done, then the dependency-hell will be finally a history.

      1. I think we are talking about the same issue. You only can have a stable API if you do have a versioned one (at least if you want to make any progress).

        I think macOS does have its versioned SDKs which is pretty much what I meant. Even at 10.14 you can execute a app that was built against a 10.4 SDK, because that API was kept maintained and stable.

        1. Versioning declared by application / library developers is different from versioning auto-handled by OS software update system.

          Versioning declared by application / library developers can’t mitigate accidents of security update breaking unpopular software / hardware making the users of such software / hardware stuck in the snow.

  15. For a few years, I maintained paraview in Gentoo. paraview is the show case application for vtk (visualization toolkit) which it builds open and “vendor”. paraview, vtk _and_ cmake are all products of the same company.
    Regular releases of vtk happen but when they release paraview, paraview uses a git snapshot of vtk at the time of release. For years upstream promised distros they would stop. I believe we are still waiting.

    That was my horror story as a maintainer. I work in “eResearch”. We have issue with reproducibility – which is a major scientific problem. It is often found that if you want to reproduce some results exactly you may have to recreate the software stack used in the original in the exact same state and with all the flaws :( most likely in an emulator of some kind or a container if your stuff is recent enough. https://www.nature.com/articles/d41586-020-02462-7

    1. Yes, reproducibility is a big problem, and pinning dependencies goes a long way towards achieving that.

      The modern packager wouldn’t be complaining about dependency pinning being a “security nightmare”. They would be rolling their sleeves up and building tooling to leverage the rich information provided by Cargo.lock. Security update? Maintain a reverse index of dependencies to packages and simply rebuild packages when they change.

      Dynamic linking is an outdated, outmoded way of running a computer system. It is only suitable for languages like C, and is unsuitable for languages like Rust which make heavy use of monomorphization and macros. Rebuilding is the only solution here.

      Unlike with other ecosystems like Node, ignoring Rust is not really going to be possible either.

      1. Personally I don’t like having giant rust/go binaries because of the static link. Code that can be shared between binaries should go into a library.

      2. ‘Simply rebuild’ may sound sensible when Rust is not widespread. If you practically started rebuilding your whole system every week, it’s not going to be nice. Even if you don’t rebuild yourself and just use a binary package vendor, the resulting transfers will be huge.

        There are systems installed in places with poor Internet connectivity. There are systems without Internet access at all.

        Cargo does not scale.

  16. Just for interest. I understand the benefits of dynamic linking since it automatically uses (at loadtime) the fixed/newer library. However, I also understand that most languages have no defined ABI, since this is a hard problem.

    AFAIK, only C (and Swift?) has a stable usable ABI. C++ has only kind of an ABI that fails as soon as templates are used. If I get it right, this is solved in Gentoo via Subslot/extra versions/etc..

    Let’s assume Boost as an example. If a vulnerability is found in Boost template code, the Boost ebuild is updated, the package manager is aware of the dependencies and rebuilds all of them.

    So the security problem is kind of solved here. It triggers _a lot of_ rebuilds, but at least this is an automated process.

    I’m asking myself why this is not possible with Rust packages. AFAIK Rust works in this way: 1. Download all dependencies, 2. Build them, put them into some cache, 3. Build the actual package, 4. Static link them all together. Currently, Gentoo does step 1 in advance, so 1. downloads can be signed, 2. downloads are shared across Rust packages, 3. downloads are not part of the sandboxed build process.

    Is there some cargo feature to do step 2 also in advance? In this case several rust packages could exist that are not functional by itself but only function as dependency. For example the regex crate can be “precompiled” and loaded into Cargo’s cache in advance (in serveral versions). If I get it right, this would solve at least some of the security concerns, since portage sees an updated regex ebuild, merges it, and triggers a rebuild of every package that is dependend on regex.
    Of course, the binaries itself are still linked statically but the “keep your crate dependencies updated” task is shifted towards the Gentoo maintainers and fully automated.

    I mean, the typical rust developer doesn’t come in touch with the fact that rust is statically linking its libraries. It is all fully automated with Cargo. If Rust would introduce a stable ABI and dynamic linking, packages could switch nearly instantly. But that also means, that Rust dependencies are managed in a clear way (via Cargo). Maybe this management can be used to not only download the packages in advance but propagate the dependencies towards the distributions package manager.

    1. Is there some cargo feature to do step 2 also in advance?

      I haven’t tried it for a whole system, but one trick that’s suggested for caching intermediate artifacts across projects if you don’t mind turning cargo clean into a nuclear option is setting a single value for the CARGO_TARGET_DIR environment variable for all projects you’re compiling.

  17. Thanks Michał for your excellent post!
    I also think, you are right: central package management + dynamic
    linking is the correct way!
    But I also would like to defend rust a bit here.

    Why am I a rust developer (the perspective of a cross platform dev)?

    I have been working on cross platform applications for a long time
    now. Amongst my target platforms, there is one, which ignores the fact, that there is such a thing like a central package manager
    (at least until now https://docs.microsoft.com/en-us/windows/package-manager/).
    As a consequence, you are basically forced to bundle dependencies for software for this OS. Maybe this is the only point where I disagree with you, Michał (“it’s not really doable long term”).
    Not saying that its path is desirable and also not saying that it is
    an example of good security, but somehow Windows managed to survive since the 80s. But maybe I just have to think in even longer terms. Time will tell, but my guess is: chaos always finds a way to survive!

    If you want to develop, the first step is a cross toolchain. Here gentoo’s crossdev did an amazing job!
    As a next step I had to cross compile all the libraries, together with all their dependencies. In my case cairo and / or gtk+. Yes, I just could have used fedora’s repo of precompiled mingw libraries: https://fedoraproject.org/wiki/MinGW, but what if you want some special features and you need to tweak some configure option?
    Back to gentoo / crossdev!
    A lot of ebuilds had some issues with cross compiling, which I tried to fix and push upstream. I spent a lot of time. And as I did the next release, some of the packages had changed their build system from autotools to meson. More fun with fixing!

    When I started with rust, things changed:
    Only one command to set up a cross toolchain for windows (you still need a mingw gcc toolchain for the linking). Also only one command to setup a cross toolchain for arm (my raspberry pi) or for musl binaries (single binary docker containers, ~10MB only), or for wasm (web applications)!
    No messing with different build systems and manual cross compiling.
    And you even dont need platform dependent code anymore, a single codebase for all those targets! It just compiles. How cool is that?

    This is the reason, why I think rust did not everything wrong!
    Rust built a bridge for me from my linux environment to other platforms. Therefore I hope, that rust is not a threat but a chance.

    I hope, that many rust core / compiler devs will read Michałs post!
    Maybe they will finish the uncompleted bridge and eventually will
    introduce some solution to the bundling problem!
    I think both sides would profit here a lot.
    Rust would leave its cargo capsule and make rust software more
    attractive to end users (most end users just wont install rust software if they first need to install cargo as rust’s package manager).

  18. I’m still in my early stages of developing cloud based applications and architectures and have put my focus mainly on node.js using both javascript and typescript formats. I also use a containerization method that combines docker and gcp source repositories. I use a combination of dependabot on github where I keep my source code, container analysis on gcp that links my github repo to gcp source repositories , that has a workflow to update my docker containers that I use for app engine and cloud run deployments, and I write all my code in vs code using debian buster on wsl, which has countless extensions to keep you up to date on version control and links to vulnerability documentation and how to properly adjust and update packages and methods of how to deal with breaking changes. It might sound all over the place but this way I can use the latest stable versions of packages while coding my app, dependabot scanning on my source repo, gcp container analysis to alert me if dependabot missed any vulnerabilities, and I used containerized microservice architecture so if there are breaking changes it only breaks a single atomic aspect of my application due to microservices being ran in a isolated container using functions to provide a glue to connect them. Not to mention npm install and npm audit are pretty good about letting you know of any new vulnerabilities and provides links to documentation on how to fix it. Running deprecated packages is the sign of laziness and lack of pride in the application you develop when there are tons of resources out there to automate and inform you that updates to your packages are needed

  19. Hello there. ‘[T]hey become the integral part of the program’ is poor English. It should be: ‘they become an integral part of the program’.

  20. Michael, many thanks for the description! This seems indeed unsustainable. One can only guess how much work this produces.

    Now, it does not seem to be a good option for packagers to try to control the developers’ behavior, or to go on complaining, which means just taking some kind of victim position.

    There also seems to be a degree of inconsistency: Library developers obviously want stable interfaces in their own dependencies (otherwise they would not pin versions), but many do not want to guarantee for stable interfaces themselves for *their* users. They want their stuff to be quickly present on common distributions, but on the other hand, they often do not want to use the current library versions which these distributions provide.

    Meanwhile, users certainly want backward compatibility and stable systems (I remember vividly the unhappiness around GNOME 3).

    I also agree with Michael that a lot of this comes down to providing backward-compatible and stable APIs. Distributions should definitely push for that.

    Perhaps it is a solution (or part of it) that distributions make the situation more transparent, group libraries and applications into tiers, and provide their users with choices about which tier of API stability they want to install. The system package managers would display the tier for each package. Something like a traffic-light system:

    – “Green” for libraries and applications that provide stable APIs, use dynamic linking, and have a stable ABI.

    – “Yellow” for these that may not have a stable ABI and may use static linking, but provide thorough, machine-readable information about dependencies, and use and provide stable APIs. Yes, that would include Rust software, at the time being. Users could still expect timely security updates for such packages, but would need to be prepared to suffer enormous download sizes.

    – “Red” for those that use bundling or “vendoring”, or do not provide a stable API, or do not make explicit which static dependencies are used exactly.

    And yes, these classification would be transitive – if an application uses a “Yellow” library, it can only be “Yellow” itself, and the same for red. Grouping down according to API stability would need to happen automatically, no questions asked, and grouping up again should only be possible after at least one year of probation time.

    Users could configure from which set of packages they want to install, and they could set exceptions for difficult cases like web browsers. They could decide themselves which software is important enough that it warrants such exceptions.

    My expectation is that most users want systems which at the core, and with more than 98% of packages, consist of stable, secure, well-vetted packages, and for a few packages with actually important new features, they would perhaps make an exception. Maybe a new version of a language like rust or OpenCV, just for toying around. (And even for this class of applications, the Guix package manager running on top is possibly a better choice, because it can isolate unstable versions).

    Such a system would promote a culture of stability of external APIs (which requires a minimum bit of skill but is technically no problem at all, just a matter of intent, as e.g. the Linux kernel shows quite well). Libraries and application that want to remain “green” would need to be careful not to chose dependencies that turn yellow or red – it is their responsibility which dependencies they chose. Which is also somewhat a matter of trust but this is something which app developers and library authors can figure out between themselves.

    The result would consist of a core distribution of “Green” packages, with an optional set of “Yellow” and “red” add-ons, which might perhaps be used mainly by developers who do not care if their app does not run yet on a “Green” system.

    Such a stability-graded tier system would also be a great advantage e.g. for Scientists, which almost never have man-power, resources, and interest to port working software constantly to use newer libraries. And the same is probably true for many businesses and corporations – for these, stability counts; that’s why some still use COBOL.

    And finally, there would be less reason left that software developers complain about distributions that they do not pack their new stuff. (They could instead complain about users that they do not install and use their stuff, but that would obviously be a bit silly….)

  21. While I sympathize with your situation, it’s only fair that I point out that, when people try to argue that C++ is superior to Rust because C++ has a de facto stable ABI, your The impact of C++ templates on library ABI is part of one of my explanations for why Rust has no stable ABI. (Given that Rust DEFAULTS to using monomorphic dispatch.)

    I pair it with The Day The Standard Library Died by cor3ntin, which talks about the costs of C++ maintaining that stability and point out that Rust v1.0 didn’t ship with things that required breaking the ABI, like automatic structure packing. (Any bystanders who don’t know what structure packing is are invited to read ESR’s The Lost Art of Structure Packing.)

    1. Ugh. Don’t post while tired. I hate that WordPress has neither a post preview nor an edit window in its default configuration.

      Could you please close that <a>?

      1. Done. Don’t worry, I hate wordpress too. I just don’t have time to look for an alternative that would support commenting and be reasonably sane.

  22. Extremely interesting post, and very general. All distros have a similar problem, and each new language introduces new ways to track dependencies.

    Sometimes it is for good reasons, e.g. features of the language that don’t fit well in the model of pre-compiled dynamic libraries. The problem already started with C++, as some pointed out, e.g. header-only template libraries are the ultimate form of bundled libraries, because now they are dispatched across your binary code in the form of deeply embedded inlined code, in a way that depends on the actual template instantiation arguments. Rust has a similar problem.

    Having worked on the first standardized C++ ABI (the Itanium C++ ABI that was later adopted by GCC), I can tell you that over twenty years ago, we fully recognized the problem, but had no clue how to solve it. It must be really hard to solve, since even today, we don’t know how to deal with it.

    Short of a fully abstracted binary representation that would be able to do dynamic instantiations, like a supercharged JVM or WASM, we will always have a trade-off between dynamic code with C-level interfaces or static code with more sophisticated semantics for generics.

    Even if we solve that problem, we will also always have a problem that pushes people to pin specific versions. Sometimes, a library you depend on changes semantics. The most egregious example of this is LLVM, which broke compatibility with every release, not just at the binary level, but even at the source code level. It’s a nightmare.

  23. Without taking a bold stance on how developers should start operating tomorrow, or even next month, I think there are a few points worth tossing into this discussion.

    – Dynamically-linked libraries used to be considered an optimization, not an essential security feature. When I was using Guix some time ago, they did this thing called “grafting”, where they would patch the binary of a library or executable, I think again primarily as an optimization.

    – This problem has more or less been solved, in theory, with https://wiki.haskell.org/The_Monad.Reader/Issue2/EternalCompatibilityInTheory … the basic idea being that developers pin to an API version for every dependency, but then any time they release an API-breaking change in their code, they are obligated to re-implement the old API in terms of the new one.

    – Going more in the direction another commenter suggested, “you’ve got to be able to fix it at the source”, you could probably at least make progress in concretizing the API-implementation surface by writing everything in a decent type theory (Coq, Agda, F*, some future version of Haskell with ergonomic dependent types and termination-checking) and rendering all of your assumptions explicit. This can, in principle, ensure any properties developers collectively care about enough to formalize. It would slow down software development, but maybe that’s a good thing?

    The least painful way I can think of to start prototyping some of this stuff tomorrow-ish is to start using Agda to write Javascript code.

  24. An alternative approach would be to make software that didn’t need constant “updates” and thus could be finished.

    Ada’s an example of a language that generally splits a library into a specification and a body; so long as the specification remains the same, then using a newer version with a newer body is trivial. Now, it’s still possible to break the software, sure, but it’s a lot less likely to happen accidentally. If I had to make software in one of these lesser languages, I’d also demand a specific version of some library, because these people lack standards that should’ve been adopted in the last millennium.

    > However, this is a lot of work, and often it is not even possible because of custom patching, including the kind of patching that has been explicitly rejected upstream.

    One of the most pathetic qualities of modern software is how little of it is reusable. The OO approach, the real one like in Smalltalk, allows for changing behaviour without needing to dig too deep, as in those examples. A Lisp approach, with hooks and ways to add to the code no differently than the author, would also work. Even the approach used in Ada generics, allowing for parameterization of code that doesn’t need to focus too deeply on the particulars of the types, would result in much less repetition in many cases. However, SQLite is written in the C language and probably not suited to many changes at all, and the LLVM compiler playset is also clearly inadequate if the Rust people feel the need to fork it so.

    It’s a Sisyphean task, sure, but eventually the rock will kill this Sisyphus and perhaps then the intelligent people can build something that will work this time.

Leave a Reply to Oz Cancel reply

Your email address will not be published.