I’m a Gentoo developer for over 10 years already. I’ve been doing a lot of different things throughout that period. However, Python was pretty much always somewhere within my area of interest. I don’t really recall how it all started. Maybe it had something to do with Portage being written in Python. Maybe it was the natural next step after programming in Perl.
I feel like the upcoming switch to Python 3.9 is the last step in the prolonged effort of catching up with Python. Over the last years, we’ve been working real hard to move Python support forward, to bump neglected packages, to enable testing where tests are available, to test packages on new targets and unmask new targets as soon as possible. We have improved the processes a lot. Back when we were switching to Python 3.4, it took almost a year from the first false start attempt to the actual change. We started using Python 3.5 by default after upstream dropped bugfix support for it. In a month from now, we are going to start using Python 3.9 even before 3.10 final is released.
I think this is a great opportunity to look back and see what changed in the Gentoo Python ecosystem, in the last 10 years.
Continue reading “10 Years’ Perspective on Python in Gentoo”
PyPy is an alternative Python implementation. While it does replace a large part of the interpreter, a large part of the standard library is shared with CPython. As a result, PyPy is frequently affected by the same vulnerabilities as CPython, and we have to backport security fixes to it.
Backporting security fixes inside CPython is relatively easy. All main Python branches are in a single repository, so it’s just a matter of cherry-picking the commits. Normally, you can easily move patches between two related git repositories using git-style patches but this isn’t going to work for two repositories with unrelated histories.
Does this mean manually patching PyPy and rewriting commit messages by hand? Luckily, there’s a relatively simple git am trick that can help you avoid that.
Continue reading “Moving commits between independent git histories”
One of the comments to the The modern packager’s security nightmare post posed a very important question:
why is it bad to depend on the app developer to address security issues? In fact, I believe it is important enough to justify a whole post discussing the problem. To clarify, the wider context is bundling dependencies, i.e. relying on the application developer to ensure that all the dependencies included with the application to be free of vulnerabilities.
In my opinion, the root of security in open source software is widely understood auditing. Since the code is public, everyone can read it, analyze it, test it. However, with a typical system install including thousands of packages from hundreds of different upstreams, it is really impossible even for large companies (not to mention individuals) to be able to audit all that code. Instead, we assume that with large enough number of eyes looking at the code, all vulnerabilities will eventually be found and published.
On top of auditing we add trust. Today, CVE authorities are at the root of our vulnerability trust. We trust them to reliably publish reports of vulnerabilities found in various packages. However, once again we can’t expect users to manually make sure that the huge number of the packages they are running are free of vulnerabilities. Instead, the trust is hierarchically moved down to software authors and distributions.
Both software authors and distribution packagers share a common goal — ensuring that their end users are running working, secure software. Why do I believe then that the user’s trust is better placed in distribution packagers than in software authors? I am going to explain this in three points.
Continue reading “Why not rely on app developer to handle security?”
One of the most important tasks of the distribution packager is to ensure that the software shipped to our users is free of security vulnerabilities. While finding and fixing the vulnerable code is usually considered upstream’s responsibility, the packager needs to ensure that all these fixes reach the end users ASAP. With the aid of central package management and dynamic linking, the Linux distributions have pretty much perfected the deployment of security fixes. Ideally, fixing a vulnerable dependency is as simple as patching a single shared library via the distribution’s automated update system.
Of course, this works only if the package in question is actually following good security practices. Over the years, many Linux distributions (at the very least, Debian, Fedora and Gentoo) have been fighting these bad practices with some success. However, today the times have changed. Today, for every 10 packages fixed, a completely new ecosystem emerges with the bad security practices at its central point. Go, Rust and to some extent Python are just a few examples of programming languages that have integrated the bad security practices into the very fabric of their existence, and recreated the same old problems in entirely new ways.
The root issue of bundling dependencies has been discussed many times before. The Gentoo Wiki explains why you should not bundle dependencies, and links to more material about it. I would like to take a bit wider approach, and discuss not only bundling (or vendoring) dependencies but also two closely relevant problems: static linking and pinning dependencies.
Continue reading “The modern packager’s security nightmare”
As of today, the most common implementation of the LZMA algorithm on open source operating systems is the xz format. However, there are a few others available. Notably, a few packages found in the Gentoo repository use the superior lzip format. Does this mean you may end up having to have separate decompressors for both formats installed? Not necessarily.
Back in 2017, I’ve entertained a curious idea. Since both lzip and xz are both container formats built on top of the original LZMA algorithm, and xz features backwards-compatible support for the earlier container format used by lzma-utils, how hard would it be to implement a decoder for the lzip format as well? Not very hard, it turned out. After all, most of the code was already there — I’ve just had to implement the additional container format. With some kind help of XZ upstream, I’ve done that.
Sadly, the patches have not been merged yet. More than three years have passed now in waiting. Today I’ve rebased them and updated to follow changes in XZ itself. For anyone interested, it can be found on the lzip2 branch of my xz fork. After building xz with my patches, it now happily decompresses .lz files in addition to the regular set. Thanks to a tiny patchset you don’t have to build yet another program to unpack a few distfiles.