Moving commits between independent git histories

PyPy is an alternative Python implementation. While it does replace a large part of the interpreter, a large part of the standard library is shared with CPython. As a result, PyPy is frequently affected by the same vulnerabilities as CPython, and we have to backport security fixes to it.

Backporting security fixes inside CPython is relatively easy. All main Python branches are in a single repository, so it’s just a matter of cherry-picking the commits. Normally, you can easily move patches between two related git repositories using git-style patches but this isn’t going to work for two repositories with unrelated histories.

Does this mean manually patching PyPy and rewriting commit messages by hand? Luckily, there’s a relatively simple git am trick that can help you avoid that.
Continue reading “Moving commits between independent git histories”

Why not rely on app developer to handle security?

One of the comments to the The modern packager’s security nightmare post posed a very important question: why is it bad to depend on the app developer to address security issues? In fact, I believe it is important enough to justify a whole post discussing the problem. To clarify, the wider context is bundling dependencies, i.e. relying on the application developer to ensure that all the dependencies included with the application to be free of vulnerabilities.

In my opinion, the root of security in open source software is widely understood auditing. Since the code is public, everyone can read it, analyze it, test it. However, with a typical system install including thousands of packages from hundreds of different upstreams, it is really impossible even for large companies (not to mention individuals) to be able to audit all that code. Instead, we assume that with large enough number of eyes looking at the code, all vulnerabilities will eventually be found and published.

On top of auditing we add trust. Today, CVE authorities are at the root of our vulnerability trust. We trust them to reliably publish reports of vulnerabilities found in various packages. However, once again we can’t expect users to manually make sure that the huge number of the packages they are running are free of vulnerabilities. Instead, the trust is hierarchically moved down to software authors and distributions.

Both software authors and distribution packagers share a common goal — ensuring that their end users are running working, secure software. Why do I believe then that the user’s trust is better placed in distribution packagers than in software authors? I am going to explain this in three points.
Continue reading “Why not rely on app developer to handle security?”

The modern packager’s security nightmare

One of the most important tasks of the distribution packager is to ensure that the software shipped to our users is free of security vulnerabilities. While finding and fixing the vulnerable code is usually considered upstream’s responsibility, the packager needs to ensure that all these fixes reach the end users ASAP. With the aid of central package management and dynamic linking, the Linux distributions have pretty much perfected the deployment of security fixes. Ideally, fixing a vulnerable dependency is as simple as patching a single shared library via the distribution’s automated update system.

Of course, this works only if the package in question is actually following good security practices. Over the years, many Linux distributions (at the very least, Debian, Fedora and Gentoo) have been fighting these bad practices with some success. However, today the times have changed. Today, for every 10 packages fixed, a completely new ecosystem emerges with the bad security practices at its central point. Go, Rust and to some extent Python are just a few examples of programming languages that have integrated the bad security practices into the very fabric of their existence, and recreated the same old problems in entirely new ways.

The root issue of bundling dependencies has been discussed many times before. The Gentoo Wiki explains why you should not bundle dependencies, and links to more material about it. I would like to take a bit wider approach, and discuss not only bundling (or vendoring) dependencies but also two closely relevant problems: static linking and pinning dependencies.
Continue reading “The modern packager’s security nightmare”

lzip decompression support for xz-utils

As of today, the most common implementation of the LZMA algorithm on open source operating systems is the xz format. However, there are a few others available. Notably, a few packages found in the Gentoo repository use the superior lzip format. Does this mean you may end up having to have separate decompressors for both formats installed? Not necessarily.

Back in 2017, I’ve entertained a curious idea. Since both lzip and xz are both container formats built on top of the original LZMA algorithm, and xz features backwards-compatible support for the earlier container format used by lzma-utils, how hard would it be to implement a decoder for the lzip format as well? Not very hard, it turned out. After all, most of the code was already there — I’ve just had to implement the additional container format. With some kind help of XZ upstream, I’ve done that.

Sadly, the patches have not been merged yet. More than three years have passed now in waiting. Today I’ve rebased them and updated to follow changes in XZ itself. For anyone interested, it can be found on the lzip2 branch of my xz fork. After building xz with my patches, it now happily decompresses .lz files in addition to the regular set. Thanks to a tiny patchset you don’t have to build yet another program to unpack a few distfiles.

Distribution Kernels: module rebuilds, better ZFS support and UEFI executables

The primary goal of the Distribution Kernel project is provide a seamless kernel upgrade experience to Gentoo users. Initially, this meant configuring, building and installing the kernel during the @world upgrade. However, you had to manually rebuild the installed kernel modules (and @module-rebuild is still broken), and sometimes additionally rebuild the initramfs after doing that.

To address this, we have introduced a new dist-kernel USE flag. This flag is automatically added to all ebuilds installing kernel modules. When it is enabled, the linux-mod eclass adds a dependency on virtual/dist-kernel package. This virtual, in turn, is bound to the newest version of dist-kernel installed. As a result, whenever you upgrade your dist-kernel all the module packages will also be rebuilt via slot rebuilds. The manual @module-rebuild should no longer be necessary!

ZFS users have pointed out that after rebuilding sys-fs/zfs-kmod package, they need to rebuild the initramfs for Dracut to include the new module. We have combined the dist-kernel rebuild feature with pkg_postinst() to rebuild the initramfs whenever zfs-kmod is being rebuilt (and the dist-kernel is used). As a result, ZFS should no longer require any manual attention — as long as rebuilds succeed, the new kernel and initramfs should be capable of running on ZFS root once the @world upgrade finishes.

Finally, we have been asked to provide support for uefi=yes Dracut option. When this option is enabled, Dracut combines the EFI stub, kernel and generated initramfs into a single UEFI executable that can be booted directly. The dist-kernels now detect this scenario, and install the generated executable in place of the kernel, so everything works as expected. Note that due to implementation limitations, we also install an empty initramfs as otherwise kernel-install.d scripts would insist on creating another initramfs. Also note that until Dracut is fixed to use correct EFI stub path, you have to set the path manually in /etc/dracut.conf: