From build-dir to venv — testing Python packages in Gentoo

A lot of Python packages assume that their tests will be run after installing the package. This is quite a reasonable assumption if you take that the tests are primarily run in dedicated testing environments such as CI deployments or test runners such as tox. However, this does not necessarily fit the Gentoo packaging model where packages are installed system-wide, and the tests are run between compile and install phases.

In great many cases, things work out of the box (because the modules are found relatively to the current directory), or require only minimal PYTHONPATH adjustments. In others, we found it necessary to put a varying amount of effort to create a local installation of the package that is suitable for testing.

In this post, I would like to shortly explore the various solutions to the problem we’ve used over the years, from simple uses of build directory to the newest ideas based on virtual environments.

Testing against the build directory

As I have indicated above, great many packages work just fine with the correct PYTHONPATH setting. However, not all packages provide ready-to-use source trees and even if they do, there’s the matter of either having to manually specify the path to them or have more or less reliable automation guess it. Fortunately, there’s a simple solution.

The traditional distutils/setuptools build process consists of two phases: the build phase and the install phase. The build phase is primarily about copying the files from their respective source directories to a unified package tree in a build directory, while the install phase is generally about installing the files found in the build directory. Besides just reintegrating sources, the build phase may also involve other important taks: compiling the extensions written in C or converting sources from Python 2 to Python 3 (which is becoming rare). Given that the build command is run in src_compile, this makes the build directory a good candidate for use in tests.

This is precisely what distutils-r1.eclass does out of the box. It ensures that the build commands write to a predictable location, and it adds that location to PYTHONPATH. This ensures that the just-built package is found by Python when trying to import its modules. That is — unless the package residing in the current directory takes precedence. In either case, it means that most of the time things just work, and sometimes just have to restort to simple hacks such as changing the current directory.

distutils_install_for_testing (home layout)

While the build directory method worked for many packages, it had its limitation. To list a few I can think of:

  • Script wrappers for entry points were not created (and even regular scripts were not added to PATH due to a historical mistake), so tests that relied on being able to call installed executables did not work.
  • Package metadata (.egg-info) was not included, so pkg_resources (and now the more modern importlib.metadata) modules may have had trouble finding the package.
  • Namespace packages were not handled properly.

The last point was the deal breaker here. Remember that we’re talking of the times when Python 2.7 was still widely supported. If we were testing a zope.foo package that happened to depend on zope.bar, then we were in trouble. The top-level zope package that we’ve just added to PYTHONPATH had only the foo submodule but bar had to be gotten from system site-packages!

Back in the day, I did not know much about the internals of these things. I was looking for an easy working solution, and I have found one. I have discovered that using setup.py install --home=... (vs setup.py install --root=... that we used to install into D) happened to install a layout that made namespaces just work! This was just great!

This how the original implementation of distutils_install_for_testing came around. The rough idea was to put this –home install layout on PYTHONPATH and reap all the benefits of having the package installed before running tests.

Root layout

The original dift layout was good while it worked. But then it stopped. I don’t know the exact version of setuptools or the exact change but the magic just stopped working. Good news is that it was just a few months ago, and we were already deep in removing Python 2.7, so we did not have to worry about namespaces that much (namespaces are much easier in Python 3 as they work via empty directories without special magic).

The simplest solution I could think of was to stop relying on the home layout, and instead use the same root layout as used for our regular installs. This did not include as much magic but solved the important problems nevertheless. Entry point wrappers were installed, namespaces worked of their own accord most of the time.

I’ve added a new --via-root parameter to change dift mode, and --via-home to force the old behavior. By the end of January, I have flipped the default and we were happily using the new layout since then. Except that it didn’t really solve all the problems.

Virtualenv layout

The biggest limitations of the both dift layouts is that they’ve relied on PYTHONPATH. However, not everything in the Python world respects path overrides. To list just two examples: the test suite of werkzeug relies on overwriting PYTHONPATH for spawned processes, and tox fails to find its own installed package.

I have tried various hacks to resolve this, to no avail. The solution that somewhat worked was to require the package to be actually installed before running the tests but that was really inconvenient. Interestingly enough, virtualenvs rely on some internal Python magic to actually override module search path without relying on PYTHONPATH.

The most recent dift --via-venv variant that I’ve just submitted for mailing list review uses exactly this. That is, it uses the built-in Python 3 venv module (not to be confused with the third-party virtualenv).

Now, normally a virtualenv creates an isolated environment where all dependencies have to be installed explicitly. However, there is a --system-site-packages option that avoids this. The packages installed inside the virtualenv (i.e. the tested package) will take precedence but other packages will be imported from the system site-packages directory. That’s just what we need!

I have so far tested this new method on two problematic packages (werkzeug and tox). It might be just the thing that resolves all the problems that were previously resolved via the home layout. Or it might not. I do not know yet whether we’ll be switching default again. Time will tell.

Leave a Reply

Your email address will not be published.