DISTUTILS_USE_SETUPTOOLS, QA spam and… more QA spam?

Update: the information provided in this post is out of date. As of today, Python 3.7 is no longer relevant from DISTUTILS_USE_SETUPTOOLS perspective, and ‘rdepend’ is no longer valid when when entry points are used.

I suppose that most of the Gentoo developers have seen at least one of the ‘uses a probably incorrect DISTUTILS_USE_SETUPTOOLS value’ bugs by now. Over 350 have been filed so far, and new ones are filed practically daily. The truth is, I’ve never intended for this QA check to result in bugs being filed against packages, and certainly not that many bugs.

This is not an important problem to be fixed immediately. The vast majority of Python packages depend on setuptools at build time (this is why the build-time dependency is the eclass’ default), and being able to unmerge setuptools is not a likely scenario. The underlying idea was that the QA check would make it easier to update DISTUTILS_USE_SETUPTOOLS when bumping packages.

Nobody has asked me for my opinion, and now we have hundreds of bugs that are not very helpful. In fact, the effort involved in going through all the bugmail, updating packages and closing the bugs greatly exceeds the negligible gain. Nevertheless, some people actually did it. I have bad news for them: setuptools upstream has changed entry point mechanism, and most of the values will have to change again. Let me elaborate on that.

The current logic

The current eclass logic revolves around three primary values:

  • no indicating that the package does not use setuptools
  • bdepend indicating that the package uses setuptools at build time only
  • rdepend indicating that the package uses setuptools at build- and runtime

There’s also support for pyproject.toml but it’s tangential to the problem at hand, so let’s ignore it.

The setuptools package — besides the build system — includes a pkg_resources sub-package that can be used to access package’s metadata and resources. The two primary uses of rdepend revolves around this. These are:

  1. console_scripts entry points — i.e. autogenerated executable scripts that call a function within the installed package rather than containing the program code itself.
  2. Direct uses of pkg_resources in the modules installed by the package.

Both of these cases were equivalent from dependency standpoint. Well, not anymore.

Entry points via importlib.metadata

Well, the big deal is the importlib.metadata module that was added in Python 3.8 (there’s also a relevant importlib.resources module since Python 3.7). It is a built-in module that provides routines to access the installed package metadata, and therefore renders another part of pkg_resources redundant.

The big deal is that the new versions of setuptools have embraced it, and no longer require pkg_resources to run entry points. To be more precise, the new logic selects the built-in module as the first choice, with fallback to the importlib_metadata backport and finally to pkg_resources.

This means that the vast majority of packages that used to depend on setuptools at runtime, no longer does strictly that. With Python 3.8 and newer, they have no additional runtime dependencies and just require setuptools at build time. With older versions of Python, they prefer importlib_metadata over it. In both cases, the packages can still use pkg_resources directly though.

How to resolve it via the eclass?

Now, technically speaking this means replacing rdepend with three new variants:

  • scripts — that means build-time dependency on setuptools + runtime impl-conditional dep on importlib_metadata, for pure entry point usage.
  • rdepend — that means runtime dependency on setuptools, for pure pkg_resources usage.
  • scripts+rdepend — for packages that combine both.

Of course, this means that the existing packages would get a humongous number of new bug reports, often requesting a change to the value that was updated recently. The number could be smaller if we changed the existing meaning of rdepend to mean importlib.metadata, and introduced a new value for pkg_resources.

Still, that’s not the best part. The real fun idea is that once we remove Python 3.7, all Python versions would have importlib.metadata built-in and the distinction will no longer be necessary. Eventually, everyone would have to update the value again, this time to bdepend. Great, right?

…or not to resolve it?

Now that we’ve discussed the solution recommended to me, let’s consider an alternative. For the vast majority of packages, the runtime dependency on setuptools is unnecessary. If the user uses Python 3.8+ or has importlib_metadata installed (which is somewhat likely, due to direct dependencies on it), pkg_resources will not be used by the entry points. Nevertheless, setuptools is still pretty common as a build-time dependencies and, as I said before, it makes little sense to uninstall it.

We can simply keep things as-is. Sure, the dependencies will not be 100% optimal. Yet, the dependency on setuptools will ensure that entry points continue working even if the user does not have importlib_metadata installed. We will eventually want to update DISTUTILS_USE_SETUPTOOLS logic but we can wait for it till Python versions older than 3.8 become irrelevant, and we are back to three main variants.

Leave a Reply

Your email address will not be published.