Dependency pitfalls regarding slots, slot ops and any-of deps

During my work on Gentoo, I have seen many types of dependency pitfalls that developers fell in. Sad to say, their number is increasing with new EAPI features — we are constantly increasing new ways into failures rather than working on simplifying things. I can’t say the learning curve is getting much steeper but it is considerably easier to make a mistake.

In this article, I would like to point out a few common misunderstandings and pitfalls regarding slots, slot operators and any-of (|| ()) deps. All of those constructs are used to express dependencies that can be usually be satisfied by multiple packages or package versions that can be installed in parallel, and missing this point is often the cause of trouble.

Separate package dependencies are not combined into a single slot

One of the most common mistakes is to assume that multiple package dependency specifications listed in one package are going to be combined somehow. However, there is no such guarantee, and when a package becomes slotted this fact actually becomes significant.

Of course, some package managers take various precautions to prevent the following issues. However, such precautions not only can not be relied upon but may also violate the PMS.

For example, consider the following dependency specification:

>=app-misc/foo-2
<app-misc/foo-5

It is a common way of expressing version ranges (in this case, versions 2*, 3* and 4* are acceptable). However, if app-misc/foo is slotted and there are versions satisfying the dependencies in different slots, there is no guarantee that the dependency could not be satisfied by installing foo-1 (satisfies <foo-5) and foo-6 (satisfies >=foo-2) in two slots!

Similarly, consider:

app-misc/foo[foo]
bar? ( app-misc/foo[baz] )

This one is often used to apply multiple sets of USE flags to a single package. Once again, if the package is slotted, there is no guarantee that the dependency specifications will not be satisfied by installing two slots with different USE flag configurations.

However, those problems mostly apply to fully slotted packages such as sys-libs/db where multiple slots are actually meaningfully usable by a package. With the more common use of multiple slots to provide incompatible versions of the package (e.g. binary compatibility slots), there is a more important problem: that even a single package dependency can match the wrong slot.

For non-truly multi-slotted packages, the solution to all those problems is simple: always specify the correct slot. For truly multi-slotted packages, there is no easy solution.

For example, a version range has to be expressed using an any-of dep:

|| (
     =sys-libs/db-5*
     =sys-libs/db-4*
)

Multiple sets of USE flags? Well, if you really insist, you can combine them for each matching slot separately…

|| (
    ( sys-libs/db:5.3 tools? ( sys-libs/db:5.3[cxx] ) )  
    ( sys-libs/db:5.1 tools? ( sys-libs/db:5.1[cxx] ) )  
	…
)

The ‘equals’ slot operator and multiple slots

A similar problem applies to the use of the EAPI 5 ‘equals’ slot operator. The PMS notes that:

=
Indicates that any slot value is acceptable. In addition, for runtime dependencies, indicates that the package will break unless a matching package with slot and sub-slot equal to the slot and sub-slot of the best installed version at the time the package was installed is available.

[…]

To implement the equals slot operator, the package manager will need to store the slot/sub-slot pair of the best installed version of the matching package. […]

PMS, 8.2.6.3 Slot Dependencies

The significant part is that the slot and subslot is recorded for the best package version matched by the specification containing the operator. So again, if the operator is used on multiple dependencies that can match multiple slots, multiple slots can actually be recorded.

Again, this becomes really significant in truly slotted packages:

|| (
     =sys-libs/db-5*
     =sys-libs/db-4*
)
sys-libs/db:=

While one may expect the code to record the slot of sys-libs/db used by the package, this may actually record any newer version that is installed while the package is being built. In other words, this may implicitly bind to db-6* (and pull it in too).

For this to work, you need to ensure that the dependency with the slot operator can not match any version newer than the two requested:

|| (
     =sys-libs/db-5*
     =sys-libs/db-4*
)
<sys-libs/db-6:=

In this case, the dependency with the operator could still match earlier versions. However, the other dependency enforces (as long as it’s in DEPEND) that at least one of the two versions specified is installed at build-time, and therefore is used by the operator as the best version matching it.

The above block can easily be extended by a single set of USE dependencies (being applied to all the package dependencies including the one with slot operator). For multiple conditional sets of USE dependencies, finding a correct solution becomes harder…

The meaning of any-of dependencies

Since I have already started using the any-of dependencies in the examples, I should point out yet another problem. Many of Gentoo developers do not understand how any-of dependencies work, and make wrong assumptions about them.

In an any-of group, at least one immediate child element must be matched. A blocker is considered to be matched if its associated package dependency specification is not matched.

PMS, 8.2.3 Any-of Dependency Specifications

So, PMS guarantees that if at least one of the immediate child elements (package dependencies, nested blocks) of the any-of block, the dependency is considered satisfied. This is the only guarantee PMS gives you. The two common mistakes are to assume that the order is significant and that any kind of binding between packages installed at build time and at run time is provided.

Consider an any-of dependency specification like the following:

|| (
    A
    B
    C
)

In this case, it is guaranteed that at least one of the listed packages is installed at the point appropriate for the dependency class. If none of the packages are installed already, it is customary to assume the Package Manager will prefer the first one — while this is not specified and may depend on satisfiability of the dependencies, it is a reasonable assumption to make.

If multiple packages are installed, it is undefined which one is actually going to be used. In fact, the package may even provide the user with explicit run time choice of the dependency used, or use multiple of them. Assuming that A will be preferred over B, and B over C is simply wrong.

Furthermore, if one of the packages is uninstalled, while one of the remaining ones is either already installed or being installed, the dependency is still considered satisfied. It is wrong to assume that in any case the Package Manager will bind to the package used at install time, or cause rebuilds when switching between the packages.

The ‘equals’ slot operator in any-of dependencies

Finally, I am reaching the point of lately recurring debates. Let me make it clear: our current policy states that under no circumstances may := appear anywhere inside any-of dependency blocks.

Why? Because it is meaningless, it is contradictory. It is not even undefined behavior, it is a case where requirements put for the slot operator can not be satisfied. To explain this, let me recall the points made in the preceding sections.

First of all, the implementation of the ‘equals’ slot operator requires the Package Manager to explicitly bind the slot/subslot of the dependency to the installed version. This can only happen if the dependency is installed — and an any-of block only guarantees that one of them will actually be installed. Therefore, an any-of block may trigger a case when PMS-enforced requirements can not be satisfied.

Secondly, the definition of an any-of block allows replacing one of the installed packages with another at run time, while the slot operator disallows changing the slot/subslot of one of the packages. The two requested behaviors are contradictory and do not make sense. Why bind to a specific version of one package, while any version of the other package is allowed?

Thirdly, the definition of an any-of block does not specify any particular order/preference of packages. If the listed packages do not block one another, you could end up having multiple of them installed, and bound to specific slots/subslots. Therefore, the Package Manager should allow you to replace A:1 with B:2 but not with B:1 nor with A:2. We’re reaching insanity now.

Now, all of the above is purely theoretical. The Package Manager can do pretty much anything given invalid input, and that is why many developers wrongly assume that slot operators work inside any-of. The truth is: they do not, the developer just did not test all the cases correctly. The Portage behavior varies from allowing replacements with no rebuilds, to requiring both of mutually exclusive packages to be installed simultaneously.