The Gentoo Profile Stacking Problem

I thought I’d write a bit about a long standing problem that the hardened team has been facing with Gentoo’s profile system.  Ever since I joined the team around 2009, we’ve had to deal with the “profile stacking problem”.  Most users and devs just merily go along using `eselect profile` to pick the profile closest to the type of system they want and then tweak the various files under /etc/portage, adding a USE flag here, and keywording or unmasking a package there, until they get the “perfect” system.  What I want to do in this post is expose just what goes into designing the profiles that we publicly export.

I was inpsired to write this because of bug #492312.  There we want to re-introduce the hardened desktop profile for amd64, x86 and arm.  I say “re-introduce” because we had to remove it and its sibling profiles /server and /developer.  So what was going on there?

To start, let me give you a nice pice of python code:

import portage
for p in portage.settings.profiles:
    print("%s" % p)

What this little snippet does is print out the profile stack as the directories inherited from one another via the parent file.  Its a useful tool because profile stacking can get very hard to follow.  When the parent file has something simple like just “..” then the inheritance is easy and that directory just inherits all the package.mask, package.unmask etc of the parent directory, as you would expect from the shell meaning of “..”  But what happens when the parent file looks like this:

../../../base
../../../default/linux
../../../arch/amd64
..

as it does for hardened/linux/amd64?  Well then we get some interesting behavior. The first line says, inherit from base. Easy enough since base inherits from nothing else you get all of base’s settings. The second line says inherit from default/linux, which aslo doesn’t inherit from anything.  These setting just add and override those from base.  Easy enough. Ah! But now we come to arch/amd64, where the parent file says

../base
../../features/multilib/lib32

and the inheritance continues to those directories in order. Finally “..” in hardened/linux/amd64 means inherit hardened/linux which sets most of hardened’s needs via make.defaults, package.mask, use.mask and friends. But alas, hardened/linux has its own parent file which reads

../../releases/13.0

and the trip down the rabbit hole continues!  If you are starting to get a little lost, don’t feel bad. It is hard to wrap your brain around stacking, which is why that little script above is so useful.  But the difficulty in following profiles stacking is not the real problem. If you’re like me, you’re too proud to admit you can’t get your head around any complexity ;)   No, the real problem is that you can’t control the stacking order.

To demonstrate, let me refer again to bug #492312.  There we’d like to have a profile which reads

hardened/linux/amd64/desktop

Okay, but what should we put for its parent file?  We’ll need “..” in there to inherit all of hardened/amd64 settings, but we also would like targets/desktop.  So let’s try a parent file that looks like this

..
../../../../targets/desktop

In that case, our little script tells us that our profile stack as follows:

/usr/portage/profiles/base
/usr/portage/profiles/default/linux
/usr/portage/profiles/arch/base
/usr/portage/profiles/features/multilib
/usr/portage/profiles/features/multilib/lib32
/usr/portage/profiles/arch/amd64
/usr/portage/profiles/releases
/usr/portage/profiles/eapi-5-files
/usr/portage/profiles/releases/13.0
/usr/portage/profiles/hardened/linux
/usr/portage/profiles/hardened/linux/amd64
/usr/portage/profiles/targets/desktop
/usr/portage/profiles/hardened/linux/amd64/destkop

And if you switch the order of .. and targets/desktop, you get

/usr/portage/profiles/targets/desktop
/usr/portage/profiles/base
/usr/portage/profiles/default/linux
/usr/portage/profiles/arch/base
/usr/portage/profiles/features/multilib
/usr/portage/profiles/features/multilib/lib32
/usr/portage/profiles/arch/amd64
/usr/portage/profiles/releases
/usr/portage/profiles/eapi-5-files
/usr/portage/profiles/releases/13.0
/usr/portage/profiles/hardened/linux
/usr/portage/profiles/hardened/linux/amd64
/usr/portage/profiles/hardened/linux/amd64/destkop

The problem with the first ordering is that targets/desktop overrides hardened/linux/amd64 and so any USE flags that we may turn off or on in hardened can get reverse in desktop.  The example here is the jit flag — Just-In-Time compilers write executable code on the fly in areas of memory which must be both writeable and executable.  But a PaX hardened kernel will not allow WX mmap-ings because this is an obvious exploit vector.  Rather, in hardened, we prefer slower and safer methods for compiling/interpreting code on the fly than JIT.

Okay, so what about the second ordering.  It may look strange to have target/desktop before base, but that in itself is not an issue.  Here we have the same problem as above but in an even more subtle way!  (See my comment #9 of bug #492312.)  Consider a fairly important package like dev-libs/libxml2.  In the current state of the tree, `emerge -vp dev-libs/libxml2` would give

 [ebuild   R    ] dev-libs/libxml2-2.9.1-r1:2  USE="ipv6 python* ...

for both stacking choices. But if at some point in the future, someone added the following to profiles/default/linux/package.use

#Python support causes problems on xyz
#Don't pull it in if we don't neeed it
dev-libs/libxml2  -python

The vanilla profile default/linux/amd64/13.0/desktop and our hardened profile with targets/desktop last would not change since they have “dev-libs/libxml2 python” in package.use near the bottom of the stack, but our proposed hardened profile with targets/desktop on top would give

 dev-libs/libxml2-2.9.1-r1:2  USE="ipv6 readline ... -python ...

So, both choices for orderings of “..” and “targets/desktop” in our parent file for hardened/linux/amd64/desktop lead to situations where we can’t control what packages get what use flags. What we would like is a stacking that looks something like this

...
/usr/portage/profiles/targets/desktop
/usr/portage/profiles/hardened/linux/amd64
/usr/portage/profiles/hardened/linux/amd64/destkop

but how do we get that with our current inheritance mechanism? One idea that Magnus (Zorry) had was to gutt out this portion of portage and replace the parsing of the parent file with something along the lines of openrc’s depend() { … } clause. Then we just locally say what has to come before/after what and we let the algorithm figure it out. It sounds like an interesting problem if there were two of me and if there were a good chance that it would actually get implemented. In the mean time, we limp along with what we have and do ad hoc fixes as changes in one part of the profiles means we have to adjust other things. Since we are all responsible for different areas of the tree’s profiles, inevitably we cause one another breakage even with the best of intentions. For example, a few days ago, Mike (vapier) removed a masking on the uclibc USE flag in the base profile.  Doing so makes perfect sense. He didn’t tell me, and why should he have to?, but this lead to a small breakage in hardened/linux/uclibc/amd64 and friends where I had to relax that masking.  I only discovered this upon a catalyst run which is a bit annoying.

6 thoughts on “The Gentoo Profile Stacking Problem

  1. Kai

    As a general rule I would guess that deeper nested profiles are more specialized versions of their parents. What about reordering the result you get from the python snippet so that you create a new stack by picking depth=1 profiles first, then appending depth=2 profiles and so on? That way you can still influence the general order but still get the highest specialized profiles last in the parsing process…

    Instead of using this brute-force method to sort by depth, you could use a stable sorting algorithm like bucketsort, insertionsort, or mergesort, with the sort key being the depth of the profile. It would make profile stacking at least a bit more predictable while still allowing to prefer one over another profile.

    Reply
    1. blueness Post author

      Kai,
      Tha’ts a very good idea actually. One possibility I was thinking of was to add depth measures to the parent file — actually just a depth of 1. Here I was inspired by an ancient language called prolog (yes I’m that old!) which is recursive/deep by nature and you can stop the recursion/deeper call with a ! command. So “..!” would inherit the parent but not anything beyond. But I’m not sure yet what approach is easiest or best. I would think Zorry’s design is best but hard to implement, and some depth control is less than optimal but easier to implement.
      Along these lines, when I added the hardened/linux/uclibc profiles, I purposely was very shallow. These stack as follows:

      /usr/portage/profiles/base
      /usr/portage/profiles/default/linux
      /usr/portage/profiles/hardened/linux/uclibc
      /usr/portage/profiles/hardened/linux/uclibc/amd64

      skipping over default/linux/amd64, arch/amd64 and many others. The reason is that much of the keywording/masking there is for glibc systems. In some places it is as much work to undo those choices as it is to just start fresh.

      Reply
      1. Kai

        Well, one should not try to fix complicated things by adding even more complicated things but instead make the complex problem simpler. With a defined ordering by specialization, you’d at least know that a use flag change in a level above can be easily overridden by one use flag change in your own profile. This make it much more easy to fix the problems.

        If you introduce hard-to-follow dependencies, this will not fix the complexity, but add even more problems.

        With an ordering by depth it’s pretty simple: A use flag change one level 1 can be overridden by a use flag change of level 2. That simple. If you depend on specific settings, just set them in your own profile – and will never flip if someone changes it at the base level. Because you always not: You are at deeper level AND the ordering in the parent profiles gives you a higher priority.

        I’d see it as a simplification of a complex problem, behavior is more easy to predict and track then.

        Reply
        1. reavertm

          Kai proposal looks good on paper. I have some possible doubts however.
          The problem may be that single node profile cannot have two immediate parents.

          This likely means, that in order to recreate existing profiles functionality this way, especially wrt hardened – to ensure it prevails, there may have to be hardened leaf profile equivalent of each non hardened leaf profile.
          To illustrate:

          base
          +- default/linux
          +- arch/base
          +- features/multilib
          +- features/multilib/lib32
          +- arch/amd64
          | +- releases
          | +- eapi-5-files
          | +- 13.0
          | +- hardened-13.0
          +- arch/x86
          +- releases
          +- eapi-5-files
          +- 13.0
          +- hardened-13.0

          Or something similar.

          This looks like generating a lot of permutations (even if symlinks were used).

          Perhaps some hybrid approach: level-based overriding (Kai) + explicit “include and override” (stacked profiles)….

          Reply
  2. Jacob

    I wrote a tool (in Java; sorry) that parses a Gentoo profile, and with verbose output turned on, I believe it reported when certain things get added and removed. I was involved in a project that also needed to build on the official profiles and this tool helped us see how exactly to do that effectively.

    If you’re interested, I can pass the source code along.

    Reply
    1. blueness Post author

      Jacob, sure pass it along: blueness@gentoo.org. Python is the language of choice for portage which is why I chose it. Anyhow, maybe a general tool for working with profiles is in order. Let me see what you did.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>