Portage – Marius Mauch

Time to say goodbye

So, time has come for me to realize that my time with Gentoo is over. I
haven’t actually been doing much Gentoo work over the last months due
to personal reasons (nothing Gentoo related), and I don’t see that
situation changing in the near future. In fact I’ve already reassigned
or dropped most of my responsibilites in Gentoo a while ago, so there
are just a few pet projects left to give away:
– my gentoo-stats project (in the portage/gentoo-stats svn repository).
I know quite a few people are interested in the idea of collecting
various statistic data from gentoo user systems, and I’d encourage
everyone who wants to implement such a system to at least look at it (I
may have even finished it if I wouldn’t have wasted my time focusing on
the wrong problems). There is quite a bit of documentation also that
should help to get you started
– a graphical security update tool (see bug #190397)

So if anyone wants to adopt those, complete or just parts, just take
them. As for Portage, Zac has practically already filled my role.

So I guess that wraps it up. It’s been a nice ride most of the time,
but now it’s time for me to leave the Gentoo train.

More extensions to package set support

After writing my previous post about set operators I’ve added a few more things related to package sets to portage. First, operators can now also be used inside sets.conf files using the extend, remove and intersect options, each taking a whitespace separated list of set names (without the @ prefix), working analog to the operators in set expressions described in the previous post. The main difference is that the evaluation order is fixed now (unions come first, differences second and intersections last) while in expressions it’s left-to-right.

The second new feature is that package sets can now be (re)defined on the emerge command line. This is done using the following syntax:
emerge '@setname{key1=value1,key2=value2}'
where setname can either be an existing package set, or a new one to define a set without having to modify any files. Note the quotes that are necessary to ensure that emerge gets the argument as-is without interference from the shell. The nice thing is this syntax also works inside set expressions. The not-so-nice thing is that for now there are a few restrictions about the values you can use, as there is no quoting mechanism implemented yet (this is planned however). So using any of the following characters or whitespace inside the braces will lead to undefined behavior: { } @ = ,
Another restriction is that you may not redefine package sets that are created by a multiset section in sets.conf (as those use different options that only make sense when defining multiple sets at once).
Note that for redefining existing package sets you only have to pass those options that should be different from the sets.conf definition.

And last but not least, to make the above features a bit easier to use there is also a new DummyPackageSet class that can be used to build a package set only by using operators, and/or to include a few packages without having to edit an external file. So it’s even easier than before to define a new set @world-without-system, using
[world-without-system] class=portage.sets.base.DummyPackageSet extend=world remove=system

Package set operators

Ok, just a quick notice that portage 2.2_rc10 (or 2.2 final if there isn’t another RC) will not only support package sets as defined in sets.conf, but also expressions to generate unions, intersections and differences of multiple package sets. This for example allow you to temporarily exclude @system from @world (assuming you have @system in your world_sets file) by running emerge @world-@system.
Other operators are / for intersections (select only atoms included in both sets) and + for unions. The latter is useful as expressions can contain more than one operator, e.g. emerge @kde+@gnome/@installed to reinstall all kde and gnome packages that are already installed (assuming kde and gnome sets are defined somewhere).

This feature is just a few minutes old, so it will probably be extended or otherwise changed in the future. Current restrictions include

strict left-to-right evaluation order
only defined package sets can be used as operands (no package names)
feature is currently only available on the commandline, not via sets.conf

And while I’m on it, I’ve also added a new AgeSet class to select installed packages that are older/newer than a given number of days.

Portage-2.2_pre2 is in the tree

As of a few minutes ago a portage-2.2 test release is finally available
for public consumption. This is a test release (somewhere between
alpha and beta I’d say), NOT a release candidate, so expect a few rough
edges and not always up2date/complete documentation.

Please see the shipped NEWS and RELEASE-NOTES for changes from the 2.1
series, and check bugs.gentoo.org before reporting issues in
#gentoo-portage.

Note for Ebuild developers: This test release includes a partially
rewritten version of repoman that’s not heavily tested, so do not use
it for committing anything to the tree and double check its reports
with other tools or a 2.1 version.

Marius

Portage-2.2 preview

So, while Zac has been keeping everyone distracted with new portage-2.1 releases over the last months I’ve been mostly working on the new features in trunk, which will become portage-2.2, and I think it’s time to give a short preview about things to expect as we plan to release it before the end of the year, so the feature set probably won’t change much from now on:

The most important new things will be package sets. Sounds boring at first, I know, but due to a flexible framework they allow us (and you) to do interesting things, like eventually replacing glsa-check and revdep-rebuild (while the security set is pretty much identical to what glsa-check did, the set for rebuilding packages with broken linkage is very experimental, incomplete and not enabled by default yet). Or simply update or remerge all packages in a specific category. And that’s not even touching the power of the CommandOutputSet class 😉
Support for GLEP 42 news, as an alternative for package maintainers to the elog framework
Visibility filtering based on licenses, aka ACCEPT_LICENSE, which allows you for example to build a RMS-approved system and will render the interactive license prompts currently found in some packages obsolete
A new FEATURES flag to keep libraries that are still used on a soname change, including a simply way to rebuild all packages using the old library (using another package set). A bit too late for the expat issue, but hopefully it helps to prevent future incidents of that kind
And of course all the things that already appeared in portage-2.1.3

But no light without darkness, there will be some important changes requiring your attention:

While not set in stone yet, the behavior of system and world will likely change to match that of other package set and single packages. Currently emerge world is the same as emerge --noreplace world, meaning that installed packages aren’t rebuilt (unlike emerge $foo which will rebuild $foo). With 2.2 emerge world is likely going to be the same as emerge $(< /var/lib/portage/world), if you want the old behavior you’ll have to use –noreplace. That change also has other benefits beyond consistency, like removing the restriction that world/system could not be combined with other packages on the commandline.
“world” will likely no longer include “system”, if you want to update both you’ll have to specify both
Due to a change in the namespace many portage related tools will require an update or generate a lot of deprecation warnings.

As said, it’s just a preview, and some things are still work in progress, but it should give you a first impression what portage-2.2 will be about. I think we might create the first test releases in late November, but that’s no promise. Though if you want to test it you don’t have to wait that long, just install subversion and read http://www.gentoo.org/proj/en/portage/doc/testing.xml (that’s especially recommended for maintainers of portage related tools), just don’t expect everything to work perfectly yet.

[RFC] Properties of package sets

One missing feature in portage is the lack of package sets. Before we
(re)start working on that however I’d like to get some feedback about
what properties/features people would expect from portage package set
support.
Some key questions:

– should they simply act like aliases for multiple packages? E.g.
should `emerge -C sets/kde` be equivalent to `emerge -C kdepkg1 kdepkg2
kdepkg3 …`? Or does the behavior need to be “smarter” in some ways?

– what kind of atoms should be supported in sets? Simple and versioned
atoms for sure, but what about complex atoms (use-conditional, any-of,
blockers)?

– should sets be supported everywhere, or only in selected use cases?
(everywhere would include depstrings for example)

– what use cases are there for package sets? Other than the established
“system” and “world”, and the planned “all” and “security” sets.

– how/where should sets be stored/distributed?

News on the portage front

So, it’s been a while since my last post, so people may wonder what happened since then within portage. Well, besides the usual maintenance releases of 2.1.2 there hasn’t been a lot of exciting stuff as I’ve been mostly inactive, but there are still a number of interesting things:
– We’ve decided that trunk will be released as 2.2, not 2.1.3, due to the structural changes in the codebase (which aren’t complete yet)
– I’ve finished Alecs work on the portage implementation of Glep 42 and added Portage support to the eselect module that was shipped with Paludis and made it compliant with the Glep, now we just have to wait for the eselect and Paludis people to get their act together for the module to be released (bug 179064)
– The new preserve-libs feature is now more or less complete except for the support in revdep-rebuild (more on this in a later post)
– KEYWORDS=”-*” is now completely unsupported, gvisible() will throw a warning if it encounters packages using it (see KEYWORDS.stupid for reasons)
– Zac merged the license visibility code (aka ACCEPT_LICENSE)
– lots of other minor things Zac merged that I don’t remember now, but most of those are also in 2.1.2
– I’ve added some basic instructions to our project page how interested people can use/test portage versions or svn without having to install them system-wide

There are still a lot of things I’d like to do, but most of those have been on the todo list for so long that it’s unlikely to get them into 2.2, as my time and motivation is quite limited these days.

diet for portage/init.py

So, as I said earlier I’ve now moved the dbapi stuff into it’s own subpackage, and portage/__init__.py (formerly portage.py) has now shrunk to 5k lines. However, that’s still way too much for me, so I’ll see what I can remove from it next, likely candidates are config() and/or doebuild stuff.
Hopefully at some point no module will have more than 1k lines, so things get managable again and we can start working again without getting lost in files that span hundreds of pages, and maybe even break some of teh larger functions/classes (config, fetch, treewalk, …) down into smaller pieces. Now what’s the point of breaking things up? Well, one thing is that the smaller a code block the easier it usually is to reuse it. Same for replacing it with something better. Also as I also have to determine what symbols each new module actually uses to rewrite the import statements it might also give us a better view on which symbols are actually used, the dependencies between modules and eventually give us a clue how to group them better (so that semantically related symbols are in the same namespace).

Namespace sanitizing and splitting up the tree

Something that’s bugged me for while in portage was the crappy namespace handling we had since whenever we moved the python modules to /usr/lib/portage/pym. Originally there was no real problem as we only had a single module portage.py, so all you needed was a ‘import portage’, but over time more modules were created, which Nick started to name portage_foo.py due to the lack of a “portage” python package to use as container. Also there were a number of modules without any “portage” part in the name, such as xpak, cvstree, output or the cache package, which could potentially cause a namespace collision with other packages in site-packages or even the standard library, not a very pleasant thought.
But as of today that’s history, I finally fixed this annoyance and moved all the portage related code into the new “portage” package (so portage.py is now portage/__init__.py and portage_foo.py is now portage/foo.py). For now the code is mostly a 1:1 translation, but over time it hopefully gets a bit cleaner by removing redundant qualifiers. Also this now allows us to split the big portage.py (or now __init__.py) up further without fearing namespace collisions, I’ll probably move the dbapi classes into their own package later this week.
But what does this all mean to you? If you’re just a normal user it shouldn’t affect you in any way (assuming I didn’t screw up anything and Zac updates the ebuild accordingly). If you have some custom scripts or are a developer of a tool using the portage API you should prepare for updating it after portage-2.1.3 is released, though for the time being the old names should just continue to work as I’ve also added some symlinks to avoid a large-scale API breakage.

On another note I fully agree with Diego on the idea of splitting the tree up. I’ve never been a big fan of the recent overlay hype, but at this point it’s still manageable. Also besides any technical problems a tree split would increase the “repo hunting” problem which we’re already starting to see and is IMHO one of the major downsides of most other (rpm-based) distributions, and that’s something I’d like to avoid in Gentoo.

Getting rid of KEYWORDS=-*, step 2

After raising the awareness about KEYWORDS=”-*” being a stupid thing to use in the last months today I decided to eliminate the remaining reason for using it (one couldn’t unmask a package that had KEYWORDS=”” without editing it) by adding support for a new token in package.keywords. So now when portage-2.1.3 goes live all theses live-cvs-completely-unsupported packages can stop using the broken KEYWORDS=”-*” and use KEYWORDS=”” instead without loosing functionality. And once we get the tree clean from those KEYWORDS=”-*” abusers we can also finally fix the -* handling for package.keywords to do what it should do (act like ACCEPT_KEYWORDS).