Eclasses, Portage and PMS

Recently I had a little IRC bikeshed with bonsaikitten on the topic of .la file removal. As one of the maintainers of autotools-utils.eclass, I tend to like people actually using that eclass and keeping the .la removal algorithm there; bonsaikitten would like to see it in Portage instead.

To prove my point, let’s take a look at the process of making a change in an eclass:

  1. getting eclass maintainers’ approval,
  2. sending patches to gentoo-dev for review,
  3. PROFIT!

The whole process usually takes one week, and the change is effective as soon as user syncs the tree. This means that if the particular change aims to fix an issue with an ebuild yet to be committed, its commit can be delayed. And considering it is committed after eclass change, users won’t even notice the breakage.

Although getting a change in a single PM can usually be faster, it starts being effective when user upgrades it. This usually means that the ebuild author has either to work around the problem or delay the commit until fix gets stable, and then still a number of users could be hit by the bug.

For PMS, I think the situation is clear. A lot of time to get it into PMS, get new EAPI approved, get it implemented, stabilize and finally get blessing for the tree.

Although this could sound like this, I’m not denying PMS. PMS has its coverage but I really don’t see a reason to put everything into it just for the fun. In my opinion, PMS should cover most fundamental functions which either are very simple (and thus unlikely to introduce bugs) or highly relevant to the PM internals.

For example, emake in Portage reuses MAKEOPTS variable which can be considered private/internal. elog relies on PM-specific output system, and doins… well, one can say it could reuse some magic to optimize the merge process (OTOH, ebuilds already assume it does not).

econf on the other hand, although pretty common, doesn’t fully fit into PMS. The only magic it uses, it uses for libdir; and it is very specific to autoconf. But the same magic needs to be implemented in multilib.eclass to let non-autoconf build systems handle libdir correctly.

Returning to the topic: .la removal is not suitable for either PMS or PM because:

  1. it is very specific to autotools and libtool,
  2. it requires either a smart algo or some magic to determine which files to remove and which ones to keep,
  3. and for those kept, more magic could be required.

A quite sane algo is implemented in autotools-utils right now. When further packages are migrated to it, maintainers can give us feedback on it and help us improve it. And if it fails on a new package, we can commit a fix before the package hits final users.

If Portage started removing .la files on its own, we end up with either:

  1. having a really smart algo which will always work right, or random breakages for a number of users with solution being ‘upgrade portage and rebuild offending packages’;
  2. implementing some kind of Portage-specific variables to control the .la removal better.

I really don’t like any of those. So, just migrate your ebuilds! Distutils use distutils.eclass, CMake uses cmake-utils.eclass. There’s no reason to not inherit a dedicated eclass for autotools, with autotools-specific quirks.

Ah, and please finally stop pushing everything into PMS just because some devs break eclass APIs. If someone breaks policy on instance, that person should be punished. Locking all devs in cages is no solution.

libtinynotify — a smaller implementation of Desktop Notifications

Just a quick note. This week I began a new project, and it’s called libtinynotify. The tagline would probably sound like from the creator of uam, another piece of software to make your systems smaller. But in fact, I don’t think it will. I just used libnotify, and thought it could be done much better.

The highlight in libtinynotify is to keep it simple. When I used the original libnotify in autoupnp, I noticed that it forces my library to link not only with gobject but even with gdk-pixbuf! I reported the bug upstream and they didn’t really care. Why would a simple notifications library force using gdk-pixbuf on all users? That’s not really a good dependency set for a preloaded library which autoupnp is now.

And that’s basically how it all started. First, I wanted to create a smaller, GLib-free variant of libnotify with a compatible API but that’s obviously impossible (due to GObject). And I really didn’t want to push GObject dep into the API. Thus, libtinynotify comes with a new, shiny API.

Although it’s still early work and API can change rapidly, it does its job already. I tried to make it pretty flexible for everyday tasks while keeping it simple. And I think I did pretty well, though I’m open to comments.

If someone wants to give it a try, it’s x11-libs/libtinynotify in mgorny overlay (live ebuild). My playground for it is net-misc/autoupnp (also the live ebuild, in mgorny overlay).

PMS Test Suite: D-Bus → …?

The D-Bus communication method used now within PMS Test Suite shows more and more disadvantages over time. As the final evaluation’s approaching, it will be the final solution for the GSoC period of development. But after that I’ll probably jump straight to replacing it with something better.

Right now, I can list at least the following issues with my current IPC:

  • it requires the system bus to be running which may not be really useful for smaller systems,
  • …and which makes using it in a Prefix environment a painful experience,
  • …and makes running multiple instances of pmsts a random failure,
  • it limits the package to using Python 2, directly and indirectly (via GLib event loop),
  • …and the GLib Python event loop fails to propagate exceptions correctly,
  • …and it wants the PID out of subprocess.

Looking for a new solution

Before deciding which way to proceed, let’s take a look at what we exactly need to have. And we need:

  • an IPC mechanism which would work fine within limited ebuild environment (sandbox, userpriv),
  • …which would not require us to touch or prepare builddirs,
  • …which would work fine with failing (or not even being started) ebuilds as well,
  • and an event loop which would allow us to asynchronically communicate with ebuilds and wait for a single subprocess to finish,
  • …and it all has to work both with Python 2 & 3.

And it would be great if I could avoid introducing additional dependencies.

asyncore?

Right now, the most promising solution seems to be using asyncore Python module, and an UNIX socket. Considering that gentoopm is already able to provide us with the userpriv UIDs for all PMs, we can make the socket userpriv-aware and not world-writable.

We’d use asyncore.loop() then to handle comms, and a few secs timeout to check the subprocess for termination.

One remaining question is the socket path. We could either:

  1. just make it a well-known name,
  2. make it random and write to the eclass at generation time,
  3. make it random and pass through the environment.

The first solution has the disadvantage that only one instance of PMS test suite could be running at once. On the other hand, running more than one at once seems to be a bad idea anyway (unless they’re running a completely different test suites with unique ebuild names and separate repos). By using a common socket name, the other instance could just ping the first one and fail nicely rather than failing quite randomly.

And the third solution has the big disadvantage that we’re starting to rely on a random variable getting passed through PM. Although this is possible and most probably even will work, I don’t think it’s a good idea. And it could stop working at any point.

I’ll probably go with the first one. I guess gentoopm would have to provide us with root path too.

uam can now mount CDs and DVDs!

Today I have released uam-0.2. The new release adds a long-awaited feature — capability of mounting and unmounting ejectable media like CD and DVD disks. And it does so in a way much simpler than I expected.

But first, what is uam?

uam is my old project, dating back to the times of HAL. Although HAL is long gone, uam is still there shining. It is a simple, lightweight media automounter using udev rules only.

Unlike HAL (or udisks) it doesn’t introduce any additional daemons. It just installs a few udev rules and helper scripts. When a media device is added or removed, udev calls the scripts and they perform all the mount/umount operations as necessary. HAL/udisks not required anymore, neither do mounter daemons.

Isn’t that a very limited solution?

Of course uam can’t be as flexible as the HAL/udisks attempt. You can’t get it (easily) to do things like asking user for permission or password; well, it doesn’t even create mounted media icons on your desktop. But is that what you really want it to do?

You can tell uam is one of the plug & play apps. emerge uam, CR and newly-inserted media shall start appearing in /media. There’s a config file too. If you want to fine-tune it a little, there are a few more switches and options in /etc/udev/uam.conf. You can set mount options, mountpoint naming, device filtering…

But how does it handle CDs and DVDs without a daemon?

Before, it wasn’t possible to mount CDs without some kind of a polling daemon. HAL/udisks provided such a daemon; I was even considering adding such a daemon to uam. The other solution was to use sys-apps/pmount which allows unprivileged users to mount removable media.

None of these is any longer necessary. Nowadays, kernel can poll ejectable drives itself and report media change (and eject) events through udev. As of 0.2, uam handles those events and is able to mount CDs as well.

In order to do that, the kernel polling has to be enabled. This can be done either per-device:

echo 5000 > /sys/block/sr0/events_poll_msecs

or by setting a common polling interval as events_dfl_poll_msecs parameter to the block module:

echo 5000 > /sys/module/block/parameters/events_dfl_poll_msecs

The interval is specified in milliseconds, i.e. the above examples set it to 5 seconds. Smaller intervals result in a quicker mounting of CDs, larger result in less polling overhead.

PMS Test Suite: getting the test results

One of key problems in PMS Test Suite is getting actual test results. With the whole complexity of build process, including privilege dropping, sandbox, collision protection, auto-pretending it is not that easy to check whether a particular test succeeded without risking a lot of false positives.

The simple attempt: succeed or die!

The simplest method of all would be to assume the test is supposed to either complete and merge successfully or die. Although that will work in many cases, it has many limitations.

First of all, to make it work as expected, the actual test code has to be executed. If for some reason the test code is not executed, we end up with a false positive. Consider the test checking phase function execution order. If for some reason pkg_postinst() isn’t called at all, there is no way we could die about it.

Moreover, if a test is supposed to fail, we can’t be sure if it failed for our reason or with some random PM bug. We could try to implement some method of grabbing the failure message and parsing it but that would imply relying on a particular output format. That’s not really what I’m interested in.

On the other hand, that is most straightforward method of checking test results. It doesn’t introduce additional dependencies, is PM-safe and that’s why the most basic EbuildTestCase class of PMS Test Suite uses that. Well, to be more exact, it checks vardb before and after running the tests to see which ones were merged and which ones failed to.

Passing more complex test results

Due to the problems pointed out above, I’ve decided to introduce a more complex test result checking method. Originally, it was supposed to use files to store ebuild output but during early testing showed that that concept has a few weaknesses.

Most importantly, FEATURES=userpriv resulted in some phase functions being run as root and some other as portage user. I’ve decided that hacking permissions, sandbox and other potential obstacles to get that concept working was not worth the effort.

That’s why the current implementation uses D-Bus for communication between the actual tests and the test runner. I was a little surprised by the fact that neither Portage nor pkgcore had any trouble with letting the test code reach the system bus.

Right now, the DbusEbuildTestCase handles all necessary D-Bus integration. It creates an D-Bus object for each running test, integrates the pms-test-dbus eclass with tests and provides methods to submit and check the test results.

Not all D-Bus test cases have to actually submit any output. Simpler ones just ping the D-Bus object in pkg_setup() to let it know that the test was actually started. This avoids a case when a test is expected to die and is considered so because PM didn’t start it at all (e.g. due to insufficient permissions when emerge assumes --pretend).