Gentoo musl Support Expansion for Qt/KDE Week 11

This week has mostly been dedicated to fixing old, and harder problems that I had previously put off. I spent a whole lot of time learning about the AccountsService codebase and setting up systems with LDAP authentication, but it turned out it didn’t need a rewrite after reading a couple of issues on the GitLab page, more on that later.

To start with I added a CMAKE_SKIP_TESTS variable to cmake.eclass. Currently you need to specify skipped tests by doing myctestargs=( -E ‘(test1|test2|test3)’ ). This works fine for the most part, but if you need to specify skipped tests multiple times it gets really messy, because ctest does not allow you to pass -E multiple times. Personally I ran into this when fixing tests for kde-apps/okular. Most tests for Okular only pass when it’s installed (#653618), but the ebuild already skips some tests for other reasons. So I needed to first unconditionally disable some tests, and then conditionally with “has_version ${P} || append tests”. To solve it I introduced an array and then parsed it with myctestargs+=( -E '('$( IFS='|'; echo "${CMAKE_SKIP_TESTS[*]}")')' ), but as this was useful for a lot more ebuilds than just Okular I decided to implement it in the eclass.

The second thing I worked on was AccountsService, it’s a daemon that retrieves a list of users on the system and presents them with a DBus interface. It’s used for showing users in things like login managers and accounts settings panels. I actually worked on this a long time ago but I put it off for a while because it required a bigger rewrite, and I had more important things to do back then.
AccountsService has two issues on musl. It uses the glibc function fgetspent_r, and wtmp which is just implemented as stubs in musl (https://wiki.musl-libc.org/faq.html#Q:-Why-is-the-utmp/wtmp-functionality-only-implemented-as-stubs?). I asked in #musl to figure out a fgetspent_r replacement, but we then discussed why it was bad to enumerate /etc/passwd to get a list of users, for example it does not respect non-local (LDAP/NIS users), so AS needed a bigger rewrite, we thought :).
So I started with setting up two virtual machines, one LDAP client, and one server. Having never used LDAP before this was a little hard but I got it working. I also needed to set up QEMU networking so that my VMs could connect to each other, and I also set up an LDAP webui called ldap-ui so I could easily get an overview of my LDAP instance. Because AS works by providing a DBus interface I also learned using the qdbusviewer and dbus-send tools. Before taking a deep dive into the AS source code I wrote some small test programs to get comfortable with the DBus C API, passwd+shadow database functions, and GLib.
I then started reading the AccountsService source code to understand it better, its main.c just sets up a daemon that’s mostly defined in daemon.c, the rest of the source files are mostly just helpers and wrappers. When the daemon initializes it sets up user enumerators using the entry_generator_* functions. The main one is entry_generator_fgetpwent, this generator uses fgetspent_r to enumerate /etc/passwd, and my idea was to replace it with getpwent + getspnam. But there are two other generators, requested_users and cachedir. requested_users takes a requested user (ex. when manually entering username+password in the login manager), and adds it into /var/lib/AccountsService/users. cachedir looks at that directory and adds these entries into the daemon. It turns out that requesting a non-local LDAP user with the requested_users generator is completely fine, and the login information will be cached in the dir so that the cachedir generator can expose it for future logins. I then looked at some issues in the AccountsService GitLab, and it turns out that enumerating /etc/passwd was intentional to not blow up the login screen with thousands of users on a big LDAP domain for example. So, the rewrite was sadly not needed, but I learned a lot! Still, fixing fgetspent_r and wtmp needs to get done, but I already have a fix for that.

Another thing I spent a lot of time on this week was poxml. This is also an old issue that I put off, mostly because it was too hard at the time. The build crashes because it can’t find the function gl_get_setlocale_null_lock in libgettextpo.so. This shared object belongs to GNU Gettext, so I something was wrong with that. Looking at the so with nm --dynamic /usr/lib/libgettextpo.so I could see that the function was undefined, bad! We reported this issue to upstream and got into a long conversation. Apparently Bruno (GNU) used Alpine Linux which packages GNU libintl, while Gentoo uses the musl libintl implementation. GNU libintl actually provides gl_get_setlocale_null_lock which explains why it worked on Alpine without issue. After grepping for gl_get_setlocale_null_lock I found this:
/* When it is known that the gl_get_setlocale_null_lock function is defined by a dependency library, it should not be defined here. */
#if OMIT_SETLOCALE_LOCK
*do nothing*
#else
*define gl_get_setlocale_null_lock*
#fi

So I tried just forcing the check to false, and it worked! I then looked at the build system and expected something like AC_SEARCH_LIBS([gl_get_setlocale_null_lock], [intl], ...) *set OMIT_SETLOCALE_LOCK*, but it turns out that autotools just forces OMIT_SETLOCALE_LOCK to 1. This is clearly wrong so I sent another comment upstream and temporarily fixed it in the Gentoo tree. Instead of doing it properly I made an ugly hack to not get complacent (sams idea) and hopefully we can get it resolved upstream instead :D.

To summarize I feel like this week has gone pretty good. I’ve solved everything that was left and now I’m ready to start writing a lot of documentation. A lot of the accountsservice setup and work was ultimately unnecessary but I still learned a lot.

This entry was posted in musl KDE. Bookmark the permalink.

Leave a Reply

Your email address will not be published.