Lilblue Linux: release 20140520

A couple of days ago, I pushed out a new build of Lilblue Linux [1] which is my attempt to turn embedded Linux on its head and use uClibc [2] instead of glibc as the standard C library for a fully featured XFCE4 desktop for amd64. Its userland is built with Gentoo’s hardened toolchain, and the image ships with a kernel built using hardened-sources which include the Grsec/PaX patches for added security, but its main distinguishing feature from mainstream Gentoo is uClibc. Even though Lilblue is something of an experimental project which grew out of my attempt to get more and more packages to build against uClibc, the system works better than I’d originally expected and there are very few glitches which are uClibc specific. You get pretty much everything you’d expect in a desktop, including all your multimedia goodies, office software, games and browsers. mplayer2 works flawlessly!

But all is not well in the land of uClibc these days. It has been over two years since the last release, 0.9.33.2 on May 15, 2012, and there are about 80 commits sitting in the 0.9.33 branch, many of which address critical issues since 0.9.33.2. This causes problems for people building around uClibc, such as buildroot, and there has even been talk on the email lists of dropping uClibc as its main libc in favor of either glibc or musl [3]. Buildroot is maintaining about 50 backported patches, while Mike’s (aka vapier’s) latest patchset has 20. I seem to always have to insert a backported patch of my own here or there, or ask Mike to include it in his patchset.

For this release, I did something that I have mixed feelings about. Instead of 0.9.33.2 + backported patches, I used the latest HEAD of the 0.9.33 git branch. This saved me the trouble of getting more patches backported into a new revision of our 0.9.33.2 ebuild, or by “cheating” and putting the patches into /etc/portage/patches/sys-libs/uclibc, but it did expose a well known problem in uClibc, namely the problem of how its header files stack. A libc’s header files typically include one another to form a stack [4]. For example, on glibc, sched.h stacks as follows

    sched.h
        features.h
            sys/cdefs.h
                features.h
                bits/wordsize.h
            gnu/stubs.h
        bits/types.h
            features.h
            bits/wordsize.h
            bits/typesizes.h
        stddef.h
        time.h
            features.h
            stddef.h
            bits/time.h
                bits/types.h
                bits/timex.h
                    bits/types.h
            bits/types.h
            xlocale.h
        bits/sched.h

Here sched.h includes features.h, bits/types.h, stddef.h, time.h and bits/sched.h. In turn, features.h includes sys/cdefs.h and gnu/stubs.h, and so on. Each indentation indicates another level of inclusion. Circular inclusions are avoided by using #ifdef shields.

At least one reason for this structure is to abstract away differences in architectures and ABIs in an effort to present a hopefully POSIX compliant interface to the rest of userland. So, for example, glibc’s sys/syscall.h looks the same on amd64 as on mipsel, but it includes asm/unistd.h which is different on the two architectures. Each architecture’s asm/unistd.h have their own internal #ifdefs for the different ABIs proper to the architecture, and each #ifdef section in turn defines the values of the various syscalls appropriately for their ABI [5]. Another reason for this stacked inclusion is to make sure that certain definitions, macros or prototypes defined in one header are made available in another header in the same way as they are made available in a c file. This is the reason given, for instance, in the uClibc commit 2e2dc998 which I examine below.

Let’s see where uClibc’s header problems begin. Take a look at Gentoo’s bug #486782, where cdrtools-3.01_alpha17 fails to build against uClibc because its readcd/readcd.c defines “BOOL clone;” which collides with the definition of clone() in bits/sched.h [6]. Nowhere is sched.h included in readcd.c, instead bits/sched.h gets pulled in indirectly because stdio.h is included! Comment 7 reveals the stacking problem. stdio.h’s stacking is complex, but following just the bad chain, we see that stdio.h includes bits/uClibc_stdio.h which includes bits/uClibc_mutex.h which includes pthread.h which includes sched.h which includes bits/sched.h — wheh! If you’re wondering what stdio.h should have to do with sched.h, then you see the problem: too much information is being exposed here. Joerg’s comment on the bug pretty much sums it up: “The related include files (starting from what stdio.h includes) most likely expose the problem because they seem to expose implementation details that do not belong to the scope of visibility of the using code.”

Back to my bump from 0.9.33.2 to the HEAD of the 0.9.33 branch. This bump unexpectedly exposed bugs #510766 and #510770. Here we find that =media-libs/nas-1.9.4 and =app-text/texlive-core-2012-r1, both of which build just fine against 0.9.33.2, fail against HEAD 0.9.33 because of a name collision with abs(). Unlike the case with cdrtools, where the blame is squarely on uClibc, I think this is a case of enough blame to go around. Both of those packages define abs() as a macro even though it is supposed to be a function prototyped in stdlib.h, as per POSIX.1-2001 [7]. At least nas tries to check if abs() has been already defined as a macro, but its still not enough of a check to avoid the name collision. Unfortunately, given its archaic imake system, its not as easy as just adding AC_CHECK_FUNCS([abs]) to configure.ac. texlive-core at least uses GNU autotools, but its collection of utilities define abs() in several different places making a fix messy. On the other hand, why do we suddenly have stdlib.h being pulled in after those macros with HEAD 0.9.33 whereas we didn’t with release 0.9.33.2? It turns out to be uClibc’s tiny commit 2e2dc998 which I quote here:

	sched.h: include stdlib.h for malloc/free
	Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>

	diff --git a/libc/sysdeps/linux/common/bits/sched.h b/libc/sysdeps/linux/common/bits/sched.h
	index 7d6273f..878550d 100644
	--- a/libc/sysdeps/linux/common/bits/sched.h
	+++ b/libc/sysdeps/linux/common/bits/sched.h
	@@ -109,6 +109,7 @@ struct __sched_param
	 /* Size definition for CPU sets.  */
	 # define __CPU_SETSIZE	1024
	 # define __NCPUBITS	(8 * sizeof (__cpu_mask))
	+# include <stdlib.h>
	 
	 /* Type for array elements in 'cpu_set_t'.  */
	 typedef unsigned long int __cpu_mask;

Both packages pull in stdio.h after their macro definition of abs(). But now stdio.h, which pulls in bits/sched.h, further pulls in stdlib.h with the function prototype of abs() and … BOOM! … we get

/usr/include/stdlib.h:713:12: error: expected identifier or '(' before 'int'
/usr/include/stdlib.h:713:12: error: expected ')' before '>' token

Untangling the implementation details is a going to be a thorny problem. And, given uClibc’s faltering release schedule schedule, things are probably not going to get better soon. I have looked at the issue a bit, but not enough to start unraveling it. Its easier just to apply hacky patches to the odd package here and there than to rethink uClibc’s internal implementations. If we are going to start rethinking implementation, the musl [8] is much more exciting. uClibc is used in lots of embedded systems and the header issue is not going to be a show stopper for it or for Liblue, but it does make alternatives look like musl more attractive.

References:

[1] https://wiki.gentoo.org/wiki/Project:Hardened_uClibc/Lilblue

[2] http://www.uclibc.org

[3] See Petazzoni’s email to the uClibc community.

[4] I wrote a little python script to generate these stacks since creating them manually . You can download it from my dev space: header-stack.py. Note that the stacking is influenced by #ifdef’s throughout, eg #ifdef __USE_GNU, which the script ignores, but it does give a good starting place for how the stacking goes.

[5] As of glibc 2.17, on mips, asm/unistd.h defines the various __NR_* values in a flat file with three #ifdefs sections for _MIPS_SIM_ABI32, _MIPS_SIM_ABI64 and _MIPS_SIM_NABI32, respectively ABI=o32, n64 and n32. Using my script from [4], the stacking looks as follows:

    sys/syscall.h
        asm/unistd.h
            asm/sgidefs.h
        bits/syscall.h
            sgidefs.h

In contrast, on amd64, each ABI is broken out further into their own file, with asm/unistd_32.h, asm/unistd_x32.h or asm/unistd_64.h included into asm/unistd.h for __i386__, __ILP32__, or __ILP64__ respectively. Here the stacking is

    sys/syscall.h
        asm/unistd.h
            asm/unistd_32.h
            asm/unistd_x32.h
            asm/unistd_64.h
        bits/syscall.h

Remember, on both architectures, sys/syscall.h are identical, and that is the file you should include in your c programs, not any of the asm/* which often carry warnings not to include them directly.

[6] man 2 clone

[7] man 3 abs

[8] http://www.musl-libc.org/

Leave a Reply

Your email address will not be published.