alt-libc: The state of uClibc and musl in Gentoo (part 1)

About five years ago, I became interested in alternative C standard libraries like uClibc and musl and their application to more than just embedded systems.  Diving into the implementation details of those old familiar C functions can be tedious, but also challenging especially under constraints of size, performance, resource consumption, features and correctness.  Yet these details can make a big difference as you can see from this comparison of glibc, uClibc, dietlibc and musl.  I first encountered libc performance issues when I was doing number crunching work for my ph.d. in physics at Cornell  in the late 80’s.  I was using IBM RS6000’s (yes!) and had to jump into the assembly.  It was lots of fun and I’ve loved this sort of low level stuff ever since.

Over the past four years, I’ve been working towards producing stage3 tarballs for both uClibc and musl systems on various arches, and with help from generous contributors (thanks guys!), we now have a pretty good selection on the mirrors.  These stages are not strictly speaking embedded in that they do not make use of busybox to provide their base system.  Rather, they employ the same packages as our glibc stages and use coreutils, util-linux, net-tools and friends.  Except for small details here and there, they only differ from our regular stages in the libc they use.

If you read my last blog posting on this new release engineering tool I’m developing called GRS, you’ll know that I recently hit a milestone in this work.  I just released three hardened, fully featured XFCE4 desktop systems for amd64.  These systems are identical to each other except for their libc, again modulo a few details here and there.  I affectionately dubbed these Bluemoon for glibc, Lilblue for uClibc, and Bluedragon for musl.  (If you’re curious about the names, read their homepages.)  You can grab all three off my dev space , or off the mirrors under experimental/amd64/{musl,uclibc} if you’re looking for just Lilblue or Bluedragon — the glibc system is too boring to merit mirror space.  I’ve been maintaining Lilblue for a couple of years now, but with GRS, I can easily maintain all three and its nice to have them for comparison.

If you play with these systems, don’t expect to be blown away by some amazing differences.  They are there and they are noticeable, but they are also subtle.  For example, you’re not going to notice code correctness in, say, pthread_cancel() unless you’re coding some application and expect certain behavior but don’t get it because of some bad code in your libc.  Rather,  the idea here is push the limits of uClibc and musl to see what breaks and then fix it, at least on amd64 for now.  Each system includes about 875 packages in the release tarballs, and an extra 5000 or so binpkgs built using GRS.  This leads to lots of breakage which I can isolate and address.  Often the problem is in the package itself, but occasionally it’s the libc and that’s where the fun begins!  I’ve asked Patrick Lauer for some space where I can set up my GRS builds and serve out the binpkgs.  Hopefully he’ll be able to set me up with something.  I’ll also be able to make the build.log’s available for packages that fail via the web, so that GRS will double as a poor man’s tinderbox.

In a future article I’ll discuss musl, but in the remainder of this post, I want to highlight some big ticket items we’ve hit in uClibc.  I’ve spent a lot of time building up machinery to maintain the stages and desktops, so now I want to focus my attention on fixing the libc problems.  The following laundry list is as much a TODO for me as it is for your entertainment.  I won’t blame you if you want to skip it.  The selection comes from Gentoo’s bugzilla and I have yet to compare it to upstream’s bugs since I’m sure there’s some overlap.

Currently there are thirteen uClibc stage3’s being supported:

  • stage3-{amd64,i686}-uclibc-{hardened,vanilla}
  • stage3-armv7a_{softfp,hardfp}-uclibc-{hardened,vanilla}
  • stage3-mipsel3-uclibc-{hardened,vanilla}
  • stage3-mips32r2-uclibc-{hardened,vanilla}
  • stage3-ppc-uclibc-vanilla

Here hardened and vanilla refer to the toolchain hardening as we do for regular glibc systems.  Some bugs are arch specific some are common.  Let’s look at each in turn.

* amd64 and i686 are the only stages considered stable and are distributed along side our regular stage3 releases in the mirrors.  However, back in May I hit a rather serious bug (#548950) in amd64 which is our only 64-bit arch.  The problem was in the the implementation of pread64() and pwrite64() and was triggered by a change in the fsck code with e2fsprogs-1.42.12.  The bug led to data corruption of ext3/ext4 filesystem which is a very serious issue for a release advertised as stable.  The problem was that the wrong _syscall wrapper was being used for amd64.  If we don’t require 64-bit alignment, and you don’t on a 64-bit arch (see uClibc/libc/sysdeps/linux/x86_64/bits/uClibc_arch_features.h), then you need to use _syscall4, not _syscall6.

The issue was actually fixed by Mike Frysinger (vapier) in uClibc’s git HEAD but not in the 0.9.33 branch which is the basis of the stages.  Unfortunately, there hasn’t been a new release of uClibc in over three year so backporting meant disentangling the fix from some new thread cancel stuff and was annoying.

* The armv7a stages are close to being stable, but they are still being distributed in the mirrors under experimental.  The problem here is not uClibc specific, but due to hardened gcc-4.8 and it affects all our hardened arm stages.  With gcc-4.8, we’re turning on -fstack-check=yes by default in the specs and this breaks alloca().  The workaround for now is to use gcc-4.7, but we should turn off -fstack-check for arm until bug #518598 – (PR65958) is fixed.

* I hit this one weird bug when building the mips stages, bug #544756.  An illegal instruction is encountered when building any version of python using gcc-4.9 with -O1 optimization or above, yet it succeeds with -O0.  What I suspect happened here is some change in the optimization code for mips between gcc-4.8 and 4.9 introduced the problem.  I need to distill out some reduced code before I submit to gcc upstream.   For now I’ve p.masked >=gcc-4.9 in default/linux/uclibc/mips.  Since mips lacks stable keywords, this actually brings the mips stages in better line with the other stages that use gcc-4.8 (or 4.7 in the case of hardened arm).

* Unfortunately, ppc is plagued with bug #517160PIE code is broken on ppc and causes a seg fault in plt_pic32.__uClibc_main ().  Since PIE is an integral part of how we do hardening in Gentoo, there’ s no hardened ppc-uclibc stage.  Luckily, there are no known issues with the vanilla ppc stage3.

Finally, there are five interesting bugs which are common to all arches.  These problems lie in the implementation of functions in uClibc and deviate from the expected behavior.  I’m particularly grateful to who found them by running the test suite for various packages.

* Bug 527954 – One of wget’s tests makes use of fnmatch(3) which intelligently matches file names or paths.  On uClibc, there is an unexpected failure in a test where it should return a non-match when fnmatch’s options contains FNM_PATHNAME and a matching slash is not present in both strings.  Apparently this is a duplicate of a really old bug (#181275).

* Bug 544118René noticed this problem in an e2fsprogs test for libss.so.  The failure here is due to the fact that setbuf(3) is ineffective at changing the buffer size of stdout if it is run after some printf(1).  Output to stdout is buffered while output to stderr is not.  This particular test tries to preserve the order of output from a sequence of writes to stdout and stderr by setting the buffer size of stdout to zero.  But setbuf() only works on uClibc if it is invoked before any output to stdout.  As soon as there is even one printf(), all invocations to setbuf(stdout, …) are ineffecitve!

* Bug 543972 – This one came up in the gzip test suite.  One of the tests there checks to make sure that gzip properly fails if it runs out of disk space.  It uses /dev/full, which is a pseudo-device provided by the kernel that pretends to always be full.  The expectation is that fclose() should set errno = ENOSPC when closing /dev/full.  It does on glibc but it doesn’t in uClibc.  It actually happens when piping stdout to /dev/full, so the problem may even be in dup(2).

* Bug 543668 – There is some problem in uClibc’s dopen()/dlclose() code.  I wrote about this in a previous blog post and also hit it with syslog-ng’s plugin system.  A seg fault occurs when unmapping a plugin from the memory space of a process during do_close() in uClibc/ldso/libdl/libdl.c.  My guess is that the problem lies in uClibc’s accounting of the mappings and when it tries to unmap an area of memory which is not mapped or was previously unmapped, it seg faults.