A few thoughts on libc++ and _GNU_SOURCE

This week I was trying to make libc++ work without _GNU_SOURCE predefined, which causes me some trouble when compiling LLVM against musl. As mentioned in my last post, g++/clang++ unconditionally predefines _GNU_SOURCE for any C++ code, because libstdc++/libc++ simply won’t work without it. This is an old and well-known issue [1], but unfortunately has never been fixed. This week I boldly tried to fix it for libc++, and failed 🙁

Simply put, libc++ depends on some non-standard C functions that are only available when _GNU_SOURCE is predefined. For example, strtoll_l() is a non-POSIX function hidden by _GNU_SOURCE in <stdlib.h>, and used by libc++’s header <locale>. A naive idea might be to define _GNU_SOURCE in <locale>. It doesn’t work because <stdlib.h> is possibly already included and expanded before <locale>, at which point defining _GNU_SOURCE is too late.

To address the above problem, we need to define _GNU_SOURCE before any inclusion of <stdlib.h>. So a straight-forward idea is putting _GNU_SOURCE in <cstdlib>, which is the only place in libc++ where <stdlib.h> is directly included (other C++ headers usually include <cstdlib> instead). Unfortunately this doesn’t work either. If you read glibc’s header, you’ll notice that symbols like strtoll_l are actually not directly protected by _GNU_SOURCE, but by another macro: __USE_GNU. __USE_GNU is defined in <features.h> only when _GNU_SOURCE is defined, so literally they have the same effect. But this leads to an unpleasant consequence: <features.h> might be included prior to <cstdio>, so defining _GNU_SOURCE doesn’t necessarily mean __USE_GNU is defined; without __USE_GNU, the symbols we want in <stdlib.h> are still hidden.

Then here comes the third idea: just define _GNU_SOURCE before any inclusion of <features.h>! Thus we make sure __USE_GNU is properly defined this time. This works in theory; the problem is we don’t know when exactly <features.h> is to be included. Almost every C header implicitly includes <features.h> somewhere, which means we need to define _GNU_SOURCE before the inclusion of any C header in libc++’s headers: <cstdio>, <cstdlib>, <cstring>, etc.

Defining _GNU_SOURCE in <cstdio>, <cstdlib>, etc seems no big deal. Doing that, we don’t need the C++ compiler to predefine _GNU_SOURCE for libc++, and user code won’t be polluted by _GNU_SOURCE anymore. Flawless, isn’t it? In fact, no. Let’s recall what’s the purpose of avoiding _GNU_SOURCE: to prevent user code from being polluted by non-standard symbols. With our “solution”, though _GNU_SOURCE is absent, those symbols hidden by it are still exposed in user code anyway. So this isn’t a “real” solution.

This issue just doesn’t seem as trivial as it appears to be; no wonder it’s never fixed though frequently complained about. A large part of the nastiness is due to the abuse of feature test macros in libc; perhaps when C++ module become a real deal [2], C++ library writers won’t be bothered by macro pollutions anymore.

[1] http://web.mit.edu/darwin/src/modules/gcc3/libstdc++-v3/docs/html/faq/#3_5
[2] http://clang.llvm.org/docs/Modules.html