Build GNU-free executables with clang

Previously we discussed how to build a LLVM C++ runtime stack with libc++, libc++abi and libunwind. Along with musl, we now have a GNU-free C/C++ runtime environment. But an unfortunate fact is, clang is used to living with GCC and glibc, and it takes some extra effort to make clang work with our new environment. In my last post, I demonstrated how to make a wrapper of clang to build GNU-free executables. As the name of this project implies, we want a native clang, not an ugly wrapper. So in this post, I’m going to show you how to build a native clang that works “out of the box”.

Before getting into it, let’s first analyze what dependencies a program built by clang typically has. For C programs, it’s of course glibc; and for C++ programs there’s also libstdc++. Yet there’s a lesser known library that every program relies on: libgcc. Sometimes it’s statically linked into the executable, other times dynamically linked in the form of “libgcc_s”. libgcc is a low-level runtime library provided by GCC. More information can be found at: https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html. LLVM also has a replacement for it: compiler-rt; and we’ll need this later.

The biggest obstacle preventing clang from working with musl is that musl has its own dynamic linker which could not be recognized by clang. A naive workaround is to rename musl’s linker to the same name as glibc’s, but that would obviously mess up the whole system. We’ll have to take a alternative approach (which I’m personally resistant to): modify clang/LLVM’s source code.

Two rudimentary patches that work on x86_64 platforms could be found here: https://github.com/zzlei/musl-clang. As their names imply, one patch is for the LLVM source root, and the other for clang. Assume you’ve already checked out LLVM, clang and compiler-rt to the right location, say $LLVM, $LLVM/tools/clang and $LLVM/projects/compiler-rt respectively. After applying the patches, issue the following command to build them all together:

$ mkdir $LLVM/build && cd $LLVM/build
$ cmake -DGCC_INSTALL_PREFIX=/usr \
-DDEFAULT_SYSROOT=/usr/x86_64-pc-linux-musl \
-DCLANG_DEFAULT_CXX_STDLIB=libc++ \
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-pc-linux-musl ..
$ make

Some explanations:
DEFAULT_SYSROOT tells clang where to find musl’s headers and libraries. It’s pointed to the location where the musl toolchain is installed.
GCC_INSTALL_PREFIX specifies where GCC is installed. clang needs this to find crtbegin.o and crtend.o. This part is a bit thorny, as neither musl or clang provides these files. We’ll need to replace them with some other vendor’s later in this project.
CLANG_DEFAULT_CXX_STDLIB tells clang to use libc++ by default. A vanilla clang on Linux always uses libstdc++ by default.
LLVM_DEFAULT_TARGET_TRIPLE informs clang that we’re targeting on musl-libc; without this clang won’t find the correct dynamic linker.

After putting the freestanding C++ runtime libraries we previously built under /usr/x86_64-pc-linux-musl/usr/lib, we should have a native clang that “almost” works out of the box. Why “almost”? Because we still need to feed one option to clang: “-rtlib=compiler-rt”, indicating the use of compiler-rt instead of libgcc. I’m still struggling to set this option permanently at build time; hopefully I don’t have to modify too much of clang’s code to achieve this…

Now, let’s take a final look of our product:

$ ./bin/clang++ hello.cc -rtlib=compiler-rt
$ readelf -d a.out | grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libc++.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc++abi.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so]

Great!