libc++abi v.s. libcxxrt

If you’ve read my previous posts, you may already know that a C++ standard library needs something called an “ABI library” to do certain low-level work for it. In the case of libc++, it supports libsupc++, libc++abi or libcxxrt as its ABI library. Among them, libsupc++ is not very well-known because it’s a part of GCC and usually not shipped standalone. In this post, we’ll be concentrating on libc++abi and libcxxrt.

libc++abi is the following effort by LLVM after it successfully developed a new C++ standard library — libc++. Developed by the same party, libc++abi works seamlessly with libc++. It’s now used on Apple’s platforms.

libcxxrt is developed by PathScale [1], a commercial compiler vendor, together with the BSD communities.  It’s now used in the three BSD derivatives and proved to be of industry quality.

While refactoring the ebuild for libc++, I hit a tough question: which ABI library to use as the default. libc++abi seems a more natural choice to me because it’s from the same bloodline as libc++. But libcxxrt should also be solid enough since it’s used in those rock-steady BSDs without any problem. Out of curiosity, I decided to do a simple benchmark between the two rivals.

I’m using a C++ program suggested by lu_zero for this benchmark [2]. I linked this program with libc++abi or libcxxrt (dynamically or statically), ran it three times and recorded the average runtime under each configuration. OpenMP is disabled for static linking because the OpenMP library used by clang is currently shared only on Gentoo. The test machine is a Gentoo VM with 4 cores and 4GB of memory. The parameter SPP (in source code) is altered from its default to 100 to reduce runtime. Below are the benchmark figures:

libc++abi, shared: 50.178s
libcxxrt, shared: 54.558s

libc++abi, static: 1m38.484s
libcxxrt, static: 1m35.134s

Apparently, libc++abi is slightly faster with dynamic linking, but a bit slower with static linking. Considering system fluctuation, the difference is nearly negligible. Due to its same bloodline as libc++, I’m going to keep libc++abi as the default ABI library in Gentoo 🙂

 

[1] http://www.pathscale.com/
[2] https://github.com/lighttransport/nanort/tree/master/examples/path_tracer

GSoC 2016: code submission status

This post serves as a tracker of code submitted in the domain of this GSoC project.

This summer, I worked out a bunch of patches that enhance clang/llvm with support for musl-libc, and had those patches contributed upstream. With these patches, clang is now able to correctly link binaries with musl.

llvm musl-libc support:
http://llvm.org/viewvc/llvm-project?view=revision&revision=272660
http://llvm.org/viewvc/llvm-project?view=revision&revision=273726

clang musl-libc support:
http://llvm.org/viewvc/llvm-project?view=revision&revision=272662
http://llvm.org/viewvc/llvm-project?view=revision&revision=272825
http://llvm.org/viewvc/llvm-project?view=revision&revision=273735
http://llvm.org/viewvc/llvm-project?view=revision&revision=277985

There’s still a pending compatibility issue that prevents llvm itself from being built on musl as is. musl and llvm’s developers have different views on this issue [1], and I haven’t yet found a solution that pleases both side. Currently we’re using a downstream patch in Gentoo to make llvm and musl compatible [2].

To have clang not link binaries with libgcc, I contributed another patch to clang that allows compiler-rt to be used as the default runtime library:
http://llvm.org/viewvc/llvm-project?view=revision&revision=276848

With those upstream enhancements, I wrote several ebuilds for Gentoo to construct a GCC-free C++ runtime environment, including:

  • create new packages for LLVM’s libunwind and libc++abi
  • enhance libc++ to support libc++abi and libunwind
  • make llvm compatible with musl-libc
  • enhance clang to support libc++ as the default stdlib and compiler-rt as the default rtlib
  • create a profile for using clang as the default compiler in Gentoo

Code submitted to Gentoo:
https://github.com/gentoo/gentoo/commits/master?author=zzlei

Code under reviewing for Gentoo:
https://github.com/gentoo/gentoo/pull/2048
https://github.com/gentoo/gentoo/pull/2049

Once the pending pull requests are merged, I’ll deliver a proper Gentoo stage3 with clang as the default compiler and all packages (except for kernel) built with clang.

 

[1] http://www.openwall.com/lists/musl/2014/04/15/5
[2] https://github.com/gentoo/gentoo/blob/master/sys-devel/llvm/files/llvm-3.8-musl-fixes.patch

Use clang as a native compiler in Gentoo

So far my GSoC project for Gentoo this year is soon coming to an end. There are still a few missing pieces, but the goal — supporting clang as a native compiler — is almost achieved. There’s nothing preventing you from using clang natively in your system now, and I’ll show you how in this post 🙂

First, you should install a GCC-free C++ runtime stack, composed of libunwind, libcxxrt and libcxx. There’re two versions of libunwind, one from nongnu and the other from LLVM. I encourage you to use the LLVM version, which proved more robust in my previous experiments [1]. To explicitly install LLVM’s libunwind, type:

$ emerge llvm-libunwind

followed by:

$ USE=libunwind emerge libcxx

libcxxrt will be pulled in automatically by emerge since libcxx depends on it. If we don’t explicitly install llvm-libunwind, the nongnu version will be pulled in instead.

Then it’s time to install clang:

$ USE='clang default-libcxx default-compiler-rt' emerge llvm

The USE flags default-libcxx and default-compiler-rt tell clang to use libc++ as the default C++ library and compiler-rt as the default runtime library, in place of libstdc++ and libgcc respectively, thus getting rid of dependence on GCC. If your system is based on musl-libc, you need also add -sanitize to the USE flags to disable LLVM’s sanitizers, which won’t compile on musl at the moment.

Remember to apply keyword ~amd64 on all the packages involved, since the features mentioned above are only available in the latest version of those packages. Additionally, the latest version of LLVM is currently masked for testing; you need to put the following line in package.unmask to unmask it:

# in file /etc/portage/profile/package.unmask
=sys-devel/llvm-3.8.1-r1

Then we are ready to go! Now you can use clang to compile any C/C++ program and the resulting binary will be GCC-free: no dependence on libgcc or libstdc++. Also put CC=clang and CXX=clang++ to ensure all your future packages are compiled by clang. Actually I encourage you to rebuild @world instantly, after which all packages in your system will be GCC-free:

$ emerge -e @world

NOTE: there’s one package you should pay attention to: libmnl. It’s pulled in by another package iproute2, and unfortunately is mis-compiled by clang [1]. There’re two solutions: 1) apply keyword ~amd64 on libmnl so the latest version is used, which works with clang; 2) apply USE flag minimal on iproute2 so it doesn’t need libmnl at all.

At this moment, you may wonder if we can just uninstall GCC once and for all. Unfortunately, the answer is no. There are two cases where GCC is still irreplaceable. First is when you upgrade your kernel. Currently clang is not capable of compiling the kernel without heavy patching, so you still need the good old GCC.

The second case is when you compile C++ code. When I say the binary built by clang is GCC-free, it’s actually not 100% true 🙁 There’re two pieces from GCC needed by every C++ program: crtbegin and crtend; unfortunately they’re not provided by any library mentioned above. I tried borrowing implementation of these two files from NetBSD, and they seem to work right out of the box. But another problem is that clang on Linux is hardcoded to use GCC’s crtbegin/end; NetBSD’s crtbegin/end won’t be recognized unless clang’s behavior is altered. This issue isn’t unresolvable and I’ll see if I can find a workaround for it. For now, just don’t remove your GCC 🙂

This project is not finished. There’re two pieces to be delivered soon: 1) a new package libcxxabi, which is developed by LLVM, will replace libcxxrt as the default C++ ABI library; 2) a new profile for native clang where USE flags, keywords and masks are all set appropriately so you don’t need to do it yourself.

Stay tuned!

 

[1] https://blogs.gentoo.org/gsoc2016-native-clang/2016/07/24/a-new-gentoo-stage4-musl-clang/

A new Gentoo stage4: musl + clang

I’m glad to announce that I just successfully deployed the GNU-free toolchain, which I’ve been working on so far, into a musl-based Gentoo stage4. Everything in this stage4, except for the kernel, is built by clang with non-GNU runtime libs. I’m now using this as my main Linux system, and luckily nothing breaks so far 🙂

To be more specific, the runtime environments of this stage4 is composed of the following components:

  • C runtime: musl
  • C++ runtime
    • C++ stdlib: libc++
    • C++ ABI lib: libcxxrt [1]
    • stack unwinding lib: libunwind
  • compiler: clang

Let’s take program /usr/bin/ld.gold for example, which is written in C++, and see its runtime dependencies:

$ ldd /usr/bin/ld.gold
	/lib/ld-musl-x86_64.so.1 (0x559f8b598000)
	libz.so.1 => /lib/libz.so.1 (0x7f5a5b7a9000)
	libc++.so.1 => /usr/lib/libc++.so.1 (0x7f5a5b6ea000)
	libcxxrt.so.1 => /usr/lib/libcxxrt.so.1 (0x7f5a5b4cd000)
	libunwind.so.1 => /usr/lib/libunwind.so.1 (0x7f5a5b4c4000)
	libc.so => /lib/ld-musl-x86_64.so.1 (0x559f8b598000)

Apparently none of the above libs is related to gcc. So this program is just free from gcc!

Actually I’ve already figured out how to glue clang and those non-GNU C++ runtime libs together weeks ago; but this is the first time I put it into serious use. The most challenging part is to rebuild @world with this toolchain; I’m really nervous that countless packages get broken after I issue the command emerge -e @world. Surprisingly enough, only two packages are broken.

The first broken package is iproute2, which depends on libmnl and libmnl is mis-compiled by clang. libmnl has an unpleasant past with clang [2], but its new versions are fine. So the fix is simple: just upgrade libmnl to the latest version available (use ~amd64).

The second broken package, ironically, is clang/llvm itself. It turned out the libunwind I’m using lacks a few functions essential to llvm. Actually there’re two versions of libunwind available: one is from nongnu [3] which already exists in Gentoo’s repo for a while; the other one is developed by LLVM and is almost functionally equivalent to the former. Initially I just used the nongnu version, since there’s no package for LLVM’s libunwind yet. Unfortunately, the nongnu one works fine with everything else except for llvm itself. I just had to stop being lazy and wrote an ebuild for LLVM’s libunwind. Luckily again, it has everything we need, including those functions missing in the other libunwind.

After that, while I’m using this new system, I encountered a few other broken packages. But they’re either incompatible with musl, or the gold linker (yes, it’s the /usr/bin/ld.gold I just showed you). There’s nothing wrong with clang or the C++ runtime libs.

I also found an interesting fact: the building of gcc involves several steps of bootstrapping and the final executable has no dependence on any external C++ runtime libs, though gcc itself written in C++. This means gcc still works even clang or the C++ runtime gets broken. So I don’t need to worry about breaking packages since I can still rebuild them with gcc 🙂

Now that I have a working stage4, the next step is to make gcc-config support clang, so clang can act like a real native compiler.

Stay tuned!

 

[1] libcxxrt is functionally equivalent to libc++abi, and is planned to be replaced by the latter later in this project
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=548786
[3] http://www.nongnu.org/libunwind/

Hello GSoC 2016 !

I’m very glad that I’m accepted by Gentoo as a participant of Google Summer of Code this year. During this summer, I’ll be working on building a clang-based toolchain for Gentoo.

Clang is a modern C/C++ compiler developed by LLVM, famous for its modular design and non-intrusive license. The ideal result of this project is to provide a Gentoo profile, where clang is the default compiler in place of gcc.

As clang is written in C++, it needs a C++ runtime to work, which are basically a C standard library, a C++ standard library, a C++ ABI library and a stack unwinder. On a typical Linux host, glibc and libstdc++ are the de facto C and C++ standard libraries respectively. The functionality of C++ ABI library is also integrated in libstdc++; the stack unwinder is implemented in libgcc.

libstdc++ and libgcc are both parts of GCC, which won’t be available when we deploy clang as the default compiler. Luckily, besides clang, LLVM also developed a complete implementation of the C++ runtime, consisting of three libraries: libc++, libc++abi and libunwind. Unlike GCC, the C++ ABI library is implemented separately. To decouple our toolchain further from the GNU toolset, we’ll use musl as the libc.

Sum it up

In this project, we’ll build a toolchain with clang as the compiler, musl as libc and a C++ runtime composed of libc++, libc++abi and libunwind. If everything goes smoothly, this setup will be offered as a Gentoo profile; users who like the neat features of clang thus have the chance to say goodbye to GCC 🙂

I’ll update this blog regularly to reflect my most recent progress and share technical stuff that might be helpful to others. Stay tuned !