The C++11 ABI incompatibility problem in Gentoo

Gentoo allows users to have multiple versions of gcc installed and we (mostly?) support systems where userland is partially build with different versions.  There are both advantages and disadvantages to this and in this post, I’m going to talk about one of the disadvantages, the C++11 ABI incompatibility problem.  I don’t exactly have a solution, but at least we can define what the problem is and track it [1].

First what is C++11?  Its a new standard of C++ which is just now making its way through GCC and clang as experimental.  The current default standard is C++98 which you can verify by just reading the defined value of __cplusplus using the preprocessor.

$  g++ -x c++ -E -P - <<< __cplusplus
199711L
$  g++ -x c++ --std=c++98 -E -P - <<< __cplusplus
199711L
$  g++ -x c++ --std=c++11 -E -P - <<< __cplusplus
201103L

This shouldn’t be surprising, even good old C has standards:

$ gcc -x c -std=c90 -E -P - <<< __STDC_VERSION__
__STDC_VERSION__
$ gcc -x c -std=c99 -E -P - <<< __STDC_VERSION__
199901L
$ gcc -x c -std=c11 -E -P - <<< __STDC_VERSION__
201112L

We’ll leave the interpretation of these values as an exercise to the reader.  [2]

The specs for these different standards at least allow for different syntax and semantics in the language.  So here’s an example of how C++98 and C++11 differ in this respect:

// I build with both --std=c++98 and --std=c++11
#include <iostream>
using namespace std;
int main() {
    int i, a[] = { 5, -3, 2, 7, 0 };
    for (i = 0; i < sizeof(a)/sizeof(int); i++)
        cout << a[i] << endl ;
    return 0;
}
// I build with only --std=c++11
#include <iostream>
using namespace std;
int main() {
    int a[] = { 5, -3, 2, 7, 0 };
    for (auto& x : a)
        cout << x << endl ;
    return 0;
}

I think most people would agree that the C++11 way of iterating over arrays (or other objects like vectors) is sexy.  In fact C++11 is filled with sexy syntax, especially when it come to its threading and atomics, and so coders are seduced.  This is an upstream choice and it should be reflected in their build system with –std= sprinkled where needed.  I hope you see why you should never add –std= to your CFLAGS or CXXFLAGS.

The syntactic/semantic differences is the first “incompatiblity” and it is really not our problem downstream.  Our problem in Gentoo comes because of ABI incompatibilities between the two standards arrising from two sources: 1) Linking between objects compiled with –std=c++98 and –std=c++11 is not guaranteed to work.  2) Neither is linking between objects both compiled with –std=c+11 but with different versions of GCC differing in their minior release number.  (The minor release number is x in gcc-4.x.y.)

To see this problem in action, let’s consider the following little snippet of code which uses a C++11 only function [3]

#include <chrono>
using namespace std;
int main() {
    auto x = chrono::steady_clock::now;
}

Now if we compile that with gcc-4.8.3 and check its symbols we get the following:

$ $ g++ --version
g++ (Gentoo Hardened 4.8.3 p1.1, pie-0.5.9) 4.8.3
$ g++ --std=c++11 -c test.cpp
$ readelf -s test.o
Symbol table '.symtab' contains 12 entries:
Num:    Value          Size Type    Bind   Vis      Ndx Name
  0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
  1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.cpp
  2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
  3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
  4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
  5: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
  6: 0000000000000000     0 SECTION LOCAL  DEFAULT    7
  7: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
  8: 0000000000000000    78 FUNC    GLOBAL DEFAULT    1 main
  9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND _GLOBAL_OFFSET_TABLE_
 10: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND _ZNSt6chrono3_V212steady_
 11: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND __stack_chk_fail

We can now confirm that that symbol is in fact in libstdc++.so for 4.8.3 but NOT for 4.7.3 as follows:

$ readelf -s /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so.6 | grep _ZNSt6chrono3_V212steady_
  1904: 00000000000e5698     1 OBJECT  GLOBAL DEFAULT   13 _ZNSt6chrono3_V212steady_@@GLIBCXX_3.4.19
  3524: 00000000000c8b00    89 FUNC    GLOBAL DEFAULT   11 _ZNSt6chrono3_V212steady_@@GLIBCXX_3.4.19
$ readelf -s /usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/libstdc++.so.6 | grep _ZNSt6chrono3_V212steady_
$

Okay, so we’re just seeing an example of things in flux.  Big deal?  If you finish linking test.cpp and check what it links against you get what you expect:

$ g++ --std=c++11 -o test.gcc48 test.o
$ ./test.gcc48
$ ldd test.gcc48
        linux-vdso.so.1 (0x000002ce333d0000)
        libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so.6 (0x000002ce32e88000)
        libm.so.6 => /lib64/libm.so.6 (0x000002ce32b84000)
        libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1 (0x000002ce3296d000)
        libc.so.6 => /lib64/libc.so.6 (0x000002ce325b1000)
        /lib64/ld-linux-x86-64.so.2 (0x000002ce331af000)

Here’s where the wierdness comes in.  Suppose we now switch to gcc-4.7.3 and repeat.  Things don’t quite work as expected:

$ g++ --version
g++ (Gentoo Hardened 4.7.3-r1 p1.4, pie-0.5.5) 4.7.3
$ g++ --std=c++11 -o test.gcc47 test.cpp
$ ldd test.gcc47
        linux-vdso.so.1 (0x000003bec8a9c000)
        libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so.6 (0x000003bec8554000)
        libm.so.6 => /lib64/libm.so.6 (0x000003bec8250000)
        libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1 (0x000003bec8039000)
        libc.so.6 => /lib64/libc.so.6 (0x000003bec7c7d000)
        /lib64/ld-linux-x86-64.so.2 (0x000003bec887b000)

Note that it says its linking against 4.8.3/libstdc++.so.6 and not 4.7.3.  That’s because of the order in which the library paths are search is defined in /etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf and this file is sorted that way it is on purpose.  So maybe it’ll run!  Let’s try:

$ ./test.gcc47
./test.gcc47: relocation error: ./test.gcc47: symbol _ZNSt6chrono12steady_clock3nowEv, version GLIBCXX_3.4.17 not defined in file libstdc++.so.6 with link time reference

Nope, no joy.  So what’s going on?  Let’s look at the symbols in both test.gcc47 and test.gcc48:

$ readelf -s test.gcc47  | grep chrono
  9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND _ZNSt6chrono12steady_cloc@GLIBCXX_3.4.17 (4)
 50: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND _ZNSt6chrono12steady_cloc
$ readelf -s test.gcc48  | grep chrono
  9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND _ZNSt6chrono3_V212steady_@GLIBCXX_3.4.19 (4)
 49: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND _ZNSt6chrono3_V212steady_

Whoah!  The symbol wasn’t mangled the same way!  Looking more carefully at *all* the chrono symbols in 4.8.3/libstdc++.so.6 and 4.7.3/libstdc++.so.6 we see the problem.

$ readelf -s /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so.6 | grep chrono
  353: 00000000000e5699     1 OBJECT  GLOBAL DEFAULT   13 _ZNSt6chrono3_V212system_@@GLIBCXX_3.4.19
 1489: 000000000005e0e0    86 FUNC    GLOBAL DEFAULT   11 _ZNSt6chrono12system_cloc@@GLIBCXX_3.4.11
 1605: 00000000000e1a3f     1 OBJECT  GLOBAL DEFAULT   13 _ZNSt6chrono12system_cloc@@GLIBCXX_3.4.11
 1904: 00000000000e5698     1 OBJECT  GLOBAL DEFAULT   13 _ZNSt6chrono3_V212steady_@@GLIBCXX_3.4.19
 2102: 00000000000c8aa0    86 FUNC    GLOBAL DEFAULT   11 _ZNSt6chrono3_V212system_@@GLIBCXX_3.4.19
 3524: 00000000000c8b00    89 FUNC    GLOBAL DEFAULT   11 _ZNSt6chrono3_V212steady_@@GLIBCXX_3.4.19
$ readelf -s /usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/libstdc++.so.6 | grep chrono
 1478: 00000000000c6260    72 FUNC    GLOBAL DEFAULT   12 _ZNSt6chrono12system_cloc@@GLIBCXX_3.4.11
 1593: 00000000000dd9df     1 OBJECT  GLOBAL DEFAULT   14 _ZNSt6chrono12system_cloc@@GLIBCXX_3.4.11
 2402: 00000000000c62b0    75 FUNC    GLOBAL DEFAULT   12 _ZNSt6chrono12steady_cloc@@GLIBCXX_3.4.17

Only 4.7.3/libstdc++.so.6 has _ZNSt6chrono12steady_cloc@@GLIBCXX_3.4.17.  Normally when libraries change their exported symbols, they change their SONAME, but this is not the case here, as running `readelf -d` on both shows.  GCC doesn’t bump the SONAME that way for reasons explained in [4].  Great, so just switch around the order of path search in /etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf.  Then we get the problem the other way around:

$ ./test.gcc47
$ ./test.gcc48
./test.gcc48: /usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/libstdc++.so.6: version `GLIBCXX_3.4.19' not found (required by ./test.gcc48)

So no problem if your system has only gcc-4.7.  No problem if it has only 4.8.  But if it has both, then compiling C++11 with 4.7 and linking against libstdc++ for 4.8 (or vice versa) and you get breakage at the binary level.  This is the C++11 ABI incompatibility problem in Gentoo.  As an exercise for the reader, fix!

Ref.

[1] Bug 542482 – (c++11-abi) [TRACKER] c++11 abi incompatibility

[2] This is an old professor’s trick for saying, hey go find out why c90 doesn’t define a value for __STDC_VERSION__ and let me know, ‘cuz I sure as hell don’t!

[3] This example was inspired by bug #513386.  You can verify that it requires –std=c++11 by dropping the flag and getting yelled at by the compiler.

[4] Upstream explains why in comment #5 of GCC bug #61758.  The entire bug is dedicated to this issue.