The upcoming clang 16 release features substantial improvements to configuration file support. Notably, it adds support for specifying multiple files and better default locations. This enabled Gentoo to finally replace the default-* flags used on sys-devel/clang, effectively empowering our users with the ability to change defaults without rebuilding whole clang.
This change has also been partially backported to clang 15.0.2 in Gentoo, and (unless major problems are reported) will be part of the stable clang 15.x release (currently planned for upcoming 15.0.3).
In this post, I’d like to shortly describe the new configuration file features, how much of them have been backported to 15.x in Gentoo and how defaults are going to be selected from now on.
Configuration file support in clang 16.x
Configuration files were supported at least for a few clang releases now but they weren’t very useful for us before. With clang 16, I have taken the opportunity to finally change that.
In clang 16, configuration files can be both specified explicitly and loaded from default locations. Default configuration files are loaded first (unless explicitly disabled by --no-default-config or a non-empty CLANG_NO_DEFAULT_CONFIG envvar — the latter intended to be used in clang’s test suites), and configuration files specified via --config= options are loaded afterwards. This permits explicit files to override the options specified in default configs. However, it should be noted that some values are appended rather than overrode, and there is no way to “reset” them right now.
As built in Gentoo, clang looks for configuration files in two locations: /etc/clang and the executable directory. Technically, the build system also permits specifying “user” configuration directory but it’s not practically useful, as it provides no way of referencing the user’s home directory. Effectively, we only use /etc/clang. The sys-devel/clang-common package installs a default set of configuration files there.
The default config lookup algorithm looks for <triple>-<driver>.cfg first. If this file is not found, it looks for separate <triple>.cfg and <driver>.cfg files and loads both. This enables the first location to be used as an override, without suffering from the appending problem. Triple is the effective target triple (i.e. accounting for options such as --target= and -m32) and driver is the string corresponding to driver mode, e.g. clang, clang++ or clang-cpp (but it does not account for -x c++ option!).
So for example, on a typical amd64 system clang will first try:
with fallback to loading both of:
If -m32 is used, this will be i386-pc-linux-gnu* instead. If clang++ is called, this will be *clang++.cfg, etc.
Explicit configuration files are specified using the --config=<file> option. They are loaded after the default configs, in order of being listed. They can either be specified by full path, or by bare filename. In the latter case, clang looks for them in the directories listed earlier.
Configuration files use response file syntax, i.e. you specify command-line options inside them as you would pass them on the command-line. They also support including additional files via @<filename> syntax. An example configuration file could specify:
-I/opt/mystuffs/include -L/opt/mystuffs/lib @gentoo-runtimes.cfg
Full documentation: Clang Compiler User’s Manual: configuration files.
The backport to clang 15.0.2
There are notable differences in configuration file support in clang 15.x:
- the new --config=<file> spelling is not supported, you have to use --config <file> (yes, compilers don’t follow getopt_long rules)
- only one configuration file can be loaded
- --config disables loading default configuration files
- there is no explicit --no-default-config option
The rules for loading configuration files were also different (and not very reliable). I have mostly backported the new lookup rules. However, since this version does not support loading multiple configuration files (and I did not want to diverge too far from vanilla), only <triple>-<driver>.cfg and <driver>.cfg names are supported (i.e. pure-triple variant is not).
I think this divergence is acceptable because Gentoo did not enable configuration file support before (i.e. did not specify the system configuration directory), so there is no reason to assume that any of regular Gentoo users would have relied on the prior logic.
Use of configuration files in Gentoo
We currently install two “base” configuration files: gentoo-gcc-install.cfg and gentoo-runtimes.cfg.
gentoo-gcc-install.cfg is used to provide the path to the GCC installation. Its initial contents are e.g.:
# This file is maintained by gcc-config. # It is used to specify the selected GCC installation. --gcc-install-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0
The intent is that gcc-config would update this file when switching between versions and clang would not have to include logic for reading from Gentoo-specific files anymore. However, gcc-config part is not yet implemented, and it is not clear if this solution will be workable long-term, as it probably breaks support for sysroots.
gentoo-runtimes.cfg is used to control the default runtimes and link editor used. Its initial contents are controlled by USE flags on the clang-common package and are e.g.:
# This file is initially generated by sys-devel/clang-runtime. # It is used to control the default runtimes using by clang. --rtlib=libgcc --unwindlib=libgcc --stdlib=libstdc++ -fuse-ld=bfd
On top of these two files, we install the actual configuration files for the three driver modes relevant to Gentoo compilations to use: clang.cfg, clang++.cfg and clang-cpp.cfg. All of them have the following contents:
# This configuration file is used by clang driver. @gentoo-runtimes.cfg @gentoo-gcc-install.cfg
Effectively, they defer into loading the two base files.
Future use of config files
Why now?, you might ask. After all, configuration files were there for a while now, and I haven’t taken the effort to fix them to be usable before. Well, it all started with the total mayhem caused by clang 15.x becoming more strict. While we managed to convince upstream to revert that change and defer it into 16.x, it became important for us to be easily able to test for the breakage involved in clang changing its behavior.
One part of this effort was starting to package clang 16 snapshots. If you’d like to help testing it, you can unmask them using the following package.accept_keywords snippet:
=dev-ml/llvm-ocaml-16.0.0_pre* ** =dev-python/lit-16.0.0_pre* ** =dev-util/lldb-16.0.0_pre* ** =dev-python/clang-python-16.0.0_pre* ** =sys-devel/clang-16.0.0_pre* ** =sys-devel/clang-common-16.0.0_pre* ** =sys-devel/clang-runtime-16.0.0_pre* ** =sys-devel/lld-16.0.0_pre* ** =sys-devel/llvm-16.0.0_pre* ** =sys-devel/llvm-common-16.0.0_pre* ** =sys-libs/compiler-rt-16.0.0_pre* ** =sys-libs/compiler-rt-sanitizers-16.0.0_pre* ** =sys-libs/libcxx-16.0.0_pre* ** =sys-libs/libcxxabi-16.0.0_pre* ** =sys-libs/libcxxrt-16.0.0_pre* ** =sys-libs/libomp-16.0.0_pre* ** =sys-libs/llvm-libunwind-16.0.0_pre* ** =dev-libs/libclc-16.0.0_pre* ** ~sys-devel/llvmgold-16 ** ~sys-devel/clang-toolchain-symlinks-16 ** ~sys-devel/lld-toolchain-symlinks-16 ** ~sys-devel/llvm-toolchain-symlinks-16 **
The other part is providing support for configuration files that can be used to easily pass -Werror= and -Wno-error= flags reliably to all clang invocations, and therefore adjust its behavior while testing.
With this, we should be able to assess the damage beforehand earlier. However, between the size of clang 16 tracker and upstream continuing to make the behavior more strict, I’m starting to have doubts whether clang will continue being a general-purpose that could be reliably used as a replacement for GCC, and not just a corporation-driven tool used to compile a few big projects.