The impact of C++ templates on library ABI

Author: Michał Górny
Date: 20 Aug 2012
Copyright: http://creativecommons.org/licenses/by/3.0/
Source: http://dev.gentoo.org/~mgorny/articles/the-impact-of-cxx-templates-on-library-abi.rst

Preamble

The general aspect of maintaining binary compatibility of C++ library interfaces has been already covered thoroughly multiple times. A good reference of articles on the topic can be found on wiki page of ABI compliance checker tool [1]. Sadly, those articles usually consider the topic of C++ templates only briefly, if at all.

While in fact the topic is fairly complex, and I believe that considering the overall usefulness and popularity of the templates, it should be considered more thoroughly. Thus, in this article I will try to address the issues arising from use of templates, methods of dealing with them and trying to prevent them.

Both the overall topic of templates in respect to the programming techniques, and the wide topic of ABI are already explained in detail in many other articles and guides. Moreover, I believe that myself I am not fluent enough to be able to cover those topics in detail here. Thus, I will assume that a reader of this article is already familiar with both the general topic of templates in C++, and the basic aspects of an ABI and its compatibility.

Moreover, in the solutions and problems listed here I will assume that a particular toolchain in question does conform to the C++98 standard, and is able to properly support templates with regard to multiple instantiations.

[1] http://ispras.linuxbase.org/index.php/ABI_compliance_checker#Articles

Continue reading “The impact of C++ templates on library ABI”

Research question: integral type sizes on various platforms

I’m a bit curious about sizes of various integral types on different platforms, and I’d really appreciate a little help from people running various non-common architectures/toolchains. I’ve prepared a little package which just tries to get various type sizes using the C++ compiler, and I’d really appreciate if you could run it and paste the results in a comment.

To run it:

wget http://dev.gentoo.org/~mgorny/cxx-type-sizes-0.tar.bz2
tar -xf cxx-type-sizes-0.tar.bz2
cd cxx-type-sizes-0/
./configure
make
cat output/_all

It will try to compile a few programs, and then run them. Then it concatenates the results into output/_all and that’s the file I’d like to get, along with your platform, toolchain, CHOST and ARCH, ABI and everything else you consider relevant.

I’d really like to get a single output for each architecture, and possibly additional outputs if some toolchain/other magic resulted in different results than the previous one. I’ll put the results then into a nice table. Thanks in advance.

Current results.

GitHub — or how to re-centralize a DVCS

What are the most important advantages of git? I think one which should come out pretty early is that it is «distributed», or «decentralized». This simply means that the actual, complete repository moves along with the project. At some point, you could think: where it is hosted shouldn’t matter that much.

Although git doesn’t facilitate working completely without centralized server (because you need to find updates somewhere), it should be pretty clear that the repository content should be independent of the hosting service. In other words, hosting service should serve the repository, not enforce its contents.

I think all the madness started on Google Code project hosting. There, the project wikis were hosted as a subdirectory to svnroot (e.g. in gecko-mediaplayer sources). I’m not sure if I can say «it is wrong». On one hand, it’s bad to keep completely separate codebases in the same repository. On the other, the design of subversion is simply pure madness, and so everything in the repository follows it…

On the other hand, Google got git correctly. When a particular project decides to use git there, it gets three separate repositories (look at pkgcore sources for an example).

GitHub got wikis right as well. But GitHub Pages… They actually misuse branches in a horrible, messy way. Just look how to create project pages manually — they tell you to create an «orphan» branch!

In other words, they tell you to create two repositories in a single repository. Two independent histories. Complete madness! And whether you want it or not, you pull them with every single clone you do. Yes, that could be some kind of advantage but nevertheless it has nothing to do with the source code.

There’s also this old, ignored issue that they encourage you to rename your README file to their invented suffix just to have it rendered correctly. Once again, hosting services enforces the layout of your repository. And I get really angry getting all those README.md.bz2 in my docdir. This is all against the purpose of markup…

Shortly saying, the sole purpose of markup formats like Markdown, reStructuredText, asciidoc is to provide a complete markup on top of plain text. The text which sould be still completely usable for any regular text viewer. And this means that their naming should also follow the common text file naming rules, which means either uppercase names in *nix or .txt suffix in Windows. No custom .md, and certainly not .asciidoc!

It’s really sad that the very common git hosting sites, instead of encouraging people to use git correctly, force them to hack it around to achieve some minor madness.

vim: smart C/C++ boilerplate templates

A simple vim scriptie for those who are interested. It is triggered when new C/C++ files are created (e.g. via vim new-file.cxx), and fills it in with boilerplate unit code. What’s special about it is that it tries to find tips about that code in other files in that or parent directory.

Continue reading “vim: smart C/C++ boilerplate templates”

A C API for C++ and Python ones — or making of libh2o

Lately I spent a lot of time working on a small project of mine called libh2o. Its goal is to provide a library of routines implementing IAPWS IF97 equations for water and steam properties. With the core written in C, and providing a nice-to-use API for C++ and Python.

At first, I thought about not providing a «high level» C API at all. It was like: if you want to use plain C, you’ve gotta glue all the low-level equations yourself. However, after some thinking I decided to provide one, and built the two remaining APIs (C++ and Python) on top of it.

The main reason for doing this was that Python (well, CPython) is written in C. Although I’ve seen people writing Python extensions in C++, and even using some of C++ features to make them a little nicer, that’s still a bunch of ugly C hacks and pointer casts. I don’t see a really good reason to write a Python extension in C++, nor to make it depend on a C++ compiler when it’s all limited to C-based CPython API anyway.

And that means that I have either to duplicate all the high-level logic in the Python extension, or just create a C API first and reuse that. Since the whole logic was simple enough to be covered completely and clearly in C, I have chosen this way.

As it happens when people choose C, I had to implement some kind of poor man’s objectivity. Not something as wide (and ugly) as GObject (someone, please kill it!) but a few bits necessary to keep the state. In other words, a structure keeping the «object» and a bunch of nicely named functions taking it as their first argument.

Before I learnt C++, I would assume that the object structure should be a private (and obscured) blob, and the object type should be an incomplete pointer to it. User should just grab that pointer from a «constructor», pass it around and finally free it through a «destructor». Advantage: the exact struct contents are not the part of ABI.

But now I’ve decided to go the other way; way similar to how C++ classes work. I’ve created a structure with explicitly listed private fields (and a very simple /*private:*/ comment), and used that as the public type. It doesn’t need to keep any memory allocated, and is simple enough to be allocated on stack. Advantages: no need for a destructor, and an ability to pack that struct in the C++ class which will wrap it.

Then the usual stuff: a bunch of functions with common prefixes. One prefix for the «namespace», another one for the function (new, get…). All in nice and clear fashion, either to be used directly or wrapped in the C++ or Python APIs.