{"id":745,"date":"2019-07-01T20:17:46","date_gmt":"2019-07-01T20:17:46","guid":{"rendered":"http:\/\/blogs.gentoo.org\/lu_zero\/?p=745"},"modified":"2019-07-01T20:21:25","modified_gmt":"2019-07-01T20:21:25","slug":"building-crates-so-they-look-like-cabi-libraries","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/lu_zero\/2019\/07\/01\/building-crates-so-they-look-like-cabi-libraries\/","title":{"rendered":"Building crates so they look like C(ABI) Libraries"},"content":{"rendered":"<blockquote><p>I presented <a href=\"https:\/\/github.com\/lu-zero\/cargo-c\">cargo-c<\/a> at the <a href=\"https:\/\/rustlab.it\">rustlab 2019<\/a>, here is a longer followup of <a href=\"https:\/\/github.com\/lu-zero\/rustlab-it-2019\">this<\/a>.<\/p><\/blockquote>\n<h2>Mixing Rust and C<\/h2>\n<p>One of the best selling point for <strong>rust<\/strong> is being highly <strong>interoperable<\/strong> with the C-ABI, in addition to <strong>safety<\/strong>, <strong>speed<\/strong> and its amazing community.<\/p>\n<p>This comes really handy when you have <strong>well optimized<\/strong> hand-crafted asm kernels you&#8217;d like to use as-they-are:<\/p>\n<ul>\n<li>They are small and with a clear interface, usually strict boundaries on what they read\/write by their own nature.<\/li>\n<li>You&#8217;d basically rewrite them as they are using some <a href=\"https:\/\/doc.rust-lang.org\/unstable-book\/language-features\/asm.html\">inline assembly<\/a> for dubious gains.<\/li>\n<li>Both <a href=\"https:\/\/github.com\/alexcrichton\/cc-rs\">cc-rs<\/a> and <a href=\"https:\/\/crates.io\/crates\/nasm-rs\">nasm-rs<\/a> make the process of building and linking relatively painless.<\/li>\n<\/ul>\n<p>Also, if you plan to integrate in a foreign language project some rust component, it is quite straightforward to link the <code>staticlib<\/code> produced by cargo in your main project.<\/p>\n<p>If you have a pure-rust crate and you want to export it to the world as if it were a normal C (shared\/dynamic) library, it gets quite gory.<\/p>\n<h2>Well behaved C-API Library structure<\/h2>\n<p>Usually when you want to use a C-library in your own project you should expect it to provide the following:<\/p>\n<ul>\n<li>A header file, telling the compiler which symbols it should expect<\/li>\n<li>A static library<\/li>\n<li>A dynamic library<\/li>\n<li>A pkg-config file giving you direction on <strong>where<\/strong> to find the header and <strong>what<\/strong> you need to pass to the linker to correctly link the library, being it static or dynamic<\/li>\n<\/ul>\n<h3>Header file<\/h3>\n<p>In C you usually keep a list of function prototypes and type definitions in a separate file and then embed it in your source file to let the compiler know what to expect.<\/p>\n<p>Since you rely on a quite simple <a href=\"https:\/\/en.wikipedia.org\/wiki\/C_preprocessor\">preprocessor<\/a> to do that you have to be careful about adding <strong>guards<\/strong> so the file does not get included more than once and, in order to avoid clashes you install it in a subdirectory of your include dir.<\/p>\n<p>Since the location of the header could be not part of the default search path, you store this information in <strong>pkg-config<\/strong> usually.<\/p>\n<h3>Static Libraries<\/h3>\n<p>Static libraries are quite <strong>simple<\/strong> in concept (and execution):<\/p>\n<ul>\n<li>they are an <a href=\"https:\/\/en.wikipedia.org\/wiki\/Ar_%28Unix%29\">archive<\/a> of object code files.<\/li>\n<li>the linker simply reads them as it would read just produced <code>.o<\/code>s and link everything together.<\/li>\n<\/ul>\n<p>There is a <strong>pitfall<\/strong> though:<\/p>\n<ul>\n<li>In some platforms even if you want to make a fully static binary you end up dynamically linking some system library for a number of reasons.<br \/>\n<blockquote><p>The worst offenders are the <strong>pthread<\/strong> libraries and in some cases the compiler builtins (e.g. <a href=\"https:\/\/gcc.gnu.org\/onlinedocs\/gccint\/Libgcc.html\"><code>libgcc_s<\/code><\/a>)<\/p><\/blockquote>\n<\/li>\n<li>The information on what they are is usually not known<\/li>\n<\/ul>\n<p><strong>rustc<\/strong> comes to the rescue with <a href=\"https:\/\/doc.rust-lang.org\/nightly\/rustc\/command-line-arguments.html#a--print-print-compiler-information\"><code>--print native-static-libs<\/code><\/a>, it isn&#8217;t the best example of integration since it&#8217;s a string produced on <strong>stderr<\/strong> and it behaves as a side-effect of the actual building, but it is still a good step in the right direction.<\/p>\n<p><a href=\"https:\/\/www.freedesktop.org\/wiki\/Software\/pkg-config\/\">pkg-config<\/a> is the de-facto standard way to preserve the information and have the build systems know about it (I guess you are seeing a pattern now).<\/p>\n<h3>Dynamic Libraries<\/h3>\n<p>A <a href=\"https:\/\/en.wikipedia.org\/wiki\/Library_(computing)#Shared_library\">shared<\/a> or dynamic library is a specially crafted lump of executable code that gets linked to the binary as it is being executed.<br \/>\nThe advantages compared to statically linking everything are mainly two:<\/p>\n<ul>\n<li><strong>Sparing disk space<\/strong>: since without link-time pruning you end up carrying multiple copies of the same library with every binary using it.<\/li>\n<li><strong>Safer and simpler updates<\/strong>: If you need to update say, openssl, you do that once compared to updating the 100+ consumers of it existing in your system.<\/li>\n<\/ul>\n<p>There is some inherent complexity and constraints in order to get this feature right, the most problematic one is <strong>ABI stability<\/strong>:<\/p>\n<ul>\n<li>The dynamic linker needs to find the symbols the binary expects and have them with the correct size<\/li>\n<li>If you change the in-memory layout of a struct or how the function names are represented you should make so the linker is aware.<\/li>\n<\/ul>\n<p>Usually that means that depending on your platform you have some versioning information you should provide when you are preparing your library. This can be as simple as telling the compile-time linker to embed the  version information (e.g. <a href=\"https:\/\/developer.apple.com\/library\/archive\/documentation\/DeveloperTools\/Conceptual\/DynamicLibraries\/100-Articles\/CreatingDynamicLibraries.html#\/\/apple_ref\/doc\/uid\/TP40002073-SW20\">Mach-O dylib<\/a> or <a href=\"https:\/\/www.ibm.com\/developerworks\/library\/l-shobj\/\">ELF<\/a>) in the library or as complex as crafting a <a href=\"https:\/\/sourceware.org\/binutils\/docs\/ld\/VERSION.html#VERSION\">version script<\/a>.<\/p>\n<p>Compared to crafting a <code>staticlib<\/code> it there are more moving parts and platform-specific knowledge.<\/p>\n<p>Sadly in this case <strong>rustc<\/strong>  does not provide any help for now: even if the C-ABI is stable and set in stone, the <strong>rust<\/strong> <a href=\"https:\/\/github.com\/rust-lang\/rfcs\/pull\/2603\">mangling strategy<\/a> is not finalized yet, and it is a large part of being ABI stable, so the work on fully supporting dynamic libraries is yet to be completed.<\/p>\n<p>Dynamic libraries in most platforms have a mean to store which other dynamic libraries they reliy on and which are the paths in which to look for. When the information is incomplete, or you are storing the library in a non-standard path, <strong>pkg-config<\/strong> comes to the rescue again, helpfully storing the information for you.<\/p>\n<h3>Pkg-config<\/h3>\n<p>It is your single point of truth as long your build system supports it and the libraries you want to use craft it properly.<br \/>\nIt simplifies a lot your life if you want to keep around multiple versions of a library or you are doing non-system packaging (e.g.: <a href=\"https:\/\/brew.sh\/\">Homebrew<\/a> or <a href=\"https:\/\/wiki.gentoo.org\/wiki\/Project:Prefix\">Gentoo Prefix<\/a>).<br \/>\nBeside the <code>search path<\/code>, <code>link line<\/code> and <code>dependency<\/code> information I mentioned above, it also stores the library version and inter-library compatibility relationships.<br \/>\nIf you are publishing a C-library and you aren&#8217;t providing a <code>.pc<\/code> file, <strong>please<\/strong> consider doing it.<\/p>\n<h2>Producing a C-compatible library out of a crate<\/h2>\n<p>I explained what we are expected to produce, now let see what we can do on the <strong>rust<\/strong> side:<\/p>\n<ul>\n<li>We need to export C-ABI-compatible symbols, that means we have to:<\/li>\n<li>Decorate the data types we want to export with <code>#[repr(C)]<\/code><\/li>\n<li>Decorate the functions with <code>#[no_mangle]<\/code> and prefix them with <code>export \"C\"<\/code><\/li>\n<li>Tell <code>rustc<\/code> the crate type is both <code>staticlib<\/code> and <code>cdylib<\/code><\/li>\n<li>Pass <code>rustc<\/code> the platform-correct link line so the library produced has the right information inside.<br \/>\n&gt; <strong>NOTE<\/strong>: In some platforms beside the version information also the install path must be encoded in the library.<\/li>\n<li>Generate the header file so that the C compiler knows about them.<\/li>\n<li>Produce a <code>pkg-config<\/code> file with the correct information<br \/>\n<blockquote><p><strong>NOTE<\/strong>: It requires knowing where the whole lot will be eventually installed.<\/p><\/blockquote>\n<\/li>\n<\/ul>\n<p><code>cargo<\/code> does not support installing libraries at all (since for now rust dynamic libraries should <strong>not<\/strong> be used at all) so we are a bit on our own.<\/p>\n<p>For <a href=\"https:\/\/github.com\/xiph\/rav1e\">rav1e<\/a> I did that the hard way and then I came up an easy way for you to use (and that I used for doing the same again with <a href=\"https:\/\/github.com\/RustAudio\/lewton\">lewton<\/a> spending about 1\/2 day instead of several ~~weeks~~months).<\/p>\n<h3>The hard way<\/h3>\n<blockquote><p>As seen in <a href=\"https:\/\/github.com\/lu-zero\/crav1e\">crav1e<\/a>, you can explore the history there.<\/p><\/blockquote>\n<p>It isn&#8217;t the fully hard way since before <a href=\"https:\/\/github.com\/lu-zero\/cargo-c\">cargo-c<\/a> there was already nice tools to avoid some time consuming tasks: <a href=\"https:\/\/github.com\/eqrion\/cbindgen\/\">cbindgen<\/a>.<br \/>\nIn a terse summary what I had to do was:<\/p>\n<ul>\n<li>Come up with an external build system since <code>cargo<\/code> itself cannot install anything nor have direct knowledge of the install path information. I used <code>Make<\/code> since it is simple and sufficiently widespread, anything richer would probably get in the way and be more time consuming to set up.<\/li>\n<li>Figure out how to extract the information provided in <code>Cargo.toml<\/code> so I have it at <code>Makefile<\/code> level. I gave up and duplicated it since parsing <code>toml<\/code> or <code>json<\/code> is pointlessly complicated for a prototype.<\/li>\n<li>Write down the platform-specific logic on how to build (and install) the libraries. It ended up living in the <code>build.rs<\/code> and the <code>Makefile<\/code>. Thanks again to <a href=\"https:\/\/github.com\/dwbuiten\">Derek<\/a> for taking care of the <strong>Windows<\/strong>-specific details.<\/li>\n<li>Use <a href=\"https:\/\/github.com\/eqrion\/cbindgen\/\">cbindgen<\/a> to generate the C header (And in the process smooth some of its rough edges<\/li>\n<li>Since we already have a build system add more targets for testing and continuous integration purpose.<\/li>\n<\/ul>\n<p>If you do not want to use <a href=\"https:\/\/crates.io\/crates\/cargo-c\">cargo-c<\/a> I spun away the <code>cdylib<\/code>-link line logic in a <a href=\"https:\/\/crates.io\/crates\/cdylib-link-lines\">stand alone crate<\/a> so you can use it in your <code>build.rs<\/code>.<\/p>\n<h3>The easier way<\/h3>\n<p>Using a <code>Makefile<\/code> and a separate crate with a customized <code>build.rs<\/code> works fine and keeps the developers that care just about writing in rust fully shielded from the gory details and contraptions presented above.<\/p>\n<p>But it comes with some additional churn:<\/p>\n<ul>\n<li>Keeping the API in sync<\/li>\n<li>Duplicate the release work<\/li>\n<li>Have the users confused on where to report the issues or where to find the actual sources. (The users tend to miss the information presented in the obvious places such as the README way too often)<\/li>\n<\/ul>\n<p>So to try to minimize it I came up with a <code>cargo<\/code> applet that provides two subcommands:<\/p>\n<ul>\n<li><strong>cbuild<\/strong> to build the libraries, the .pc file and header.<\/li>\n<li><strong>cinstall<\/strong> to install the whole lot, if already built or to build and then install it.<\/li>\n<\/ul>\n<p>They are two subcommands since it is quite common to <strong>build<\/strong> as user and then <strong>install<\/strong> as root. If you are using <code>rustup<\/code> and root does not have cargo you can get away with using <code>--destdir<\/code> and then <code>sudo install<\/code> or craft your local package if your distribution provides a mean to do that.<\/p>\n<p>All I mentioned in the hard way happens under the hood and, beside bugs in the current implementation, you should be completely oblivious of the details.<\/p>\n<h3>Using cargo-c<\/h3>\n<blockquote><p>As seen in <a href=\"https:\/\/github.com\/RustAudio\/lewton\">lewton<\/a> and <a href=\"https:\/\/github.com\/xiph\/rav1e\">rav1e<\/a>.<\/p><\/blockquote>\n<ul>\n<li><a href=\"https:\/\/github.com\/RustAudio\/lewton\/pull\/50\/commits\/557cb4ce35beedf6d6bfaa481f29936094a71669\">Create<\/a> a <code>capi.rs<\/code> with the C-API you want to expose and use <code>#[cfg(cargo_c)]<\/code> to hide it when you build a normal rust library.<\/li>\n<li><a href=\"https:\/\/github.com\/RustAudio\/lewton\/pull\/50\/commits\/e7ea8fff6423213d1892e86d51c0c499d8904dc1\">Make sure<\/a> you have a <code>lib<\/code> target and if you are using a workspace the first member is the crate you want to export, that means that you might have <a href=\"https:\/\/github.com\/xiph\/rav1e\/pull\/1381\/commits\/7d558125f42f4b503bcdcda5a82765da76a227e0#diff-80398c5faae3c069e4e6aa2ed11b28c0R94\">to add a <code>\".\"<\/code> member at the start of the list<\/a>.<\/li>\n<li>Remember to <a href=\"https:\/\/github.com\/RustAudio\/lewton\/pull\/51\/files\">add<\/a> a <code>cbindgen.toml<\/code> and fill it with at least the include guard and probably you want to set the language to <code>C<\/code> (it defaults to <code>C++<\/code>)<\/li>\n<li>Once you are happy with the result update your documentation to tell the user to install <code>cargo-c<\/code> and do <code>cargo cinstall --prefix=\/usr --destdir=\/tmp\/some-place<\/code> or something along those lines.<\/li>\n<\/ul>\n<h2>Coming next<\/h2>\n<p><a href=\"https:\/\/github.com\/lu-zero\/cargo-c\">cargo-c<\/a> is a young project and far from being complete even if it is functional for my needs.<\/p>\n<p>Help in improving it is <a href=\"https:\/\/github.com\/lu-zero\/cargo-c\/issues\">welcome<\/a>, there are plenty of rough edges and bugs to find and squash.<\/p>\n<h2>Thanks<\/h2>\n<p>Thanks to <a href=\"https:\/\/github.com\/est31\/\">est31<\/a> and <a href=\"https:\/\/github.com\/sdroege\/\">sdroege<\/a> for the in-depth review in <strong>#rust-av<\/strong> and <a href=\"https:\/\/github.com\/kodabb\/\">kodabb<\/a> for the last minute edits.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I presented cargo-c at the rustlab 2019, here is a longer followup of this. Mixing Rust and C One of the best selling point for rust is being highly interoperable with the C-ABI, in addition to safety, speed and its amazing community. This comes really handy when you have well optimized hand-crafted asm kernels you&#8217;d &hellip; <a href=\"https:\/\/blogs.gentoo.org\/lu_zero\/2019\/07\/01\/building-crates-so-they-look-like-cabi-libraries\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Building crates so they look like C(ABI) Libraries<\/span><\/a><\/p>\n","protected":false},"author":10,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[1],"tags":[37,31],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1aGWH-c1","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/745"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/comments?post=745"}],"version-history":[{"count":3,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/745\/revisions"}],"predecessor-version":[{"id":748,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/745\/revisions\/748"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/media?parent=745"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/categories?post=745"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/tags?post=745"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}