{"id":280,"date":"2014-06-23T17:26:11","date_gmt":"2014-06-23T15:26:11","guid":{"rendered":"https:\/\/blogs.gentoo.org\/mgorny\/?p=280"},"modified":"2014-06-23T19:23:39","modified_gmt":"2014-06-23T17:23:39","slug":"inlining-marchnative-for-distcc","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/mgorny\/2014\/06\/23\/inlining-marchnative-for-distcc\/","title":{"rendered":"Inlining -march=native for distcc"},"content":{"rendered":"<p><kbd>-march=native<\/kbd> is a\u00a0gcc flag that enables auto-detection of\u00a0CPU architecture and\u00a0properties. Not only it allows you to avoid finding the\u00a0correct value of\u00a0<kbd>-march=<\/kbd> but also enables instruction sets that do not fit any standard CPU profile and\u00a0detects the\u00a0cache sizes.<\/p>\n<p>Sadly, <kbd>-march=native<\/kbd> itself can&#8217;t really work well with distcc. Since the\u00a0detection is performed when compiling, remote gcc invocations would use the\u00a0architecture of\u00a0the\u00a0distcc host rather than the\u00a0client. Therefore, the\u00a0resulting executables would be a\u00a0mix of\u00a0different architectures used by\u00a0distcc.<\/p>\n<p>You may also find <kbd>-march=native<\/kbd> a\u00a0bit opaque. For example, we had multiple bug reports about <a rel='external' href='https:\/\/bugs.gentoo.org\/show_bug.cgi?id=500032'>LLVM failing to build with -march=atom<\/a>. However, some of\u00a0the\u00a0reporters were using <kbd>-march=native<\/kbd>, so we wasn&#8217;t able to immediately identify the\u00a0duplicates.<\/p>\n<p>In\u00a0this article, I will guide you shortly on replacing <kbd>-march=native<\/kbd> with expanded compiler flags, for the\u00a0benefit of\u00a0distcc compatibility and\u00a0more explicit build logs.<\/p>\n<p><!--more--><\/p>\n<h3>Obtaining the\u00a0native flags from gcc<\/h3>\n<p>The\u00a0first step towards replacing <kbd>-march=native<\/kbd> is to\u00a0determine which flags are\u00a0enabled by\u00a0it. Various people suggest <a rel='external' href='http:\/\/stackoverflow.com\/questions\/5470257\/how-to-see-which-flags-march-native-will-activate'>multiple ways of\u00a0obtaining <kbd>-march=native<\/kbd> flags<\/a>. For\u00a0example, you can use the\u00a0following call:<\/p>\n<pre><code>$ gcc -### -march=native -x c -\r\nUsing built-in specs.\r\nCOLLECT_GCC=\/usr\/x86_64-pc-linux-gnu\/gcc-bin\/4.8.3\/gcc\r\nCOLLECT_LTO_WRAPPER=\/usr\/libexec\/gcc\/x86_64-pc-linux-gnu\/4.8.3\/lto-wrapper\r\nTarget: x86_64-pc-linux-gnu\r\n<em>[\u2026]<\/em>\r\nThread model: posix\r\ngcc version 4.8.3 (Gentoo 4.8.3 p1.1, pie-0.5.9) \r\nCOLLECT_GCC_OPTIONS='-march=native'\r\n \/usr\/libexec\/gcc\/x86_64-pc-linux-gnu\/4.8.3\/cc1 -quiet - <strong>\"-march=k8-sse3\" -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt --param \"l1-cache-size=64\" --param \"l1-cache-line-size=64\" --param \"l2-cache-size=512\" \"-mtune=k8\"<\/strong> -quiet -dumpbase - -auxbase - -fstack-protector -o \/tmp\/cckZDyUR.s\r\n<em>[\u2026]<\/em><\/code><\/pre>\n<p>For those more curious, a\u00a0similar call can be\u00a0made with <kbd>-x c++<\/kbd> for\u00a0the\u00a0C++ compiler flags. The\u00a0expanded optimization flags can be\u00a0found in\u00a0the\u00a0<kbd>cc1<\/kbd> (or\u00a0<kbd>cc1plus<\/kbd> in\u00a0case of\u00a0C++) command line. I\u00a0have highlighted the\u00a0relevant flags \u2014 usually you&#8217;re looking for various <kbd>-m<\/kbd> flags and\u00a0<kbd>--param<\/kbd>s related to caches.<\/p>\n<p>You may also notice <kbd>-fstack-protector<\/kbd> there. This is because <a rel='external' href='http:\/\/sources.gentoo.org\/gitweb\/?p=proj\/gentoo-news.git;a=blob;f=2014\/2014-06-15-gcc48_ssp\/2014-06-15-gcc48_ssp.en.txt'>nowadays Gentoo enables it by\u00a0default<\/a>. If you are using a\u00a0non-Gentoo distcc host (why would you have a\u00a0non-Gentoo host in\u00a0the\u00a0first place?), you may want to pass it explicitly as\u00a0well.<\/p>\n<p>You may find the\u00a0above output a\u00a0bit oververbose. While this technically isn&#8217;t a\u00a0problem, it clutters the\u00a0build logs. So, let&#8217;s filter it a\u00a0bit.<\/p>\n<h3>Filtering out redundant flags<\/h3>\n<p>Most of\u00a0the\u00a0<kbd>-m<\/kbd> flags listed above are redundant, being either equivalent to\u00a0the\u00a0defaults, or\u00a0enabled implicitly by\u00a0<kbd>-march<\/kbd>. For\u00a0example, on the\u00a0host providing the\u00a0example output none of\u00a0<kbd>-mno-*<\/kbd> flags were actually required, and\u00a0<\/kbd>-msahf<\/kbd> was enabled implicitly.<\/p>\n<p>You can safely assume that in\u00a0Gentoo all <kbd>-m<\/kbd> flags are disabled by\u00a0default. To find out what flags are implied by\u00a0the\u00a0<kbd>-march<\/kbd>, let&#8217;s look at gcc sources.<\/p>\n<pre><code>$ tar -xf \/var\/cache\/portage\/distfiles\/gcc-4.8.3.tar.bz2\r\n$ find gcc-4.8.3\/gcc\/config -name '*.c' -exec grep k8-sse3 {} +\r\n<strong>gcc-4.8.3\/gcc\/config\/i386\/i386.c<\/strong>:      {\"k8-sse3\", PROCESSOR_K8, CPU_K8,\r\ngcc-4.8.3\/gcc\/config\/i386\/driver-i386.c:\tcpu = \"k8-sse3\";<\/code><\/pre>\n<p>The\u00a0first file has what we&#8217;re looking for. Inside, you can find:<\/p>\n<pre><code>      {\"k8-sse3\", PROCESSOR_K8, CPU_K8,\r\n\tPTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE\r\n\t| PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR},<\/pre>\n<p><\/code><\/p>\n<p>So <kbd>-march=k8-sse3<\/kbd> would enable <kbd>-mmmx<\/kbd>, <kbd>-m3dnow<\/kbd>, <kbd>-msse<\/kbd> and\u00a0so on. If\u00a0you compare this list with the\u00a0output obtained before, you'd notice that the\u00a0<kbd>-march<\/kbd> option didn't enable any flags that would need to be\u00a0disabled explicitly, so all <kbd>-mno-*<\/kbd> flags can be\u00a0omitted. Similarly, <kbd>-mfxsr<\/kbd> is redundant. But <kbd>-mcx16<\/kbd> and\u00a0<kbd>-msahf<\/kbd> seem relevant since the\u00a0former is not listed there at all, and the\u00a0latter is disabled by\u00a0default.<\/p>\n<p>After filtering out the\u00a0unnecessary flags, we can create both\u00a0distcc- and eye-friendly CFLAGS like:<\/p>\n<pre><code>CFLAGS='-O2 -pipe -march=k8-sse3 -mcx16 -msahf -param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512'<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>-march=native is a\u00a0gcc flag that enables auto-detection of\u00a0CPU architecture and\u00a0properties. Not only it allows you to avoid finding the\u00a0correct value of\u00a0-march= but also enables instruction sets that do not fit any standard CPU profile and\u00a0detects the\u00a0cache sizes. Sadly, -march=native itself can&#8217;t really work well with distcc. Since the\u00a0detection is performed when compiling, remote gcc invocations &hellip; <a href=\"https:\/\/blogs.gentoo.org\/mgorny\/2014\/06\/23\/inlining-marchnative-for-distcc\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Inlining -march=native for distcc&#8221;<\/span><\/a><\/p>\n","protected":false},"author":137,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[3],"tags":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts\/280"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/users\/137"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/comments?post=280"}],"version-history":[{"count":12,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts\/280\/revisions"}],"predecessor-version":[{"id":293,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts\/280\/revisions\/293"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/media?parent=280"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/categories?post=280"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/tags?post=280"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}