{"id":562,"date":"2017-03-17T15:58:06","date_gmt":"2017-03-17T14:58:06","guid":{"rendered":"https:\/\/blogs.gentoo.org\/mgorny\/?p=562"},"modified":"2018-08-13T18:16:44","modified_gmt":"2018-08-13T16:16:44","slug":"why-you-cant-rely-on-repository-format-pms","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/mgorny\/2017\/03\/17\/why-you-cant-rely-on-repository-format-pms\/","title":{"rendered":"Why you can&#8217;t rely on repository format (PMS)"},"content":{"rendered":"<p>You should know already that you are not supposed to rely on Portage internals in ebuilds \u2014 all variables, functions and helpers that are not defined by the\u00a0PMS. You probably know that you are not supposed to touch various configuration files, vdb and\u00a0other Portage files as well. What most people don&#8217;t seem to understand, you are not supposed to make any assumptions about the\u00a0ebuild repository either. In this post, I will expand on this and\u00a0try to explain why.<\/p>\n<p><!--more--><\/p>\n<h2>What PMS specifies, what you can rely on<\/h2>\n<p>I think the\u00a0first confusing point is that PMS actually defines the\u00a0repository format pretty thoroughly. However, it <em>does not<\/em> specify that you can rely on that format being visible from within ebuild environment. It just defines a\u00a0few interfaces that you can reliably use, some of them in\u00a0fact quite consistent with the\u00a0repository layout.<\/p>\n<p>You should really look as the\u00a0PMS-defined repository format as an\u00a0<em>input specification<\/em>. This is the\u00a0format that the\u00a0developers are supposed to use when writing ebuilds, and\u00a0that all basic tools are supposed to support. However, it does not prevent the\u00a0package managers from defining and\u00a0using other package formats, as long as they provide the\u00a0environment compliant with the\u00a0PMS.<\/p>\n<p>In fact, this is how binary packages are implemented in\u00a0Gentoo. The\u00a0PMS does not define any specific format for them. It only defines a\u00a0few basic rules and\u00a0facilities, and\u00a0both Portage and\u00a0Paludis implement their own binary package formats. The\u00a0package managers expose APIs required by the\u00a0PMS, and\u00a0can use them to run the\u00a0necessary pkg_* phases.<\/p>\n<p>However, the\u00a0problem is not limited to two currently used binary package formats. This is a\u00a0generic goal of being able to define any new package format in the\u00a0future, and\u00a0make it work out of the\u00a0box with existing ebuilds. Imagine just a\u00a0few possibilities: more compact repository formats (i.e. not requiring hundreds of unpacked files), fetching only needed ebuild files\u2026<\/p>\n<p>Sadly, none of\u00a0this can even start being implemented if developers continuosly insist to rely on specific repository layout.<\/p>\n<h2>The *DIR variables<\/h2>\n<p>Let&#8217;s get into the\u00a0details and\u00a0iterate over the\u00a0few relevant variables here.<\/p>\n<p>First of all, <var>FILESDIR<\/var>. This is the\u00a0directory where ebuild support files are provided throughout src_* phases. However, there is no guarantee that this will be exactly the\u00a0directory you created in the\u00a0ebuild repository. The\u00a0package manager just needs to provide the\u00a0files in some directory, and\u00a0this directory may not actually exist before the\u00a0first src_* phase. This implies that the\u00a0support files may not even exist at all when installing from a\u00a0binary package, and\u00a0may be created (copied, unpacked) later when doing a\u00a0source build.<\/p>\n<p>The\u00a0next variable listed by the\u00a0PMS is <var>DISTDIR<\/var>. While this variable is somewhat similar to the\u00a0previous one, some developers are actually eager to make the\u00a0opposite assumption. Once again, the\u00a0package manager may provide the\u00a0path to <em>any directory<\/em> that contains the\u00a0downloaded files. This may be a\u00a0\u2018shadow\u2019 directory containing only files for this package, or it can be any system downloads directory containing lots of\u00a0other files. Once again, you can&#8217;t assume that <var>DISTDIR<\/var> will exist before src_*, and\u00a0that it will exist at all (and\u00a0contain necessary files) when the\u00a0build is performed using a\u00a0binary package.<\/p>\n<p>The\u00a0two remaining variables I would like to discuss are <var>PORTDIR<\/var> and\u00a0<var>ECLASSDIR<\/var>. Those two are a\u00a0cause of real mayhem: they are completely unsuited for a\u00a0multi-repository layout modern package managers use and\u00a0they enforce a\u00a0particular source repository layout (they are not available outside src_* phases). They pretty much block any effort on\u00a0improvement, and\u00a0sadly their removal is continuously blocked by a\u00a0few short-sighted developers. Nevertheless, work on removing them is in\u00a0progress.<\/p>\n<h2>Environment saving<\/h2>\n<p>While we&#8217;re discussing those matters, a\u00a0short note on environment saving is worth being written. By <q>environment saving<\/q> we usually mean the\u00a0magic that causes the\u00a0variables set in\u00a0one phase function to be carried to a\u00a0phase function following it, possibly over a\u00a0disjoint sequence of actions (i.e. install followed by\u00a0uninstall).<\/p>\n<p>A\u00a0common misunderstanding is to assume the\u00a0Portage model of\u00a0environment saving \u2014 i.e. basically dumping a\u00a0whole ebuild environment including functions into a\u00a0file. However, this is not sanctioned by the\u00a0PMS. The\u00a0rules require the\u00a0package manager to save only <em>variables<\/em>, and\u00a0only those that are not defined in\u00a0global scope. If phase functions define functions, there is no guarantee that those functions will be preserved or restored. If phases redefine global variables, there is no guarantee that the\u00a0redefinition will be preserved.<\/p>\n<p>In\u00a0fact, the\u00a0specific wording used in the\u00a0PMS allows a\u00a0completely different implementation to be used. The\u00a0package manager may just snapshot defined functions after processing the\u00a0global scope, or\u00a0even not snapshot them at\u00a0all and\u00a0instead re-read the\u00a0ebuild (and\u00a0re-inherit eclasses) every time the\u00a0execution continues. In\u00a0this case, any functions defined during phase function are lost.<\/p>\n<h2>Is there a future in this?<\/h2>\n<p>I hope this clears up all the\u00a0misunderstandings on how to write ebuilds so that they will work reliably, both for source and binary builds. If those rules are followed, our users can finally start expecting some fun features to come. However, before that happens we need to fix the\u00a0few existing violations \u2014 and\u00a0for that to happen, we need a\u00a0few developers to stop thinking only of their own convenience.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You should know already that you are not supposed to rely on Portage internals in ebuilds \u2014 all variables, functions and helpers that are not defined by the\u00a0PMS. You probably know that you are not supposed to touch various configuration files, vdb and\u00a0other Portage files as well. What most people don&#8217;t seem to understand, you &hellip; <a href=\"https:\/\/blogs.gentoo.org\/mgorny\/2017\/03\/17\/why-you-cant-rely-on-repository-format-pms\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Why you can&#8217;t rely on repository format (PMS)&#8221;<\/span><\/a><\/p>\n","protected":false},"author":137,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[11],"tags":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts\/562"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/users\/137"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/comments?post=562"}],"version-history":[{"count":8,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts\/562\/revisions"}],"predecessor-version":[{"id":571,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/posts\/562\/revisions\/571"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/media?parent=562"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/categories?post=562"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/mgorny\/wp-json\/wp\/v2\/tags?post=562"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}