I got this idea about writing a Portage metadata cache backend based on extended file attributes. We are talking about file metadata after all and the key=value format fits the cache quite fine. I have it working now. On the road I hit a couple of interesting issues. The cache can have arbitrary long lines but all file systems I tested have a limit on how long the values can be. I decided to just split the values into multiple attributes when they are too long. I also found out that ext4 and btrfs use the wrong errno to signal the value being too long. man xattr_set says it should be E2BIG but both of those file systems return ENOSPC. I opened an upstream kernel bug about this to see what they think:
This is what it looks like currently:
betelgeuse@pena /mnt/test/dev-java/java-config $ getfattr -d java-config-2.1.7.ebuild | head # file: java-config-2.1.7.ebuild user.CDEPEND="1:" user.DEFINED_PHASES="1:compile install postinst postrm unpack" user.DEPEND="1:dev-lang/python >=sys-apps/sed-4 virtual/python" user.DESCRIPTION="1:Java environment configuration tool" user.EAPI="1:0" user.HOMEPAGE="1:http://www.gentoo.org/proj/en/java/" user.INHERITED="1:" user.IUSE="1:" user.KEYWORDS="1:~alpha ~amd64 ~arm ~ia64 ~ppc ~ppc64 ~x86 ~x86-fbsd"
As for performance the current implementation seems to perform about the same for emerge -uDpv world as the default cache.
These results are with a warm file system cache.
Results on btrfs/xattrs:
xfs does a little better because it has a longer limit for attribute values. I guess that most of the time is spend in doing something else than cache lookups but will try to profile later. The code isn’t committed anywhere outside my portage trunk git svn checkout yet but will try to see if this is something zmedico accepts to Portage trunk. Probably not going to be a documented option any time soon though.