Bonding Period 2 – Modernization of Portage

Bonding Period 2 – Modernization of Portage


In order to get familiar with the portage codebase, we decided that I’d fix a few bugs. This blog post talks about the second half of the community bonding period (weeks 3 and 4) where I try to do that.

Bugs, bugs and more bugs

When it comes to bugs, the paradox of choice is real. To choose from, there is a heap of them (1439 at the moment of writing). Most of the bugs are quality of life improvements as the portage team  has put in a lot of effort to make sure portage does it’s jobs without many errors. After searching, we decided to work on bug 634576.


Portage uses backtracking to calculate the dependencies of a package and it is a computationally  intensive and time consuming process. If a person were to issue a command emerge 10 packages,  portage calculates the dependencies one by one and if he/she were to misspell a single package,  portage would calculate the dependencies of other packages before recognizing that the name of a  package is wrong. It fails, but only after calculating dependencies of other packages. At this point,  all the computation done is also being wasted. At the time of filing of this bug, portage did not  cache it’s calculations and so in the next run, all the dependencies are calculated again. Ideally, portage should have recognized the package does not exist and it should “fail faster”.

Reproducing the bug

The bug was confirmed and so that means the portage team was able to reproduce it. So we tried

# emerge www-client/chromium "<cython-3" libreoffice dev-lang/ghc
  dev-haskell/doctest dev-ruby/actionpack firefox tensorflow idonotexist

and to our surprise, emerge failed fast. We can’t just close the bug without giving context and so we had to find the commit that fixed it.

Git bisect

One of the mentors, Sam James suggested we use git bisect. It is a clever feature of git. I was very  glad when I read about git bisect. It was very cool to see binary search being used in real world. Git bisect has an option for automated testing. We write an application (or script), based on which’s  exit code, git bisect can find “good” and “bad” commits. We noticed that if portage fails faster, it fails within 1.8 seconds. So we wrote the following script.

#!/usr/bin/env python

import subprocess
import time

a = time.time()
b = time.time()
t = b - a

if t > 1.8: # If t goes above 1.8, it means dependencies are being calculated
    exit(0) # Says to git bisect this is good (we want to find the bad commit)
elif t < 0.2:
    exit(127) # Says to git bisect to check nearby commit.
    exit(1) # Bad commit (we want to find this)

We need the exit(127) line because some of the commits leave portage repo at an inconsistent  state. When the program exits with an exit code of 127, it tells git bisect to ignore the current  commit and check nearby commits. We ran the script with the following command.

git bisect ./script

The output of the run can be found here.

Due to EAPI differences and the portage status being inconsistent between commits, we could not identify the exact commit that fixed it but it was somewhere around commit “0f3070198c56a8bc3b23e3965ab61136d3de76ae”, which was around 2021 when caching  capabilities are added to portage. With this information, the bug was closed successfully.


To summarize, we looked at a few bugs and closed one using git bisect and a small python script. I  am also getting familiar with the codebase and I have been looking at places to work on.emerge --version seems like a nice and simple corner to start. That marks the end of the “community  bonding period”. The “official coding period” starts next week and I can barely contain my excitement!!

This entry was posted in 2023 GSoC, Modernization of Portage with C++ and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published.