Week 5 Report, Automated Gentoo System Updater

Progress on Week 5

Week started off by receiving some feedback from the community in the forums. Here are some nice ideas that community have suggested to implement:

  1. Fallback to the latest version of the package if an error is encountered during an update;
  2. Add an option to control Portage niceness;
  3. Estimate update time;
  4. Notify users about obsolete USE flags;
  5. Think of a way to make updater work on binpkg servers.

I will attempt to do 1-4 in the duration of Google Summer of Code.

There were also some suggestions on improving the workflow and many different opinions were voiced. The discussion is still ongoing, but it has already yielded some positive results.

I’ve made some progress on the Parser, it can now detect whether update has ended in an error or not. Log format and general output flow was modified to simplify parsing. Most noticeable change was the way how the updater.sh is launched. Before the whole script (~250 lines of Bash) were launched all at once, and now each function from the script is being launched separately. Additional flag (--report) was added to utilize the parser, it can now parse the last log from the log directory.

Furthermore, I spent sometime on organizing testing a bit better. I updated container versions and created a better naming convention for my containers to not get lost in them. 08-05-2023 Desktop image on openrc is being used to test glsa-check, and most recent openrc basic image is used to test updating functionality.

Challenges

Parser has turned out to be much harder than I anticipated. First of all, I had to make some changes to both Python and Bash code to create simpler log output, which reduced number of if/else statements in the parser.

Secondly, there were some motivation issues. It was a bit hard to focus on the parser, because a much better approach is to add machine readable output from Portage instead of parsing logs. I talked to my mentor about it and we decided to continue working on the parser, mainly because modifying Portage in any significant way take waay too much time.

Plans for Week 6

On week 6 the plan is to add error parsing and comprehension to the parser. This means I will have to find some different ways to cause Portage to break, and then try to make parser understand the errors that have occurred. Should be really fun!

After that is done, I can focus on using this information to create nice-looking update reports.

Posted in 2023 GSoC, Automated System Updater | 2 Comments

Weekly report 4, LLVM libc

Hello! This is a combined report for both week 3 and 4.

In these two weeks I’ve fixed several issues in LLVM libc, but quite a
lot of time has also been spent purely learning things. I will start
by going over what I’ve learned, and then refer to related issues.

Continue reading

Posted in Bootstapping LLVM | Leave a comment

Week 4 – Modernization of Portage

Week 4 – Modernization of Portage

Another week of GSOC. Days run really fast. This again was a productive week. The first half was  towards understanding the unit tests for portage and the second half was towards solving a bug.

Testing in portage

Tests are one of the most important components of any software. Portage being no exception  employs unit tests for testing. Till now, I did not bother to look into the tests. We have a bash script runtests. I run it and I watch for things to succeed. Sam felt that I needed to have a bit more  understanding of the tests, for various reasons. So, I started looking into the tests.

Portage’s tests are single threaded. It takes between 300 and 450 seconds to run all the tests portage has, depending on the speed of the machine. It would be nice to have the unit tests run in parallel, but there are several caveats to that. For one, portage needs to virtualize various things including runtime parameters and a filesystem (to test the changes portage makes). Sharing one virtualized  environment among many threads did not seem like a plausible idea. So, for each thread a
new virtual environment has to be created. So threading has to be outside the virtual environment creation phase.

So, I added the functionality to start and stop testing at the nth test file. With this functionality, the  plan is to count the number of tests, split them into groups and assign each group to a separate thread. This leads to a bit of overhead as virtual environments have to created for each thread, but it will make the tests faster. The implementation can be found in this pull request. It is not merged yet because the long term goal is to get rid of runtests and exclusively use standardized python tools
like pytest-xdist for running tests parallely. There is also work going on to make portage tests run properly with python-xdist. I am not sure if this work will block that. It should not, but still, we are holding the merge.

Bug 528836

From day one, I wanted to work on the dependency resolution system of portage. But it is obviously not a simple job and so Sam advised to get familiar with the algorithm by fixing bugs related to that. Sam chose me a bug to fix and it is 528836. The problem is that two conflicting packages are pulled in when only should have been pulled. The bug was not reproduceable with the current state of portage and the ebuild repository. There were a few hurdles along the way, but finally, we were able to reproduce the bug by restoring portage and the ebuild repository to 2017.

We are not yet sure if the bug is due to portage or some misconfiguration in the ebuild repository.  We will continue to work on it and I will keep you posted.

Next week’s plan

The next week’s plan will be to write tests for this bug to make sure it doesn’t happen again. We  will also try to squeeze in a few more quality of life changes if time permits.

Posted in 2023 GSoC, Modernization of Portage with C++ | Tagged , | Leave a comment

Week 4 Report, Automated Gentoo System Updater

This article is a summary of all the changes made on Automated Gentoo System Updater project during week 4 of GSoC.

Project is hosted on Github.

Progress on Week 4

Started the week by discovering that my updates to ebuild were not accepted in the GURU overlay. The issue arose due to a misuse of USE flags feature in the ebuild. Maintainers of GURU (big thanks to antecrescent!) pointed out my mistake and explained how to fix it, which I did by submitting 2 more commits ( commit1 and commit2).

Then I proceeded to write an introductory blog post. It can currently be read in Gentoo GSoC blog. I’ve delayed posting about it on forums because I was waiting for the newest ebuild version to be merged to the main branch in GURU overlay (and because I was a bit anxious to be honest 😰). But in the end I decided to stop waiting, and just mentioned in the blog post that gentoo_update can be also installed via pip.
Forum post can be found here.

Updater also received some improvements overall. I found errors in –args flag (used for passing custom parameters to emerge), in some cases it was not reading all parameters correctly. To fix the problem I changed the input type, now it receives a string of space separated parameters instead of a list, for example “quiet-build=y color=y”, and the problem was fixed.

Also the packaging with Python’s setuptools was improved, now there are no warnings during wheel building.

Finally, I started working on the parser. Right now it can only split the output to different categories.

Overall, the week was not a very productive one, but many bugs and imperfections were discovered and fixed which is great because I can now focus on the parser!

Challenges

It was a bit challenging to understand the reason why USE flags were not a good solutions in my case, but after I got it it suddenly became obvious 🤓

I used USE flags to install optional dependencies for the updater. However, USE flags are typically meant to guide Portage in the program’s build and compilation processes, and in this case USE flags don’t change the outcome of how the program is built. If these flags are ever removed, it would trigger an unnecessary recompilation of the updater. The proper management of optional dependencies, as recommended by antecrescent, involves using the optfeature eclass. This approach provides users with dependency information and prompts them to consider installing dependencies by themselves.

Then it was a bit tricky to get rid of warnings from setuptools (it feels like I’m struggling with setuptools every week 😔). Warning were saying that updater.sh and even tests directories were treated as Python packages, which was a problem because update.sh is a Bash script, and tests contains scripts and Docker compose file used for tests, and both of them were not meant to be a package. I found a solution in Gentoo’s Python Guide which suggested a proper way to exclude the packages to avoid issues with Portage.

Plans for Week 5

Mostly I plan to work on the parser the whole week. Here is the checklist from last week:

  • Split log into multiple sections, i.e updated programs, what needs restarting etc.
  • Summarize sections and create a report
  • If updater exits with error, crate a separate error report
Posted in 2023 GSoC, Automated System Updater | Leave a comment

gentoo_update Introduction

Introduction

gentoo_update (Github repo) is a tool that automatically updates Gentoo Linux.

Motivation

Gentoo Linux gives users maximum flexibility and control over the system. A great example of this is the OS upgrade process. Users have a large selection of different command utilities and a bunch of configuration options to choose from to tailor the upgrade process to their needs. Here is the list of some tools that are commonly used during an upgrade:

[
    eix, equery, emaint, euse, etc-update, dispatch-conf,  
    eselect, elogv, needrestart, eclean, eclean-kernel, 
    qcheck, revdep-rebuild, glsa-check, layman
]

For a successful upgrade, knowing how to use many of these tools is essential. While experienced users might find this manageable, it can be overwhelming for new or inexperienced users.

Additionally, users often delay updates due to the time required, which can compromise system security. Regular updates are vital for maintaining security, so it is recommended to update the system daily.

This project addresses both issues:

  1. complicated update process
  2. potential security issues caused by the lack of regular upgrades

Functionality

Here are some of the things that gentoo_update will be able to do:

  • Install only security updates from Gentoo Linux Security Advisory by default.
  • Optionally run a full system upgrade (@world) with different parameters.
  • Detect and handle update errors.
  • Schedule updates.
  • Generate a post-update report and send it via email and/or IRC chat.
  • Send push notifications to a mobile app.

The program comprises three core components: the updater, the parser, and the notification sender. The updater is a Bash script that executes emerge to update the system and generates detailed logs for each action performed. Upon successful completion of the updater, the parser reads the logs and compiles an update report. The notification sender then dispatches this report to users.

Usage

At the moment gentoo_update can only install GLSA and @word updates and store the output to a dedicated directory. It is available in GURU overlay in app-admin/gentoo_update, and in PyPI. Generally, installing the program from GURU overlay is the preferred method, but PyPI will always have the most recent version (at the time of writing the newest version is 0.1.6).

After enabling GURU overlay it can be installed via:

emerge --ask app-admin/gentoo_update

Alternatively, it can be installed with pip:

emerge --ask dev-python/pip
pip install gentoo_update --break-system-packages

Here are some use cases:
Security update
Running command without specifying –update-mode will use glsa-check to install security patches.

gentoo-update

@world update
Run full system update, merge all new configuration files, restart all services that were updated and display elogs:

gentoo-update --update-mode full \
              --config-update-mode merge \
              --daemon-restart y \
              --read-logs y

Override default behavior and show build logs:

gentoo-update --update-mode full --args "quiet-build=n"

After an update a log file will be created in /var/log/portage/gentoo_update/log_<timestamp>.

Conclusion

gentoo_update aims to be a useful tool that will automate and simplify updating Gentoo Linux. By default it only installs updates from GLSA, but can also be used to update @world, and it can be installed from GURU or PyPI.

I would love to receive some feedback and/or suggestions for this project, feel free to reach out to me via Githubemail or IRC (LabBrat).

Posted in 2023 GSoC, Automated System Updater | 3 Comments

Week 4 report on porting Gentoo packages to modern C

Hello all,

This is my week 4 report on Modern C porting of Gentoo’s packages.

Well nothing interesting to report this week, just following my proposal
and focused on -Wimplicit-int type of bug for the first half of the week
while moving to -Wimplicit-function-declaration.

However, if you follow my PRs on github [1], you will notice that it
happens I fix/send patches bugs that are not per my proposal’s timeline.
This happens because of multiple reasons, sometimes I randomly come
across a bug that is requires some rather easy patch, some other times I
come across a package that is not in the tracker listing bug and send in
a patch. I’ve informed my mentor (Sam) about such situation, and he
acknowledged me taking bugs at random and diverting from my proposal
workflow sometimes.

As I keep solving bugs I’ve also set up a system with llvm profile which
I keep testing recent packages and my patches against. I do plan to at
least make a desktop environment working on llvm profile. Currently I’ve
tried GNOME and Mate, both of them require some work, specially forcing
some tools to GNU version compared to their LLVM counterpart. For
example the gtk package currently cant be installed directly on llvm
profile, it requires overriding the OBJCOPY to gnu objcopy from
llvm-objcopy and forcing the LD (or linker) to GNU bfd instead of lld
which is default linker in llvm. Not to mention there are bugs/build
failures occurring specifically when building with libcxx.

Adhere to my proposal and work on more -Wimplicit-function-declaration
bugs.

Hopefully I’ve have some spare time this week to do some more
experiments on the llvm profile. Out of GNOME and Mate, the two desktop
environment I tested on the said profile, the mate meta package seems to
require less patches (only a couple of packages from the meta package)
compared to GNOME. As in GNOME, Vala is still a blocker and another
important package (NetworkManager) is failing on llvm profile, most
probably due to libcxx quirks.

Till then, see yah!

[1]: https://github.com/gentoo/gentoo/pulls/listout

Posted in 2023 GSoC, Modern C Package Porting | Leave a comment

Week 3 – Modernization of Portage

Week 3 – Modernization of Portage

It is the third week of the coding period. It is mostly an uneventful week. Most part was spent on  trying to understand the dependency resolution algorithm. In the second part of the week I also did  some refactoring and some type hints.

Update on the blog posts

I lost my password to access this blog and also had troubles resetting the password. That is why I  have not been able to post per week. With help from BlueKnight, I got my access back. So, I am dumping the blog posts I have written, all at once. From next week, I expect posts to be at regular intervals (one per week). Sorry about the bulk posting, hope you don’t mind.

Portage vs Pacman

The most significant part of portage is it’s dependency resolution system. It is very different from all other package managers due to the unique concept ot “USE” flags.

As many can guess, a graph data structure is used extensively to find the dependencies of a package. The process is somewhat trivial in most package managers. Arch Linux’s pacman for example constructs a graph with package names as nodes and it’s dependencies as it’s children. Now it’s a matter of a simple graph traversal. The detected dependencies are copied to a list and everything is downloaded from the a server. The implementation can be found here.

Portage uses a different algorithm called backtracking. Portage finds the package’s use flags and backtracks the packages related to that use flag. It is recursive and the process repeats. The time complexity is O(N!) and I understood that’s the reason portage is slow and not because of python. Even after hours of looking at the codebase it is hard to make sense of the exact workings. Due to the sheer number of features portage offers, the process is many folds complex than pacman. This can be understood by the fact that the dependency resolution part (not completely) of pacman is like 927 lines of c code at the moment of writing this and the portage equivalent is about 11814 lines of python code. I also had the code profiled and depgraph.py alone takes like 95% of the total runtime. So it is useful to learn about the algorithm. The profiling results can be found here.

Here is the image version. profile

Portage also uses granular, file level locks, whereas pacman uses one lock for the whole runtime.  This allows multiple instances of portage to run at the same time.

Studying portage’s codebase made it much easier for me to understand pacman’s codebase and in no time, even though I don’t have much experience in C. I am glad I got to work on portage, but also sad that I cannot understand the codebase enough (yet) to contribute in a deeper way. Portage is years of work of many smart minds and it is unreasonable of me to expect to understand things fast.

Refactoring / Tidying up portage

The second half of the week was towards finding something to improve on portage. I added a few type hints and refactored some big functions into smaller ones for better readability. Also found some unreachable code and removed them as well. I think they are nice improvements. I sent a pull request which can be found here. It is not merged yet, I think it will be soon.

Summary

To summarize, the week was spent delving more into the cProfile results of portage and comparing it with pacman. I also refactored the codebase a little bit, which makes it a tiny bit better.

I want to look at portage’s ability to resolve circular dependencies but I am not sure if I can do it (if it is a simple task). If it were simple, portage developers would have done it already, but I still want to give it a try. I’ll keep you posted. That marks the end of week three. See you in week four’s post!

Posted in 2023 GSoC, Modernization of Portage with C++ | Tagged , | Leave a comment

Week 2 – Modernization of Portage

Week 2 – Modernization of Portage

It is the second week of coding period and it has been a productive one. It started according to the  plans and diverged in the second half for the good. The first half was towards type annotation and the second half was dummy_threading deprecation.

Type annotation

In the words of Sam, my mentor, “I’d considere a GSOC project complete if some 50% of the  codebase is just type annotated”. Many portage developers were excited when we were talking  about adding type hints and docstrings.

Adding type hints and tidying up the codebase will also give me more exposure to the underlying functions. So, we decided, this week I’ll do type annotations.

Deciding on the type hints style

Python 3.9 adds a simpler “native” style type annotation, but portage has a minimum supported python version of 3.7.

Eg.

# Python 3.7 style
a: typing.Optional[typing.List[str]] = None
b: typing.Dict[int] = {}

# Python 3.9 style
a: list[str] | None = None
b: dict[int] = {}

We can see that the 3.9 style is arguably more readable and we don’t have to import the typing module.

There is talk on increasing portage’s MSV to python 3.9, but till that is accepted, we decided to stick with python 3.7 style type annotations.

So, I added some type annotations and docstrings. Sam reviewed them and suggested a few  changes. After making the necessary changes, the pull request was merged. It can be found here. Portage’s need of type hints and docstrings can be felt by @ajakk’s comment here

dummy_threading deprecation

dummy_threading existed in python as threading was not stabilied in a few platforms. As I said in the first blog post, portage has to work on many platforms and different architectures. So, portage
had to take into account of the cases where threading might not be available.

But, things have been stabilized and from python version 3.7, dummy_threading is deprecated. Since portage’s MSV is python 3.7 we decided to remove all dummy_threading related
code.

I removed everything related to dummy_threading and updated the tests according to the changed code. It took some 8 commits. I squashed everything and sent a pull request to GitHub. It can be found here Sam checked it once and it was held for some time to see if someone finds a mistake. Sam also said he wanted to get a look one more time with a fresh set of eyes. It was merged shortly.

Summary

TLDR, I sent two pull requests and both are merged

  • The first one with some type hints and docstrings.
  • The second one where all code related to dummy_threading is removed.

So, that marks the end of week two. Next week’s work probably is towards more type annotations and some minor improvements. I’ll keep you posted. Until then, take care!

Posted in 2023 GSoC, Modernization of Portage with C++ | Tagged , | Leave a comment

Week 1 – Modernization of Portage

Week 1 – Modernization of Portage

Coding period starts

So, it’s the first week of the official coding period and I wanted to write some code and get it  merged into the master branch (I understand it’s a bit over ambitious of me, but a man can wish).  As I said in the first blog post, portage is relied up on by many people for different use cases and if  something were a simple fix, the gentoo developers would have done it already. I just can’t storm in and make changes, expect things to work.

So, we tried to find a place which has very little impact on the portage’s running and ended up at emerge --version.

The problem.

emerge’s –version command takes a bit longer than most programs. For example,

$ time gcc --version
Executed in 1.07 millis

$ time python --version
Executed in 2.70 millis

$ time black --version
Executed in 84.64 millis

Guess how much time emerge --version takes.

$ time emerge --version
Executed in 709.19 millis

There is something sinister going on. So we decided to profile the the code. The live profile results  can be found here.

Here is the image version profile

Diving into the profile results

Here, the total run takes 750 milliseconds.

  • Till emerge_main, it takes 71 milliseconds – That’s for some imports, command line arguments parsing etc, cannot be avoided.
  • From emerge_main to run_action takes 167 milliseconds. This is mainly due to the creation of emerge_config object, which is needed to assess different variables (emerge –version outputs more information than only a version number). This can be reduced with significant code changes, but it will lead to a lot of code restructuring and code duplication. We still got 500 milliseconds to account for, lets look into that.
  • Notice that there is no difference between run_action and getportageversion. This means that getting the portage version takes around 1 millisecond. It is true because, it
    PORTAGE_VERSION is just a variable defined in lib/portage/__init__.py.
  • getgccversion() takes 460 milliseconds. That’s concerning because we just noticed that gcc --version takes around 1 millisecond.

Diving into getgccversion()

Turns out getgccversion gets gcc’s version in two ways. gcc --dumpversion or gcc-config -c. If the former code path is taken, getgccversion takes a couple milliseconds, but if the latter code path is taken, getgccversion takes 450 milliseconds. I tried to find when the former codepath is taken, but it is almost never and it seemed like gcc-config -c is unnecessary. So, I avoided the call and created a new code path just for --version (so that no other part of portage is affected).

Patch

Together with a new code path for --version, with the help of my mentor Sam James, we also refactored a few big functions into smaller ones, and added a few quality of life changes like f-strings etc. The pull request can be found here With this change, the emerge --version goes from 750ms to 240ms.

But, we did not consider the edge cases where the CHOST of the system where packages are  compiled might be different than the system the binpkgs would be used. Though it does not affect the functionality of portage in any way, it could provide wrong information to the
end user when he/she types emerge --version. So, the pull request is not merged yet and we are working on solutions to the problem. One of the main reasons for delay of resolution is the fact
that I don’t completely understand what exactly gcc-config does. It is a bash script and I have very little knowledge of bash. We are working on a solution and will try to get the changes merged into master.

Side benefit

Studying the portage codebase for emerge –version has been indeed a fruitful one. I am getting more familiar with the codebase and I was able to find an unreachable code block (duplicated logic). I submitted a pull request and it was merged. Sam commented 
that I am getting familiar with the codebase. That felt good.

Next week

I need more understanding of portage’s internals. Sam suggested that I add docstrings and type annotations to the codebase. That’ll help new developers as well as help me understand the codebase more. So, the next week will most probably be spent type annotating and adding docstrings. I’ll also spent a bit of time learning bash so that I can work on gcc-config and many more as portage/gentoo relies a lot on bash.

Conclusion

Overall, the first week was a productive one. Though the pull request is not merged yet, it has good changes with respect to refactoring. If the new codepath is not sucessful, we’d drop those commits to merge in the rest, hopefully we fix the patch to work on all CHOSTs. See you next week! Have a good one!

Posted in 2023 GSoC, Modernization of Portage with C++ | Tagged , | Leave a comment

Bonding Period 2 – Modernization of Portage

Bonding Period 2 – Modernization of Portage

Context

In order to get familiar with the portage codebase, we decided that I’d fix a few bugs. This blog post talks about the second half of the community bonding period (weeks 3 and 4) where I try to do that.

Bugs, bugs and more bugs

When it comes to bugs, the paradox of choice is real. To choose from, there is a heap of them (1439 at the moment of writing). Most of the bugs are quality of life improvements as the portage team  has put in a lot of effort to make sure portage does it’s jobs without many errors. After searching, we decided to work on bug 634576.

634576

Portage uses backtracking to calculate the dependencies of a package and it is a computationally  intensive and time consuming process. If a person were to issue a command emerge 10 packages,  portage calculates the dependencies one by one and if he/she were to misspell a single package,  portage would calculate the dependencies of other packages before recognizing that the name of a  package is wrong. It fails, but only after calculating dependencies of other packages. At this point,  all the computation done is also being wasted. At the time of filing of this bug, portage did not  cache it’s calculations and so in the next run, all the dependencies are calculated again. Ideally, portage should have recognized the package does not exist and it should “fail faster”.

Reproducing the bug

The bug was confirmed and so that means the portage team was able to reproduce it. So we tried

# emerge www-client/chromium "<cython-3" libreoffice dev-lang/ghc
  dev-haskell/doctest dev-ruby/actionpack firefox tensorflow idonotexist

and to our surprise, emerge failed fast. We can’t just close the bug without giving context and so we had to find the commit that fixed it.

Git bisect

One of the mentors, Sam James suggested we use git bisect. It is a clever feature of git. I was very  glad when I read about git bisect. It was very cool to see binary search being used in real world. Git bisect has an option for automated testing. We write an application (or script), based on which’s  exit code, git bisect can find “good” and “bad” commits. We noticed that if portage fails faster, it fails within 1.8 seconds. So we wrote the following script.

#!/usr/bin/env python

import subprocess
import time

a = time.time()
subprocess.run(
            [
                "emerge",
                "www-client/chromium",
                "libreoffice-dev",
                "dev-lang/ghc",
                "dev-haskell/doctest",
                "dev-ruby/actionpack",
                "firefox",
                "tensorflow",
                "idonotexist"
            ]
    )
b = time.time()
t = b - a

if t > 1.8: # If t goes above 1.8, it means dependencies are being calculated
    exit(0) # Says to git bisect this is good (we want to find the bad commit)
elif t < 0.2:
    exit(127) # Says to git bisect to check nearby commit.
else:
    exit(1) # Bad commit (we want to find this)

We need the exit(127) line because some of the commits leave portage repo at an inconsistent  state. When the program exits with an exit code of 127, it tells git bisect to ignore the current  commit and check nearby commits. We ran the script with the following command.

git bisect ./script

The output of the run can be found here.

Due to EAPI differences and the portage status being inconsistent between commits, we could not identify the exact commit that fixed it but it was somewhere around commit “0f3070198c56a8bc3b23e3965ab61136d3de76ae”, which was around 2021 when caching  capabilities are added to portage. With this information, the bug was closed successfully.

Summary

To summarize, we looked at a few bugs and closed one using git bisect and a small python script. I  am also getting familiar with the codebase and I have been looking at places to work on.emerge --version seems like a nice and simple corner to start. That marks the end of the “community  bonding period”. The “official coding period” starts next week and I can barely contain my excitement!!

Posted in 2023 GSoC, Modernization of Portage with C++ | Tagged , | Leave a comment