Weekly report 6, LLVM libc

Hi! This week I have been working on LLVM/Clang support for
Crossdev. This is currently done by swapping out the different Crossdev
stages for ones that make sense for LLVM.

Currently it replaces stage0 with checking whether LLVM can target the
target triple’s architecture by checking the LLVM_TARGETS USE-flag.

Stage1, which normally installs libc headers and compiles a -stage1 C
compiler is replaced by installing libc headers and compiling
compiler-rt.

Stage2 (kernel headers), is the same.

Stage3 (libc install), is the same.

Stage4, which compiles a full compiler is skipped completely.

Another needed change was to make the compiler-rt ebuild cross-aware,
with changes like making the assembler and C compiler target the target triple, and
including headers from the crossdev /usr/${CTARGET}/usr/include
directory instead of using the host’s libc headers. I got some help from
wikky here, thanks!

Currently doing ‘crossdev –llvm -t riscv64-gentoo-linux-musl &&
riscv64-gentoo-linux-musl-emerge dash’ produces a working binary that
can be run using qemu-user like this ‘qemu-riscv64 -L
/usr/riscv64-gentoo-linux-musl /usr/riscv64-gentoo-linux-musl/bin/dash’
which I think is very cool! However, there is still some issues with
dynamically linking libraries built with the cross compiler. For example
xz-utils installs liblzma.so which fails with “exec format
error”. (http://sprunge.us/HkSmms). I am currently looking into that.

Another thing I’m still a little uncertain about is where to put all
environment variables and compilation options, whether it’s crossdev’s
job or the ebuilds’. This is something I will come back to, and I have
some changes locally on my computer
Crossdev patches:
https://github.com/alfredfo/crossdev/commit/ec65dee4b4c359bf3e0fc374d31e05b147fa3f0d
Compiler-rt patches:
https://github.com/alfredfo/catcream_repo/blob/master/sys-libs/compiler-rt/compiler-rt-17.0.0.9999.ebuild

Later during the week I made ebuilds for LLVM libc and
libc-hdrgen (generates LLVM libc headers from TableGen specification
files). Normally you build LLVM libc together with libc-hdrgen, but when
cross compiling it’s a better idea to split these and keep libc-hdrgen a
tool installed on the build host. I have played with only building
libc headers for bootstrapping with crossdev but I haven’t figured it all out yet.
To only install headers you can use the install-libc-headers target, but
it seems like CMake still wants to build things. There’s also the scudo
allocator that needs to be statically linked to LLVM libc. My idea is to
make a USE=static-scudo flag for compiler-rt that gets set in crossdev
when compiling compiler-rt for a LLVM libc target.
These are also kept locally until I’ve figured out how to cross compile
in stages.

Many “small random issues” and technicalities have also poped up during
this week that’ve taken quite a long time, but are not really worth
digging into here.

Next week I will continue with this until I can use it to work on LLVM
libc, worst case scenario I could temporarily make a “franken LLVM libc ebuild” that
does everything (headers, compiler-rt, scudo, llvm libc) in one shot,
but it should definitely be possible to do it separately.

I also forgot to update my llvm-common changes with the new
elisp-site-file-install function that was inspired by my PR 🙁
… will fix that tomorrow.

Thanks for reading!

Posted in Bootstapping LLVM, Uncategorized | 2 Comments

Creating custom lxd gentoo containers from stage-3 tarballs

Much of this based from the incredible guide by user (and my mentor) Juippis and his work over at The ultimate testing system with lxd. In fact most of what comes next comes directly from Juippis himself.

The reason for creating custom gentoo containers was purely for testing. I really need a way to seamlessly test my PR’s against gcc-13, clang-16 and musl + clang-16, while also being lazy because testing them manually meant introducing error and missing test cases. So I asked Juippis for help and I’ll try to write down what he taught me. I don’t really expect anyone else to read this and this is mainly server me for a quick guide setting up lxc container. Don’t take this one as anything more than a quick guide.

First of all, you need to install lxd on your host system. For that I would recommend heading over to the Gentoo’s lxd wiki.

Once that is done, you can follow Juippis’s guide for setting up lxd container with glibc and gcc-13 or the newest gcc available. His guide is pretty verbose and straight forward and I don’t really expect any hiccups.

Now onto the steps for creating a lxd container from stage-3 llvm tarball.

  • Install distrobuilder
  • create a sub-folder gentoo under folder distrobuilder (example: mkdir -p ~/distrobuilder/gentoo)
  • Download the gentoo.yaml from https://raw.githubusercontent.com/lxc/lxc-ci/master/images/gentoo.yaml after cd-ing into ~/distrobuilder/gentoo
  • create another folder for this profile, lets say llvm (example: mkdir llvm) from inside the ~/distrobuilder/gentoo where you downloaded the gentoo.yaml
  • cd into to the llvm folder
  • now using distrobuilder you can create the lxd image with the following command: sudo distrobuilder build-lxd ../gentoo.yaml -o image.architecture=amd64 -o image.variant=openrc -o source.variant=llvm-openrc
  • with source.variant variable you can manipulate the what is being downloaded. So the line source.variant=llvm-openrc will download the llvm openrc stage-3 tarball for creating the image.
  • Once download finishes, you can import the rootfs with the following command: lxc image import lxd.tar.xz rootfs.squashfs –alias gentoo-amd64-llvm
  • launch your lxc image with lxc launch gentoo-amd64-llvm gentoo-llvm-test and login into it the usual way.
  • That’s it, you now have gentoo lxd image with llvm openrc profile

Note:

For using musl profile, you’ll need to modify the gentoo.yml file abit, specifially comment the following lines:

echo en_US.UTF-8 UTF-8 > /etc/locale.gen
locale-gen

This due to musl only uses C.UTF-8 locale.

So now have three test-pr scripts, test-pr-gcc, test-pr-clang16, and test-pr-mclang16 which I use to test my PR’s against gcc-13, clang-16 on glibc and clang-16 on musl libc repsectively.

Posted in 2023 GSoC, Modern C Package Porting | Leave a comment

Week 8 Report, Automated Gentoo System Updater

This article is a summary of all the changes made on Automated Gentoo System Updater project during week 8 of GSoC.

Project is hosted on GitHub.

Progress on Weeks 8

Currently, the updater supports two methods of notifications: IRC bot and email.

The IRC bot was built using Python’s sockets library with SSL support. Although functional, it remains quite basic and encounters issues with sending out the report properly in approximately 20% of cases. The issue seems to occur during connection to irc.libera.chat servers, though the exact problem remains unclear.

In addition, there’s an option to send the report via email using SendGrid. This service was selected due to its free registration and simplicity of use because it only requires an API key.

Challenges

The initial challenge involved figuring out an effective way to send the report to the IRC chat. The program has a short 10-second buffer to ensure the message is sent properly. However, with reports that could be tens or hundreds of lines long, this process can take a bit longer. The current solution is to send a brief report that merely indicates if the update was successful. After this, the bot will ask if a more detailed report is needed.

Future plans also involve setting up a local email relay using sendmail and postfix. However, this method is accompanied by several challenges. For instance, only one MTA (mail transfer agent) can be installed, which must be reflected in the ebuild. Also, configuring an email relay on Linux systems typically involves more steps, which requires writing a comprehensive documentation.

Plans for Week 9

This week, I plan to start working on the web app’s design and architecture layout.

At the same time, there are several code enhancements that need to be implemented. For instance, the current logger only covers the updater script, neglecting the parser, reporter, and notifier. Thus, it needs to be extended to cover all components of the program.

In terms of report formatting, the report is currently structured as a dictionary. However, it would be more beneficial to refactor it into a Python object, such as a dataclass.

Lastly, the way gentoo_updater accepts its CLI flags could be improved. Currently, either y or n must be passed to the CLI, as in:

gentoo_update --update-mode full --read-logs y --read-news y

It looks a bit cubmersome, since if the flag is present then y is already implied.

Posted in 2023 GSoC, Automated System Updater | Leave a comment

Week 8 report on porting Gentoo packages to modern C

Hello all,
I’m here with my week 8 report on Modern C porting of Gentoo’s packages.

Testing environments are set. I now have three environments to test my
PRs on.
– GCC 13 with glibc
– Clang-16 with llvm profile
– Clang-16 with musl-llvm profile

Much of it goes to juippis who gave me the instructions for creating
custom lxc images using gentoo stage-3 tar balls. This has helped me
immensely, I can now have testing environment ready in only couple of
minutes and keep untouched clean environments at ready.

Coming to my work, it’s has remained the same, I’ve picked up various
random bugs from the tracker list and worked on them. But I’ve come to
the realization that my work isn’t just limited to c99 or c11 porting.
It’s is mix between c99 porting, using Clang-16 as the default compiler
and perhaps using lld as the system linker as well. Which of course I’m
very happy about.

Another thing that Sam brought up is that it’s always the best to inform
him whenever I’m or I’m not sending patches upstream, because it’s in
my initial proposal to send patches upstream and sometimes it’s very
important because often times the developers of the packages know better
about the codebase and can offer more in sights about what would be the
best practice.

Coming next week, I plan to work more on reducing the bug from the
tracker, mainly picking up bugs from the tracker and send patching them.

Also, work with Sam and Joonas on my already submitted patches as they
have started to review my PRs. Not to mention I’ve to take care about
sending patches upstream whenever possible, as Sam mentioned.

Till then, see ya!

Posted in 2023 GSoC, Modern C Package Porting | Leave a comment

Week 6+7 Report, Automated Gentoo System Updater

Progress on Weeks 6 + 7

These 2 weeks were spent on the parser and the reporter. During this time, I’ve added many features to it, but there are still much more things left to be done. Due to limited time of GSoC I will implement additional features after the program end.

Here is a list of features that were implemented so far:

  • If the update was successful, report will show:
    • updated package names
    • package versions in the format “old -> new”
    • USE flags of those packages
    • disk usage before and after the update
  • If the emerge pretend has failed, report will show:
    • error type (for now only supports ‘blocked packages’ error)
    • error details (for blocked package it will show problematic packages)

And here are the errors that I plan to add support for in the future:

  • Mutually exclusive USE flags
  • Errors due to licenses
  • OOM
  • Not enough disk space
  • Network issues during the update

I also had a good idea about how to go about testing gentoo_update. Basically, I can set up a CI/CD pipeline that will detect newly published stage3 Docker containers, and whenever there a new container is detected – run gentoo_update on it and check the output. Eventually, it will run into some errors that I will then use to improve gentoo_update. Pipeline itself can be set up with Jenkins, for example. This idea is a bit out of scope of my proposal, so I will work on it after GSoC 🙂

Challenges

While trying to find ways to generate errors in Portage I realized how hard it is to break Portage intentionally, and it’s almost impossible without deliberately creating faulty ebuilds and USE flags, which of course is a good thing!

So far I only managed to test out ‘blocked package’ error, here is how it was done:

  • Create a simple Bash script that prints out some ASCII art (prints an owl, in my case);
  • Set up a local repository, and add an ebuild for this script;
  • Install the script on the system;
  • Version bump the script, for example 0.1 -> 0.2;
  • Then add RDEPEND="!net-print/cups" to the ebuild, which will raise an error if cups is installed. cups is just a package that was installed on my system, any other package will do;
  • Run update @world and look how Portage starts to throw errors 🙂

What sounds like a couple simple steps actually took me about 2 days to figure out… Although challenging, it actually is very fun to find ways to break things 🙂

Plans for Week 8

Week 8 will be dedicated to writing code to send reports via emails and IRC chats. But before that, I need to do some more work to improve integration between the updater, parser and reporter.

Ideally, I also need to spend some more time on error catching and improving overall stability of gentoo_update.

Posted in 2023 GSoC, Automated System Updater | Leave a comment

Week 7 report on porting Gentoo packages to modern C

Hello all.

First of all, I would like to give the good news of passing the Mid-Term
evaluation. My mentor/s have provided me some valuable advice that I
would like to incorporate in my work in the following weeks ahead.

Coming to my work, as I said in my last report. I picked up where I left
before week 6. Sent sent in some patches (no upstream unfortunately).
This week I mainly worked with Juippis (my other mentor) on reviews of
my already submitted PRs. We came across some challenges while doing,
namely reproduction of a bug, the case being juippis and sam_ were able
to reproduce the bug, but I couldn’t due. It was most probably due to
compiler-rt. I still have to send in a proper fix for that bug. Which
brings us the to second topic of setting up a test environment. Juippis
has an excellent guide on using lxc containers for setting up test
environment.

So there’s that.

Coming weeks, priority would be setting up the test environment with
Juippis guide so we don’t have to face the aforementioned scenarios
again.

Apart I mainly want to do two things:
– stick to my proposal and work on Wstrict-prototypes, and
– work on bringing down the number of bug on the tracker, there’s still
quite a lot, and often times more keeps getting added.
– Work more with mentors on code/PR reviews

Till then, see yah!

Posted in 2023 GSoC, Modern C Package Porting | Leave a comment

Week 6 report on porting Gentoo packages to modern C

Hello all,

This week I couldn’t do much as I caught a bit of cold and fell ill. But
I’m doing much better now and will begin working again starting this
week. I plan on making up for last weeks work in the coming week and
in case there is still remaining work, I will make it up in the
extra/emergency week at the end. This was also the reason I could not update my blogs for the last week.

So the plan for coming is to pick up where I left and start from there
and finish any remaining work. Then start with
Wdeprecated-non-prototypes and Wstrict-prototypes, which aligns with my
proposal timeline for week 7 to 10.

Our evaluation for midterm opens this week (today 10th of July) and will
be open for the entire week. I’m super nervous and excited about it.
*fingers crosses*

Till then, see ya. Take care.

Posted in 2023 GSoC, Modern C Package Porting | Leave a comment

Week 5 report on porting Gentoo packages to modern C

I’m writing this report on 13th July, almost two week late. See week 6 report for that, I had fallen a bit sick.

Hello all, this is my week 5’s report for my project “Porting Gentoo’s
packages to Modern C”.

First things first, we now have MATE desktop and related packages
ported. Not only just in Modern C, but it’s now compatible with
gettext-0.22, too [1]. So if you are using llvm-musl or the llvm profile
you can use MATE desktop.

While fixing MATE settings-daemon I’ve learned two very valuable
lesson (thanks to my Sam),
– Getting feedback from upstream devs is important
– Casting variables in incompatible function pointer type of errors is
not always correct, it might only temporarily fix the problem/silence
the warning.
I’m going to keep this two points in mind for the next and upcoming
weeks.

Apart from the MATE work, I mostly adhered to my proposal timeline and
fixed more -Wimplicit-function-declaration bug, [2][3] and more.

While strictly according to my proposal, coming two weeks (week 6 and 7)
are to be focused on -Wdeprecated-non-prototype. But in my experience
till now there are not many bugs of this type. I’ll obviously keep an
eye out for this bug types but I’ll most likely be solving more of
-Wimplicit-function-declaration or -Wincompatible-function-pointer-types
type of bugs, as they seem to dominate the bug list/tracker.

Our midterm evaluation is also coming up, opens 10th this month, hence
working towards that (mainly communicating with my mentors on any things
they expect of me or would like to see/get done before the evaluation).
Needless to say super excited about that.

Till then, see ya!

[1]: https://github.com/mate-desktop/mate-panel/pull/1375
[2]: https://github.com/gentoo/gentoo/pull/31671
[3]: https://github.com/gentoo/gentoo/pull/31670

Posted in 2023 GSoC, Modern C Package Porting | Leave a comment

Week 5 – Modernization of Portage

Week 5 – Modernization of Portage

Hey everyone, this week was a fun and satisfying one. Let’s get into it.

Context

I wanted to work on the dependency resolution system of portage. It is a scary codebase and so  Sam suggested I start with bugs related to the dependency resolution system. We decided on bug  528836. The bug is relatively simple (though it took me relatively long time to understand). In gentoo, there are virtual packages. If multiple packages can provide the same functionality / library,  then there is a virtual package that depends on either one of them. Any package needing  that functionality / library can depend on the virtual package and not worry about the specifics. The  problem in this bug is that a package has two dependencies (let’s say) and one depends on a  package and the other depends on the corresponding virtual package. Now portage tries to emerge  both sides of the virtual package, which leads to conflicts. Ideally, the ebuild maintainers should  have made both dependencies depend on the virtual package rather than the actual, but nonetheless portage should have been able to figure it. The first task was to reproduce the bug in a gentoo  system.

There were several hurdles along the way. The bug was very old (from 2014). It is not  reproduceable in the current state of portage or the ebuild repository. Luckily, we got an old stage3  from Andrey, one of the mentors. Gentoo moved to git from CVS only recently and so, we had to graft in the historical gentoo repo into the current repo to restore it to an older state. The major  hurdle is my inexperience / knowledge with ebuilds. Though I have been using gentoo for a few  years, I never bothered to create ebuilds or study them. So when I had to look through the ebuilds  to figure out what is going on, it was a bit overwhelming. Reading through pages and pages of man  pages, PMS and gentoo wiki and with a lot of help from Sam, we were able to reproduce the bug.

Writing a test for the bug

Sam suggested that I write a test for portage that would expose this behaviour. It had it’s own  hurdles, but finally we were able to do it. It is not integrated into portage, but it can be found here. I  really want to thank Sam again for his patience towards me. I sometimes ask the silliest of things,  but he explains them with a smile. I could never be more thankful.

The next step will be towards trying to fix the bug with portage or declare the ebuild to be invalid  (which is reasonable). We will also work towards integrating the test into portage. Sam will have to  decide on that. I will keep you posted whatever happens.

Unreachable code

I also sent a pull request, removing some unreachable / legacy code. At the time of submitting the  pull request, GitHub’s pypy37 (one of the targets portage is tested against) runner had some issues.  The tests will be rerun and the commit will be merged into master soon.

Next week

The mid term evaluations are coming up. The next week will be towards getting ready for that,  fixing the above bug and maybe a few more type annotations. I’ll see you all next week.

Posted in 2023 GSoC, Modernization of Portage with C++ | Tagged , | Leave a comment

Weekly report 5, LLVM libc

Hey! This week I’ve spent most of my time figuring out how to bootstrap
a LLVM cross compiler toolchain targeting a hosted Linux environment. I
have also resolved the wint_t issue from last week. Both of these things
took way longer than expected, but I also learned a lot more than
expected so it was worth it.

I’ll start with discussing the LLVM cross compiler setup. My initial
idea on how to bootstrap a toolchain was to simply specify LLVM_TARGETS
for the target architecture when building LLVM, then compile compiler-rt
for the target triple, and then the libc. This is indeed true, but the official
cross compilation instructions tells you to specify a sysroot where the
libc is already built, and that’s not possible when bootstrapping from
scratch.

As the compiler-rt cross compilation documentation only tells you to use
an already set up sysroot, which I didn’t have, I had to try my way
forward. This actually took me a few days, and I did things like trying
to bootstrap with a barebones build of compiler-rt, mixing in some GCC
things, and a lot of hacks. I then studied
mussel for a while until finding out about
headers-only “builds” for glibc and musl. It turns out that the only
thing compiler-rt needs the sysroot for is libc headers, and those can
be generated without a functioning compiler for both musl and
glibc. This is done by setting CC=true to pass all the configure tests
and then run ‘make headers-install‘ (for musl) into a temporary install
directory to generate the headers needed for bootstrapping
compiler-rt.

export CC=true
./configure \
--target=${CTARGET} \
--prefix="${MUSL_HEADERS}/usr" \
--syslibdir="${MUSL_HEADERS}/lib" \
--disable-gcc-wrapper
make install-headers

After this is done you can pass the following CFLAGS:
-nostdinc -I*path to temporary musl install dir*/usr/include‘ to the
compiler-rt build.

-DCMAKE_ASM_COMPILER_TARGET="${CTARGET}"
-DCMAKE_C_COMPILER_TARGET="${CTARGET}"
-DCMAKE_C_COMPILER_WORKS=1
-DCMAKE_CXX_COMPILER_WORKS=1
-DCMAKE_C_FLAGS="--target=${CTARGET} -isystem ${MUSL_HEADERS}/usr/include -nostdinc -v"

After this is done you can export
LIBCC="${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a
to the musl build to use the previously built compiler-rt builtins for
the actual libc build.

To then build actual binaries targeting the newly built libc you can do something like this:

clang --target="${CTARGET}" main.c -c -nostdinc -nostdlib -I"${MUSL_HEADERS}"/usr/include -v

ld.lld -static main.o \
"${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a \
"${MUSLLIB}"/crti.o "${MUSLLIB}"/crt1.o "${MUSLLIB}"/crtn.o "${MUSLLIB}"/libc.a

Running the binary with qemu-user:
$ cat /etc/portage/package.use/qemu
> app-emulation/qemu static-user QEMU_USER_TARGETS: aarch64
$ emerge qemu
$ qemu-aarch64 a.out
> hello, world

Afterwards it feels pretty obvious that the headers were needed, and I
could’ve probably figured it out a lot sooner by for example examining
crossdev a bit closer. But I am happy I did play with this since I
learned things like what the different runtime libraries did, what’s
needed to link a binary, and a lot more. Here’s a complete script that
does everything:
gist.
Next I will integrate this into crossdev. Another thing I need to think
about is how to do a header-only install of LLVM libc. Currently the
headers get generated with libc-hdrgen and installed with the
install-libc target. Probably this can be done by packaging a standalone
libc-hdrgen binary and using that for bootstrapping. I could also
temporarily “cheat” and do a compiler-rt+libc build to get going.

Next I also figured out what, and why, the wint_t problem occurs when
building LLVM libc in fullbuild mode on a musl system (see last week’s
report). The problem here is that on a musl system, /usr/include will be
first in the include path, regardless of CFLAGS="-ffreestanding". (for
C++ they will be after the standard C++ headers and then
#include_next‘ed, so no difference). I thought at first that this was a
bug since you don’t want to target an environment where the libc is
available (hosted environment) when building in freestanding
mode. However, after asking in #musl IRC this is actually fine since the
musl headers respect the __STDC_HOSTED__ variable that gets set when using
-ffreestanding, and there is a clear standard specifying what should be
available in a freestanding environment.

The problem arises because LLVM libc assumes that the Clang headers will
be used when passing -ffreestanding, and therefore relies on Clang header
internals. Specifically the __need_wint_t macro for stddef.h which is
in no way standardized and only an implementation detail. My thought
here was to instead of relying on CFLAGS="-ffreestanding" to use the
Clang headers, we should instead figure out another way using the build
system to force Clang headers. Another way to solve this would also just
be to also rely on musl internals (__NEED_wint_t for stddef.h).

After discussing this we agreed to first actually get the libc built,
and then decide on a strategy once we know how many times similar issues
pop up. If there are only a few instances of this then more #defines are
fine, else we could do something like the gcc buildbot target. My only
worry with this is that it will keep biting us in the ass as more things
get added.
https://github.com/llvm/llvm-project/issues/63510

Other things worth noting is that my ‘USE=emacs llvm-common’ PR inspired a
new elisp-common.eclass function called elisp-make-site-file
https://github.com/gentoo/gentoo/commit/a4e8704d22916a96725e0ef819d912ae82270d28because mgorny thought that my sitefiles were a waste of inodes :D.
https://github.com/gentoo/gentoo/pull/31635. I also got my
__unix__->__linux__ CL merged into LLVM. I do however have some worries
that this could’ve broken some things on macOS as seen in my comment:

> done! I think there should be something addressing pthread_once_t and
> once_flag for other Unix platforms though. Both of these would've
> previously, before this commit, been valid on macOS, as __unix__ is
> defined and __futex_word is just an aligned 32 bit int. No internal
> Linux headers were used here before that would've caused an error.

https://reviews.llvm.org/D153729

Next week I will try to make Crossdev be able to use LLVM/Clang by
integrating the things I did this week.

Posted in Bootstapping LLVM | Leave a comment