Bonding period 1 – Modernization of Portage

Bonding period 1 – Modernization of Portage

Hello everyone,

I am Berin Aniesh, one of the four contributors for Gentoo through GSOC 2023. You can read more about us here. In this post, I want to talk about the project I am working on and the first two  weeks of the community bonding period

Title and Project Scope

The title of the project is “Modernization of portage codebase by refactoring and rewriting performance critical parts as C++ extensions”.

Portage is probably the most versatile package manager on the planet and this has been its boon and bane at the same time This versatility combined with portage’s feature richness has made it possible for not only gentoo users, but projects like chromium OS, Flatcar container linux, a numerous downstream projects and many more. In linux, it can support any underlying stack (eg. glibc vs musl, hardened systems, systemd vs openrc, etc). Other than linux, it can also run on BSD and MacOS. It supports compile time feature selection through USE flags. Taking all these factors into  account, together with the fact that portage supports numerous architectures, seeing portage  perform its duties as it was designed to is a huge feat of engineering. And above all, everything of  portage is written by passionate volunteers. If anything, understanding the landscape of gentoo has  brought me huge respect towards the gentoo developers and the community.

Community

My time in the community bonding period was spent, well, bonding with the community,  understanding the spread of gentoo and studying the codebase.

Gentoo community is one of the most knowledgeable and welcoming of the linux communities.  Gentoo is different from other linux distros in the fact that it urges the community to contribute.  This combined with the patience needed to run gentoo as a daily driver has brought together a community, that is mature, knowledgeable and composed. I learnt so many things, trivial and non  trivial just by hanging around in #gentoo-chat of libera.chat. If anyone reading this has not  participated in #gentoo-chat, I really insist that you to have a look around in your free time.

Setting up an IRC Bouncer

Though mailing lists exist, the main mode of communication for most things gentoo is through  IRC. There is just one small problem. IRC doesn’t work like other modern chat applications, there  is no centralized server to save our messages. Someone can’t drop us a message and expect it to get delivered all the time. The recipient has to be online at the time of sending to receive the message. This is not feasible and so we have to setup something called an IRC bouncer, which stays online  24/7, gets messages on our behalf, stores it and and relays the message to us when we are online.  We have two options, find a managed bouncer service or self host one and you guessed it, I of  course went with the self hosting method.

Setting up the bouncer was not so easy as it is a combination of many technologies which work  together. There are so many options to choose from and each choice only documents its very  specific job. I had to learn about the overview of IRC servers and clients, their interworking, websockets, SSL, nginx and many more. At the end, with a lot of help from #gentoo-chat and my  friend catcream, I did manage to setup a bouncer on linode with soju from emersion and his IRC client, gamja.

Though the process is painful, I think it was worth the effort. Chatting using the bouncer I setup is  indeed very satisfying.

Getting familiar with the codebase

Portage is a huge codebase with a lot of intertwined functions and getting familiar with nearly  twenty years of work is no easy task. The mentors understood this and helped in every way  possible. Other than reading the codebase, my mentor Mr. Sam James suggested that I try fixing a  few bugs so that I get familiar with the codebase. I’ll be searching the bugzilla for simpler bugs,  which are easy to fix and fixing them. That would be the plan for the next two weeks.

Summary

So, in the first two weeks of the community bonding period,

  • I got familiar with the community and made a lot of friends
  • I am getting to know about the places where portage is used
  • I am getting familiar with the portage codebase.

That marks the end of two weeks into GSOC. I have been thoroughly enjoying my time at gentoo. I am thankful for the opportunity and I hope to make some meaningful contribution to gentoo in  the upcoming weeks.

Posted in 2023 GSoC, Modernization of Portage with C++ | Tagged , | Leave a comment

Week 3 report on porting Gentoo packages to modern C

Hello all,

I’m here with my week 3 report for Modern C porting of Gentoo’s
packages. For this week I diverted from my initial idea a bit and
focused on the “C++17 does not allow register storage class specific” type
error. Basically, C++14 deprecated the register storage class and it has
been completely removed in C++17, thus resulting in C++ packages that
use register keywords with this kind of error. A general fix is it
either removes the keywords or replaces them with *int* where applicable.

For example, in this PR [1] for the fox toolkit, I’m using sed to remove
register keywords from various folders of the source. Whenever possible
I’m sending patches upstream as well, for example, I’ve sent this [2]
patch upstream while also applying it Gentoo tree.

Not to mention, I’ve been sticking to my proposal timeline as well and
patching -Wimplicit-int error [3][4][5] (Initially I planned on taking
up this particular bug type in the latter half of the week). But
sometimes the error messages are not that straightforward and can be
miss leading. For example in this PR [3], although the bugs say that
it might be a -Wimplicit-int bug it was rather just a wrong inclusion of
header where the data type off_t was not being found and it was assumed
to be the parameter.

Next:
As per my proposal, I’m going to stick to -Wimplicit-int for the rest few
days of the week and then slowly move to
-Wimplicit-function-declaration. I did come across some
-Wimplicit-function-declaration type of bugs which I’ve patched and
created a PR [6][7].

I’ve also come across two main language-specific blockers with
clang-16, namely Fortran and Vala. Lots of packages from sci-libs are
dependent on the package sci-libs/lapack which is failing to build on
with clang-16. As a result, I’ve to keep some of the sci-libs bugs on
halt for now. On the Vala front, not much we (as in Gentoo people/devs)
can do, this is a main blocker for some GNOME packages.

That’s it for this week, hopefully, I’ll be able to patch more packages
in the coming weeks.

[1]: https://github.com/gentoo/gentoo/pull/31357/files#diff-aaf358e6aacd565fcdd354d1ff87b08b1b2c679aaf978af900caa00c70a7978eR46
[2]: https://github.com/VirtualGL/virtualgl/commit/441c4e77d8e33edb28d3015b573e4e45bb13d684
[3]: https://github.com/gentoo/gentoo/pull/31520
[4]: https://github.com/gentoo/gentoo/pull/31464
[5]: https://github.com/gentoo/gentoo/pull/31411
[6]: https://github.com/gentoo/gentoo/pull/31476
[7]: https://github.com/gentoo/gentoo/pull/31513

Posted in 2023 GSoC, Modern C Package Porting | Leave a comment

Week 3 Report, Automated Gentoo System Updater

This article is a summary of all the changes made on Automated Gentoo System Updater
project during week 3 of GSoC’23.

Project is hosted on Github.

Progress on Week 3

Project finally received some Github stars!

It also has received 2 issues (#7 and #8). In #7 someone suggested to remove update.sh from being installed in the PATH, and only expose gentoo-update as entry point. #8 was a question if the program will be resolving circular dependencies. I was more than happy to solve/answer both issues, and hope to see more in near future!

ebuild that I have submitted to GURU repository apparently didn’t pass some CI tests and generated a bunch of errors which I received by email. Luckily for me, some nice maintainers found and fixed the issues, and those problems are solved now. More details in bugs 908307 and 908308. I also received some recommendations about how to avoid similar issues in future from the GURU’s IRC chat.

I have further improved the updater program code, here is the changelog:

  • remove updater.sh from PATH
  • Read PORTAGE_LOGDIR variable from make.conf and use it to store logs.
    By default it will use /var/log/portage/gentoo_update
  • Before running needrestart, eclean or revdep-rebuild check if it’s installed, print
    error message if not
  • Add type hints to Python functions and methods
  • Change set -e to set -euo pipefail to improve stability of Bash scripts

ebuild also received some improvements. Now it defines 2 USE flags that will install optional dependencies.

Lastly, I started coding the parser. Parser’s end goal is to read the log file created by the updater and to create an easy-to-read report that will briefly describe what changes were made on the system.

Oh, and I finally learned how to post blogs on https://blogs.gentoo.org/gsoc, it wasn’t that
hard after all.

Challenges

I found it very hard to debug ebuild issues, the documentation is actually very helpful but it’s also gigantic a takes much time to go through. Thankfully GURU team came through to solve the issues I was having.

Apart from that, I am thinking on ways to automate version bumping the updater in GURU repository.

Here are the steps I am taking now to version bump the ebuild:

  • Push a tag in the Github repository
  • Sync GURU overlay
  • Modify ebuild, usually just version bump, but this week I also added USE flags
  • Run test with -> ebuild gentoo_update-0.1.5.ebuild test
  • Update Manifest if tests were passed -> ebuild gentoo_update-0.1.5.ebuild manifest
  • Commit changes to dev branch

I think all of these steps can be automated with Github Actions.

Plans for Week 4

The first thing to do is to write a post on Gentoo Forums about the updater. This task is already a bit delayed because of technical issues with the updater and the ebuild, but now everything is ready 🦾😊

To maximize the usefulness of gentoo_update it’s very important to get some feedback from community as soon as possible, and it will also be nice to have some more Github issues to work on. I’m planning to post on forums before next Tuesday.

Then, there of course is the parser. I plan to add following features:

  • Split log into multiple sections, i.e updated programs, what needs restarting etc.
  • Summarize sections and create a report
  • If updater exits with error, crate a separate error report

If there will be free time left, it would be great to work on Github Actions workflow to automate version bumping the ebuild.

Posted in 2023 GSoC, Automated System Updater | Leave a comment

Weekly report 2, LLVM libc

Hi! This week I’ve continued my work on getting Python to run. It has
mostly involved defining a lot of missing functions and types for
Python.

These are mostly taken from musl libc, but some things are also just
implemented using a no op.

In LLVM libc there are currenly some headers without all the needed
declarations available. That will make “HAVE_*_H” configuration tests
pass, but then fail later. For some of these cases I simply did
‘#undef’ and hoped the functionality wasn’t needed.

My current plan is to just get Python to build, and then go back to
fix things properly, within reason.

I also found a bug in LLVM libc where the TableGen specifications for
truncate and ftruncate had mixed up argument types. Because the
libc headers are generated from these specifications, the unistd.h
header had incorrect declarations for these functions. As this did not
add any new things, and was easy to implement, I got it committed into
upstream LLVM instead of keeping it locally like the other fixes. Sam
suggested that if I get stuck on something and want to do something
else for I while I could go through the spec files and look for errors
similar to this (truncate commit on phabricator).

At the end of the week I went back to fix an error I got while setting
up LLVM libc, that being an error related to PRId64 format string. Sam
suggested that I could use this to learn how to use cvise, which I
will do next week.

Next week I aim to be finished with Python, and in that case I’ll
continue with getting Crossdev to work with LLVM by making it able to
compile compiler-rt for non-host triples.

Posted in Bootstapping LLVM | Leave a comment

Week 2 report on porting Gentoo packages to modern C

This is my week 2 report for my SoC 2023 project “Modern C porting of
Gentoo packages” at Gentoo Linux.

Current:
– I’ve stuck to my proposal and mainly worked on the
“Wincompatible-function-pointer-types” bugs. Honestly, nothing much
interesting did happen.
– I was not able to work for 2 days, due to some personal health issues, I
plan on making up for them in the following weeks/days.
– MUSL testing environment (chroot) is set up and bugs are being tested
against it. There are some bugs that still need improvements/fixes on
musl
– Got more of my bugs reviewed by my mentor/s, while I do need to work
on those updates.

Next:
– While the “Wincompatible-function-pointer-types” bugs are not
completely removed from the bug list, I do plan on working on
different kinds of bugs in the coming weeks, while also trying to keep
up with the aforementioned bug types.
– For the first half of the coming two weeks the plan is to work on
“Wimplicit-function-declaration” type of bugs, and “Wimplicit-int” on
the later half. To be honest, I didn’t see many bugs in the later type
in the bug list, hence if there are fewer I can dedicate some of the
time to the “Wincompatible-function-pointer-types” bugs.
– Since I have the musl testing environment up and running, I plan of
testing/patching most of the bugs on musl environment, especially the
ones that were found on the musl-clang environment.

That is it, hopefully, I’ll be come across something interesting for
people reading here.

Posted in Modern C Package Porting | Leave a comment

Weekly report 1, LLVM libc

Hey! I had to start GSoC on sunday last week due to school, and I didn’t
think that I’d write a weekly report for the first week but I decided to
do it anyways.

My plan for week 1 was:
>This week I will set up a LLVM toolchain and sysroot for compiling
>programs targeting LLVM libc. I will also start setting up a
>“llvm-libc/Linux from Scratch” chroot.

Because I played with LLVM libc before last week I had already completed this
goal. Going forward I will only work in the sysroot until setting up
crossdev because it’s simple and gives me everything I need to fix
dependencies like Python.

This far the project has been going pretty smooth, but I’ve also ran
into some issues which I will comment on.

The first issue was regarding SSP when setting up the LLVM
toolchain. LLVM libc currently does not support stack smashing
protection, and somehow compiled binaries automatically wants it if the
toolchain was built with SSP enabled, even when compiling with
-fno-stack-protector. Probably this has something to do with internal
libraries getting built with it. I spent quite a bit of time on this
because I forgot CXXFLAGS was a thing and only set CFLAGS, thinking that
something else was causing it :). Why this works out of the box on some
other distributions, hence not in setup docs, is because Gentoo enables
SSP in the clang config files.

(See:
https://blogs.gentoo.org/mgorny/2022/10/07/clang-in-gentoo-now-sets-default-runtimes-via-config-file/)

I then moved on to work on Python. When I talked with LLVM libc
developers about this project they told me that the biggest obstacle for
Python would be the missing libm functions, so I decided to use Julia’s
openlibm instead of the built in one.

> I’d guess that python wouldn’t quite work yet since we don’t have all
> of the double precision math functions yet, though you might be able
> to fudge it by creating entrypoints that just call the single
> precision versions.

Openlibm just compiled and worked
out of the box with Python by configuring with
–with-libm=*libopenlibm.a*, and then substituting math.h for
openlibm_math.h.

Next week I will mostly work on Python, because libm isn’t the only
issue. Things like fileno, wide strings, pthread_cond are used in Python
and is not in LLVM libc yet. posixsource.c is particularly annoying.
Yes, I have a time machine.


catcream

Posted in Bootstapping LLVM | 2 Comments

Week 1 report or porting Gentoo packages to modern C

Hello all,

This is my week 1 report for my project “Modern C porting of Gentoo
packages” as a Google summer of code student at Gentoo Linux foundation.
Some of you might recognize me from last year, yep this is my second
time (and unfortunately last).

I’ll try to divide my report into two sections, current which is where
I’ll discuss the current status and next which is where I’ll discuss what
I’m going to do it next.

Current:
Getting to the report itself, I’ve been mainly sticking to the plan and
working on *Wincompatible-function-pointer-types* bugs. The idea is to
reduce such bugs from the bug list [1] completely or as much as
possible. Since I was already kind of familiar with some of the working,
environment, and tools for Gentoo, I started a bit early and have been
working on the aforementioned bugs during the community bonding period
which gave me some time to set up a music machine. This has helped me
solve some of the most specific bugs as
*Wincompatible-function-pointer-types* is not limited to only one
particular lib, in this case, glibc.

On the topic of keeping things on track, I have started sending some of
my patches upstream for review, and fortunately from them two of my
patches got merged [2][3] and other are under review

Next:
The plan moving forward is to fix more of these bugs and send patches
upstream while also waiting for reviews from upstream and Gentoo maintainers on my patches.

I’ve also started to work on masking all nss packages on musl, I’ll
keep working on that as well [4].

[1]: https://bugs.gentoo.org/870412
[2]: https://github.com/gssapi/gssproxy/commit/f6ab3193e64ecc9db4d253b6dd99991f461b6081
[3]: https://gitlab.com/gnuwget/wget2/-/commit/ca851a9a2780dada078b093d65295a440899313e
[4]: https://github.com/gentoo/gentoo/pull/31243

Posted in Modern C Package Porting | Leave a comment

Gentoo Google Summer of Code (GSoC) for 2023

Gentoo is excited to announce that the Gentoo Google Summer of Code has accepted a group of talented contributors to participate in this year’s program. We extend our congratulations and welcome them aboard!

Google Summer of Code is a global program that provides a unique opportunity for students and young professionals to work on open-source projects under the guidance of experienced mentors.

We received a high volume of impressive applications from individuals around the world, each demonstrating their passion and skills for open-source projects. The selection process was challenging, but we are pleased to have accepted the following four contributors:

  • Alfred Persson Forsberg – IRC Handle: catcream
  • Berin Aniesh – IRC Handle: hyperedge
  • Stepan Kulikov – IRC Handle: labbrat
  • Brahmajit Das – IRC Handle: listout

Continue reading

Posted in 2023 GSoC | Tagged , , , , | 2 Comments

Refining ROCm Packages in Gentoo — project summary

12 weeks quickly slips away, and I’m proud to say that the packaging quality of ROCm in Gentoo does gets improved in this project.

Two sets of major deliverables are achieved: New ebuilds of ROCm-5.1.3 tool-chain that purely depends on vanilla llvm/clang, and rocm.eclass along with ROCm-5.1.3 libraries utilizing them. Each brings one great QA improvement compare to the original ROCm packaging method.

Beyond these, I also maintained rocprofiler, rocm-opencl-runtimes, bumping their version with nontrivial changes. I discovered several bugs, and talked to upstream. I also wrote ROCm wiki pages, which starts my journey on Gentoo wiki.

By writing rocm.eclass, I learnt pretty much about eclass writing — how to design, how to balance needs and QA concerns, how to write comments and examples well, etc. I’m really grateful to those Gentoo developers who pointed out my mistakes and helped me polishing my eclass.

Since I’m working on top of Gentoo repo, my work is scattered around rather than having my own repo. My major products can be seen in [0], where all my PRs to ::gentoo located. My weekly report can be found on Gentoo GSoC blogs

[0] My finished PRs for gentoo during GSoC 2022

Details are as followed:

First, it’s about ROCm on vanilla llvm/clang

Originally, ROCm has its own llvm fork, which has some modifications not upstreamed yet. In the original Gentoo ROCm packaging roadmap, sys-devel/llvm-roc is introduced as the ROCm forked llvm/clang. This is the simple way, and worked well on ROCm-only packages [1]. But it brings troubles if a large project like blender pulls in dependencies using vanilla llvm, and results in symbol collision [2].

So, when I noticed [1] in week 1, I began my journey on porting ROCm on vanilla clang. I’m very lucky, because at that time clang-14.0.5 was just released, eliminating major obstacles for porting (previous versions more or less have bugs). After some quick hack I succeeded, which is recorded in the week 1 report [3]. In that week I successfully built blender with hip cycles (GPU-accelerated render code written in HIP), and rendered some example projects on a Radeon RX 6700XT.

While I was thrilled in porting ROCm tool-chain upon vanilla clang, my mentor pointed out that I have carelessly brought some serious bugs in ::gentoo. In week 2, I managed to fix bugs I created, and set up a reproducible test ground using docker, to make test more easy and clean and avoid such bugs from happening again. Details can be found in week 2’s report [4].

After that there weren’t non-trivial progresses in porting to vanilla clang, only bug fixes and ebuild polishing, until I met MIOpen in the last week.

The story of debugging MIOpen assemblies

In week 12 rocm.eclass is almost in its final shape, so I began to land ROCm libraries [1] including sci-libs/miopen. ROCm libraries are usually written in “high level” languages like HIP, while dev-util/hip is already ported to use vanilla clang in good shape, so there is no need to worry compilation problems. However, MIOpen have various hand-written assemblies for JIT, which causes several test failures [5]. It was frustrating because I’m unfamiliar with AMDGPU assemblies, so I was close to gave up (my mentor also suggest to give up working on it in GSoC). Thus, I reported my problem to upstream in [5], attached with my debugging attempts.

Thanks to my testing system mentioned previously, I have setup not only standard environments, but also one snapshot with full llvm/clang debug symbols. I quickly located the problem and reported to upstream via issue, but I still didn’t know why the error is happening.

In the second day, I decided to look at the assembly and debugging result once again. This time fortune is on my side, and I discovered the key issue is LLVM treating Y and N in metadata as boolean values, not strings (they should be kernel parameter names) [6]. I provided a fix in [7], and all tests passed on both Radeon VII and Radeon RX 6700XT. Amazing! I have also mentioned how excited I was in week 12’s report [8].

[1] For example, ROCm libraries in https://github.com/ROCmSoftwarePlatform
[2] https://bugs.gentoo.org/693200
[3] Week 1 Report for Refining ROCm Packages in Gentoo
[4] Week 4 Report for Refining ROCm Packages in Gentoo
[5] https://github.com/ROCmSoftwarePlatform/MIOpen/issues/1731
[6] https://github.com/ROCmSoftwarePlatform/MIOpen/issues/1731#issuecomment-1236913096
[7] https://github.com/littlewu2508/gentoo/commit/40eb81f151f43eb5d833dc7440b02f12dab04b89
[8] Week 12 Report for Refining ROCm Packages in Gentoo

The second deliverable is rocm.eclass

The most challenging part for me, is to write rocm.eclass. I started writing it in week 4 [9], and finished my design in week 8 [10] (including 10 days of temporary leave). In week 9-12, I posted 7 revisions of rocm.eclass in gentoo-dev mailing list [10,11], and received many helpful comments. Also, on Github PR [12], I also got lots of suggestions from Gentoo developers.

Eventually, I finished rocm.eclass, providing amdgpu_targets USE_EXPAND, ROCM_REQUIRED_USE, and ROCM_USE_DEP to control which gpu targets to compile, and coherency among dependencies. The eclass provides get_amdgpu_flags for src_configure and check_amdgpu for ensuring AMDGPU device accessibility in src_test. Finally, rocm.eclass is merged into ::gentoo in [13].

[9] Week 9 Report for Refining ROCm Packages in Gentoo
[10] https://archives.gentoo.org/gentoo-dev/threads/2022-08/
[11] https://archives.gentoo.org/gentoo-dev/threads/2022-09/
[12] https://github.com/gentoo/gentoo/pull/26784
[13] https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=cf8a6a845b68b578772f2ae0d2703f203c6dec33

Other coding products

Merged ebuilds

rocprofiler

I have bumped dev-util/rocprofiler and its dependencies to version 5.1.3, and fixed proprietary aql profiler lib loading, so ROCm stack on Gentoo stays fully open-sourced without losing most profiling functionalities [14].

[14] https://github.com/ROCm-Developer-Tools/rocprofiler/issues/38

Unmerged ebuilds

Due to limited time and long testing period, ebuilds of ROCm-5.1.3 libraries (ones using rocm.eclass) does not get merged. They can be found in this PR.
dev-libs/rocm-opencl-runtime is a critical package because it provides opencl, and many users still use opencl for GPGPU since HIP is a new stuff. I bumped it to 5.1.3 to match the vanilla clang tool-chain, and enabled its src_test, so users can make sure that vanilla clang isn’t breaking anything. The PR is located here.

Bug fixes

Existing bug fixing is also a part of my GSoC. I have created various PRs and closed corresponding bugs on Gentoo Bugzilla: #822828, #853718, #851795, #851792, #852236, #850937, #836248, #836274, #866839. Also, many bug fixing happens before new packages enter the gentoo main repo, or they are found by myself in the first place, so there is no record on Bugzilla.

Last but not least, the wiki page

I have created 3 pages [15-17], filling important information about ROCm. I also received a lot of help from the Gentoo community, mainly focused on refining my wiki to meet the standards.

[15] https://wiki.gentoo.org/wiki/ROCm
[16] https://wiki.gentoo.org/wiki/HIP
[17] https://wiki.gentoo.org/wiki/Rocprofiler

Comparison with original plan

The original plan in proposal also contained rocm.eclass. But it only allocated the last week for “investigation on vanilla clang”. In week 1, my mentor and I added “porting ROCm on vanilla clang” to the plan, and this became the new major deliverable. Due to the time limit, packaging high level frameworks like pytorch and tensorflow is abandoned. I only worked to get CuPy worked [18], showing rocm.eclass functionality on packages that depend on ROCm libraries.

I think the change of plan and deliverables better annotated the project title “Refining”, because what I did greatly improves the quality of existing ebuilds, rather than introducing more ebuilds.

[18] https://github.com/littlewu2508/gentoo/commit/3d142fa4b4ada560c053c2fd3c8c1501c82aace2

Posted in ROCm Packages | Leave a comment

Week 12 Report for Refining ROCm Packages in Gentoo

Although this is the final week, I would like to say that it is as exciting as the first week.

I kept polishing rocm.eclass with the help of Michał and my mentor, and it is now in good shape [1]. I must admit that the time to write an eclass for a beginner like me is much more than what I expected. In my proposal, I leave 4 weeks to finish it, 2-week implementation and 2-week polishing. In reality, I implemented within 2 weeks, but polished it for 4 weeks. I made a lot of QA issues and was not aware, which increases the number of review-modify cycles. During this process, I leant a lot:

1. Always re-read the eclass, especially comments and examples thoroughly after modification. Many times I forgot there is an example far from the change that should be updated because one functions changes its behavior.

2. Read the bash manual carefully, because properly usage of features like bash array can greatly simplify code.

3. Consider the maintenance difficulty of the eclass. I wrote a oddly specific `src_test`, which can cover all the cases of ROCm packages. But it’s not worth it, because specialized code should be placed into ebuilds, not one eclass. So instead, I remain the most common part, `check_amdgpu`, and get rid of phase functions, which made the eclass much cleaner.

I also find some bugs and their solutions. As I mentioned in week 10’s report, I observed many test failures in sci-libs/miopen based on vanilla clang. In this week, I figured out that they have 3 different reasons, and I’ve provided the two fixes for two failures ([2, 3]). The third issue, I’ve found it’s root cause [4]. I believe there would be a simple solution to this.

For gcc-12 issues, I also come to a brutal workaround [5]: undef the __noinline__ macro before including stdc++ headers and def it afterwards. I also observed that clang-15 does not fix this issue as expected, and provided a MWE at [6].

I’m also writing wiki pages, filling installation and developing guide.

In this 12-week project, I proposed to deliver rocm.eclass, and packages like pytorch, tensorflow with rocm enabled. Instead, I delivered rocm.eclass as proposed, but migrated the ROCm toolchain to vanilla clang. I thought porting ROCm toolchain to vanilla clang is closer to my project title “Refining ROCm Packages” 🙂

[1] https://github.com/gentoo/gentoo/pull/26784
[2] https://github.com/littlewu2508/gentoo/commit/2bfae2e26a23d78b634a87ef4a0b3f0cc242dbc4
[3] https://github.com/littlewu2508/gentoo/commit/cd11b542aec825338ec396bce5c63bbced534e27
[4] https://github.com/ROCmSoftwarePlatform/MIOpen/issues/1731
[5] https://github.com/littlewu2508/gentoo/commit/2a49b4db336b075f2ac1fdfbc907f828105ea7e1
[6] https://github.com/llvm/llvm-project/issues/57544

Posted in ROCm Packages | Leave a comment