Week 5 Report for RISC-V Support for Gentoo Prefix

Hello Everyone!

Hope you all are doing great, this is my report for the fifth week of Gentoo GSoC.
We finally got a completely working Prefix for lp64d RISC-V profile with all the 3 stages compiling successfully. All the major pull requests have been merged.
Talking about the patches we have the prefix profile symlink pr merged[1] and now RISC-V Prefix profile is available upstream. I have keyworded prefix-toolkit[2] as it is required to complete Stage-3 installation. The patch for binutils is in progress.
As discussed in last week’s report, there were some minor issues that need work on, the scanelf issue has been resolved and there is a issue in the host freedom-u-sdk which doesn’t let binutils compile completely and causes errors on running in ar and ranlib due to missing libraries. I am working on it and will be resolved soon. Though after emerge – uDNv system, ar and ranlib don’t have those issues. After having complete Stage-3 being compiled, the script starts emerging all the packages again, this will be solved soon by adding patches the the script.
As we got Prefix on RISC-V working, I have started with testing packages for EESSI overlay and Gentoo. The workflow is to post bugs using bugz then use nattka bot to make package list, then add it in package list and use tatt to test package. I have received the list of packages that EESSI uses and needs to be tested, I will get them tested and keyworded in upcoming weeks. I am also willing to test prefix with RISC-V profile on more systems to avoid issues on corner cases, if any.
To summarize the report, I have spent it in most of the time on documenting, testing  and learning about testing  packages. In next week I will continue by solving the bugs, testing packages and documentation. Also mentors have been really helpful and reachable.
Regards,

wiredhikari

Posted in RISC-V Prefix | Leave a comment

Daily blog july 14 by Catcream

Today I have worked on “forgotten” patches and also made some new ones, I’ve also done some testing on Plasma desktop packages and a little cross compiling with distcc.

Continue reading

Posted in Uncategorized | Leave a comment

Daily blog july 13 by Catcream

Today I’ve set up my router to use Gentoo musl, it is an Espressobin v7 (aarch64). I thought I’d do it before I started working but it took a looot longer than expected due to some issues, like macvlan.ko causing a kernel panic (wat). I have learnt a lot about cross compiling, using u-boot and various nftables things.

I also made some more musl-compat packages and I’ve also came a long way of merging kde-apps-meta. I wrote down everything that’s not working and exactly what needs to be done. Some of my  patches are in ::gentoo but not upstream, some just local on my fork, etc, etc… This will make it easier to focus on what needs to be done going forward.

A quick summary of what’s left: apps that depend on QtWebEngine do not work yet, and there are a few applications that do not build, like filelight (that one I’ve previously made a patch for but I still got errors this time), mysql-connector-c and k3b. My goal is to fix everything in kde-apps-meta this week, except packages that depend on qtwebengine, because I’ll work on that next week.

Posted in musl KDE, Uncategorized | Leave a comment

Daily blog july 12 by Catcream

This day has been mostly dedicated to reinstalling my musl development desktop. Throughout this project I’ve been putting various small patches into /etc/portage/patches, as well as “make install:ing” some things and forgetting about them. Continue reading

Posted in musl KDE | Leave a comment

Week 4 Report for Refining ROCm Packages in Gentoo

The forth week working on packaging ROCm is quite smooth. There are some bug fixes, and also major improvements on rocm.eclass.

Bug fixes cover rocBLAS and rocFFT. For rocBLAS, I backported a patch to sci-libs/rocBLAS-5.0.2-r1 and dev-util/Tensile-r1, to pass `-j N` from ${MAKEOPTS} to TensileCreateLibrary when building rocBLAS, which fixed [1]. As of rocFFT, I corrected its BDEPEND [2], added missing sys-libs/omp for omp.h [3], and let it depend on dev-util/rocm-cmake-5.0.2-r1 which does not install files to unexpected paths [4]. However, as the gcc-12.1.0 lands, bugs about clang expanding __noinline__ macro in g++-v12/bits/shared_ptr_base.h emeregs [5,6]. Details can be seen on [5], and I’m working on resolving this (see PR [7]).

For rocm.eclass, I finished the draft for three major functions: USE_EXPAND, src_configure and src_test. I also wrote get_amdgpu_flags function used by src_configure.

The use expand. I haven’t write a profiles/desc/amdgpu_targets.desc, so the descriptions are missing.

My latest work on rocm.eclass is located at https://github.com/littlewu2508/gentoo/blob/rocm-5.1.3/eclass/rocm.eclass. Below are its status and my questions I’d like to share:

1. Default architectures. Now I implement the USE_EXPAND of AMDGPU_TARGETS, I need to specify the default value of each use. The straightforward way is to enable all targets by default, but that can be **extremely** slow and disk-hungry when compiling ROCm libraries such as rocBLAS or rocFFT (expect to compile for several hours if the CPU is not powerful enough). Currently I defined a variable OFFICIAL_AMDGPU_TARGETS, which is referenced from ROCm installation documents [8]. Although the support range is much larger, and different components have their own support matrices, AMD promise to fully support these enterprise cards. For enterprise users, they can just emerge ROCm packages without setting specific use flag, and have out-of-box experience on Gentoo. For users with consumer end cards, they can read the wiki page (covered later in my GSoC project) and seek instructions to set the correct use flag.

2. Whether setting -DSKIP_RPATH=true in mycmakeargs. Previously this is set to avoid including rpath if USE=benchmark when building ROCm packages like sci-libs/roc-* and sci-libs/hip-*. The test and benchmark executables are named “clients” (take rocBLAS as example, clients are programs that uses functions and link librocblas.so). In order to run tests and benchmarks before install libraries to system, rpath is set on these executables, but gentoo does not have a src_benchmark phase, so the benchmark binaries is just installed, and user can run it afterwards (actually I use it in my research to tune algorithms). So there should not be rpath in benchmark binaries, and this is achieved by setting -DSKIP_RPATH=true. However, after this, test program cannot execute because rpath is also eliminated, so I have to specify LD_LIBRARY_PATH in src_test manually. Another resolution is not skipping rpath, but run chrpath on affected binaries, which means maintainers have to write a dedicated src_install and remember to add chrpath command applying on every new executables when bumping versions. The third solution is to patch CMakeLists.txt to include rpath only in test programs, but this method also introduce more maintenance work. What’s your opinion?

3. Detect AMDGPU in src_test. This blocks https://bugs.gentoo.org/817440, and I also raise questions in the bug report. Tinderbox cannot run tests on ROCm packages like rocBLAS, because there is no AMDGPU available. I implement the detection mechanism, with one problem left: if no GPU available, fail the test or exit normally? Personally, I think the best solution is to detect AMDGPU during pretend or setup phase, turn off the test USE flag if no GPU available, or the architecture compiled does not match the detected GPU. But is operating USE flag inside ebuild phase functions possible?

Despite these issues I managed a working version of rocm.eclass, and used it on rocBLAS. The use expand works successful, while src_test can properly detect hardware and execute in both sandboxed vanilla Gentoo, and non-sandboxed Gentoo prefix. There are still things to work on rocm.eclass:

1. ROCM_USEDEP, similar to PYTHON_USEDEP. For example, hipBLAS uses architectures gfx906 and gfx1030, then its dependency, rocBLAS, must contains gfx906 and gfx1030.
2. SRC_URI.
3. A way to automatically add PORTAGE_USERNAME to render group, to access amdgpu and perform src_test. I don’t have any clue on this yet, maybe meta package in acct-group can do this?

In the coming week I’ll finish rocm.eclass as planed, and send out for early review. Meanwhile I’ll continue fixing bugs [5,6,9,10], answering questions about enabling rocm in packages [11,12], and prepare to land ROCm-5.1.3. One of my friend is also plugging Radeon VII on there arm64 server, and if everything goes well I can try ROCm on arm64 (in kernel document, the GPGPU driver, amdkfd, support amd64, arm64 and ppc64), and add the ~arm64 KEYWORD in the future.

[1] https://bugs.gentoo.org/852236
[2] https://bugs.gentoo.org/836248
[3] https://bugs.gentoo.org/850937
[4] https://bugs.gentoo.org/836274
[5] https://bugs.gentoo.org/857126
[6] https://bugs.gentoo.org/857660
[7] https://github.com/gentoo/gentoo/pull/26311
[8] https://docs.amd.com/bundle/ROCm-Getting-Started-Guide-v5.1.3/page/Overview_of_ROCm_Installation.html
[9] https://bugs.gentoo.org/842366
[10] https://bugs.gentoo.org/836275
[11] https://github.com/gentoo/gentoo/pull/25836
[12] https://github.com/gentoo/gentoo/pull/25837

Posted in Uncategorized | Leave a comment

Week 3 Report for Refining ROCm Packages in Gentoo

This week I’m quite busy on other work (school related, I’m at the end of the semester and the official summer vacation starts next week), so there is not much progress on finishing the plan mentioned in week 2’s report. Another reason is I’m focusing on investigating and leaning things such as ebuild writing.

AMD released ROCm-5.2 earlier this week, so I give it a try at https://github.com/littlewu2508/gentoo/tree/rocm-5.2 using llvm-14 backend (same as 5.1.3). I observed two bugs:
1. dev-libs/rocm-comgr-5.2.0 calls clang to compile for gfx1036 and use `-mcode-object-version=5`, which is unsupported by clang-14. This can be patched out, and the tests all passed except for the existing issue [1].
2. dev-util/rocm-device-libs-5.2.0 causes lld throw linking error when compiling HIP programs: `lld: error: undefined symbol: __oclc_ABI_version`. Due to limited time I did not dig in and found the cause. I suspect it’s caused by incompatibilities between the newest rocm-device-libs and llvm/clang-14. Currently rocm-device-libs-5.1.3 serves well for hip-5.2, so I decide to look at this carefully and consult upstream later.

Nevertheless, I installed hip-5.2 and it worked, but that did not make blender-3.2 HIP cycles work on Radeon VII. After reading blender developer forum and investigating the versioning of HIP, I find the answer in [2] — HIP cycles work on vega devices needs future releases of HIP.

There is also a gentoo user interested in installing the newest version of ROCm. I’ve been answering his/her questions about resolving errors and warnings when they bump to ROCm 5.1.3 and 5.2.0 [4][5], which provides valuable information. But in my plan, I won’t quickly bump to ROCm-5.2.0 because there are two incompatibilities mentioned above, so I suggest to wait for the next version of clang (probably clang-15). Also, I have not seen any urgent need of ROCm-5.2.x.

Another interesting thing is ROCm on APU. I have a Ryzen 4800u Laptop, and 2 years ago in the age of ROCm-3.5, the iGPU is marked as gfx902, but the hip program compiled to gfx902 caused weird behaviours such as dead screen. Now rocminfo shows that it is gfx90c, and hip program can run smoothly. So I wonder if the full ROCm stack can be installed and run on this APU. Sadly rocBLAS fails to compile — clang throws internal error when compiling the gigantic Tensile kernel to gfx90c.

The important job this week is learning eclass syntax. I went through the eclass writing guide, and read some eclass examples (mainly llvm.org.eclass because it has similar USE_EXPAND case). I started writing rocm.eclass, currently working on handling the USE_EXPAND of AMDGPU_TARGETS, and determine compilation architectures depend on AMDGPU_TARGETS. I’m developing it at [3], and I hope by the end of week 4 the core functionalities can be finished, and I’ll launch a PR to get comments, then mail it to Gentoo-dev mailing list for more suggestions.

[1] https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/issues/45
[2] https://bugs.gentoo.org/693200#c32
[3] https://github.com/littlewu2508/gentoo/blob/rocm-5.1.3/eclass/rocm.eclass
[4] https://bugs.gentoo.org/851702#c9
[5] https://bugs.gentoo.org/693200#c35

Posted in ROCm Packages | Leave a comment

Week 2 Report for Refining ROCm Packages in Gentoo

The second week of refining ROCm ebuilds is quite busy. I deployed docker to perform clean build which find two hidden bugs in hip, and there is also progress on completing rocm-5.1.3 against vanilla llvm/clang.

After learning a lesson at bug #853184, I realized that a clean environment to build and test is essential to find hidden bugs, especially missing dependencies. I find the cause of #853184 and fixed that in [1]. With the help of clean build, I found bug #853718 and fixed that in [2]. I also reproduced bug #843263 and provide a fix in [3].

I also fixed an old bug #853184 with [12].

Another bug fix is [4] for a serious issue of incorrect manifest, bug #851792 and #851795. Andrew Ammerlaan also pointed out the QA issue of directly calling python3 to execute scripts instead of using EPYTHON. I will consider that in week3.

Then it’s about progress on rocm-5.1.3 against vanilla llvm/clang. The major achievements are:

1. Michał Górny told me the policy of packaging llvm/clang, so the brutal patch in [6] is not suitable. I studied the patch and find it unnecessary, as long as we add `–rocm-path=/usr` and `–hip-device-lib-path=/usr/lib/amdgcn/bitcode` when calling clang to compile hip sources. So I patched hipcc.pl in dev-util/hip and comgr-compiler.cpp in dev-libs/rocm-comgr to explicitly add `–rocm-path=/usr`. Notice that the patch for rocm-comgr is not obvious, because a test suite called “compile_hip_test_in_process” won’t appear and fail unless dev-util/hip is merged (hip depend on rocm-comgr but does not depend on hip), so I guess that’s why Debian and Fedora has not encounter this issue. I suppose they are also packaging hip, and will meet similar problems, so it would be really helpful if ROCm team of major distributions can discuss and share information on packaging hip.

2. I packaged dev-util/hip-5.1.3, it’s currently in [7]. It currently works, although I’m not satisfied with tens of sed commands and ten patches needed — upstream of hip currently is not distribution-friendly. I fixed the cmake issue mentioned in week1’s report, also mentioned in [8]. I also encountered bug when trying to turn on USE=profile, and the solution is backporting two patches (see details in [9]), meaning that this release of hip is not able to build itself due to some important fix not included. Plus the hard-coded clang-runtime include paths and abused `-isystem`, I really find hip the most chanllenging one among ROCm packages.

3. Blender still works after the removing the patch of clang mentioned in 1., and details can be found in [10]. I also tried backporting a patch to enable using HIP cycles (a render engine for blender) on Radeon VII, but failed with GPU memory access error, which indicates that hip needs further tuning [11].

4. Version 5.1.3 ebuilds are in good shape [7], including low-level runtimes {roct-thunk-interface, rocr-runtime, rocminfo}, and toolchains {rocm-device-libs, rocm-comgr, rocm-cmake, hip}, waiting for PR. The commits are squashed, while you can see my original history of battling against hip in the unrebased tree [16]. rocBLAS is also bumped to 5.1.3 and running tests, but I decide to rewrite it and make use of rocm.eclass later.

5. rocm-comgr upstream noticed my bug report [17].

So now hip-5.1.3 seems to be ready, and my test system does not show bugs. I’ll PR my rocm-5.1.3 branch [7] right after [3] get merged.

In the next week I shall land make hip-5.1.3 in ::gentoo, and prepare a draft of rocm.eclass. There will also be bug fixes, concentrating on rocBLAS not respecting MAKEOPTS (#852236), rocprofiler QA issue [5], rocFFT build issue using hip-5.1.3 [13]. For the long term, I’ll also investigate the embedded header in libhipamd64.so and libhiprtc-builtins.so which blocks CuPy, and how well vanilla libomp supports ROCm openmp offloading compared to aomp(llvm-roc) which is related to rocSPARSE [14].

Summary: I fixed existing bugs in ::gentoo so the blockers are gone [15]. I finished the dev-util/hip-5.1.3 and its 5.1.3 dependencies. The hacks applied to hip is too much — it would be helpful to share information with other distribution developers, and reflect those issues/open PR to upstream.

[1] https://github.com/gentoo/gentoo/pull/26018
[2] https://gitweb.gentoo.org/repo/gentoo.git/commit/93ff73188c29fe12088f6166df669847cde9b2b4
[3] https://github.com/gentoo/gentoo/pull/26090
[4] https://github.com/gentoo/gentoo/pull/25891
[5] https://github.com/gentoo/gentoo/pull/25891#issuecomment-1163481516
[6] https://github.com/gentoo/gentoo/pull/25999
[7] https://github.com/littlewu2508/gentoo/tree/rocm-5.1.3-submit
[8] https://bugs.gentoo.org/693200#c23
[9] https://github.com/ROCm-Developer-Tools/hipamd/issues/18#issuecomment-1167198811
[10] https://bugs.gentoo.org/693200#c24
[11] https://developer.blender.org/D15242
[12] https://github.com/gentoo/gentoo/pull/26039
[13] https://bugs.gentoo.org/693200#c25
[14] https://github.com/gentoo/gentoo/pull/25318
[15] https://github.com/justxi/rocm/issues/8#issuecomment-1166165426
[16] https://github.com/littlewu2508/gentoo/tree/rocm-5.1.3
[17] https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/issues/45

Posted in ROCm Packages | Leave a comment

Week 1 Report for Refining ROCm Packages in Gentoo

This is my first week of GSoC at Gentoo, and I found it unexpectedly exciting. The center of first week is around making dev-util/hip rely on vanilla clang. In https://github.com/littlewu2508/gentoo/tree/blender-rocm, I bumped rocm-device-libs, rocm-comgr, hip to 5.1.3 and use vanilla llvm/clang as backend; after that I bumped blender to 3.2.0 and enables its HIP cycles, and it worked on Radeon 6700XT (see [1])! That means I made a good start on replacing llvm-roc with system llvm, which is originally the last thing in my GSoC proposal. So, I changed the plan a bit, to move the last week’s plan forward.

The story begun when I heard blender 3.2.0 is finally released with HIP cycles support on Linux, so I decided to try it out. Also I searched the bugzilla and noticed a proposal to use llvm.eclass and rocm USE-flag[1].

After a quick bump for media-gfx/blender and its required dependencies, I enabled the HIP cycles in ebuild and started emerging. The build is surprisingly smooth, since build commands are simply calling hipcc without too many arguments which is already in good shape. However, blender was aborted when I tried to use HIP cycles at runtime — the error suggest that more than one llvm libs are linked in. I realized that some dependencies like mesa linked vanilla llvm while blender itself has to link llvm-roc since it has components compiled with hipcc. I reported my trial in [1] and Sebastian Parborg confirmed the reason of my failure, so I opened another bug about llvm-roc at [2]. There I stated the situation and give two possible solutions: use vanilla clang as hip’s backend, or make llvm-roc another slot of llvm/clang. That is actually my last-week-plan in GSoC proposal, but at that time I didn’t realize the importance of making llvm-roc compatible of system llvm, since I had never encountered a package that both use llvm and HIP. In the bug report I announced that the second solution should be easier so I preferred that, but in my heart I think the first one is more elegant, so I would try it first and fallback to the second solution if I failed. As a result, I started my journey on removing llvm-roc from the ROCm dependency tree.

The first thing is to modify rocm-device-libs. With the help of Michał Górny (who pointed out that packages should not assume llvm to have the “BUILD_SHARED_LIBS=ON” and link llvm components in [2], knowledge++), I patched the source made it only rely on llvm:14 (Fedora developers have also discussed about this and they would like to upstream their patches). Then it’s rocm-comgr, where I encounter serious problems. With the help from Yuyi Wang, I figured out a patch [3] (however I do not understand why Debian and Fedora don’t need it) and I prepare to upstream it to ROCm team in the future. After that only four test failures remain, but it took me a long time to debug, and I found both Debian, Fedora team and me has not to come to a solution yet, so I decided to open a github issue to upstream at https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/issues/45. During ebuild writing I used llvm.eclass to determine llvm prefix and `clang -print-resource-dir` to locate the CLANG_RESOURCE_DIR which is in `/usr/lib/clang/<version>` but not the default relative path in llvm-project — knowledge+=2.

Then it was all about HIP. I encountered many issues about finding the correct include locations, and they are fixed one-by-one. At last I came to a new hipvars.pm and a patch to hipcc.pl, disabling poisoning `-isystem` and correcting many paths. Now directly calling hipcc works, and blender rendered successfully using HIP cycles! I was amazed at this result.

Then I continued to test — compiling rocBLAS-5.1.3 using this new ROCm toolchain. Sadly, there are paths that should be corrected in cmake files. I’ve done some fixes, but there still needs more to let rocBLAS get configured. Bumping the high-level libs using this new toolchain would be the major task of the coming week. Another job is finalize and push low-level runtimes and toolchain into ::gentoo via PRs, starting from https://github.com/gentoo/gentoo/pull/25785. I’ll also fix existing bugs when I bump the versions of those in sci-libs. For https://bugs.gentoo.org/852236 I already have a solution. For bugs of not respecting CFLAGS/LDFLAGS I shall investigate, and I think the problem is in common with https://bugs.gentoo.org/851792. I’ll check them one-by-one.

So, the plan is changed as follows:

I am currently half way in the middle of week 11’s task. So plan of week 11 is merged into week 1, meaning that tasks in week 1-10 are postpone one week.

Also, since I’m using ROCm-5.1.3 as the test place of the new toolchain, I would like to make use of rocm.eclass, if possible. That means the original week 5-8 would be moved after week 2 (between CuPy and TensorFlow).

In conclusion, in the first week I was persuaded by [1] that [2] is an important blocker, so the task in week 11 is no longer optional but essential, and get prompted. The good news is I’m getting nice progress on this issue, and I believe I’m the first Gentoo user to package and use blender-3.2 with HIP cycles. The bad news is I’m not finished with hacking cmake modules for HIP.

[1] https://bugs.gentoo.org/693200
[2] https://bugs.gentoo.org/851702
[3] https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/issues/45#issuecomment-1155975910

Also, there is a  summary of bug reports in week 1:
1. 822828, 693200, 851702, 842405, 842405

The summary of closed pull requests during week 1:
1. https://github.com/gentoo/gentoo/pull/25861

The summary of currently opened pull requests:

1. rocprofiler QA fixes: https://github.com/gentoo/gentoo/pull/25891 Status: open for review
2. dev-libs/ocl-icd prefix adoption: https://github.com/gentoo/gentoo/pull/25785 Status: fixing
3. sys-devel/clang ROCm patch: https://github.com/gentoo/gentoo/pull/25999 Status: open for review
4. dev-util/premake prefix adoption (this is related to https://github.com/GPUOpen-LibrariesAndSDKs/HIPRTSDK) https://github.com/gentoo/gentoo/pull/25825 Status: open for review

Posted in ROCm Packages | Leave a comment

Daily blog july 11 by Catcream

As I used the #gentoo-soc-musl channel as a mini-blog Blueknight suggested to me that I’d do small daily blog posts instead.

Today I’ve worked on a lot of different and seemingly unrelated stuff. To start with I fixed plocate to use fprintf + exit instead of error. Upstream wanted a proper patch. This was not as trivial as just replacing since the code used the error_message_count variable. Still pretty straight forward though. Continue reading

Posted in musl KDE | Leave a comment

Week 4 Report for RISC-V Support for Gentoo Prefix

Hello everyone,
So the fourth week of Google Summer of Code has come to an end and here is my weekly report for the same.
This week we got a working prefix with Stage 3 compiling till the end. Profile for lp64d got merged as well.
Earlier I use to pull my gentoo fork for testing but as the pull request for lp64d no-multilib Profile[1] got merged, I tested with the official Gentoo repository.
Current approach is to fix the bugs locally and see how far stage 3 goes. So after the fixes we did in previous weeks there are 2 major bugs that need fix to complete stage 3.
We need to add check for scanelf​ in bootstrap script as its needed in the host for ncurses to install during stage 2.
 
libfl.so.6​ library is missing due to which riscv64-pc-linux-gnu-ar​ and riscv64-pc-linux-gnu-ranlib​ fails to execute.
I tried bootstrapping by fixing these issues locally and it continued till the end.
It was decided that we add support to lp64d as of now and accordingly I have added the patch to Gentoo Prefix[2], other profiles are under 17.0​ directory while RISC-V is under 20.0​, it has been fixed accordingly.
Also continued documentation on “Porting Prefix to New Architecture” and Setting up RISC-V environment for testing prefix.
In upcoming weeks I plan to continue testing with the latest packages and work on libfl.so.6​ and scanelf​ issue. Post that I will work on keywording packages with RISC-V.


Regards,
wiredhikari

Posted in RISC-V Prefix | Leave a comment