{"id":382,"date":"2022-09-11T10:10:09","date_gmt":"2022-09-11T10:10:09","guid":{"rendered":"https:\/\/blogs.gentoo.org\/gsoc\/?p=382"},"modified":"2022-09-11T10:10:14","modified_gmt":"2022-09-11T10:10:14","slug":"week-10-report-for-refining-rocm-packages-in-gentoo","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/gsoc\/2022\/09\/11\/week-10-report-for-refining-rocm-packages-in-gentoo\/","title":{"rendered":"Week 10 Report for Refining ROCm Packages in Gentoo"},"content":{"rendered":"<p>This week I have leant a lot from Ulrich&#8217;s comments on rocm.eclass. I polished the eclass to v3 and send to gentoo-dev mailing list. However, I observed another error introduced in v3, and I&#8217;ll include a fix for it in the v4 in the following days.<\/p>\n<p>Another half of my time is spent on testing sci-libs\/roc-* packages on various platforms, utilizing rocm.eclass. I can say that rocm.eclass did its job as expected, so I believe after v4 it can be merged.<\/p>\n<p>With src_test enabled, I have found various test failures. rocBLAS-5.1.3 fails 3 tests on Radeon RX 6700XT, slightly exceeding tolerance, which seems not a big issue; rocFFT-5.1.3 fails 16 suites on Radeon VII [1], which is serious and confirmed by upstream, so I suggest masking &lt;code&gt;amdgpu_targets_gfx906&lt;\/code&gt; USE flag for rocFFT-5.1.3; just today I observe MIOpen is failing many tests, probably due to vanilla clang. I&#8217;ll open issues and report those test failures to upstream. Running tests suite takes a lot of time, and often drain the GPU. It may takes more than 15 hours testing rocBLAS, even on performant CPU like Ryzen 5950X. If I use the GPU to render graphics (run a desktop environment) and do test simultaneously, it often result in amdgpu driver failure. I hope one day we can have a testing farm for ROCm packages, but that would be expensive because there are a lot of GPU architectures, and the compilation takes a lot of time.<\/p>\n<p>I planned to finish the draft of wiki pages [2,3], but turns out I&#8217;m running out of time. I&#8217;ll catch up in week 11. My mentor is also busy in week 10, so my PR about rocm-opencl-runtime is still pending for review. Now we are working on solving the dependency issue of ROCm packages &#8212; gcc-12 and gcc-11.3.0 incompatibilities. Due to two bugs, the current stable gcc, gcc-11.3.0 cannot compile some ROCm packages [4], and the current unstable gcc, gcc-12, is unable to compile nearly all ROCm packages [5].<\/p>\n<p>I&#8217;ll continue to do what&#8217;s postponed in week 10 &#8212; landing rocm.eclass and sci-libs packages, preparing cupy, fixing bugs, and writing the wiki pages. I&#8217;ll investigate MIOpen&#8217;s situation as well.<\/p>\n<p>[1] https:\/\/github.com\/ROCmSoftwarePlatform\/rocFFT\/issues\/369<br \/>\n[2] https:\/\/wiki.gentoo.org\/wiki\/ROCm<br \/>\n[3] https:\/\/wiki.gentoo.org\/wiki\/HIP<br \/>\n[4] https:\/\/bugs.gentoo.org\/842405<br \/>\n[5] https:\/\/bugs.gentoo.org\/857660<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This week I have leant a lot from Ulrich&#8217;s comments on rocm.eclass. I polished the eclass to v3 and send to gentoo-dev mailing list. However, I observed another error introduced in v3, and I&#8217;ll include a fix for it in &hellip; <a href=\"https:\/\/blogs.gentoo.org\/gsoc\/2022\/09\/11\/week-10-report-for-refining-rocm-packages-in-gentoo\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":179,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/posts\/382"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/users\/179"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/comments?post=382"}],"version-history":[{"count":1,"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/posts\/382\/revisions"}],"predecessor-version":[{"id":383,"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/posts\/382\/revisions\/383"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/media?parent=382"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/categories?post=382"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/gsoc\/wp-json\/wp\/v2\/tags?post=382"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}