Week 9 Report for Refining ROCm Packages in Gentoo

This week I mainly focused on dev-libs/rocm-opencl-runtime.

I bumped dev-libs/rocm-opencl-runtime to 5.1.3. That’s relatively easy. The difficult part is enabling its tests. I came across a major problem, which is oclgl test requiring X server. I compiled using debug options and use gdb to dive into the code, but found there is no simple solution. Currently the test needs a X server where OpenGL vender is AMD. Xvfb only provides llvmpipe, not meeting the requirements. I consulted some friends, they said NVIDIA recommends using EGL when there is no X [1], but apparently ROCm can only get OpenGL from X [2]. So my workaround is to let user passing an X display into the ebuild, by reading the environment variable OCLGL_DISPLAY (DISPLAY variable will be wiped when calling emerge, while this can survive). If no display is detected, or glxinfo shows the OpenGL vendor is not AMD, then src_test dies, throwing indications about running an X server using amdgpu driver.

I was also trapped by CRLF problem in src_test of dev-libs/rocm-opencl-runtime. Tests in oclperf.exclude should be skipped for oclperf test, but it did not. After numerous trials, I finally found that this file is using CRLF, not LF, which causes the exclusion failed 🙁

Nevertheless, rocm-opencl-runtime tests passed on Radeon RX 6700XT! A good thing, because I know many user in Gentoo rely on this package to provide opencl in their computation, and the correctness is vital. Before we does not have src_test enabled. The PR is now in [6].

Other works including starting wiki writing [3,4], refine rocm.eclass according to feedback (not much, see gentoo-dev mailing list), and found a bug of dev-util/hipFindHIP.cmake module is not in the correct place. Fix can be found in [5] but I need to further polish the patch before PR.

If no further suggestions on rocm.eclass, I’ll land rocm.eclass in ::gentoo next week, and start bumping the sci-libs version already done locally.

[1] https://developer.nvidia.com/blog/egl-eye-opengl-visualization-without-x-server/
[2] https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime/blob/bbdc87e08b322d349f82bdd7575c8ce94d31d276/tests/ocltst/module/common/OCLGLCommonLinux.cpp
[3] https://wiki.gentoo.org/wiki/ROCm
[4] https://wiki.gentoo.org/wiki/HIP
[5] https://github.com/littlewu2508/gentoo/tree/hip-correct-cmake
[6] https://github.com/gentoo/gentoo/pull/26870

This entry was posted in ROCm Packages. Bookmark the permalink.

Leave a Reply

Your email address will not be published.