Congrats on the job! It's exciting to see developments in CUDA competitors.
One of the issues I've had with ROCm is not so great support for commercial GPUs. This is specifically with RX 7XXX series. Do you think there is any chance it will improve in future?
Not the GP, but I have a RX 7700S running Ubuntu and I cannot for the life of me get ROCm to play nice with my GPU. I tried all sorts of env vars but I keep getting seg faults when I try to run PyTorch. Or it just ends up running on my CPU
I work for AMD. To be clear, my new job is about integrating ROCm into the distribution not just about shipping ROCm packages that can run on Debian.
I'll be doing things like creating new packages in main, helping to get support for the HIP language embedded into existing dpkg tooling, helping to get GPU architecture awareness integrated into the Debian CI infrastructure, helping to enable ROCm support in other libraries and applications packaged for Debian, and ensuring that everything in Debian is successfully imported into the Ubuntu universe repositories.
Integrating HIP support into Debian so that it feels as natural as C or C++ and 'just works' across dozens of GPUs is a job for more than one person. That is why I'm glad there have been so many volunteers in the community stepping forward to help with various pieces.
Could you please tell AMD that it is a major competitive advantage for Nvidia that they keep doing driver updates for cards for many many years after they were released and even very old cards still get current drivers.
AMD just drops your card within a few years it seems like and drops your card from the current releases. Makes me favor Nvidia.
The only driver I'm aware of is the AMDGPU driver in the Linux kernel. It is updated with every release of Linux and is used for all modern AMD GPUs. I find that the drivers generally work well. My complaints are more about the user space libraries.
The good news is that I have at least one AMD GPU of each architecture from Vega to RDNA 3 / CDNA 2 on the Debian ROCm CI. Debian Trixie has packages built and tested for every modern discrete AMD GPU from Vega to RDNA 3 / CDNA 2. (I'd have liked to include RDNA 4 / CDNA 3, but the effort was quite resource constrained and the packages are a bit old. I'm hoping to improve upon that going forward, but Trixie is already in feature freeze so it will have to wait for the next release.)
I personally own much of the equipment for the Debian ROCm CI and I can promise I will continue testing new releases on old hardware for a very long time.
The driver for AMD's XDNA NPU landed in Linux 6.14 [1]. However, the Xilinx AI runtime still needs to be packaged. That may take some time. The NPU runtime stack is based on the Xilinx AI toolchain, which is not yet as mature as the ROCm stack. There are a few related packages in Debian, but AMD and Debian both have a lot of work to do to get support for the NPU integrated into the distribution. I probably won't directly be doing the packaging of the runtime, but I've been helping to nudge the process along.
It's perhaps worth mentioning that Framework has directly supported Debian in providing access to hardware with AMD NPUs and iGPUs. I'm typing this message on one of two Framework 13 laptops that they donated to support Debian in this effort. I will be using it both for testing gfx1103 support on Debian and for testing the NPU packages when they become available. Framework also generously offered to provide one of those desktop systems you linked for the Debian ROCm CI [2]. It would also be used as a CI worker for the NPU runtime libraries once those are packaged.
It's not just about drivers in isolation, but what features those drivers and cards support. Support for older APIs for doing compute on AMD cards get dropped in newer drivers and newer APIs aren't supported on older cards. With Nvidia CUDA has been supported continuously for probably 15 years now, while in the AMD world you've been expected to throw out all your old code and port it to a new API every 3 years.
I've been volunteering with Debian to help package ROCm for four years now, but today it officially became my full-time job. AMA.