Hi everyone,
First of all, let me say that I’ve been running a custom image built with blue build over ublue’s images for a while and it has always been a great experience. I think that speaks a lot about the quality of these projects
I have a question about the base silverblue-nvidia image though. I’ve got a custom image based on it, and recently (I think after the hwe repo was integrated into main) the nvidia card in my laptop stopped being detected. After searching the error online I found that it was because I had the nvidia-open driver instead of the nvidia one (which my card would need). I validated that by checking the /etc/nvidia/kernel.conf and it was kernel-open indeed and the Containerfilepoints at the open module as well.
My question is, it looks like bluefin-nvidia uses the not -open driver and there’s a separate image for the -open version. Should there be 2 different base images? A silverblue-nvidia and a silverblue-nvidia-open?
Or, is it just better to customise my container file and copy the steps to use the regular nvidia driver instead?
Nvidia is dropping the propietary driver support. The newest cards only work with the open one (like 50XX series) but they support only Turing/16XX and up devices.
So once the closed driver is dropped those cards will have to move to the normal driver (nouveau/nvk)
Nvidia is recommending the open driver for all new GPUs, including datacenter, so it doesn’t sound like they are interested in selling a proprietary driver.
From their website:
Supported GPUs
Not every GPU is compatible with the open-source GPU kernel modules.
For cutting-edge platforms such as NVIDIA Grace Hopper or NVIDIA Blackwell, you must use the open-source GPU kernel modules. The proprietary drivers are unsupported on these platforms.
For newer GPUs from the Turing, Ampere, Ada Lovelace, or Hopper architectures, NVIDIA recommends switching to the open-source GPU kernel modules.
For older GPUs from the Maxwell, Pascal, or Volta architectures, the open-source GPU kernel modules are not compatible with your platform. Continue to use the NVIDIA proprietary driver.
For mixed deployments with older and newer GPUs in the same system, continue to use the proprietary driver.
If you are not sure, NVIDIA provides a new detection helper script to help guide you on which driver to pick. For more information, see the Using the installation helper script section later in this post.
Thanks all for your responses (can’t mention all of you yet).
I was already aware that for newer cards the nvidia-open driver is the way to go, but I was wondering if it would make sense to follow what’s already in place, for example in bluefin, and publish both images.
Sure, the proprietary driver will go away at some point (like the legacy one did), but until then, I think it could be helpful and more consistent to have all -nvidia images use the proprietary and the -nvidia-open use the open version.
At least the akmods repo still builds both of them, so it should be reasonably simple to include the right one in my own recipe. So looks like I’ve got some homework to do
But surely we don’t want to use the open drivers just yet? From what I read aren’t they still drastically lagging behind in performance to the closed drivers?
No, they’re basically the same. If there were drastic differences, it wouldn’t make sense for them to be suggesting them for data center use.
Benchmarks: