Cannot use Nvidia Runtime in Docker Since Update to Fedora 42

Bille747 · May 29, 2025, 1:14am

I’ve encountered a strange issue when launching docker containers with the nvidia runtime.

Previously (2 weeks ago), I was able to launch a container with a similar command:

docker run --rm -it --gpus all -v $(pwd):/config linuxserver/ffmpeg

But ever since I rebooted my PC late last week (which automatically updated me to Fedora 42), the same command above gives me the follow error message:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as ‘legacy’
nvidia-container-cli: detection error: open failed: /usr/lib/libnvidia-tls.so.570.144: no such file or directory: unknown

I’ve done some poking around and could not locate the file /usr/lib/libnvidia-tls.so.570.144. However, I do see a similar file /usr/lib/libnvidia-tls.so.570.153.02.

When I run the ldconfig command, I get the following warnings:

ldconfig: Can’t link /lib/libnvidia-tls.so.570.144 to libnvidia-tls.so.570.153.02
ldconfig: Can’t link /lib/libnvidia-gpucomp.so.570.144 to libnvidia-gpucomp.so.570.153.02
ldconfig: Can’t link /lib/libnvidia-glvkspirv.so.570.144 to libnvidia-glvkspirv.so.570.153.02
ldconfig: Can’t link /lib/libnvidia-glsi.so.570.144 to libnvidia-glsi.so.570.153.02
ldconfig: Can’t link /lib/libnvidia-glcore.so.570.144 to libnvidia-glcore.so.570.153.02
ldconfig: Can’t link /lib/libnvidia-eglcore.so.570.144 to libnvidia-eglcore.so.570.153.02

So it looks like there was a driver update and didn’t fully cleanup those files somehow, but I am not sure how to fix that.

Any suggestions? Everything else is working fine otherwise

klmcw · May 29, 2025, 2:43am

Try the snippet available at less -p '== "Test CUDA' /usr/share/ublue-os/just/40-nvidia.just.

podman run \
  --user 1000:1000 \
  --security-opt=no-new-privileges \
  --cap-drop=ALL \
  --security-opt label=type:nvidia_container_t  \
  --device=nvidia.com/gpu=all \
  docker.io/nvidia/samples:vectoradd-cuda11.2.1

It produces this output for me:

[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

Recently I ran into the same thing. Something in the extra security settings is causing it to work.

Note this same command line does not work for me by simply swapping podman for docker.

But I was using podman in my original script already. So I stopped there.

I hope that gives you 1 extra troubleshooting data point.

If you still see the same kind of output then it is worth pursuing your current line of questioning.

Otherwise, it looks like you have some work to do to mount all of the needed libs into linuxserver/ffmpeg. [with podman even]

Taking a step back - the ffmpeg command is already in /usr/sbin; and is also available within brew.

What are you trying to accomplish?

Bille747 · May 29, 2025, 1:42pm

Thank you for responding!

I come from debian and using this container was the most reliable way I could get my hands on the latest and greatest ffmpeg (with all the latest codecs) without bricking my install . I do sometimes use my GPU for hardware accelerated encoding / decoding so that’s why I’m passing the --gpus flag.

When I installed Bluefin, I did try the built-in FFMPEG but it doesn’t have the NVENC encoders compiled so I didn’t dive further into it since I already had a pretty good solution with that docker image.

I did try the podman command you suggested and it does appear to work just fine:

[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED

Seeing how this command worked, I modified it to run the same docker image:

podman run \
  --user 1000:1000 \
  --security-opt=no-new-privileges \
  --cap-drop=ALL \
  --security-opt label=type:nvidia_container_t  \
  --device=nvidia.com/gpu=all \
  -v $(pwd):/config \
  linuxserver/ffmpeg

Note: Removing the --security-opt and --cap-drop flags works just fine too.

Alternative command

podman run \
  --user 1000:1000 \
  --device=nvidia.com/gpu=all \
  -v $(pwd):/config \
  linuxserver/ffmpeg

The container was able to launch successfully:

ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
  configuration: --disable-debug --disable-doc --disable-ffplay --enable-alsa --enable-cuda-llvm --enable-cuvid --enable-ffprobe --enable-gpl --enable-libaom --enable-libass --enable-libdav1d --enable-libfdk_aac --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libkvazaar --enable-liblc3 --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libplacebo --enable-librav1e --enable-librist --enable-libshaderc --enable-libsrt --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-nonfree --enable-nvdec --enable-nvenc --enable-opencl --enable-openssl --enable-stripping --enable-vaapi --enable-vdpau --enable-version3 --enable-vulkan
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.101 / 61. 19.101
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
Universal media converter
usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...

Use -h to get full help or, even better, run 'man ffmpeg'

So it sounds like a decent workaround to just update my alias in my .bashrc file but I am also very curious to understand why docker is trying to load a module from the previous nvidia driver.

Any idea as to how I could cleanup these old drivers?

klmcw · May 29, 2025, 6:41pm

I have not tried the command myself, but it seems like that output is coming from inside the container - and not the host - right?

So if you docker run -it ... linuxserver/ffmpeg -- bash and from in the container do ldconfig - what do you see?

I think its an issue with not being able to automatically translate /lib/libnvidia-*.so to their location on the host. Or something like that.

I have done a bunch of work with distrobox, and when used with the nvidia=true option in distrobox.ini it does a mount of each library file. Looks like this (output truncated):

$ mount | grep -E 'nvidia.*\.so' |head -20
composefs on /usr/lib/gbm/nvidia-drm_gbm.so type overlay (ro,relatime,seclabel,lowerdir+=/run/ostree/.private/cfsroot-lower,datadir+=/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on)
composefs on /usr/lib/libEGL_nvidia.so.0 type overlay (ro,relatime,seclabel,lowerdir+=/run/ostree/.private/cfsroot-lower,datadir+=/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on)
composefs on /usr/lib/libEGL_nvidia.so.570.153.02 type overlay (ro,relatime,seclabel,lowerdir+=/run/ostree/.private/cfsroot-lower,datadir+=/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on)
composefs on /usr/lib/libGLESv1_CM_nvidia.so.1 type overlay (ro,relatime,seclabel,lowerdir+=/run/ostree/.private/cfsroot-lower,datadir+=/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on)
composefs on /usr/lib/libGLESv1_CM_nvidia.so.570.153.02 type overlay (ro,relatime,seclabel,lowerdir+=/run/ostree/.private/cfsroot-lower,datadir+=/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on)

There are 100 of those lines!

$ mount | grep -E 'nvidia.*\.so' | wc -l
100

But it seems you want to use at least some of what is in the container. The goal is current ffmpeg codecs - I get it.

Bluefin-dx does come with a recent cuda toolkit preinstalled. (they have been refactoring the image build process and I have lost track of where that is being done now or I would link to it here.)

That is what is coloring how I am thinking about your problem.

Probably a mixture of access to the correct library versions and SELinux perms.

Sounds like a spicy problem. But glad you have something working.

I debated with myself whether to reply or not. I didn’t really think I had much to contribute.

I am glad I was able to help in some small way.

SylChamber · July 15, 2025, 12:33pm

I’ve been having the same issue ever since I installed Aurora on May 13th, but I haven’t put any effort on it until this morning.

In my case, I have a docker compose file that starts Ollama and OpenWeb-UI and uses NVIDIA Container Toolkit. But I currently have a workaround (Alpaca), and I want to eventually run it in Kubernetes anyway (k3s). Or podman. I prefer podman to docker anyway.

I suspect the same thing. I’m on the stable-daily channel and the NVIDIA driver version is 575.64.03. I get this error message (ollamactl is a convenience script I wrote that does docker compose up or down):

$ ollamactl up           
[+] Running 0/2
 ⠙ Container ollama-openwebui-1  Starting                                                                                                                                    0.1s 
 ⠙ Container ollama-ollama-1     Starting                                                                                                                                    0.1s 
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: detection error: open failed: /usr/lib/libnvidia-tls.so.575.64: no such file or directory: unknown

What I find is that there is no .so nor .so.1 (or whatever digit) symlink for libnvidia-tls.so.575.64.03, while all other ̀libnvidia* files have corresponding .so.{1-9} symlinks.

$ ls -la /usr/lib/libnvidia*
ls -la /usr/lib/libnvidia*                                  
lrwxrwxrwx. 1 root root       32 31 déc  1969 /usr/lib/libnvidia-allocator.so.1 -> libnvidia-allocator.so.575.64.03
-rwxr-xr-x. 1 root root   191352 31 déc  1969 /usr/lib/libnvidia-allocator.so.575.64.03
-rwxr-xr-x. 1 root root 30029768 31 déc  1969 /usr/lib/libnvidia-eglcore.so.575.64.03
lrwxrwxrwx. 1 root root       26 31 déc  1969 /usr/lib/libnvidia-egl-gbm.so.1 -> libnvidia-egl-gbm.so.1.1.2
-rwxr-xr-x. 1 root root    27600 31 déc  1969 /usr/lib/libnvidia-egl-gbm.so.1.1.2
lrwxrwxrwx. 1 root root       31 31 déc  1969 /usr/lib/libnvidia-egl-wayland.so.1 -> libnvidia-egl-wayland.so.1.1.19
-rwxr-xr-x. 1 root root    78328 31 déc  1969 /usr/lib/libnvidia-egl-wayland.so.1.1.19
lrwxrwxrwx. 1 root root       26 31 déc  1969 /usr/lib/libnvidia-egl-xcb.so.1 -> libnvidia-egl-xcb.so.1.0.2
-rwxr-xr-x. 1 root root    77684 31 déc  1969 /usr/lib/libnvidia-egl-xcb.so.1.0.2
lrwxrwxrwx. 1 root root       27 31 déc  1969 /usr/lib/libnvidia-egl-xlib.so.1 -> libnvidia-egl-xlib.so.1.0.2
-rwxr-xr-x. 1 root root    77760 31 déc  1969 /usr/lib/libnvidia-egl-xlib.so.1.0.2
lrwxrwxrwx. 1 root root       29 31 déc  1969 /usr/lib/libnvidia-encode.so -> libnvidia-encode.so.575.64.03
lrwxrwxrwx. 1 root root       29 31 déc  1969 /usr/lib/libnvidia-encode.so.1 -> libnvidia-encode.so.575.64.03
-rwxr-xr-x. 1 root root   325140 31 déc  1969 /usr/lib/libnvidia-encode.so.575.64.03
-rwxr-xr-x. 1 root root 32086156 31 déc  1969 /usr/lib/libnvidia-glcore.so.575.64.03
-rwxr-xr-x. 1 root root   798460 31 déc  1969 /usr/lib/libnvidia-glsi.so.575.64.03
-rwxr-xr-x. 1 root root 13386612 31 déc  1969 /usr/lib/libnvidia-glvkspirv.so.575.64.03
-rwxr-xr-x. 1 root root 96984532 31 déc  1969 /usr/lib/libnvidia-gpucomp.so.575.64.03
lrwxrwxrwx. 1 root root       25 31 déc  1969 /usr/lib/libnvidia-ml.so.1 -> libnvidia-ml.so.575.64.03
-rwxr-xr-x. 1 root root  2422416 31 déc  1969 /usr/lib/libnvidia-ml.so.575.64.03
lrwxrwxrwx. 1 root root       27 31 déc  1969 /usr/lib/libnvidia-nvvm.so.4 -> libnvidia-nvvm.so.575.64.03
-rwxr-xr-x. 1 root root 94346880 31 déc  1969 /usr/lib/libnvidia-nvvm.so.575.64.03
lrwxrwxrwx. 1 root root       29 31 déc  1969 /usr/lib/libnvidia-opencl.so.1 -> libnvidia-opencl.so.575.64.03
-rwxr-xr-x. 1 root root 21559024 31 déc  1969 /usr/lib/libnvidia-opencl.so.575.64.03
lrwxrwxrwx. 1 root root       34 31 déc  1969 /usr/lib/libnvidia-opticalflow.so.1 -> libnvidia-opticalflow.so.575.64.03
-rwxr-xr-x. 1 root root    46316 31 déc  1969 /usr/lib/libnvidia-opticalflow.so.575.64.03
lrwxrwxrwx. 1 root root       37 31 déc  1969 /usr/lib/libnvidia-ptxjitcompiler.so.1 -> libnvidia-ptxjitcompiler.so.575.64.03
-rwxr-xr-x. 1 root root 44553256 31 déc  1969 /usr/lib/libnvidia-ptxjitcompiler.so.575.64.03
-rwxr-xr-x. 1 root root    21880 31 déc  1969 /usr/lib/libnvidia-tls.so.575.64.03

I guess that I would have to submit an issue upstream to the packager for nvidia-driver-libs. (I found the package name by running rpm -qf /usr/lib/libnvidia-tls.so.575.64.03.)

SylChamber · July 16, 2025, 12:23am

Well, it does work with podman. I had to set the podman compose provider to podman-compose or else docker compose was used because it has precedence when it is installed. (I’m going to set it permanently in my user’s containers.conf, see man containers.conf.)

export PODMAN_COMPOSE_PROVIDER=podman-compose
podman compose up -d

I saw that using sudo with docker compose may work, but I really don’t like the idea of rootful containers anyway.

Topic		Replies	Views
Issue with containers using the nvidia-container toolkit failing after upgrade to 42 Aurora	1	182	May 19, 2025
Nvidia-container-toolkit seems to have been replaced in current release of bluefin-nvidia-dx:stable, breaking containers that use nvidia Bluefin	8	245	March 21, 2025
Quick end-of-year update Bluefin developer-experience , bluefin-news	4	1489	January 2, 2024
Multiple Discrete GPU - Broken Hardware Decoding Bluefin	6	108	March 21, 2025
Nvidia GPU (4060 Max-Q) failing to suspend Bazzite	4	785	July 20, 2024

Cannot use Nvidia Runtime in Docker Since Update to Fedora 42

Related topics