Well, this is definitely an Nvidia driver issue. I was worried about this, haven’t owned an Nvidia device in years because of it…
Ironically the card works perfectly fine with PCI-passthrough into a Windows VM!
- I’ve reverted to the non-nvidia image of Bluefin-DX (
ostree-image-signed:docker://ghcr.io/ublue-os/bluefin-dx:latest
)
- I created a new Windows 11 virtual machine using Virt-Manager
- I added a PCI device selecting the Nvidia card
Sure enough, I’m back to a stable host-system (battery life is reasonable, suspend/resume works, etc). Meanwhile the Nvidia card works fine inside the Windows VM! I was even able to suspend/resume with the Windows VM running!
Somewhat regretting my Yoga Pro 9i purchase now… I wonder how long before there is a stable kernel/nvidia-driver combination for my machine. Here is the nvidia module info, seems to be version 550.76:
Apr 26 22:43:18 fedora kernel: nvidia: module license 'NVIDIA' taints kernel.
Apr 26 22:43:18 fedora kernel: Disabling lock debugging due to kernel taint
Apr 26 22:43:18 fedora kernel: nvidia: module license taints kernel.
Apr 26 22:43:18 fedora kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 236
Apr 26 22:43:18 fedora kernel:
Apr 26 22:43:18 fedora kernel: nvidia 0000:01:00.0: enabling device (0000 -> 0003)
Apr 26 22:43:18 fedora kernel: nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
Apr 26 22:43:18 fedora kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 550.76 Wed Apr 10 20:41:20 UTC 2024
Apr 26 22:43:18 fedora kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
Apr 26 22:43:18 fedora kernel: nvidia-uvm: Loaded the UVM driver, major device number 234.
Apr 26 22:43:18 fedora kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 550.76 Wed Apr 10 20:05:49 UTC 2024
Apr 26 22:43:18 fedora kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
Apr 26 22:43:19 fedora kernel: ACPI Warning: \_SB.NPCF._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20230628/nsarguments-61)
Apr 26 22:43:19 fedora kernel: ACPI Warning: \_SB.PC00.RP12.PXSX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20230628/nsarguments-61)
Apr 26 22:43:20 fedora kernel: nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device DP-0
Apr 26 22:43:20 fedora kernel: nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device DP-0
Apr 26 22:43:20 fedora kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 4
Apr 26 22:43:20 fedora kernel: nvidia 0000:01:00.0: [drm] Cannot find any crtc or sizes
And of course the latest Bluefin-DX kernel:
❯ uname -a
Linux myhost 6.8.7-300.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr 17 19:21:08 UTC 2024 x86_64 GNU/Linux
I was actually able to add the PCI card to a VM running Fedora 40 as well, but for some reason the QXL adapter (the VM has multiple GPUs, one software QXL and one being the physical Nvidia) doesn’t take control of the login screen. The VM comes up, I can ssh into it, but I can’t login from the GDM login screen as it is not visible. Seems like it renders on the Nvidia screen and needs some physical monitor attached? I’ll look into it more, as it may be a good way to try newer kernels/drivers to see if it stabilizes…
I will update this post if I come across anything new, in case anyone else is using the same laptop…