I’m running bazzite-deck with an AMD Ryzen 9 7900 cpu and a XFX Radeon 7900 GRE gpu. I’ve recently encountered some crashes when gaming in desktop mode (specifically Helldivers 2, other less demanding games have been fine). Sequence of events:
- Picture on the monitor freezes, but I can still hear the game and can still talk in Discord
- Audio cuts out, picture is still frozen
- Screen goes black
- I return to Steam Game Mode
- I’m able to return to desktop mode, open discord, reopen the game, and continue without issue
I checked journalctl and believe this is the start of the relevant logs.
Jun 12 20:50:05 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State
Jun 12 20:50:05 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: Dumping IP State Completed
Jun 12 20:50:05 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=37073695, emitted seq=37073697
Jun 12 20:50:05 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: Process information: process main pid 13369 thread vkd3d_queue pid 13501
Jun 12 20:50:05 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: Starting gfx_0.0.0 ring reset
Jun 12 20:50:05 bazzite kernel: [drm:gfx_v11_0_bad_op_irq [amdgpu]] *ERROR* Illegal opcode in command stream
Jun 12 20:50:06 bazzite flatpak[8972]: 20:50:06.168 › [HDStreamingConsumableModal] Setting bitrates
Jun 12 20:50:06 bazzite flatpak[8972]: 20:50:06.169 › [HDStreamingConsumableModal] Setting bitrates
Jun 12 20:50:06 bazzite flatpak[8972]: 20:50:06.234 › [RTCControlSocket(default)] Sending heartbeat with last received sequence number: 22
Jun 12 20:50:06 bazzite flatpak[8972]: 20:50:06.290 › [RTCControlSocket(default)] Heartbeat ACK received
Jun 12 20:50:07 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: MES failed to respond to msg=RESET
Jun 12 20:50:07 bazzite kernel: [drm:amdgpu_mes_reset_legacy_queue [amdgpu]] *ERROR* failed to reset legacy queue
Jun 12 20:50:07 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: Ring gfx_0.0.0 reset failure
Jun 12 20:50:07 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Jun 12 20:50:11 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: failed to suspend display audio
Jun 12 20:50:13 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
Jun 12 20:50:13 bazzite kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jun 12 20:50:14 bazzite kernel: [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: MODE1 reset
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
Jun 12 20:50:14 bazzite kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
Jun 12 20:50:14 bazzite kernel: [drm] VRAM is lost due to GPU reset!
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: PSP is resuming...
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: reserve 0x1300000 from 0x83fc000000 for PSP TMR
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000003d, smu fw if version = 0x00000040, smu fw program = 0, smu fw version = 0x004e8000 (78.128.0)
Jun 12 20:50:14 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
Jun 12 20:50:15 bazzite kernel: [drm] DMUB hardware initialized: version=0x07002D00
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_1 uses VM inv eng 1 on hub 8
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 4 on hub 8
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 14 on hub 0
Jun 12 20:50:15 bazzite kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
Jun 12 20:50:15 bazzite steam[8118]: radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
Jun 12 20:50:15 bazzite audit[7495]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 subj=unconfined_u:unconfined_r:xserver_t:s0-s0:c0.c1023 pid=7495 comm="Xwayland:cs0" exe="/usr/bin/Xwayland" sig=6 res=1
Jun 12 20:50:15 bazzite kwin_wayland_wrapper[7495]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
Jun 12 20:50:15 bazzite kwin_wayland[7408]: kwin_scene_opengl: A graphics reset not attributable to the current GL context occurred.
Jun 12 20:50:15 bazzite systemd-coredump[16444]: Process 7495 (Xwayland) of user 1000 terminated abnormally with signal 6/ABRT, processing...
Jun 12 20:50:15 bazzite systemd[1]: Created slice system-systemd\x2dcoredump.slice - Slice /system/systemd-coredump.
Jun 12 20:50:15 bazzite audit: BPF prog-id=314 op=LOAD
Jun 12 20:50:15 bazzite audit: BPF prog-id=315 op=LOAD
Jun 12 20:50:15 bazzite audit: BPF prog-id=316 op=LOAD
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: azx_get_response timeout, switching to polling mode: last cmd=0x00872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite kernel: snd_hda_intel 0000:03:00.1: spurious response 0x0:0x0, last cmd=0x872400
Jun 12 20:50:15 bazzite audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@0-16444-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jun 12 20:50:15 bazzite kwin_wayland[7408]: kwin_wayland_drm: Checking test buffer failed!
Jun 12 20:50:15 bazzite systemd[1]: Started systemd-coredump@0-16444-0.service - Process Core Dump (PID 16444/UID 0).
Jun 12 20:50:15 bazzite systemd[1]: Stopping sddm.service - Simple Desktop Display Manager...
Jun 12 20:50:15 bazzite sddm[7281]: Signal received: SIGTERM
Jun 12 20:50:15 bazzite sddm-helper[7285]: Signal received: SIGTERM
Jun 12 20:50:15 bazzite systemd[2961]: Stopped target plasma-workspace-wayland.target.
Jun 12 20:50:15 bazzite systemd[2961]: Stopped target plasma-workspace.target - KDE Plasma Workspace.
Jun 12 20:50:15 bazzite systemd[2961]: Stopped target xdg-desktop-autostart.target - Startup of XDG autostart applications.
Jun 12 20:50:15 bazzite systemd[2961]: Stopping app-steam@autostart.service - Steam...
Jun 12 20:50:15 bazzite systemd[2961]: Stopping app-geoclue\x2ddemo\x2dagent@autostart.service - Geoclue Demo agent...
The logs continue on, listing things that are shutting down, kde plasma shutting down, the game mode session starting, etc, but I’m a little unsure since systemd-coredump reports core dumps of things like Discord after everything else shut down and a new session started up. I doubt the rest is relevant since the start of this mentions a gpu reset.
I’m a bit out of my league here and could use some assistance. Is this more likely a kernel/driver thing or a hardware thing? What additional information could I gather or what additional troubleshooting can I do?
It’s possible this is related to Random Reboots, without the usual suspects, but that thread describes reboots rather than being kicked from desktop mode to game mode, and the journalctl logs don’t seem particularly related to my case