[Bug Report] Bazzite does not handle OOM properly

Hello, I do work in blender and of course as standard sometimes I push things a bit to far and run out of memory. I have 32gbs of ddr5. Although this same issue has been affecting me on 32gbs of ddr4 to, Not that the generation/ram speed matters in this instance, just wanted to let people know that is has been affecting me for a while.

However unlike on windows(As far back as the early versions of windows NT workstation and Windows XP) or on MacOS. when Atomic Fedora(At least if im aware. This might just be a bazzite thing but I will take the doubt and say that it might also happen on Base Fedora Atomic as well)/Bazzite runs out of memory the entire system goes down, everything freezes and im forced to force restart(flick the psu switch) the computer losing my work, Typically this happens sometimes almost everyday for me and it affects my work. Im here to report that OOM is not working on bazzite or whatever is happening when the system runs out of memory is not enough to prevent the entire system breaking. I mean this nicely and just wish for bazzite to improve of course

With just 2 blender instances open im already using a staggering 24gbs of ram
(This is everything open on my computer right now in total)

With just only 6 gbs available in ram. Of course i understand that what gnome or most services do is not bazzites fault and I can’t blame bazzite for taking up all my ram.
But still that’s LITTLE wiggle room. If i were to launch a third blender instance out of nowhere there is a good chance that for a second I could go OOM and the entire system would go down. Which is why I bring up this problem.

We ship systemd-oomd with the same config that Fedora uses.

huh.. well now im really confused. Because yeah your right it does use that.

Tell you what. Ill record you a video showing what happens on OOM for me(This has happened on both AM4 and AM5 for me). (It just happened again, ran out of memory and the entire system froze and would not come back, had to pull the power plug again)

Yeah no I can comfirm. When bazzite runs out of memory it just freezes and never comes back. OOM does not appear to save the system at all. Even if it just a LITTLE bit reaches “OOM” territory the entire system goes down.

I have had a few OOM events running Wan 2 models in comfy-ui with ROCM. I will say sometimes oomd will kill the process causing the OOM but not always. I have had a few OOMs that the system would crawl to a halt. It was never clear to me when it would happen and it wasn’t consistent.

I have a feeling that the default config for oomd is not aggressive enough which allows some system hangs to happen. I haven’t had a chance to look at the logs myself about why this is.

please im begging. fix it…

I cant take these crashes anymore

A few weeks ago I did have a OOM event un one of mi PCs (AMD Ryzen 5, 32 GB DDR 5), in which I left open Firefox with many tabs, and it did slowly use the free memory until used almost all, and the OOM killed Firefox. I know this because a day or two later, when powering on the TV attached to it with HDMI, I saw a message telling me that Firefox was closed because the system was out of memory, and everything keep working ok.

So, this is to say that I think that the systemd-oomd is working when memory is appropiated slowly by a program, but may me the problem is when it is appropiated faster, not giving oomd time to kill the process.

At the same time, I don’t know if a more aggressive oomd setting could result in false positives, killing programs erroneously…

Here are instructions on how to manually switch from swap on zram to disk based swap with zswap enabled. This may help resolve some memory issues, as it allows data to be moved out of memory completely (freeing it for another use), rather than compressing it but keeping it in memory (what zram does). This worked for me on Bluefin, and I assume Bazzite is similar enough for this to work there too.

Step 1, switch off zram:

Create a file /etc/systemd/zram-generator.conf which overrides the system zram settings. The file can be empty. In mine I’ve put a comment to myself:

# This empty file overrides `/usr/lib/systemd/zram-generator.conf`.

Step 2, create a swapfile:

Assuming the system is using btrfs (this is the default on Bluefin, so I assume Bazzite is the same). Change the size to whatever makes sense for you.

sudo btrfs filesystem mkswapfile --size 8G /var/swapfile

Add the swapfile to /etc/fstab. The entry for me looks like this:

/var/swapfile    none    swap    pri=0    0 0

Step 3, enable zswap (optional):

sudo rpm-ostree kargs --append='zswap.enabled=1'

Step 4, reboot.

Step 5, confirm settings:

The command cat /proc/swaps can confirm the active swap. It should now show only your swapfile.

Example:

> cat /proc/swaps
Filename				Type		Size		Used		Priority
/var/swapfile                           file		8388604		0		0

If you enabled zswap, check it is enabled:

> sudo dmesg | grep -F zswap
[    0.000000] Command line: BOOT_IMAGE=(hd1,gpt5)/ostree/default-2ee8f203d4c5b97fd5fc04daf879a9dad7dc1c405c8aba2b6ddbda907ce9816e/vmlinuz-6.18.13-200.fc43.x86_64 rd.luks.uuid=luks-0c755fd5-99d6-4578-bb46-328eb72fd038 rd.lvm.lv=vg0/root rhgb quiet root=UUID=8e069d8e-b937-4ad1-a692-cd921ee55f6f rootflags=subvol=ub_root rw ostree=/ostree/boot.0/default/2ee8f203d4c5b97fd5fc04daf879a9dad7dc1c405c8aba2b6ddbda907ce9816e/0 rd.luks.options=discard zswap.enabled=1
[    0.048097] Kernel command line: BOOT_IMAGE=(hd1,gpt5)/ostree/default-2ee8f203d4c5b97fd5fc04daf879a9dad7dc1c405c8aba2b6ddbda907ce9816e/vmlinuz-6.18.13-200.fc43.x86_64 rd.luks.uuid=luks-0c755fd5-99d6-4578-bb46-328eb72fd038 rd.lvm.lv=vg0/root rhgb quiet root=UUID=8e069d8e-b937-4ad1-a692-cd921ee55f6f rootflags=subvol=ub_root rw ostree=/ostree/boot.0/default/2ee8f203d4c5b97fd5fc04daf879a9dad7dc1c405c8aba2b6ddbda907ce9816e/0 rd.luks.options=discard zswap.enabled=1
[    1.116601] zswap: loaded using pool lzo

I am not well enough informed of the tradeoffs to have opinions on whether the Bazzite defaults should be changed but I know that Chris Down recommends disk swap with zswap instead of zram (ref).

My experience has been that it eventually kills something and recovers, but this takes a LONG time. Encountering this frequently is very frustrating. I have layered earlyoom which is supposed to be much quicker about killing an application before the system freezes up entirely and this has been a better experience on my end. I would prefer a more graceful swapping to disk that resulted in slower performance than killing memory-using applications.