After a very surreal experience on Github with an AI responding and actually fixing an issue I had with some software, it peaked my interest, now I have been bitten with the bug.
So I started down the rabbit hole of playing with local LLM’s on my main server, was too slow to be usable, but I have several Bazzite boxes sitting around, so I though I would see how Bazzite goes as an LLM server.
Machine:
CPU = Intel Gen 12 I3
Ram = 32GB DDR4
GPU = RX 9060 XT 16GB
Bazzite = Latest Deck edition
I just installed ROCM via rpm-ostree, Gear Lever from Bazaar (Used to be included in Bazzite??) and LM Studio appimage.
I setup LM Studio using Gear Lever, after that I could add it Steam, installed the gpt-oss-20b model, set LM Studio to use the ROCM back end, switched back to gamescope, launched LM Studio, enabled local server mode, and thats it.
Ive connected Open WebUI and an instance of SearXNG to it, and now I have a really fast local LLM that does really fast web searching and research for me, perfect use for the console when I’m not using it to play games, as this one doesn’t get used much at all.
Tokens per second over gigabit lan is around 80 to 85 with the 20b model fully loaded into the VRAM with 90k token limit, super impressed, looks like Bazzite perform’s really well for this sort of thing, also the Steam metrics overlay is cool, you can see ram, vram and power usage all on screen, really great for testing.
I find the 9060 XT extremely good for a low cost AI card, I have 3 users all accessing it and it holds up perfectly fine, just as good as good if not better than the official OpenAI online free LLM.
Next I will try to set all services up on the one machine using Podman, as I think everything, the Open WebUI, SearXNG and LM will run on a low end machine and function perfectly fine.