Has ujust ollama-web been removed?

UPDATE: I’ve found a message on discord saying that Alpaca doesn’t expose an API, which is very likely what I need to make its ollama instance talking with other applications, right?

Therefore, I guess that installing ollama through homebrew would better fit my needs, apart for the fact that if I try something like

brew install ollama
brew services start ollama
ollama run llama3.2

it will run using my CPU instead of taking advantage of the Nvidia GPU as Alpaca was doing… am I missing a step?

1 Like

I think you’re 100% right on all of the above (alpaca doesn’t expose the api, and install ollama with brew) BUT I’m not so sure ollama won’t automatically use your GPU? Can you check that it is not? docs on GPU Discovery. Not sure if this applies to brew, etc. But I think it should?

If it is not, looks like you can force it to use your GPU with this? ollama/docs/gpu.md at 8bccae4f92bced8222efc04f0b25573df450bb89 · ollama/ollama · GitHub

Yeah sorry my previous message was ambiguous, I meant to say that I guessed that ollama was a better fit, so I did tried installing it with brew and I did encountered an issue with GPU detection.

The GPU is a NVIDIA 4060 (laptop), configured and recognized by the system as proven by its presence in the System Monitor and by the fact that Alpaca makes use of it.
However, even if I try to set the CUDA_VISIBLE_DEVICES env variable as suggested in the link you attached and, the ollama installed by brew ignores it.

Did you (or anyone else) got ollama from brew working with an nvidia GPU?

I am not using ollama so verify what I am about to share.

I did a project with pytorch and CUDA support recently. I have an RTX 3050 Ti Mobile.

Here are 2 suggestions:

  1. standard nvidia env vars (I ended up not needing these):
    export __NV_PRIME_RENDER_OFFLOAD=1
    export __GLX_VENDOR_LIBRARY_NAME=nvidia
  2. Put this in a .desktop file for the app in ~/.local/share/applications (re-login required) - this did the trick for me in most apps like kitty, vscode, etc.
    PrefersNonDefaultGPU=true

Gotcha! I’m very sorry - I don’t know!

Problem is that ollama is not recognizing the nvidia GPU even when run through the command line, not only when used in external IDEs, so I fear the problem is a differnt one :melting_face:

Gah, unfortunately we may be out of luck with brew and ollama, judging on the mac folks’ experience: Maybe try the direct installer in this case and fiddle with the path and whatnot till it works. Sorry! I don’t have a strong GPU so I never worried about it running on the CPU.

https://www.reddit.com/r/ollama/comments/1h7grjl/m3_macbook_pro_18gb_not_using_gpu/

1 Like

I’d just grab the old service unit and toss it into /etc/containers/systemd:

1 Like

Sorry, but I may need a bit more detailed instructions about that :sweat_smile:

You mean I should copy just the part below [Service] and paste it into /etc/containers/systemd? And then?

Also, I see the linked script mentions containers so I thought, can’t I just install ollama inside a docker or podman container?
I’ve seen on ollama docs that it can also be installed inside a container as long as there is the Nvidia Container Toolkit installed in the system. Does Bluefin has it included? Because if so I might try that way as well…

Yep! Then you start/enable it like any other service unit, systemctl start ollama or whatever you call it. This is you want to run it as a service on your machine. This is useful if you want a centralized ollama instance and then connect a bunch of apps to it, then you manage the llm in one place.

can’t I just install ollama inside a docker or podman container?

Yeah this is a systemd service unit that will handle that for you, it’s using the ollama/ollama container from dockerhub. You can also install it manually if you follow their instructions and run it that way: https://hub.docker.com/r/ollama/ollama

Pretty sure the nvidia image has everything it needs but if it’s missing anything we can add that.

2 Likes

For those with a 780M iGPU, I can confirm that GPU usage does work when running ollama in Docker using the compose file found in the discussion below. Not sure if this is worth documenting.

Tried that and it works like a charm!

Following the instructions on docker hub, I run these two commands before, no prior toolkit installations needed (not sure if they are necessary, but I’ve run them just in case):

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

and then spinned up an ollama container using the following compose file

---
services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    restart: unless-stopped
    ports:
      - 11434:11434
    volumes:
      - ./ollama_v:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - capabilities:
                - gpu

It exposes the ollama APIs on localhost:11434 (or any other port mapped instead of that when spinning up the container) and with that I can connect it both to Alpaca GUI (by specifying it as a “remote instance” through the settings) as well as Zed and Jetbrains IDEs.

Idk how common could be my use case, but if it’s something more people encounters it could worth adding docker instead of homebrew as a “more advanced” option in the Bluefin docs?

2 Likes

Yeah if you wouldn’t mind PRing it on the AI page that’d be sweet! I can take a look later today. Those last five lines are ridiculous, lol.

1 Like

I’m kinda new to PRing big projects, but I should have done it here

Let me know if there’s something wrong :raised_hands:

1 Like

Wow! Really excellent documentation @shaked_coffee , thank you!

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.