Ollama integration #349

ericcurtin · 2024-04-27T14:52:36Z

ericcurtin
Apr 27, 2024

There's a couple of ways we could tackle this.

Containerising Ollama and other AI frameworks, if possible, is advisable due to their complexity as software stacks. Utilizing tools like podman, bootc, etc., underscores the implicit nature of this approach within our strategies.

One nice aspect of podman containers for Ollama, is we can expose certain containers to certain pieces of hardware be it a NVIDIA GPU, a AMD GPU, a CPU or some other component.

This tool was written with the goal that ollama should be boring, daemonless, rootless, serverless, portless, ephemeral by default (but one can also change those defaults if necessary).

https://github.com/ericcurtin/podman-ollama

Some use cases for this tool:

Running local Ollama LLMs

This is just:

  podman-ollama

an ephemeral container is set up with a local server running, the request is executed and the container dies.

Running Ollama as a server only

This has similar syntax to "podman generate" (since this is all a wrapper around existing tooling), we can setup quadlets with this command, which is a similar approach to what Universal Blue took, which IMO is the most ideal approach for servers:

  podman-ollama -p 11434:11434 generate systemd --new --files --name podman-ollama-quadlet

Running Ollama as a client only

For this we can use the OLLAMA_HOST environment variable

  OLLAMA_HOST="http://some_remote_host:11434" podman-ollama

Conclusion

These approaches would fit in well with bootc, Podman Desktop AI Lab, etc.

Thoughts?

ericcurtin · 2024-04-27T14:53:12Z

ericcurtin
Apr 27, 2024
Author

Linked PR:

#46

0 replies

MichaelClifford · 2024-04-29T21:02:57Z

MichaelClifford
Apr 29, 2024
Maintainer

Thanks for creating this discussion :)

an ephemeral container is set up with a local server running, the request is executed and the container dies.

Would this allow for something like a chatbot with an on going set of requests to the server? or would the server restart with each new prompt?

3 replies

ericcurtin Apr 29, 2024
Author

In order to use empheral containers a chatbot needs to adopt a fork exec model, either:

podman-ollama SOME_PROMPT

or pipe in the prompt:

cat some_prompt | podman-ollama -

We can also make podman-ollama containers persist, using:

podman-ollama -p 11434:11434 serve

or using

podman-ollama -p 11434:11434 generate ...

to generate a file for usage somewhere. If we want it to start automatically on boot like an everyday systemd service, I guess that's a case for setting up a quadlet with the generate command.

ericcurtin May 2, 2024
Author

podman-ollama -p 8001:11434 serve worked great with the chatbot application btw

ericcurtin May 2, 2024
Author

Added:

ericcurtin/podman-ollama#46

Now, if we run:

podman-ollama chatbot

it sets up the chatbot in an ephemeral way, and tears down the two containers and the pod on completion.

ericcurtin · 2024-05-11T14:34:57Z

ericcurtin
May 11, 2024
Author

Another related PR:

ollama/ollama#4363

Can confirm, Fedora 40 and Ollama works via podman-ollama using:

AMD Radeon RX 7600

GPU

1 reply

ericcurtin May 11, 2024
Author

There's actually quite a few PRs in Ollama upstream I am trying to get reviewed, not having great success though

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama integration #349

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 4 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Ollama integration #349

ericcurtin Apr 27, 2024

Running local Ollama LLMs

Running Ollama as a server only

Running Ollama as a client only

Conclusion

Replies: 3 comments · 4 replies

ericcurtin Apr 27, 2024 Author

MichaelClifford Apr 29, 2024 Maintainer

ericcurtin Apr 29, 2024 Author

ericcurtin May 2, 2024 Author

ericcurtin May 2, 2024 Author

ericcurtin May 11, 2024 Author

ericcurtin May 11, 2024 Author

ericcurtin
Apr 27, 2024

Replies: 3 comments 4 replies

ericcurtin
Apr 27, 2024
Author

MichaelClifford
Apr 29, 2024
Maintainer

ericcurtin Apr 29, 2024
Author

ericcurtin May 2, 2024
Author

ericcurtin May 2, 2024
Author

ericcurtin
May 11, 2024
Author

ericcurtin May 11, 2024
Author