Ollama integration #349
ericcurtin
started this conversation in
Ideas
Replies: 3 comments 4 replies
-
Linked PR: |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks for creating this discussion :)
Would this allow for something like a chatbot with an on going set of requests to the server? or would the server restart with each new prompt? |
Beta Was this translation helpful? Give feedback.
3 replies
-
Another related PR: Can confirm, Fedora 40 and Ollama works via podman-ollama using: AMD Radeon RX 7600 GPU |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
There's a couple of ways we could tackle this.
Containerising Ollama and other AI frameworks, if possible, is advisable due to their complexity as software stacks. Utilizing tools like podman, bootc, etc., underscores the implicit nature of this approach within our strategies.
One nice aspect of podman containers for Ollama, is we can expose certain containers to certain pieces of hardware be it a NVIDIA GPU, a AMD GPU, a CPU or some other component.
This tool was written with the goal that ollama should be boring, daemonless, rootless, serverless, portless, ephemeral by default (but one can also change those defaults if necessary).
https://github.com/ericcurtin/podman-ollama
Some use cases for this tool:
Running local Ollama LLMs
This is just:
an ephemeral container is set up with a local server running, the request is executed and the container dies.
Running Ollama as a server only
This has similar syntax to "podman generate" (since this is all a wrapper around existing tooling), we can setup quadlets with this command, which is a similar approach to what Universal Blue took, which IMO is the most ideal approach for servers:
Running Ollama as a client only
For this we can use the OLLAMA_HOST environment variable
Conclusion
These approaches would fit in well with bootc, Podman Desktop AI Lab, etc.
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions