Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deploy deepseek-r1:1.5b or any LLM model using csghub on a linux serve for demo purpose? #919

Open
dxu104 opened this issue Jan 28, 2025 · 2 comments

Comments

@dxu104
Copy link

dxu104 commented Jan 28, 2025

Hi Team,

I would like to deploy deepseek-r1:1.5b or any other LLM model on a Linux server using CSGHub for demo purposes.

Could you please provide a step-by-step tutorial or a video guide? A tutorial in Chinese is also acceptable.

Looking forward to your guidance!

@KinglyWayne
Copy link
Member

CSGHub provide several different way to build llm app.
You could use inference endpoint to hold llm server or build gradio/streamlint app to host a demo app.
You could check our offical doc for inference
https://opencsg.com/docs/inferencefinetune/inference_finetune_intro
currently hf-tgi/vllm/sglang are supported, llamacpp/ollama is in the plan.
If you are familar with gradio or streamlit, you could try space funtion as well.

@KinglyWayne
Copy link
Member

For deepseek r1 model, as i know, to run r1 or r1_zero you would have at least 2*8 h100 gpu or more.
Personally I would suggest you to try ds-distill-qwen or ds-distill-llama which could be loaded with tgi or vllm at a decent gpu requirment. For even lower gpu(such as single gpu), you could use vllm to serve awq quant version . All those funtions are supportted now.
If you meet any problem, feel free to comment😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants