New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

support for llama 3 #43

Open

avianion opened this issue May 12, 2024 · 4 comments

avianion commented May 12, 2024

will this project plan to support llama 3 70b or 8b?

Owner

npuichigo commented May 12, 2024

llama3 should already be supported with template https://github.com/npuichigo/openai_trtllm/blob/main/templates/history_template_llama3.liquid. To get the model, please refer to https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llama#llama-v3-updates

Author

avianion commented May 12, 2024

ok great. @npuichigo but what is the model name? it keeps saying to me model not found and i have tried many model names. with llama 3 70b

Owner

npuichigo commented May 13, 2024

it's ensemble if the structure looks like https://github.com/triton-inference-server/tensorrtllm_backend/tree/v0.9.0/all_models/inflight_batcher_llm

Author

avianion commented May 15, 2024

Should skip_special_tokens be True or False? and same with add_special_tokens in the preprocessing config.pbtxt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment