Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024
#9006 opened Oct 1, 2024 by simon-mo
Open 19
vLLM's V1 Engine Architecture
#8779 opened Sep 24, 2024 by simon-mo
Open 9
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Bug]: Granite 3.0 disconnect between parser and example template bug Something isn't working
#10379 opened Nov 15, 2024 by wilbry
1 task done
[Feature]: NVIDIA Triton GenAI Perf Benchmark feature request good first issue Good for newcomers help wanted Extra attention is needed
#10377 opened Nov 15, 2024 by simon-mo
1 task done
[Bug]: Guided Decoding Broken in Streaming mode bug Something isn't working
#10376 opened Nov 15, 2024 by JC1DA
1 task done
[Bug]: Torch profiling does not stop and cannot get traces for all workers bug Something isn't working
#10365 opened Nov 15, 2024 by ruisearch42
1 task done
[Bug]: contine generation but do not return the output bug Something isn't working
#10359 opened Nov 15, 2024 by siyuyuan
1 task done
[Usage]: cuda oom when serving multi task on same server usage How to use vllm
#10345 opened Nov 15, 2024 by reneix
1 task done
[Misc]: Snowflake Arctic out of memory error with TP-8 bug Something isn't working
#10344 opened Nov 14, 2024 by rajagond
1 task done
[Bug]: Out of Memory (OOM) Issues During MMLU Evaluation with lm_eval bug Something isn't working
#10325 opened Nov 14, 2024 by wchen61
1 task done
[Installation]: Request to include vllm==0.6.2 for cuda 11.8 installation Installation problems
#10319 opened Nov 14, 2024 by amew0
1 task done
[Bug]: FusedMoE kernel performance depends on input prompt length while decoding bug Something isn't working
#10313 opened Nov 14, 2024 by taegeonum
1 task done
[Usage]: how to use vllm to output code only usage How to use vllm
#10309 opened Nov 14, 2024 by shaoyuyoung
1 task done
[Installation]: Build vllm environment error installation Installation problems
#10303 opened Nov 13, 2024 by Kawai1Ace
1 task done
[Bug]: undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12 bug Something isn't working
#10300 opened Nov 13, 2024 by yananchen1989
1 task done
[Bug]: VLLLm crash when running Qwen/Qwen2.5-Coder-32B-Instruct on two H100 GPUs bug Something isn't working
#10296 opened Nov 13, 2024 by noamwies
1 task done
[Bug]: Can't use yarn rope config for long context in Qwen2 model bug Something isn't working
#10293 opened Nov 13, 2024 by FlyCarrot
1 task done
ProTip! Find all open issues with in progress development work with linked:pr.