Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[Docs] Update spec decode + structured output in compat matrix documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#12373 opened Jan 23, 2025 by russellb Loading…
[V1] Increase default batch size for H100/H200 ready ONLY add when PR is ready to merge/full CI is needed
#12369 opened Jan 23, 2025 by WoosukKwon Loading…
[Core] add and implement VLLM_LOGITS_PROCESSOR_THREADS
#12368 opened Jan 23, 2025 by akeshet Loading…
Update compressed-tensors version ci/build ready ONLY add when PR is ready to merge/full CI is needed
#12367 opened Jan 23, 2025 by dsikka Loading…
Set weights_only=True when using torch.load() ready ONLY add when PR is ready to merge/full CI is needed
#12366 opened Jan 23, 2025 by russellb Loading…
[Misc] Enable proxy support in benchmark script
#12356 opened Jan 23, 2025 by jsato8094 Loading…
[Misc] Add FA2 support to ViT MHA layer
#12355 opened Jan 23, 2025 by Isotr0py Loading…
[Bugfix] Path join when building local path for S3 clone ready ONLY add when PR is ready to merge/full CI is needed
#12353 opened Jan 23, 2025 by omer-dayan Loading…
Flops
#12341 opened Jan 23, 2025 by dianastea Draft
[Core] Optimizing cross-attention QKVParallelLinear computation
#12325 opened Jan 22, 2025 by NickLucche Loading…
2 tasks
[V1][Frontend] Coalesce bunched RequestOutputs frontend performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed
#12298 opened Jan 22, 2025 by njhill Loading…
[Core] tokens in queue metric
#12286 opened Jan 21, 2025 by annapendleton Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.