vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.3k
Star 34.6k

Code
Issues 1.2k
Pull requests 465
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

465 Open 5,346 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Docs] Update spec decode + structured output in compat matrix documentation

Improvements or additions to documentation

ready

ONLY add when PR is ready to merge/full CI is needed

#12373 opened Jan 23, 2025 by russellb

Loading…

[V1] Increase default batch size for H100/H200 ready

ONLY add when PR is ready to merge/full CI is needed

#12369 opened Jan 23, 2025 by WoosukKwon

Loading…

[Core] add and implement VLLM_LOGITS_PROCESSOR_THREADS

#12368 opened Jan 23, 2025 by akeshet

Loading…

Update compressed-tensors version ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12367 opened Jan 23, 2025 by dsikka

Loading…

Set weights_only=True when using torch.load() ready

ONLY add when PR is ready to merge/full CI is needed

#12366 opened Jan 23, 2025 by russellb

Loading…

[Hardware][Intel-Gaudi] Enable FusedSDPA support for Intel Gaudi (HPU)

#12359 opened Jan 23, 2025 by SanjuCSudhakaran • Draft

[Misc] Enable proxy support in benchmark script

#12356 opened Jan 23, 2025 by jsato8094

Loading…

[Misc] Add FA2 support to ViT MHA layer

#12355 opened Jan 23, 2025 by Isotr0py

Loading…

[Bugfix] Path join when building local path for S3 clone ready

ONLY add when PR is ready to merge/full CI is needed

#12353 opened Jan 23, 2025 by omer-dayan

Loading…

[ROCm] Faster Custom Paged Attention kernels ci/build rocm

#12348 opened Jan 23, 2025 by tjtanaa • Draft

[Bugfix] handle alignment of arguments in convert_sparse_cross_attention_mask_to_dense

#12347 opened Jan 23, 2025 by tjohnson31415

Loading…

Flops

#12341 opened Jan 23, 2025 by dianastea • Draft

[Build] Only build 9.0a for scaled_mm and sparse kernels ci/build

#12339 opened Jan 23, 2025 by LucasWilkinson

Loading…

[Frontend] Generate valid tool call IDs when using tokenizer-mode=mistral frontend

#12332 opened Jan 22, 2025 by rafvasq • Draft

[Core] Optimizing cross-attention QKVParallelLinear computation

#12325 opened Jan 22, 2025 by NickLucche

Loading…

2 tasks

[Hardware][Gaudi][Bugfix] Fix error for guided decoding

#12317 opened Jan 22, 2025 by zhouyu5

Loading…

[do-not-merge][perf-benchmark] cleanup unused docker images/containers ci/build perf-benchmarks

#12306 opened Jan 22, 2025 by khluu

Loading…

[Feature][Spec Decode] Simplify the use of Eagle Spec Decode

#12304 opened Jan 22, 2025 by ShangmingCai

Loading…

[Hardware][Gaudi][Feature] Enable Dynamic MoE for Mixtral

#12303 opened Jan 22, 2025 by zhenwei-intel

Loading…

[Core] Make disaggregated prefill compatible with pipeline parallelism

#12301 opened Jan 22, 2025 by YuhanLiu11

Loading…

[V1][Frontend] Coalesce bunched RequestOutputs frontend performance

Performance-related issues

ready

ONLY add when PR is ready to merge/full CI is needed

#12298 opened Jan 22, 2025 by njhill

Loading…

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels ci/build

#12294 opened Jan 22, 2025 by fenghuizhang

Loading…

[Core] tokens in queue metric

#12286 opened Jan 21, 2025 by annapendleton

Loading…

[Core] Prefill Only Tokens Without KV Cache in Batch Requests (Disagg Prefill)

#12285 opened Jan 21, 2025 by Shaoting-Feng

Loading…

[CI/Build] Add label automation for structured-output / speculative-decoding ci/build

#12280 opened Jan 21, 2025 by russellb

Loading…

Previous 1 2 3 4 5 … 18 19 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly