-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Docs] Update spec decode + structured output in compat matrix
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#12373
opened Jan 23, 2025 by
russellb
Loading…
[V1] Increase default batch size for H100/H200
ready
ONLY add when PR is ready to merge/full CI is needed
#12369
opened Jan 23, 2025 by
WoosukKwon
Loading…
[Core] add and implement
VLLM_LOGITS_PROCESSOR_THREADS
#12368
opened Jan 23, 2025 by
akeshet
Loading…
Update compressed-tensors version
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12367
opened Jan 23, 2025 by
dsikka
Loading…
Set weights_only=True when using torch.load()
ready
ONLY add when PR is ready to merge/full CI is needed
#12366
opened Jan 23, 2025 by
russellb
Loading…
[Hardware][Intel-Gaudi] Enable FusedSDPA support for Intel Gaudi (HPU)
#12359
opened Jan 23, 2025 by
SanjuCSudhakaran
•
Draft
[Bugfix] Path join when building local path for S3 clone
ready
ONLY add when PR is ready to merge/full CI is needed
#12353
opened Jan 23, 2025 by
omer-dayan
Loading…
[Bugfix] handle alignment of arguments in convert_sparse_cross_attention_mask_to_dense
#12347
opened Jan 23, 2025 by
tjohnson31415
Loading…
[Build] Only build 9.0a for scaled_mm and sparse kernels
ci/build
#12339
opened Jan 23, 2025 by
LucasWilkinson
Loading…
[Core] Optimizing cross-attention
QKVParallelLinear
computation
#12325
opened Jan 22, 2025 by
NickLucche
Loading…
2 tasks
[Hardware][Gaudi][Bugfix] Fix error for guided decoding
#12317
opened Jan 22, 2025 by
zhouyu5
Loading…
[do-not-merge][perf-benchmark] cleanup unused docker images/containers
ci/build
perf-benchmarks
#12306
opened Jan 22, 2025 by
khluu
Loading…
[Feature][Spec Decode] Simplify the use of Eagle Spec Decode
#12304
opened Jan 22, 2025 by
ShangmingCai
Loading…
[Hardware][Gaudi][Feature] Enable Dynamic MoE for Mixtral
#12303
opened Jan 22, 2025 by
zhenwei-intel
Loading…
[Core] Make disaggregated prefill compatible with pipeline parallelism
#12301
opened Jan 22, 2025 by
YuhanLiu11
Loading…
[V1][Frontend] Coalesce bunched Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
RequestOutput
s
frontend
performance
#12298
opened Jan 22, 2025 by
njhill
Loading…
[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels
ci/build
#12294
opened Jan 22, 2025 by
fenghuizhang
Loading…
[Core] Prefill Only Tokens Without KV Cache in Batch Requests (Disagg Prefill)
#12285
opened Jan 21, 2025 by
Shaoting-Feng
Loading…
[CI/Build] Add label automation for structured-output / speculative-decoding
ci/build
#12280
opened Jan 21, 2025 by
russellb
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.