-
-
Notifications
You must be signed in to change notification settings - Fork 15.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Docker] Non-root support for vllm-openai; add opt-in vllm-openai-nonroot target
ci/build
documentation
Improvements or additions to documentation
#40275
opened Apr 19, 2026 by
TheDuyIT
Loading…
3 of 4 tasks
[Doc] Fix broken "full example" links in structured_outputs.md
documentation
Improvements or additions to documentation
structured-output
#40274
opened Apr 19, 2026 by
FrimpsManu
Loading…
docs: expand load_weights guide with AutoWeightsLoader and manual patterns
documentation
Improvements or additions to documentation
#40271
opened Apr 19, 2026 by
RudrenduPaul
Loading…
3 tasks
[Bugfix][Spec Decode] Wire draft_probs into probabilistic draft_model rejection
bug
Something isn't working
speculative-decoding
v1
#40269
opened Apr 19, 2026 by
bedeks
Loading…
4 tasks
refactor: make is_neox_style configurable for glm
deepseek
Related to DeepSeek models
#40267
opened Apr 19, 2026 by
inisis
Loading…
4 tasks
[Doc] Fix typos in token_embed pooling documentation
documentation
Improvements or additions to documentation
#40266
opened Apr 19, 2026 by
YifanLi3
Loading…
Fix tp device index overflow
ci/build
cpu
Related to CPU backends
documentation
Improvements or additions to documentation
frontend
kv-connector
llama
Related to Llama models
multi-modality
Related to multi-modality (#4194)
needs-rebase
new-model
Requests to new models
nvidia
qwen
Related to Qwen models
speculative-decoding
tool-calling
v1
#40265
opened Apr 19, 2026 by
kari-smasint
Loading…
2 tasks
[ROCm] Profiler api support for ROCm MORI toy proxy server in PD Disaggregation
documentation
Improvements or additions to documentation
kv-connector
rocm
Related to AMD ROCm
#40264
opened Apr 19, 2026 by
itej89
Contributor
Loading…
3 of 4 tasks
[Doc] Fix CLI help examples: remove phantom --help=listgroup and --help=page modes
documentation
Improvements or additions to documentation
#40262
opened Apr 18, 2026 by
avasis-ai
Loading…
2 tasks done
[Bugfix] Corrects estimate of torch memory use causing OOM due to incorrect KV cache space estimation when sleep mode on (Fixes #40256)
bug
Something isn't working
#40258
opened Apr 18, 2026 by
djparente
Loading…
3 of 4 tasks
Fix view shape assignment for chan_scales
#40257
opened Apr 18, 2026 by
zjysteven
Loading…
3 of 4 tasks
fix(libtorch_stable): guard mxfp4 ops behind ENABLE_NVFP4_SM100
#40255
opened Apr 18, 2026 by
svilendotorg
Loading…
[ROCm] Add missing gfx1152, gfx1153, and enable all gpu arch to AITER in docker
ci/build
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
#40254
opened Apr 18, 2026 by
thelittlefireman
Loading…
[Bugfix] Stream MiniMax M2 tool-call deltas incrementally
bug
Something isn't working
frontend
tool-calling
#40253
opened Apr 18, 2026 by
wuyingjun-lucky
Loading…
[Bugfix] Forward mm_processor_kwargs in offline generate APIs
bug
Something isn't working
frontend
#40251
opened Apr 18, 2026 by
wuyingjun-lucky
Loading…
fix: remove redundant None default in dict.get() calls
documentation
Improvements or additions to documentation
kv-connector
#40250
opened Apr 18, 2026 by
hummbl-dev
Loading…
[torch.compile] refactor config hashing through compile_factors and normalization
llama
Related to Llama models
v1
#40246
opened Apr 18, 2026 by
WorldExplored
Contributor
Loading…
[Qwen][Bugfix] Fixes sigmoid activation in torch impl of RMSNormGated.
bug
Something isn't working
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#40245
opened Apr 18, 2026 by
sighingnow
Collaborator
Loading…
[Bugfix] GLM tool parser: fix streaming corruption for Optional[str]/array args
bug
Something isn't working
tool-calling
#40197
opened Apr 18, 2026 by
kulpsin
Loading…
3 of 4 tasks
[Bugfix] Make Attention Backend Auto-Selection Batch-Invariance-Aware
bug
Something isn't working
v1
#40193
opened Apr 18, 2026 by
WorldExplored
Contributor
Loading…
[Frontend] Add ONLY add when PR is ready to merge/full CI is needed
defer_loading and tool_reference support for Anthropic and OpenAI APIs
frontend
ready
#40190
opened Apr 18, 2026 by
JaredforReal
Contributor
Loading…
4 tasks
[Distributed] Add MSCCL++ allreduce support for multi-node communication
nvidia
v1
#40188
opened Apr 18, 2026 by
AO114
Loading…
[torch.compile] Remove layer name from unified_kv_cache_update / unified_mla_kv_cache_update to fix cold-start (#33267)
#40187
opened Apr 18, 2026 by
Doondi-Ashlesh
Loading…
2 of 4 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.