Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Docker] Non-root support for vllm-openai; add opt-in vllm-openai-nonroot target ci/build documentation Improvements or additions to documentation
#40275 opened Apr 19, 2026 by TheDuyIT Loading…
3 of 4 tasks
[Doc] Fix broken "full example" links in structured_outputs.md documentation Improvements or additions to documentation structured-output
#40274 opened Apr 19, 2026 by FrimpsManu Loading…
Fix MoE backend selection for LoRA (unquantized MoE)
#40273 opened Apr 19, 2026 by danisereb Contributor Draft
4 tasks
docs: expand load_weights guide with AutoWeightsLoader and manual patterns documentation Improvements or additions to documentation
#40271 opened Apr 19, 2026 by RudrenduPaul Loading…
3 tasks
【Feat】GPU KV Cache Pluggable Eviction Policy v1
#40270 opened Apr 19, 2026 by sjmshsh Loading…
refactor: make is_neox_style configurable for glm deepseek Related to DeepSeek models
#40267 opened Apr 19, 2026 by inisis Loading…
4 tasks
[Doc] Fix typos in token_embed pooling documentation documentation Improvements or additions to documentation
#40266 opened Apr 19, 2026 by YifanLi3 Loading…
Fix tp device index overflow ci/build cpu Related to CPU backends documentation Improvements or additions to documentation frontend kv-connector llama Related to Llama models multi-modality Related to multi-modality (#4194) needs-rebase new-model Requests to new models nvidia qwen Related to Qwen models speculative-decoding tool-calling v1
#40265 opened Apr 19, 2026 by kari-smasint Loading…
2 tasks
[ROCm] Profiler api support for ROCm MORI toy proxy server in PD Disaggregation documentation Improvements or additions to documentation kv-connector rocm Related to AMD ROCm
#40264 opened Apr 19, 2026 by itej89 Contributor Loading…
3 of 4 tasks
[Doc] Fix CLI help examples: remove phantom --help=listgroup and --help=page modes documentation Improvements or additions to documentation
#40262 opened Apr 18, 2026 by avasis-ai Loading…
2 tasks done
Fix view shape assignment for chan_scales
#40257 opened Apr 18, 2026 by zjysteven Loading…
3 of 4 tasks
[ROCm] Add missing gfx1152, gfx1153, and enable all gpu arch to AITER in docker ci/build documentation Improvements or additions to documentation rocm Related to AMD ROCm
#40254 opened Apr 18, 2026 by thelittlefireman Loading…
fix: remove redundant None default in dict.get() calls documentation Improvements or additions to documentation kv-connector
#40250 opened Apr 18, 2026 by hummbl-dev Loading…
[torch.compile] refactor config hashing through compile_factors and normalization llama Related to Llama models v1
#40246 opened Apr 18, 2026 by WorldExplored Contributor Loading…
[Qwen][Bugfix] Fixes sigmoid activation in torch impl of RMSNormGated. bug Something isn't working qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed
#40245 opened Apr 18, 2026 by sighingnow Collaborator Loading…
[Bugfix] GLM tool parser: fix streaming corruption for Optional[str]/array args bug Something isn't working tool-calling
#40197 opened Apr 18, 2026 by kulpsin Loading…
3 of 4 tasks
[Bugfix] Make Attention Backend Auto-Selection Batch-Invariance-Aware bug Something isn't working v1
#40193 opened Apr 18, 2026 by WorldExplored Contributor Loading…
[Frontend] Add defer_loading and tool_reference support for Anthropic and OpenAI APIs frontend ready ONLY add when PR is ready to merge/full CI is needed
#40190 opened Apr 18, 2026 by JaredforReal Contributor Loading…
4 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.