Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork8.8k
Pull requests: vllm-project/vllm
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
[Bugfix][ROCm] Fix for warp_size uses on host rocmRelated to AMD ROCm
#21205 openedJul 18, 2025 bygshtrasLoading…
Add VLLM_DISTRIBUTED_INIT_TIMEOUT_SECONDS to set torch.distributed timeouts
#21203 openedJul 18, 2025 bytlrmchlsmthLoading…
[Bugfix] Correct input_len/prefix_len for RandomDataset in benchmarking performancePerformance-related issues
#21202 openedJul 18, 2025 byericehanleyLoading…
1 task
ci: Add CUDA + arm64 release builds ci/build
#21201 openedJul 18, 2025 byseemethereLoading…
2 of 4 tasks
[BugFix][CPU] FixONLY add when PR is ready to merge/full CI is needed v1
TorchSDPABackendImpl
doesn't haveuse_irope
ready[Compilation fix] add stubs to allow compilation without sm100
#21198 openedJul 18, 2025 bymickaelseznecLoading…
4 tasks
[Kernel] Enable Hybrid Model Support in Triton Unified Attention Kernel v1
#21197 openedJul 18, 2025 byjvlunterenLoading…
[BugFix] Fix potential cuda-graph IMA bugSomething isn't working readyONLY add when PR is ready to merge/full CI is needed v1
[CI/Build] fix cpu_extension for apple silicon ci/build
#21195 openedJul 18, 2025 byignaciosicaLoading…
[V1] [Hybrid] Enable piecewise CUDA Graph for mamba layers v1
#21194 openedJul 18, 2025 bytdoublepLoading…
5 of 6 tasks
[Kernel][Performance] Tweak MoE Batched silu_mul_fp8_quant_deep_gemm kernel readyONLY add when PR is ready to merge/full CI is needed
#21193 openedJul 18, 2025 byvarun-sundar-rabindranathLoading…
[Docs] Update Tensorizer usage documentation documentationImprovements or additions to documentation
#21190 openedJul 18, 2025 bysangstarLoading…
[Attention] Clean up iRoPE in V1 readyONLY add when PR is ready to merge/full CI is needed tpuRelated to Google TPUs v1
[Bug] DeepGemm: Fix TypeError: per_block_cast_to_fp8() missing 1 required positional argument: 'use_ue8m0' for SM100 bugSomething isn't working readyONLY add when PR is ready to merge/full CI is needed
#21187 openedJul 18, 2025 byyewentao256Loading…
[Bugfix][Model] Fix LoRA for Mistral-Small-3.1-24B-Instruct-2503 bugSomething isn't working multi-modalityRelated to multi-modality (#4194) readyONLY add when PR is ready to merge/full CI is needed
[Bugfix] V1 Fix the cursor leakage issue during request scheduling. v1
#21173 openedJul 18, 2025 byCLFutureXLoading…
[Bugfix] Fixed the missing metrics in output frontend v1
#21171 openedJul 18, 2025 byhsliuustcLoading…
3 of 4 tasks
[V0 deprecation] Remove long context LoRA readyONLY add when PR is ready to merge/full CI is needed tpuRelated to Google TPUs
#21169 openedJul 18, 2025 byjeejeeleeLoading…
4 tasks
[Feature][EPLB] Add support for unquantized models
#21168 openedJul 18, 2025 byhsliuustcLoading…
3 of 4 tasks
[Bugfix] Mistral crashes on tool with no description
#21167 openedJul 18, 2025 byHugoMichardLoading…
[Feature][OCP MX] Support mxfp6 and mixed mxfp6-mxfp4
#21166 openedJul 18, 2025 byfxmarty-amdLoading…
2 tasks
[feat] move WEIGHT_SCALE_SUPPORTED into raise block
#21164 openedJul 18, 2025 byweixiao-huangLoading…
ProTip! Typegp on any issue or pull request to go back to the pull request listing page.