- Notifications
You must be signed in to change notification settings - Fork1.9k
Pull requests: NVIDIA/TensorRT-LLM
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
[None][fix] Recover TRTLLM MoE Perf for DEP; autotuner cache alignment
#9562 openedNov 30, 2025 byrosenrodtLoading…
1 task done
[None][fix] Let KV cache manager use single stream for cache block transfer (onboard/offload)
#9560 openedNov 30, 2025 byeopXDLoading…
1 task done
[None][fix] Set symbols default visibility to hidden to avoid symbol collision
#9557 openedNov 30, 2025 byyihwang-nv • Draft
[#8733][feat] Add Llama4 MoE handling to AutoDeploy
#9556 openedNov 30, 2025 bytcherckez-nvidiaLoading…
1 task done
[None][chore] Defer exposing context parallel configs
#9552 openedNov 30, 2025 bybrb-nvLoading…
1 task done
[#9550][feat] Add NVFP4 Cutlass MoE kernels for AutoDeploy
#9551 openedNov 30, 2025 bynzmora-nvidiaLoading…
1 task done
[https://nvbugs/5651854][fix] Fix dist-serving perf by clearing CPU affinity
#9549 openedNov 29, 2025 byShixiaowei02Loading…
[TRTLLM-9075][doc] refine the slurm examples
#9548 openedNov 29, 2025 bySuperjomnLoading…
1 task done
[TRTLLM-9488][feat] use FlashInfer.sampling by default
#9545 openedNov 28, 2025 byixlmarLoading…
1 task done
[None][feat] add chat template kwargs support to longbench-v2
#9544 openedNov 28, 2025 bylfr-0531Loading…
1 task done
[TRTLLM-6222][feat] Extend cute_dsl_nvfp4_gemm to sm103.
#9543 openedNov 28, 2025 bylimin2021Loading…
1 task done
[None][fix] Skip Allreduce init for Attention DP Release BlockerPRs that blocking the final release build or branching out the release branch
#9542 openedNov 28, 2025 bysyuoniLoading…
1 task done
[None][fix] Option #2 Introduce inline namespace to avoid symbol collision
#9541 openedNov 28, 2025 byyihwang-nv • Draft
[None][feat] Update Qwen3CodeToolParser to align tool-calling parameters
#9540 openedNov 28, 2025 byWanli-JiangLoading…
1 task done
[https://nvbugs/5651854][fix] revert #8805 to fix disagg perf issue
#9536 openedNov 28, 2025 byreasonsoloLoading…
1 task done
[TRTLLM-9391][chore] Automatically estimate required workspace.
#9535 openedNov 28, 2025 bybobboliLoading…
1 task
[None][fix] Add a timeout in MNNVL throughput to prevent hangs if one rank crashes
#9532 openedNov 28, 2025 bydjns99Loading…
1 task done
[https://nvbugs/5690172][fix] Fix Qwen3-235B ATP accuracy issue with PDL
#9530 openedNov 28, 2025 bysyuoniLoading…
1 task done
[#9150][feat] AutoDeploy: reviewer comments for #9150
#9527 openedNov 27, 2025 bylucaslieLoading…
1 task
[TRTLLM-9242][doc] Add examples showcasing openai compatible APIs
#9520 openedNov 27, 2025 byJunyiXu-nvLoading…
1 task done
[https://nvbugs/5652062][fix] Rectify the checking rule for finishing a request
#9516 openedNov 27, 2025 byziyixiong-nvLoading…
1 task
ProTip! Find all pull requests that aren't related to any open issues with-linked:issue.