- Notifications
You must be signed in to change notification settings - Fork1.9k
Pull requests: NVIDIA/TensorRT-LLM
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
[None][feat] Add RocketKV usage doc and e2e accuracy test on LongBenchV2
#9572 openedDec 1, 2025 byheyuhhhLoading…
1 task
[None][fix] Replace hash method with unique_id for cutedsl MoE runners.
#9569 openedDec 1, 2025 byhyuknLoading…
1 task done
[None][fix] Recover TRTLLM MoE Perf for DEP; autotuner cache alignment
#9562 openedNov 30, 2025 byrosenrodtLoading…
1 task done
[None][fix] Let KV cache manager use single stream for cache block transfer (onboard/offload)
#9560 openedNov 30, 2025 byeopXDLoading…
1 task done
[None][fix] Set symbols default visibility to hidden to avoid symbol collision
#9557 openedNov 30, 2025 byyihwang-nv • Draft
[#8733][feat] Add Llama4 MoE handling to AutoDeploy
#9556 openedNov 30, 2025 bytcherckez-nvidiaLoading…
1 task done
[None][chore] Defer exposing context parallel configs
#9552 openedNov 30, 2025 bybrb-nvLoading…
1 task done
[#9550][feat] Add NVFP4 Cutlass MoE kernels for AutoDeploy
#9551 openedNov 30, 2025 bynzmora-nvidiaLoading…
1 task done
[https://nvbugs/5651854][fix] Fix dist-serving perf by clearing CPU affinity
#9549 openedNov 29, 2025 byShixiaowei02Loading…
[TRTLLM-9488][feat] use FlashInfer.sampling by default
#9545 openedNov 28, 2025 byixlmarLoading…
1 task done
[None][feat] add chat template kwargs support to longbench-v2
#9544 openedNov 28, 2025 bylfr-0531Loading…
1 task done
[None][fix] Skip Allreduce init for Attention DP Release BlockerPRs that blocking the final release build or branching out the release branch
#9542 openedNov 28, 2025 bysyuoniLoading…
1 task done
[None][fix] Option #2 Introduce inline namespace to avoid symbol collision
#9541 openedNov 28, 2025 byyihwang-nv • Draft
[None][feat] Update Qwen3CodeToolParser to align tool-calling parameters
#9540 openedNov 28, 2025 byWanli-JiangLoading…
1 task done
[TRTLLM-9391][chore] Automatically estimate required workspace.
#9535 openedNov 28, 2025 bybobboliLoading…
1 task
[None][fix] Add a timeout in MNNVL throughput to prevent hangs if one rank crashes
#9532 openedNov 28, 2025 bydjns99Loading…
1 task done
[#9150][feat] AutoDeploy: reviewer comments for #9150
#9527 openedNov 27, 2025 bylucaslieLoading…
1 task
[TRTLLM-9242][doc] Add examples showcasing openai compatible APIs
#9520 openedNov 27, 2025 byJunyiXu-nvLoading…
1 task done
ProTip! Mix and match filters to narrow down what you’re looking for.