- Notifications
You must be signed in to change notification settings - Fork581
Pull requests: NVIDIA/TransformerEngine
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
[DO NOT MERGE] Get seqlens and offsets in O(N) space instead of O(N*N) space do not merge
#2530 openedDec 17, 2025 byKshitijLakhani • Draft
13 tasks
[JAX] Fix incorrect calculation of segment pos from segment ids attention bugSomething isn't working jax
#2523 openedDec 16, 2025 byKshitijLakhaniLoading…
5 of 13 tasks
[JAX] Calculate seqlens and offsets in O(N) space instead of O(N*N) space for THD sequences attention
#2522 openedDec 16, 2025 byKshitijLakhani • Draft
13 tasks
Documentation for cpu offloading documentationImprovements or additions to documentation
#2520 openedDec 16, 2025 bypggPLLoading…
8 of 13 tasks
[DO NOT MERGE] Testing v2.6 + pr2201 attention
#2513 openedDec 12, 2025 byKshitijLakhani • Draft
13 tasks
[common] Add support for cuBLASLt GEMM for GroupedTensor MoE
#2502 openedDec 10, 2025 bypggPLLoading…
8 tasks done
Add logic for block-scaled tensors with GEMM swizzled scales enhancementNew feature or request MoE performancePerformance issues refactor
#2486 openedDec 6, 2025 bytimmoon10Loading…
14 of 19 tasks
[JAX] Remove unused TE DPA module dtype which fixes cuDNN backend detection to properly use input dtypes attention jax
#2485 openedDec 5, 2025 byjberchtold-nvidiaLoading…
8 of 13 tasks
[JAX] Estimate post-RHT amax using regular amax fp4
#2479 openedDec 4, 2025 byjberchtold-nvidia • Draft
13 tasks
Add support for SWA (left, right) with FusedAttention 2.12.0
#2477 openedDec 4, 2025 bysudhakarsingh27Loading…
22 of 28 tasks
[PyTorch] Documentation for op fuser API documentationImprovements or additions to documentation
#2447 openedDec 3, 2025 bytimmoon10Loading…
8 of 13 tasks
Fix transformer 2.9.0 (torch 2.9.1 used by SGLang 0.5.5) build
#2445 openedDec 2, 2025 byyiakwy-xpu-ml-framework-teamLoading…
13 tasks
[Common] Comm+GEMM overlap API updated to support cuBlasMp backend (incl. framework API)
#2443 openedDec 2, 2025 bydeneraLoading…
5 of 13 tasks
[JAX] Better error message when Q, K, V are sharded differently attention jax
#2440 openedDec 2, 2025 byjberchtold-nvidiaLoading…
8 of 13 tasks
ProTip! Typegp on any issue or pull request to go back to the pull request listing page.