- Notifications
You must be signed in to change notification settings - Fork696
Pull requests: pytorch/FBGEMM
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
Build and test script updates for OSS cla signed fb-exported meta-exported
#5252 openedDec 17, 2025 byq10Loading…
Move the prefetched info to preallocated buffers cla signed fb-exported meta-exported
#5251 openedDec 17, 2025 bychouxiLoading…
Enable direct MX4→BF16 dequantization to reduce memory (python side) (2/2) cla signed fb-exported meta-exported
#5250 openedDec 17, 2025 byarmandsauzayLoading…
Add aarch64 intrinsic-based dequantization to autovec routine cla signed fb-exported meta-exported
#5249 openedDec 17, 2025 byNicoshevLoading…
Unify output tensor layout for split-K decode kernel cla signed fb-exported meta-exported
#5248 openedDec 17, 2025 byAya-ZIbraLoading…
Choose _autovec version of GenerateEmbeddingSpMDMRowWiseSparse on AArch64 cla signed fb-exported meta-exported
#5247 openedDec 17, 2025 byMatzeBLoading…
Specialize more cases to improve EmbeddingSpMDMNBitBenchmark cla signed fb-exported meta-exported
#5245 openedDec 17, 2025 byMatzeBLoading…
Add EmbeddingSpMDMNBitRowWiseSparse autovectorized variant cla signed fb-exported meta-exported
#5244 openedDec 17, 2025 byMatzeBLoading…
Add canonical setup and build script for OSS cla signed fb-exported meta-exported
#5239 openedDec 17, 2025 byq10Loading…
Fix FBGEMM CI about blackwell attention tests cla signed fb-exported meta-exported
#5238 openedDec 17, 2025 byAlkaid-BenetnashLoading…
reorganize blackwell_fmha_test.py a bit cla signed fb-exported meta-exported
#5237 openedDec 17, 2025 byhenrylhtsangLoading…
Optimize group_index_select_or_add_2d_kernel on ROCm by adding a separate codepath for small embedding dimensions cla signed module: rocm
#5233 openedDec 16, 2025 byaryaman-guptaLoading…
Port part of cutlass decode PR to fix static assertion w/ TileShape<64, 256, 128> cla signed fb-exported meta-exported
#5232 openedDec 16, 2025 byAlkaid-BenetnashLoading…
support object cache in ssd l2 cache and add more unit tests cla signed fb-exported meta-exported
#5228 openedDec 16, 2025 byzhaojuanmaoLoading…
Optimizing 4-bit dequant to FP32 on AArch64 using vectorized intrinsics in EmbeddingSpMDMAutovec cla signed
#5224 openedDec 15, 2025 bymarma01Loading…
Upgrade GitHub Actions for Node 24 compatibility cla signed module: rocm
#5222 openedDec 13, 2025 bysalmanmkcLoading…
Change to TORCH_CHECK_VALUE for sparse ops cla signed fb-exported meta-exported
#5215 openedDec 11, 2025 byspcypptLoading…
Tune max segment length per cta in triton table batched embeddings, and expose the param via cli cla signed fb-exported meta-exported
#5212 openedDec 10, 2025 byOmarPavelLoading…
Update heuristic to support variant batch sizes cla signed fb-exported meta-exported
#5211 openedDec 10, 2025 byzjing14Loading…
Use H100 runners for OSS CI cla signed fb-exported meta-exported
#5205 openedDec 9, 2025 byq10Loading…
ProTip! Exclude everything labeled
bug with-label:bug.