- Notifications
You must be signed in to change notification settings - Fork697
Pull requests: pytorch/FBGEMM
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
Update OSS build script to support AMD and CPU variants cla signed fb-exported meta-exported
#5257 openedDec 18, 2025 byq10Loading…
Optimize benchmark index generation with std::sample() cla signed fb-exported meta-exported
#5254 openedDec 17, 2025 byterdoganLoading…
Remove unused dedup_map and associated includes from benchmarks cla signed fb-exported meta-exported
#5253 openedDec 17, 2025 byterdoganLoading…
Move the prefetched info to preallocated buffers cla signed fb-exported meta-exported
#5251 openedDec 17, 2025 bychouxiLoading…
Enable direct MX4→BF16 dequantization to reduce memory (python side) (2/2) cla signed fb-exported meta-exported
#5250 openedDec 17, 2025 byarmandsauzayLoading…
Add aarch64 intrinsic-based dequantization to autovec routine cla signed fb-exported meta-exported
#5249 openedDec 17, 2025 byNicoshevLoading…
Choose _autovec version of GenerateEmbeddingSpMDMRowWiseSparse on AArch64 cla signed fb-exported meta-exported
#5247 openedDec 17, 2025 byMatzeBLoading…
Specialize more cases to improve EmbeddingSpMDMNBitBenchmark cla signed fb-exported meta-exported
#5245 openedDec 17, 2025 byMatzeBLoading…
Add EmbeddingSpMDMNBitRowWiseSparse autovectorized variant cla signed fb-exported meta-exported
#5244 openedDec 17, 2025 byMatzeBLoading…
Optimize group_index_select_or_add_2d_kernel on ROCm by adding a separate codepath for small embedding dimensions cla signed module: rocm
#5233 openedDec 16, 2025 byaryaman-guptaLoading…
support object cache in ssd l2 cache and add more unit tests cla signed fb-exported meta-exported
#5228 openedDec 16, 2025 byzhaojuanmaoLoading…
Optimizing 4-bit dequant to FP32 on AArch64 using vectorized intrinsics in EmbeddingSpMDMAutovec cla signed
#5224 openedDec 15, 2025 bymarma01Loading…
Upgrade GitHub Actions for Node 24 compatibility cla signed module: rocm
#5222 openedDec 13, 2025 bysalmanmkcLoading…
Change to TORCH_CHECK_VALUE for sparse ops cla signed fb-exported meta-exported
#5215 openedDec 11, 2025 byspcypptLoading…
Tune max segment length per cta in triton table batched embeddings, and expose the param via cli cla signed fb-exported meta-exported
#5212 openedDec 10, 2025 byOmarPavelLoading…
Update heuristic to support variant batch sizes cla signed fb-exported meta-exported
#5211 openedDec 10, 2025 byzjing14Loading…
Use H100 runners for OSS CI cla signed fb-exported meta-exported
#5205 openedDec 9, 2025 byq10Loading…
Modifying clear_all_staged_data to accomadate KV Tensor Deletion cla signed fb-exported meta-exported
#5202 openedDec 9, 2025 byRaahul46Loading…
creating delete_rocksdb_checkpoint_dir function under KV Tensor cla signed fb-exported meta-exported
#5201 openedDec 9, 2025 byRaahul46Loading…
Adding returnKVTensorMetaData flag to Staging Read Strategy cla signed fb-exported meta-exported
#5200 openedDec 9, 2025 byRaahul46Loading…
ProTip! What’s not been updated in a month:updated:<2025-11-18.