Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 4,565 Commits
.github		.github
bench		bench
ci/integration		ci/integration
cmake/modules		cmake/modules
docs		docs
external		external
fbgemm_gpu		fbgemm_gpu
include/fbgemm		include/fbgemm
src		src
test		test
.bazelrc		.bazelrc
.clang-tidy		.clang-tidy
.gitignore		.gitignore
.gitmodules		.gitmodules
BUILD.bazel		BUILD.bazel
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MODULE.bazel		MODULE.bazel
README.md		README.md
WORKSPACE.bazel		WORKSPACE.bazel
defs.bzl		defs.bzl
netlify.toml		netlify.toml

Repository files navigation

The FBGEMM Project

The FBGEMM Project is a repository of highly-optimized kernels used acrossdeep learning applications.

The codebase is organized and published as three related packages: FBGEMM,FBGEMM-GPU, and FBGEMM-GenAI. Each package has its own set of features anddocumentation.

Project Overview

FBGEMM: A low-precision, high-performance matrix multiplication andconvolution library for server-side inference. The documentation belowprovides an overview of FBGEMM, including its features, documentation, andcommunity resources.
FBGEMM_GPU: A collection of PyTorch GPU operator libraries built on top ofFBGEMM for training and inference, with focus on recommendation systemsapplications. Please seethe documentation for moreinformation.
FBGEMM_GPU GenAI: A collection of PyTorch GPU operator libraries that aredesigned for generative AI applications, such as FP8 row-wise quantization andcollective communications. Please seethe documentationfor more information.

FBGEMM

FBGEMM (Facebook GEneral Matrix Multiplication) is a low-precision,high-performance matrix-matrix multiplications and convolution library forserver-side inference.

The library provides efficient low-precision general matrix multiplication forsmall batch sizes and support for accuracy-loss minimizing techniques such asrow-wise quantization and outlier-aware quantization. FBGEMM also exploitsfusion opportunities in order to overcome the unique challenges of matrixmultiplication at lower precision with bandwidth-bound operations.

FBGEMM is used as a backend of PyTorch quantized operators for x86 machines:

PyTorch:https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/quantized/cpu

See the fullDocumentation for more informationon building, installing, and developing with FBGEMM, as well as the mostup-to-date support matrix and API documentation for this library.

What's New?

New Features and Recent Improvements (January, 2020)

Citation

For a high-level overview, design philosophy and brief descriptions of variousparts of FBGEMM please seeour blog post.

For those looking for the appropriate article to cite regarding FBGEMM, werecommend citing ourpaper:

@article{fbgemm,  title={FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference},  author={Khudia, Daya and Huang, Jianyu and Basu, Protonu and Deng, Summer and Liu, Haixin and Park, Jongsoo and Smelyanskiy, Mikhail},  journal={arXiv preprint arXiv:2101.05615},  year={2021}}