Quantization#
Created On: Oct 09, 2019 | Last Updated On: Aug 19, 2025
We are cetralizing all quantization related development totorchao, please checkout our new doc page:https://docs.pytorch.org/ao/stable/index.html
Plan for the existing quantization flows:1. Eager mode quantization (torch.ao.quantization.quantize,torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager modequantize_ API instead
2. FX graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fxtorch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantizationAPI instead (torchao.quantization.pt2e.quantize_pt2e.prepare_pt2e,torchao.quantization.pt2e.quantize_pt2e.convert_pt2e)
3. pt2e quantization has been migrated to torchao (pytorch/ao)seepytorch/ao#2259 for more details
We plan to deletetorch.ao.quantization in 2.10 if there are no blockers, or in the earliest PyTorch version until all the blockers are cleared.
Quantization API Reference (Kept since APIs are still public)#
TheQuantization API Reference contains documentationof quantization APIs, such as quantization passes, quantized tensor operations,and supported quantized modules and functions.