DTypeConfig #

classtorch.ao.quantization.backend_config.DTypeConfig(input_dtype=None,output_dtype=None,weight_dtype=None,bias_dtype=None,is_dynamic=None)[source]#

Config object that specifies the supported data types passed as arguments toquantize ops in the reference model spec, for input and output activations,weights, and biases.

For example, consider the following reference model:

quant1 - [dequant1 - fp32_linear - quant2] - dequant2

The pattern in the square brackets refers to the reference pattern ofstatically quantized linear. Setting the input dtype astorch.quint8in the DTypeConfig means we pass intorch.quint8 as the dtype argumentto the first quantize op (quant1). Similarly, setting the output dtype astorch.quint8 means we pass intorch.quint8 as the dtype argument tothe second quantize op (quant2).

Note that the dtype here does not refer to the interface dtypes of theop. For example, the “input dtype” here is not the dtype of the inputtensor passed to the quantized linear op. Though it can still be thesame as the interface dtype, this is not always the case, e.g. theinterface dtype is fp32 in dynamic quantization but the “input dtype”specified in the DTypeConfig would still be quint8. The semantics ofdtypes here are the same as the semantics of the dtypes specified inthe observers.

These dtypes are matched against the ones specified in the user’sQConfig. If there is a match, and the QConfig satisfies the constraintsspecified in the DTypeConfig (if any), then we will quantize the givenpattern using this DTypeConfig. Otherwise, the QConfig is ignored andthe pattern will not be quantized.

Example usage:

>>>dtype_config1=DTypeConfig(...input_dtype=torch.quint8,...output_dtype=torch.quint8,...weight_dtype=torch.qint8,...bias_dtype=torch.float)>>>dtype_config2=DTypeConfig(...input_dtype=DTypeWithConstraints(...dtype=torch.quint8,...quant_min_lower_bound=0,...quant_max_upper_bound=255,...),...output_dtype=DTypeWithConstraints(...dtype=torch.quint8,...quant_min_lower_bound=0,...quant_max_upper_bound=255,...),...weight_dtype=DTypeWithConstraints(...dtype=torch.qint8,...quant_min_lower_bound=-128,...quant_max_upper_bound=127,...),...bias_dtype=torch.float)>>>dtype_config1.input_dtypetorch.quint8>>>dtype_config2.input_dtypetorch.quint8>>>dtype_config2.input_dtype_with_constraintsDTypeWithConstraints(dtype=torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=255, scale_min_lower_bound=None, scale_max_upper_bound=None)

classmethodfrom_dict(dtype_config_dict)[source]#

Create aDTypeConfig from a dictionary with the following items (all optional):: “input_dtype”: torch.dtype orDTypeWithConstraints“output_dtype”: torch.dtype orDTypeWithConstraints“weight_dtype”: torch.dtype orDTypeWithConstraints“bias_type”: torch.dtype“is_dynamic”: bool

Return type:: DTypeConfig

to_dict()[source]#

Convert thisDTypeConfig to a dictionary with the items described infrom_dict().

Return type:: dict[str,Any]

On this page

Show Source

PyTorch Libraries

Movatterモバイル変換

DTypeConfig #

Docs

Tutorials

Resources

Movatterモバイル変換

DTypeConfig#

Docs

Tutorials

Resources

DTypeConfig #