DTypeConfig#
- classtorch.ao.quantization.backend_config.DTypeConfig(input_dtype=None,output_dtype=None,weight_dtype=None,bias_dtype=None,is_dynamic=None)[source]#
Config object that specifies the supported data types passed as arguments toquantize ops in the reference model spec, for input and output activations,weights, and biases.
For example, consider the following reference model:
quant1 - [dequant1 - fp32_linear - quant2] - dequant2
The pattern in the square brackets refers to the reference pattern ofstatically quantized linear. Setting the input dtype astorch.quint8in the DTypeConfig means we pass intorch.quint8 as the dtype argumentto the first quantize op (quant1). Similarly, setting the output dtype astorch.quint8 means we pass intorch.quint8 as the dtype argument tothe second quantize op (quant2).
Note that the dtype here does not refer to the interface dtypes of theop. For example, the “input dtype” here is not the dtype of the inputtensor passed to the quantized linear op. Though it can still be thesame as the interface dtype, this is not always the case, e.g. theinterface dtype is fp32 in dynamic quantization but the “input dtype”specified in the DTypeConfig would still be quint8. The semantics ofdtypes here are the same as the semantics of the dtypes specified inthe observers.
These dtypes are matched against the ones specified in the user’sQConfig. If there is a match, and the QConfig satisfies the constraintsspecified in the DTypeConfig (if any), then we will quantize the givenpattern using this DTypeConfig. Otherwise, the QConfig is ignored andthe pattern will not be quantized.
Example usage:
>>>dtype_config1=DTypeConfig(...input_dtype=torch.quint8,...output_dtype=torch.quint8,...weight_dtype=torch.qint8,...bias_dtype=torch.float)>>>dtype_config2=DTypeConfig(...input_dtype=DTypeWithConstraints(...dtype=torch.quint8,...quant_min_lower_bound=0,...quant_max_upper_bound=255,...),...output_dtype=DTypeWithConstraints(...dtype=torch.quint8,...quant_min_lower_bound=0,...quant_max_upper_bound=255,...),...weight_dtype=DTypeWithConstraints(...dtype=torch.qint8,...quant_min_lower_bound=-128,...quant_max_upper_bound=127,...),...bias_dtype=torch.float)>>>dtype_config1.input_dtypetorch.quint8>>>dtype_config2.input_dtypetorch.quint8>>>dtype_config2.input_dtype_with_constraintsDTypeWithConstraints(dtype=torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=255, scale_min_lower_bound=None, scale_max_upper_bound=None)
- classmethodfrom_dict(dtype_config_dict)[source]#
- Create a
DTypeConfigfrom a dictionary with the following items (all optional): “input_dtype”: torch.dtype or
DTypeWithConstraints“output_dtype”: torch.dtype orDTypeWithConstraints“weight_dtype”: torch.dtype orDTypeWithConstraints“bias_type”: torch.dtype“is_dynamic”: bool
- Return type:
- Create a