quantize_dynamic#
- classtorch.ao.quantization.quantize_dynamic(model,qconfig_spec=None,dtype=torch.qint8,mapping=None,inplace=False)[source]#
Converts a float model to dynamic (i.e. weights-only) quantized model.
Replaces specified modules with dynamic weight-only quantized versions and output the quantized model.
For simplest usage providedtype argument that can be float16 or qint8. Weight-only quantizationby default is performed for layers with large weights size - i.e. Linear and RNN variants.
Fine grained control is possible withqconfig andmapping that act similarly toquantize().Ifqconfig is provided, thedtype argument is ignored.
- Parameters
model – input model
qconfig_spec –
Either:
A dictionary that maps from name or type of submodule to quantizationconfiguration, qconfig applies to all submodules of a givenmodule unless qconfig for the submodules are specified (when thesubmodule already has qconfig attribute). Entries in the dictionaryneed to be QConfig instances.
A set of types and/or submodule names to apply dynamic quantization to,in which case thedtype argument is used to specify the bit-width
inplace – carry out model transformations in-place, the original module is mutated
mapping – maps type of a submodule to a type of corresponding dynamically quantized versionwith which the submodule needs to be replaced