Rate this Page

quantize_dynamic#

classtorch.ao.quantization.quantize_dynamic(model,qconfig_spec=None,dtype=torch.qint8,mapping=None,inplace=False)[source]#

Converts a float model to dynamic (i.e. weights-only) quantized model.

Replaces specified modules with dynamic weight-only quantized versions and output the quantized model.

For simplest usage providedtype argument that can be float16 or qint8. Weight-only quantizationby default is performed for layers with large weights size - i.e. Linear and RNN variants.

Fine grained control is possible withqconfig andmapping that act similarly toquantize().Ifqconfig is provided, thedtype argument is ignored.

Parameters
  • model – input model

  • qconfig_spec

    Either:

    • A dictionary that maps from name or type of submodule to quantizationconfiguration, qconfig applies to all submodules of a givenmodule unless qconfig for the submodules are specified (when thesubmodule already has qconfig attribute). Entries in the dictionaryneed to be QConfig instances.

    • A set of types and/or submodule names to apply dynamic quantization to,in which case thedtype argument is used to specify the bit-width

  • inplace – carry out model transformations in-place, the original module is mutated

  • mapping – maps type of a submodule to a type of corresponding dynamically quantized versionwith which the submodule needs to be replaced