Rate this Page

Linear#

classtorch.ao.nn.quantized.Linear(in_features,out_features,bias_=True,dtype=torch.qint8)[source]#

A quantized linear module with quantized tensor as inputs and outputs.We adopt the same interface astorch.nn.Linear, please seehttps://pytorch.org/docs/stable/nn.html#torch.nn.Linear for documentation.

Similar toLinear, attributes will be randomlyinitialized at module creation time and will be overwritten later

Variables
  • weight (Tensor) – the non-learnable quantized weights of the module ofshape(out_features,in_features)(\text{out\_features}, \text{in\_features}).

  • bias (Tensor) – the non-learnable bias of the module of shape(out_features)(\text{out\_features}).Ifbias isTrue, the values are initialized to zero.

  • scalescale parameter of output Quantized Tensor, type: double

  • zero_pointzero_point parameter for output Quantized Tensor, type: long

Examples:

>>>m=nn.quantized.Linear(20,30)>>>input=torch.randn(128,20)>>>input=torch.quantize_per_tensor(input,1.0,0,torch.quint8)>>>output=m(input)>>>print(output.size())torch.Size([128, 30])
classmethodfrom_float(mod,use_precomputed_fake_quant=False)[source]#

Create a quantized module from an observed float module

Parameters
  • mod (Module) – a float module, either produced by torch.ao.quantizationutilities or provided by the user

  • use_precomputed_fake_quant (bool) – if True, the module will reuse min/maxvalues from the precomputed fake quant module.

classmethodfrom_reference(ref_qlinear,output_scale,output_zero_point)[source]#

Create a (fbgemm/qnnpack) quantized module from a reference quantized module

Parameters
  • ref_qlinear (Module) – a reference quantized linear module, either produced by torch.ao.quantizationutilities or provided by the user

  • output_scale (float) – scale for output Tensor

  • output_zero_point (int) – zero point for output Tensor