Rate this Page

GroupNorm#

classtorch.nn.modules.normalization.GroupNorm(num_groups,num_channels,eps=1e-05,affine=True,device=None,dtype=None)[source]#

Applies Group Normalization over a mini-batch of inputs.

This layer implements the operation as described inthe paperGroup Normalization

y=xE[x]Var[x]+ϵγ+βy = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta

The input channels are separated intonum_groups groups, each containingnum_channels/num_groups channels.num_channels must be divisible bynum_groups. The mean and standard-deviation are calculatedseparately over each group.γ\gamma andβ\beta are learnableper-channel affine transform parameter vectors of sizenum_channels ifaffine isTrue.The variance is calculated via the biased estimator, equivalent totorch.var(input, unbiased=False).

This layer uses statistics computed from input data in both training andevaluation modes.

Parameters
  • num_groups (int) – number of groups to separate the channels into

  • num_channels (int) – number of channels expected in input

  • eps (float) – a value added to the denominator for numerical stability. Default: 1e-5

  • affine (bool) – a boolean value that when set toTrue, this modulehas learnable per-channel affine parameters initialized to ones (for weights)and zeros (for biases). Default:True.

Shape:

Examples:

>>>input=torch.randn(20,6,10,10)>>># Separate 6 channels into 3 groups>>>m=nn.GroupNorm(3,6)>>># Separate 6 channels into 6 groups (equivalent with InstanceNorm)>>>m=nn.GroupNorm(6,6)>>># Put all 6 channels into a single group (equivalent with LayerNorm)>>>m=nn.GroupNorm(1,6)>>># Activating the module>>>output=m(input)