GaussianNLLLoss#
- classtorch.nn.modules.loss.GaussianNLLLoss(*,full=False,eps=1e-06,reduction='mean')[source]#
Gaussian negative log likelihood loss.
The targets are treated as samples from Gaussian distributions withexpectations and variances predicted by the neural network. For a
targettensor modelled as having Gaussian distribution with a tensorof expectationsinputand a tensor of positive variancesvarthe loss is:where
epsis used for stability. By default, the constant term ofthe loss function is omitted unlessfullisTrue. Ifvaris not the samesize asinput(due to a homoscedastic assumption), it must either have a final dimensionof 1 or have one fewer dimension (with all other sizes being the same) for correct broadcasting.- Parameters
full (bool,optional) – include the constant term in the losscalculation. Default:
False.eps (float,optional) – value used to clamp
var(see note below), forstability. Default: 1e-6.reduction (str,optional) – specifies the reduction to apply to theoutput:
'none'|'mean'|'sum'.'none': no reductionwill be applied,'mean': the output is the average of all batchmember losses,'sum': the output is the sum of all batch memberlosses. Default:'mean'.
- Shape:
Input: or where means any number of additionaldimensions
Target: or, same shape as the input, or same shape as the inputbut with one dimension equal to 1 (to allow for broadcasting)
Var: or, same shape as the input, or same shape as the input butwith one dimension equal to 1, or same shape as the input but with one fewerdimension (to allow for broadcasting), or a scalar value
Output: scalar if
reductionis'mean'(default) or'sum'. Ifreductionis'none', then, sameshape as the input
Examples
>>>loss=nn.GaussianNLLLoss()>>>input=torch.randn(5,2,requires_grad=True)>>>target=torch.randn(5,2)>>>var=torch.ones(5,2,requires_grad=True)# heteroscedastic>>>output=loss(input,target,var)>>>output.backward()
>>>loss=nn.GaussianNLLLoss()>>>input=torch.randn(5,2,requires_grad=True)>>>target=torch.randn(5,2)>>>var=torch.ones(5,1,requires_grad=True)# homoscedastic>>>output=loss(input,target,var)>>>output.backward()
Note
The clamping of
varis ignored with respect to autograd, and so thegradients are unaffected by it.- Reference:
Nix, D. A. and Weigend, A. S., “Estimating the mean and variance of thetarget probability distribution”, Proceedings of 1994 IEEE InternationalConference on Neural Networks (ICNN’94), Orlando, FL, USA, 1994, pp. 55-60vol.1, doi: 10.1109/ICNN.1994.374138.