SmoothL1Loss #

classtorch.nn.SmoothL1Loss(size_average=None,reduce=None,reduction='mean',beta=1.0)[source]#

Creates a criterion that uses a squared term if the absoluteelement-wise error falls below beta and an L1 term otherwise.It is less sensitive to outliers thantorch.nn.MSELoss and in some casesprevents exploding gradients (e.g. see the paperFast R-CNN by Ross Girshick).

For a batch of size $N N$ , the unreduced loss can be described as:

ℓ (x, y) = L = {l_{1}, . . ., l_{N}}^{T} \ell(x, y) = L = \{l_1, ..., l_N\}^T

with

l_{n} = {\begin{cases} 0.5 (x_{n} - y_{n})^{2} / b e t a, & if ∣ x_{n} - y_{n} ∣ < b e t a \\ ∣ x_{n} - y_{n} ∣ - 0.5 * b e t a, & otherwise \end{cases} l_n = \begin{cases}0.5 (x_n - y_n)^2 / beta, & \text{if } |x_n - y_n| < beta \\|x_n - y_n| - 0.5 * beta, & \text{otherwise }\end{cases}

Ifreduction is notnone, then:

ℓ (x, y) = {\begin{cases} mean (L), & if reduction = ‘mean’; \\ sum (L), & if reduction = ‘sum’. \end{cases} \ell(x, y) =\begin{cases}    \operatorname{mean}(L), &  \text{if reduction} = \text{`mean';}\\    \operatorname{sum}(L),  &  \text{if reduction} = \text{`sum'.}\end{cases}

Note

Smooth L1 loss can be seen as exactlyL1Loss, but with the $∣ x - y ∣ < b e t a |x - y| < beta$ portion replaced with a quadratic function such that its slope is 1 at $∣ x - y ∣ = b e t a |x - y| = beta$ .The quadratic segment smooths the L1 loss near $∣ x - y ∣ = 0 |x - y| = 0$ .

Note

Smooth L1 loss is closely related toHuberLoss, beingequivalent to $h u b e r (x, y) / b e t a huber(x, y) / beta$ (note that Smooth L1’s beta hyper-parameter isalso known as delta for Huber). This leads to the following differences:

As beta -> 0, Smooth L1 loss converges toL1Loss, whileHuberLossconverges to a constant 0 loss. When beta is 0, Smooth L1 loss is equivalent to L1 loss.
As beta -> $+ \infty +\infty$ , Smooth L1 loss converges to a constant 0 loss, whileHuberLoss converges toMSELoss.
For Smooth L1 loss, as beta varies, the L1 segment of the loss has a constant slope of 1.ForHuberLoss, the slope of the L1 segment is beta.

Parameters

size_average (bool,optional) – Deprecated (seereduction). By default,the losses are averaged over each loss element in the batch. Note that forsome losses, there are multiple elements per sample. If the fieldsize_averageis set toFalse, the losses are instead summed for each minibatch. Ignoredwhenreduce isFalse. Default:True
reduce (bool,optional) – Deprecated (seereduction). By default, thelosses are averaged or summed over observations for each minibatch dependingonsize_average. Whenreduce isFalse, returns a loss perbatch element instead and ignoressize_average. Default:True
reduction (str,optional) – Specifies the reduction to apply to the output:'none' |'mean' |'sum'.'none': no reduction will be applied,'mean': the sum of the output will be divided by the number ofelements in the output,'sum': the output will be summed. Note:size_averageandreduce are in the process of being deprecated, and in the meantime,specifying either of those two args will overridereduction. Default:'mean'
beta (float,optional) – Specifies the threshold at which to change between L1 and L2 loss.The value must be non-negative. Default: 1.0

Shape:

Input: $(*) (*)$ , where $* *$ means any number of dimensions.
Target: $(*) (*)$ , same shape as the input.
Output: scalar. Ifreduction is'none', then $(*) (*)$ , same shape as the input.

forward(input,target)[source]#

Runs the forward pass.

Return type: Tensor

On this page

Show Source

PyTorch Libraries

Movatterモバイル変換

SmoothL1Loss #

Docs

Tutorials

Resources

Movatterモバイル変換

SmoothL1Loss#

Docs

Tutorials

Resources

SmoothL1Loss #