Logistic regression: Loss and regularization Stay organized with collections Save and categorize content based on your preferences.
Page Summary
Logistic regression models are trained similarly to linear regression models but use Log Loss instead of squared loss and require regularization.
Log Loss is used in logistic regression because the rate of change isn't constant, requiring varying precision levels unlike squared loss used in linear regression.
Regularization, such as L2 regularization or early stopping, is crucial in logistic regression to prevent overfitting due to the model's asymptotic nature.
Logistic regressionmodels are trained using the same process aslinear regressionmodels, with two key distinctions:
- Logistic regression models useLog Loss as the loss functioninstead ofsquared loss.
- Applyingregularizationis critical to preventoverfitting.
The following sections discuss these two considerations in more depth.
Log Loss
In theLinear regression module,you usedsquared loss (also calledL2 loss) as theloss function.Squared loss works well for a linearmodel where the rate of change of the output values is constant. For example,given the linear model $y' = b + 3x_1$, each time you increment the inputvalue $x_1$ by 1, the output value $y'$ increases by 3.
However, the rate of change of a logistic regression model isnot constant.As you saw inCalculating a probability, thesigmoid curve is s-shapedrather than linear. When the log-odds ($z$) value is closer to 0, smallincreases in $z$ result in much larger changes to $y$ than when $z$ is a largepositive or negative number. The following table shows the sigmoid function'soutput for input values from 5 to 10, as well as the corresponding precisionrequired to capture the differences in the results.
| input | logistic output | required digits of precision |
|---|---|---|
| 5 | 0.993 | 3 |
| 6 | 0.997 | 3 |
| 7 | 0.999 | 3 |
| 8 | 0.9997 | 4 |
| 9 | 0.9999 | 4 |
| 10 | 0.99998 | 5 |
If you used squared loss to calculate errors for the sigmoid function, as theoutput got closer and closer to0 and1, you would need more memory topreserve the precision needed to track these values.
Instead, the loss function for logistic regression isLog Loss. TheLog Loss equation returns the logarithm of the magnitude of the change, ratherthan just the distance from data to prediction. Log Loss is calculated asfollows:
$\text{Log Loss} = -\frac{1}{N}\sum_{i=1}^{N} y_i\log(y_i') + (1 - y_i)\log(1 - y_i')$
where:
- \(N\) is the number of labeled examples in the dataset
- \(i\) is the index of an example in the dataset (e.g., \((x_3, y_3)\) is the third example in the dataset)
- \(y_i\) is the label for the \(i\)th example. Since this is logistic regression, \(y_i\) must either be 0 or 1.
- \(y_i'\) is your model's prediction for the \(i\)th example (somewhere between 0 and 1), given the set of features in \(x_i\).
Click the icon to learn more about Log Loss.
This form of the Log Loss function calculates the mean Log Loss across allpoints in the dataset. Using mean Log Loss (as opposed to total Log Loss) isdesirable in practice, because it enables us to decouple tuning of the batchsize and the learning rate.
Regularization in logistic regression
Regularization, a mechanism forpenalizing model complexity during training, is extremely important in logisticregression modeling. Without regularization, the asymptotic nature of logisticregression would keep driving loss towards 0 in cases where the model has alarge number of features. Consequently, most logistic regression models use oneof the following two strategies to decrease model complexity:
- L2 regularization
- Early stopping:Limiting the number of training steps to halt training while loss isstill decreasing.
- Gradient descent
- Linear regression
- Log Loss
- Logistic regression
- Loss function
- Overfitting
- Regularization
- Squared loss
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-10-03 UTC.