log_loss#

sklearn.metrics.log_loss(y_true,y_pred,*,normalize=True,sample_weight=None,labels=None)[source]#

Log loss, aka logistic loss or cross-entropy loss.

This is the loss function used in (multinomial) logistic regressionand extensions of it such as neural networks, defined as the negativelog-likelihood of a logistic model that returnsy_pred probabilitiesfor its training datay_true.The log loss is only defined for two or more labels.For a single sample with true label\(y \in \{0,1\}\) anda probability estimate\(p = \operatorname{Pr}(y = 1)\), the logloss is:

\[L_{\log}(y, p) = -(y \log (p) + (1 - y) \log (1 - p))\]

Read more in theUser Guide.

Parameters:
y_truearray-like or label indicator matrix

Ground truth (correct) labels for n_samples samples.

y_predarray-like of float, shape = (n_samples, n_classes) or (n_samples,)

Predicted probabilities, as returned by a classifier’spredict_proba method. Ify_pred.shape=(n_samples,)the probabilities provided are assumed to be that of thepositive class. The labels iny_pred are assumed to beordered alphabetically, as done byLabelBinarizer.

y_pred values are clipped to[eps,1-eps] whereeps is the machineprecision fory_pred’s dtype.

normalizebool, default=True

If true, return the mean loss per sample.Otherwise, return the sum of the per-sample losses.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

labelsarray-like, default=None

If not provided, labels will be inferred from y_true. IflabelsisNone andy_pred has shape (n_samples,) the labels areassumed to be binary and are inferred fromy_true.

Added in version 0.18.

Returns:
lossfloat

Log loss, aka logistic loss or cross-entropy loss.

Notes

The logarithm used is the natural logarithm (base-e).

References

C.M. Bishop (2006). Pattern Recognition and Machine Learning. Springer,p. 209.

Examples

>>>fromsklearn.metricsimportlog_loss>>>log_loss(["spam","ham","ham","spam"],...[[.1,.9],[.9,.1],[.8,.2],[.35,.65]])0.21616

Gallery examples#

Probability Calibration curves

Probability Calibration curves

Probability Calibration for 3-class classification

Probability Calibration for 3-class classification

Plot classification probability

Plot classification probability

Gradient Boosting Out-of-Bag estimates

Gradient Boosting Out-of-Bag estimates

Gradient Boosting regularization

Gradient Boosting regularization

Probabilistic predictions with Gaussian process classification (GPC)

Probabilistic predictions with Gaussian process classification (GPC)

Importance of Feature Scaling

Importance of Feature Scaling