Inmachine learning (ML), alearning curve (ortraining curve) is agraphical representation that shows how a model's performance on atraining set (and usually a validation set) changes with the number of training iterations (epochs) or the amount of training data.[1]Typically, the number of training epochs or training set size is plotted on thex-axis, and the value of theloss function (and possibly some other metric such as thecross-validation score) on they-axis.
More abstractly, learning curves plot the difference between learning effort and predictive performance, where "learning effort" usually means the number of training samples, and "predictive performance" means accuracy on testing samples.[3]
Learning curves have many useful purposes in ML, including:[4][5][6]
choosing model parameters during design,
adjusting optimization to improve convergence,
and diagnosing problems such asoverfitting (or underfitting).
Learning curves can also be tools for determining how much a model benefits from adding more training data, and whether the model suffers more from avariance error or a bias error. If both the validation score and the training score converge to a certain value, then the model will no longer significantly benefit from more training data.[7]
When creating a function to approximate the distribution of some data, it is necessary to define a loss function to measure how good the model output is (e.g., accuracy for classification tasks ormean squared error for regression). We then define an optimization process which finds model parameters such that is minimized, referred to as.
Many optimizationalgorithms are iterative, repeating the same step (such asbackpropagation) until the processconverges to an optimal value.Gradient descent is one such algorithm. If is the approximation of the optimal after steps, a learning curve is the plot of
^"Mohr, Felix and van Rijn, Jan N. "Learning Curves for Decision Making in Supervised Machine Learning - A Survey." arXiv preprint arXiv:2201.12150 (2022)".arXiv:2201.12150.