Movatterモバイル変換

[0]ホーム

Jump to content

Learning curve (machine learning)

Edit links

From Wikipedia, the free encyclopedia

Plot of machine learning model performance over time or experience

Learning curve plot of training set size vs training score (loss) and cross-validation score

Machine learning anddata mining
Part of a series on
Paradigms Supervised learning Unsupervised learning Semi-supervised learning Self-supervised learning Reinforcement learning Meta-learning Online learning Batch learning Curriculum learning Rule-based learning Neuro-symbolic AI Neuromorphic engineering Quantum machine learning
Problems Classification Generative modeling Regression Clustering Dimensionality reduction Density estimation Anomaly detection Data cleaning AutoML Association rules Semantic analysis Structured prediction Feature engineering Feature learning Learning to rank Grammar induction Ontology learning Multimodal learning
Supervised learning (classification • regression) Apprenticeship learning Decision trees Ensembles Bagging Boosting Random forest k-NN Linear regression Naive Bayes Artificial neural networks Logistic regression Perceptron Relevance vector machine (RVM) Support vector machine (SVM)
Clustering BIRCH CURE Hierarchical k-means Fuzzy Expectation–maximization (EM) DBSCAN OPTICS Mean shift
Dimensionality reduction Factor analysis CCA ICA LDA NMF PCA PGD t-SNE SDL
Structured prediction Graphical models Bayes net Conditional random field Hidden Markov
Anomaly detection RANSAC k-NN Local outlier factor Isolation forest
Neural networks Autoencoder Deep learning Feedforward neural network Recurrent neural network LSTM GRU ESN reservoir computing Boltzmann machine Restricted GAN Diffusion model SOM Convolutional neural network U-Net LeNet AlexNet DeepDream Neural field Neural radiance field Physics-informed neural networks Transformer Vision Mamba Spiking neural network Memtransistor Electrochemical RAM (ECRAM)
Reinforcement learning Q-learning Policy gradient SARSA Temporal difference (TD) Multi-agent Self-play
Learning with humans Active learning Crowdsourcing Human-in-the-loop Mechanistic interpretability RLHF
Model diagnostics Coefficient of determination Confusion matrix Learning curve ROC curve
Mathematical foundations Kernel machines Bias–variance tradeoff Computational learning theory Empirical risk minimization Occam learning PAC learning Statistical learning VC theory Topological deep learning
Journals and conferences AAAI ECML PKDD NeurIPS ICML ICLR IJCAI ML JMLR
Related articles Glossary of artificial intelligence List of datasets for machine-learning research List of datasets in computer vision and image processing Outline of machine learning
v t e

Inmachine learning (ML), alearning curve (ortraining curve) is agraphical representation that shows how a model's performance on atraining set (and usually a validation set) changes with the number of training iterations (epochs) or the amount of training data.^[1]Typically, the number of training epochs or training set size is plotted on thex-axis, and the value of theloss function (and possibly some other metric such as thecross-validation score) on they-axis.

Synonyms includeerror curve,experience curve,improvement curve andgeneralization curve.^[2]

More abstractly, learning curves plot the difference between learning effort and predictive performance, where "learning effort" usually means the number of training samples, and "predictive performance" means accuracy on testing samples.^[3]

Learning curves have many useful purposes in ML, including:^[4]^[5]^[6]

choosing model parameters during design,
adjusting optimization to improve convergence,
and diagnosing problems such asoverfitting (or underfitting).

Learning curves can also be tools for determining how much a model benefits from adding more training data, and whether the model suffers more from avariance error or a bias error. If both the validation score and the training score converge to a certain value, then the model will no longer significantly benefit from more training data.^[7]

Formal definition

[edit]

When creating a function to approximate the distribution of some data, it is necessary to define a loss function $L(f_{\theta }(X),Y)$ to measure how good the model output is (e.g., accuracy for classification tasks ormean squared error for regression). We then define an optimization process which finds model parameters $\theta$ such that $L(f_{\theta }(X),Y)$ is minimized, referred to as $\theta ^{*}$ .

Training curve for amount of data

[edit]

If the training data is

$\{x_{1},x_{2},\dots ,x_{n}\},\{y_{1},y_{2},\dots y_{n}\}$

and the validation data is

$\{x_{1}',x_{2}',\dots x_{m}'\},\{y_{1}',y_{2}',\dots y_{m}'\}$ ,

a learning curve is the plot of the two curves

$i\mapsto L(f_{\theta ^{*}(X_{i},Y_{i})}(X_{i}),Y_{i})$
$i\mapsto L(f_{\theta ^{*}(X_{i},Y_{i})}(X_{i}'),Y_{i}')$

where $X_{i}=\{x_{1},x_{2},\dots x_{i}\}$

Training curve for number of iterations

[edit]

Many optimizationalgorithms are iterative, repeating the same step (such asbackpropagation) until the processconverges to an optimal value.Gradient descent is one such algorithm. If $\theta _{i}^{*}$ is the approximation of the optimal $\theta$ after $i {\displaystyle i}$ steps, a learning curve is the plot of

$i\mapsto L(f_{\theta _{i}^{*}(X,Y)}(X),Y)$
$i\mapsto L(f_{\theta _{i}^{*}(X,Y)}(X'),Y')$

References

[edit]

^"Mohr, Felix and van Rijn, Jan N. "Learning Curves for Decision Making in Supervised Machine Learning - A Survey." arXiv preprint arXiv:2201.12150 (2022)".arXiv:2201.12150.
^Viering, Tom; Loog, Marco (2023-06-01). "The Shape of Learning Curves: A Review".IEEE Transactions on Pattern Analysis and Machine Intelligence.45 (6):7799–7819.arXiv:2103.10948.Bibcode:2023ITPAM..45.7799V.doi:10.1109/TPAMI.2022.3220744.ISSN 0162-8828.PMID 36350870.
^Perlich, Claudia (2010),"Learning Curves in Machine Learning", in Sammut, Claude; Webb, Geoffrey I. (eds.),Encyclopedia of Machine Learning, Boston, MA: Springer US, pp. 577–580,doi:10.1007/978-0-387-30164-8_452,ISBN 978-0-387-30164-8, retrieved2023-07-06
^Madhavan, P.G. (1997)."A New Recurrent Neural Network Learning Algorithm for Time Series Prediction"(PDF).Journal of Intelligent Systems. p. 113 Fig. 3.
^"Machine Learning 102: Practical Advice".Tutorial: Machine Learning for Astronomy with Scikit-learn. Archived fromthe original on 2012-07-30. Retrieved2019-02-15.
^Meek, Christopher; Thiesson, Bo; Heckerman, David (Summer 2002)."The Learning-Curve Sampling Method Applied to Model-Based Clustering".Journal of Machine Learning Research.2 (3): 397. Archived fromthe original on 2013-07-15.
^scikit-learn developers."Validation curves: plotting scores to evaluate models — scikit-learn 0.20.2 documentation". RetrievedFebruary 15, 2019.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Learning_curve_(machine_learning)&oldid=1307959438"

Categories:

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

Formal definition

Training curve for amount of data

Training curve for number of iterations

See also

References