Movatterモバイル変換

scikit-learn

From Wikipedia, the free encyclopedia

Python library for machine learning

scikit-learn

Original author(s)	David Cournapeau
Developer(s)	Google Summer of Code project
Initial release	June 2007; 18 years ago (2007-06)

Stable release	1.7.0^[1] / 6 June 2025; 31 days ago (6 June 2025)

Repository	github.com/scikit-learn/scikit-learn
Written in	Python,Cython,C andC++^[2]
Operating system	Linux,macOS,Windows
Type	Library formachine learning
License	New BSD License
Website	scikit-learn.org

scikit-learn (formerlyscikits.learn and also known assklearn) is afree and open-source machine learning library for thePython programming language.^[3]It features variousclassification,regression andclustering algorithms includingsupport-vector machines,random forests,gradient boosting,k-means andDBSCAN, and is designed to interoperate with thePython numerical and scientific librariesNumPy andSciPy. Scikit-learn is aNumFOCUS fiscally sponsored project.^[4]

Overview

[edit]

The scikit-learn project started as scikits.learn, aGoogle Summer of Code project by Frenchdata scientist David Cournapeau. The name of the project derives from its role as a "scientific toolkit for machine learning", originally developed and distributed as a third-party extension toSciPy.^[5] The originalcodebase was later rewritten by otherdevelopers.^[who?] In 2010, contributors Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort and Vincent Michel, from theFrench Institute for Research in Computer Science and Automation inSaclay,France, took leadership of the project and released the first public version of the library on February 1, 2010.^[6] In November 2012, scikit-learn as well asscikit-image were described as two of the "well-maintained and popular" scikits libraries^[update].^[7] In 2019, it was noted that scikit-learn is one of the most popular machine learning libraries onGitHub.^[8]

Features

[edit]

Large catalogue of well-established machine learning algorithms and data pre-processing methods (i.e.feature engineering)
Utility methods for common data-science tasks, such as splitting data intotrain and test sets,cross-validation andgrid search
Consistent way of running machine learning models (estimator.fit() andestimator.predict()), which libraries can implement
Declarative way of structuring a data science process (thePipeline), including data pre-processing and model fitting

Examples

[edit]

Fitting arandom forest classifier:

>>>fromsklearn.ensembleimportRandomForestClassifier>>>classifier=RandomForestClassifier(random_state=0)>>>X=[[1,2,3],# 2 samples, 3 features...[11,12,13]]>>>y=[0,1]# classes of each sample>>>classifier.fit(X,y)RandomForestClassifier(random_state=0)

Implementation

[edit]

scikit-learn is largely written in Python, and usesNumPy extensively for high-performance linear algebra and array operations. Furthermore, some core algorithms are written inCython to improve performance. Support vector machines are implemented by a Cython wrapper aroundLIBSVM; logistic regression and linear support vector machines by a similar wrapper aroundLIBLINEAR. In such cases, extending these methods with Python may not be possible.

scikit-learn integrates well with many other Python libraries, such asMatplotlib andplotly for plotting,NumPy for array vectorization,Pandas dataframes,SciPy, and many more.

History

[edit]

scikit-learn was initially developed by David Cournapeau as a Google Summer of Code project in 2007. Later that year, Matthieu Brucher joined the project and started to use it as a part of his thesis work. In 2010,INRIA, theFrench Institute for Research in Computer Science and Automation, got involved and the first public release (v0.1 beta) was published in late January 2010.

Awards

[edit]

2019 Inria-French Academy of Sciences-Dassault Systèmes Innovation Prize^[9]
2022 Open Science Award for Open Source Research Software^[10]

scikit-learn alternatives

[edit]

References

[edit]

^"Release 1.7.0". 6 June 2025. Retrieved16 June 2025.
^"The scikit-learn Open Source Project on Open Hub: Languages Page".Open Hub. Retrieved14 July 2018.
^Fabian Pedregosa; Gaël Varoquaux; Alexandre Gramfort; Vincent Michel; Bertrand Thirion; Olivier Grisel; Mathieu Blondel; Peter Prettenhofer; Ron Weiss; Vincent Dubourg; Jake Vanderplas; Alexandre Passos; David Cournapeau; Matthieu Perrot; Édouard Duchesnay (2011)."scikit-learn: Machine Learning in Python".Journal of Machine Learning Research.12:2825–2830.arXiv:1201.0490.Bibcode:2011JMLR...12.2825P.
^"NumFOCUS Sponsored Projects". NumFOCUS. Retrieved2021-10-25.
^Dreijer, Janto."scikit-learn".
^"About us — scikit-learn 0.20.1 documentation".scikit-learn.org.
^Eli Bressert (2012).SciPy and NumPy: an overview for developers. O'Reilly. p. 43.ISBN 978-1-4493-6162-4.
^"The State of the Octoverse: machine learning".The GitHub Blog.GitHub. 2019-01-24. Retrieved2019-10-17.
^"The 2019 Inria-French Academy of Sciences-Dassault Systèmes Innovation Prize : scikit-learn , a success story for machine learning free software | Inria".www.inria.fr. Retrieved2025-03-19.
^Badolato, Anne-Marie (2022-02-07)."Open Science Awards for Open Source Research Software".Ouvrir la Science. Retrieved2025-03-19.

External links

[edit]

v t e Scientific software inPython
NumPy SciPy matplotlib pandas scikit-learn scikit-image MayaVi more

v t e Differentiable computing
General	Differentiable programming Information geometry Statistical manifold Automatic differentiation Neuromorphic computing Pattern recognition Ricci calculus Computational learning theory Inductive bias
Hardware	IPU TPU VPU Memristor SpiNNaker
Software libraries	TensorFlow PyTorch Keras scikit-learn Theano JAX Flux.jl MindSpore
Portals Computer programming Technology

Retrieved from "https://en.wikipedia.org/w/index.php?title=Scikit-learn&oldid=1296041713"

Categories:

Hidden categories:

[8]ページ先頭