Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

CatBoost

From Wikipedia, the free encyclopedia
Open-source software library developed by Yandex
CatBoost
Original author(s)Andrey Gulin:[1] /Yandex
Developer(s)Yandex and CatBoost Contributors[2]
Initial releaseJuly 18, 2017; 7 years ago (2017-07-18)[3][4]
Stable release
1.2.3[5] / February 23, 2024; 13 months ago (2024-02-23)
Written inPython,R,C++,Java
Operating systemLinux,macOS,Windows
TypeMachine learning
LicenseApache License 2.0
Websitecatboost.ai

CatBoost[6] is anopen-sourcesoftware library developed byYandex. It provides agradient boosting framework which, among other features, attempts to solve for categorical features using a permutation-driven alternative to the classical algorithm.[7] It works onLinux,Windows,macOS, and is available inPython,[8]R,[9] and models built using CatBoost can be used for predictions inC++,Java,[10]C#,Rust,Core ML,ONNX, andPMML. The source code is licensed underApache License and available on GitHub.[6]

InfoWorld magazine awarded the library "The best machine learning tools" in 2017.[11] along withTensorFlow,Pytorch,XGBoost and 8 other libraries.

Kaggle listed CatBoost as one of the most frequently used machine learning (ML) frameworks in the world. It was listed as the top-8 most frequently used ML framework in the 2020 survey[12] and as the top-7 most frequently used ML framework in the 2021 survey.[13]

As of April 2022, CatBoost is installed about 100000 times per day fromPyPI repository[14]

Features

[edit]

CatBoost has gained popularity compared to other gradient boosting algorithms primarily due to the following features[15]

  • Native handling for categorical features[16]
  • Fast GPU training[17]
  • Visualizations and tools for model and feature analysis
  • Usingoblivious trees or symmetric trees for faster execution
  • Ordered boosting to overcome overfitting[7]

History

[edit]

In 2009 Andrey Gulin developedMatrixNet, a proprietary gradient boosting library that was used in Yandex to rank search results.Since 2009 MatrixNet has been used in different projects in Yandex, including recommendation systems and weather prediction.

In 2014–2015 Andrey Gulin with a team of researchers has started a new project called Tensornet that was aimed at solving the problem of "how to work withcategorical data". It resulted in several proprietary Gradient Boosting libraries with different approaches to handling categorical data.

In 2016 Machine Learning Infrastructure team led by Anna Dorogush started working on Gradient Boosting in Yandex, including Matrixnet and Tensornet. They implemented and open-sourced the next version of Gradient Boosting library called CatBoost, which has support of categorical and text data, GPU training, model analysis, visualization tools.

CatBoost was open-sourced in July 2017 and is under active development in Yandex and the open-source community.

Application

[edit]

See also

[edit]

References

[edit]
  1. ^"Andrey Gulin - People - Research at Yandex".research.yandex.com.
  2. ^"catboost/catboost".GitHub.
  3. ^"Yandex open sources CatBoost, a gradient boosting machine learning library".TechCrunch. 18 July 2017. Retrieved2020-08-30.
  4. ^Yegulalp, Serdar (2017-07-18)."Yandex open sources CatBoost machine learning library".InfoWorld. Retrieved2020-08-30.
  5. ^"Releases · catboost/catboost".GitHub. Retrieved2024-03-14.
  6. ^ab"catboost/catboost". August 30, 2020 – via GitHub.
  7. ^abProkhorenkova, Liudmila; Gusev, Gleb; Vorobev, Aleksandr; Dorogush, Anna Veronika; Gulin, Andrey (2019-01-20). "CatBoost: unbiased boosting with categorical features".arXiv:1706.09516 [cs.LG].
  8. ^"Python Package Index PYPI: catboost". Retrieved2020-08-20.
  9. ^"Conda force package catboost-r". Retrieved2020-08-30.
  10. ^"Maven Repository: ai.catboost » catboost-prediction".mvnrepository.com. Retrieved2020-08-30.
  11. ^staff, InfoWorld (27 September 2017)."Bossie Awards 2017: The best machine learning tools".InfoWorld.
  12. ^"State of Data Science and Machine Learning 2020".
  13. ^"State of Data Science and Machine Learning 2021".
  14. ^"PyPI Stats catboost".PyPI Stats.
  15. ^Joseph, Manu (2020-02-29)."The Gradient Boosters V: CatBoost".Deep & Shallow. Retrieved2020-08-30.
  16. ^Dorogush, Anna Veronika; Ershov, Vasily; Gulin, Andrey (2018-10-24). "CatBoost: gradient boosting with categorical features support".arXiv:1810.11363 [cs.LG].
  17. ^"CatBoost Enables Fast Gradient Boosting on Decision Trees Using GPUs".NVIDIA Developer Blog. 2018-12-13. Retrieved2020-08-30.
  18. ^"Code Completion, Episode 4: Model Training".JetBrains Developer Blog. 2021-08-20.
  19. ^"Stop the Bots: Practical Lessons in Machine Learning".The Cloudflare Blog. 2019-02-20.
  20. ^"How Careem's Destination Prediction Service speeds up your ride".Careem. 2019-02-19.

External links

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=CatBoost&oldid=1277472721"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp