Movatterモバイル変換


[0]ホーム

URL:


packageorf

  1. Overview
  2. Docs

You can search for identifiers within the package.

in-package search v0.2.0

OCaml Random Forests

Install

Dune Dependency

Authors

Maintainers

Sources

v1.0.1.tar.gz
sha256=7e3977bf99284fca63144dad27bdb5f024e59425188b58246b89bf4770f43791

Description

Random Forests (RFs) can do classification or regression modeling.

Random Forests are one of the workhorse of modern machinelearning. Especially, they cannot over-fit to the training set, arefast to train, predict fast, parallelize well and give you a reasonablemodel even without optimizing the model's default hyper-parameters. Inother words, it is hard to shoot yourself in the foot while trainingor exploiting a Random Forests model. In comparison, with deep neuralnetworks it is very easy to shoot yourself in the foot.

Using out of bag (OOB) samples, you can even get an idea of a RFsperformance, without the need for a held out (test) data-set.

Their only drawback is that RFs, being an ensemble model, cannot predictvalues which are outside of the training set range of values (this isa serious limitation in case you are trying to optimize or minimizesomething in order to discover outliers, compared to your trainingset samples).

For the moment, this implementation only consider a sparse vectorof integers as features. i.e. categorical variables will need to beone-hot-encoded.For classification, the dependent variable must be an integer(encoding a class label).For regression, the dependent variable must be a float.

Bibliography

Breiman, Leo. (1996). Bagging Predictors. Machine learning, 24(2),123-140.

Breiman, Leo. (2001). Random Forests. Machine learning, 45(1), 5-32.

Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely RandomizedTrees. Machine learning, 63(1), 3-42.

Published:03 Jul 2024

README

ORF: OCaml Random Forests

Random Forests (RFs) are one of the workhorse of modern machine learning. Especially, they cannot over-fit to the training set, are fast to train, predict fast, parallelize well and give you a reasonable model even without optimizing the model's default hyper-parameters. In other words, it is hard to shoot yourself in the foot while training or exploiting a Random Forests model. In comparison, with deep neural networks it is very easy to shoot yourself in the foot.

Using out of bag (OOB) samples, you can even get an idea of a RFs performance, without the need for a held out (test) dataset.

Their only drawback is that RFs, being an ensemble model, cannot predict values which are outside of the training set range of values (thisis a serious limitation in case you are trying to optimize or minimize something in order to discover outliers, compared to your training set samples).

For the moment, this implementation will only consider a sparse vector of integers as features. i.e. categorical variables will need to be one-hot-encoded.

Bibliography

Breiman, Leo. (1996). "Bagging Predictors". Machine learning, 24(2), 123-140.

Breiman, Leo. (2001). "Random Forests". Machine learning, 45(1), 5-32.

Geurts, P., Ernst, D., & Wehenkel, L. (2006). "Extremely Randomized Trees". Machine learning, 63(1), 3-42.

Dependencies (9)

  1. line_oriented
  2. parany>= "11.0.0"
  3. ocaml>= "4.12"
  4. molenc>= "16.15.0"
  5. minicli
  6. dune>= "2.8"
  7. dolog>= "4.0.0"
  8. cpm>= "11.0.0"
  9. batteries>= "3.2.0"

Dev Dependencies

None

Used by

None

Conflicts

None


[8]ページ先頭

©2009-2025 Movatter.jp