May 2, 2024 · May 3, 2024 · May 5, 2024 · May 5, 2024 · May 6, 2024 · May 3, 2024
diff --git a/pgml-cms/docs/api/sql-extension/pgml.train/README.md b/pgml-cms/docs/api/sql-extension/pgml.train/README.md
 | `task`          | `'regression'`                                        | The objective of the experiment: `regression`, `classification` or `cluster`                                                                                                                                                                                                                 |
 | `relation_name` | `'public.search_logs'`                                | The Postgres table or view where the training data is stored or defined.                                                                                                                                                                                                                     |
 | `y_column_name` | `'clicked'`                                           | The name of the label (aka "target" or "unknown") column in the training table.                                                                                                                                                                                                              |
 | `algorithm`     | `'xgboost'`                                           | <p>The algorithm to train on the dataset. |
 | `algorithm`     | `'xgboost'`                                           | <p>The algorithm to train on the dataset.</p> |
 | `hyperparams`   | `{ "n_estimators": 25 }`                              | The hyperparameters to pass to the algorithm for training, JSON formatted.                                                                                                                                                                                                                   |
 | `search`        | `grid`                                                | If set, PostgresML will perform a hyperparameter search to find the best hyperparameters for the algorithm. See [hyperparameter-search.md](hyperparameter-search.md "mention") for details.                                                                                                  |
 | `search_params` | `{ "n_estimators": [5, 10, 25, 100] }`                | Search parameters used in the hyperparameter search, using the scikit-learn notation, JSON formatted.                                                                                                                                                                                        |
diff --git a/pgml-cms/docs/api/sql-extension/pgml.train/classification.md b/pgml-cms/docs/api/sql-extension/pgml.train/classification.md

 ## Example

 This example trains models on the sklean digits dataset which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits target="_blank"). This demonstrates using a table with a single array feature column for classification. You could do something similar with a vector column.
 This example trains models on the sklean digits dataset which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits). This demonstrates using a table with a single array feature column for classification. You could do something similar with a vector column.

 ```sql
 -- load the sklearn digits dataset

 ## Algorithms

 We currently support classification algorithms from [scikit-learn](https://scikit-learn.org/ target="_blank"), [XGBoost](https://xgboost.readthedocs.io/ target="_blank"), [LightGBM](https://lightgbm.readthedocs.io/ target="_blank") and [Catboost](https://catboost.ai/ target="_blank").
 We currently support classification algorithms from [scikit-learn](https://scikit-learn.org/), [XGBoost](https://xgboost.readthedocs.io/), [LightGBM](https://lightgbm.readthedocs.io/) and [Catboost](https://catboost.ai/).

 ### Gradient Boosting

 | Algorithm               | Reference                                                                                                                  |
 | ----------------------- | -------------------------------------------------------------------------------------------------------------------------- |
 | `xgboost`               | [XGBClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBClassifier target="_blank")                    |
 | `xgboost_random_forest` | [XGBRFClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBRFClassifier target="_blank")                |
 | `lightgbm`              | [LGBMClassifier](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html#lightgbm.LGBMClassifier target="_blank") |
 | `catboost`              | [CatBoostClassifier](https://catboost.ai/en/docs/concepts/python-reference\_catboostclassifier target="_blank")                            |
 | `xgboost`               | [XGBClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBClassifier)                    |
 | `xgboost_random_forest` | [XGBRFClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBRFClassifier)                |
 | `lightgbm`              | [LGBMClassifier](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html#lightgbm.LGBMClassifier) |
 | `catboost`              | [CatBoostClassifier](https://catboost.ai/en/docs/concepts/python-reference\_catboostclassifier)                            |

 #### Examples


 | Algorithm                 | Reference                                                                                                                                |
 | ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
 | `ada_boost`               | [AdaBoostClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html target="_blank")                         |
 | `bagging`                 | [BaggingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html target="_blank")                           |
 | `extra_trees`             | [ExtraTreesClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html target="_blank")                     |
 | `gradient_boosting_trees` | [GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html target="_blank")         |
 | `random_forest`           | [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html target="_blank")                 |
 | `hist_gradient_boosting`  | [HistGradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html target="_blank") |
 | `ada_boost`               | [AdaBoostClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html)                         |
 | `bagging`                 | [BaggingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html)                           |
 | `extra_trees`             | [ExtraTreesClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html)                     |
 | `gradient_boosting_trees` | [GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)         |
 | `random_forest`           | [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)                 |
 | `hist_gradient_boosting`  | [HistGradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html) |

 #### Examples


 | Algorithm    | Reference                                                                                 |
 | ------------ | ----------------------------------------------------------------------------------------- |
 | `svm`        | [SVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html target="_blank")             |
 | `nu_svm`     | [NuSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.NuSVC.html target="_blank")         |
 | `linear_svm` | [LinearSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html target="_blank") |
 | `svm`        | [SVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)             |
 | `nu_svm`     | [NuSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.NuSVC.html)         |
 | `linear_svm` | [LinearSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html) |

 #### Examples


 | Algorithm                     | Reference                                                                                                                               |
 | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
 | `linear`                      | [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.LogisticRegression.html target="_blank")                   |
 | `ridge`                       | [RidgeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.RidgeClassifier.html target="_blank")                         |
 | `stochastic_gradient_descent` | [SGDClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.SGDClassifier.html target="_blank")                             |
 | `perceptron`                  | [Perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.Perceptron.html target="_blank")                                   |
 | `passive_aggressive`          | [PassiveAggressiveClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.PassiveAggressiveClassifier.html target="_blank") |
 | `linear`                      | [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.LogisticRegression.html)                   |
 | `ridge`                       | [RidgeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.RidgeClassifier.html)                         |
 | `stochastic_gradient_descent` | [SGDClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.SGDClassifier.html)                             |
 | `perceptron`                  | [Perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.Perceptron.html)                                   |
 | `passive_aggressive`          | [PassiveAggressiveClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.PassiveAggressiveClassifier.html) |

 #### Examples


 | Algorithm          | Reference                                                                                                                               |
 | ------------------ | --------------------------------------------------------------------------------------------------------------------------------------- |
 | `gaussian_process` | [GaussianProcessClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian\_process.GaussianProcessClassifier.html target="_blank") |
 | `gaussian_process` | [GaussianProcessClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian\_process.GaussianProcessClassifier.html) |

 #### Examples

diff --git a/pgml-cms/docs/api/sql-extension/pgml.train/clustering.md b/pgml-cms/docs/api/sql-extension/pgml.train/clustering.md

 ## Example

 This example trains models on the sklearn digits dataset -- which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits target="_blank"). This demonstrates using a table with a single array feature column for clustering. You could do something similar with a vector column.
 This example trains models on the sklearn digits dataset -- which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits). This demonstrates using a table with a single array feature column for clustering. You could do something similar with a vector column.

 ```sql
 SELECT pgml.load_dataset('digits');

 | Algorithm              | Reference                                                                                                         |
 | ---------------------- | ----------------------------------------------------------------------------------------------------------------- |
 | `affinity_propagation` | [AffinityPropagation](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AffinityPropagation.html target="_blank") |
 | `birch`                | [Birch](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.Birch.html target="_blank")                             |
 | `kmeans`               | [K-Means](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html target="_blank")                          |
 | `mini_batch_kmeans`    | [MiniBatchKMeans](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.MiniBatchKMeans.html target="_blank")         |
 | `affinity_propagation` | [AffinityPropagation](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AffinityPropagation.html) |
 | `birch`                | [Birch](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.Birch.html)                             |
 | `kmeans`               | [K-Means](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html)                          |
 | `mini_batch_kmeans`    | [MiniBatchKMeans](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.MiniBatchKMeans.html)         |

 ### Examples
Original file line number	Diff line number	Diff line change
Expand Up		@@ -8,7 +8,7 @@ description: >-

		## Example

		This example trains models on the sklean digits dataset which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits target="_blank"). This demonstrates using a table with a single array feature column for classification. You could do something similar with a vector column.
		This example trains models on the sklean digits dataset which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits). This demonstrates using a table with a single array feature column for classification. You could do something similar with a vector column.

		```sql
		-- load the sklearn digits dataset
Expand All		@@ -33,16 +33,16 @@ LIMIT 10;

		## Algorithms

		We currently support classification algorithms from [scikit-learn](https://scikit-learn.org/ target="_blank"), [XGBoost](https://xgboost.readthedocs.io/ target="_blank"), [LightGBM](https://lightgbm.readthedocs.io/ target="_blank") and [Catboost](https://catboost.ai/ target="_blank").
		We currently support classification algorithms from [scikit-learn](https://scikit-learn.org/), [XGBoost](https://xgboost.readthedocs.io/), [LightGBM](https://lightgbm.readthedocs.io/) and [Catboost](https://catboost.ai/).

		### Gradient Boosting

		\| Algorithm \| Reference \|
		\| ----------------------- \| -------------------------------------------------------------------------------------------------------------------------- \|
		\| `xgboost` \| [XGBClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBClassifier target="_blank") \|
		\| `xgboost_random_forest` \| [XGBRFClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBRFClassifier target="_blank") \|
		\| `lightgbm` \| [LGBMClassifier](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html#lightgbm.LGBMClassifier target="_blank") \|
		\| `catboost` \| [CatBoostClassifier](https://catboost.ai/en/docs/concepts/python-reference\_catboostclassifier target="_blank") \|
		\| `xgboost` \| [XGBClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBClassifier) \|
		\| `xgboost_random_forest` \| [XGBRFClassifier](https://xgboost.readthedocs.io/en/stable/python/python\_api.html#xgboost.XGBRFClassifier) \|
		\| `lightgbm` \| [LGBMClassifier](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html#lightgbm.LGBMClassifier) \|
		\| `catboost` \| [CatBoostClassifier](https://catboost.ai/en/docs/concepts/python-reference\_catboostclassifier) \|

		#### Examples

Expand All		@@ -57,12 +57,12 @@ SELECT * FROM pgml.train('Handwritten Digits', algorithm => 'catboost', hyperpar

		\| Algorithm \| Reference \|
		\| ------------------------- \| ---------------------------------------------------------------------------------------------------------------------------------------- \|
		\| `ada_boost` \| [AdaBoostClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html target="_blank") \|
		\| `bagging` \| [BaggingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html target="_blank") \|
		\| `extra_trees` \| [ExtraTreesClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html target="_blank") \|
		\| `gradient_boosting_trees` \| [GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html target="_blank") \|
		\| `random_forest` \| [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html target="_blank") \|
		\| `hist_gradient_boosting` \| [HistGradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html target="_blank") \|
		\| `ada_boost` \| [AdaBoostClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html) \|
		\| `bagging` \| [BaggingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html) \|
		\| `extra_trees` \| [ExtraTreesClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html) \|
		\| `gradient_boosting_trees` \| [GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html) \|
		\| `random_forest` \| [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) \|
		\| `hist_gradient_boosting` \| [HistGradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html) \|

		#### Examples

Expand All		@@ -79,9 +79,9 @@ SELECT * FROM pgml.train('Handwritten Digits', algorithm => 'hist_gradient_boost

		\| Algorithm \| Reference \|
		\| ------------ \| ----------------------------------------------------------------------------------------- \|
		\| `svm` \| [SVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html target="_blank") \|
		\| `nu_svm` \| [NuSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.NuSVC.html target="_blank") \|
		\| `linear_svm` \| [LinearSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html target="_blank") \|
		\| `svm` \| [SVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html) \|
		\| `nu_svm` \| [NuSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.NuSVC.html) \|
		\| `linear_svm` \| [LinearSVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html) \|

		#### Examples

Expand All		@@ -95,11 +95,11 @@ SELECT * FROM pgml.train('Handwritten Digits', algorithm => 'linear_svm');

		\| Algorithm \| Reference \|
		\| ----------------------------- \| --------------------------------------------------------------------------------------------------------------------------------------- \|
		\| `linear` \| [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.LogisticRegression.html target="_blank") \|
		\| `ridge` \| [RidgeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.RidgeClassifier.html target="_blank") \|
		\| `stochastic_gradient_descent` \| [SGDClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.SGDClassifier.html target="_blank") \|
		\| `perceptron` \| [Perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.Perceptron.html target="_blank") \|
		\| `passive_aggressive` \| [PassiveAggressiveClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.PassiveAggressiveClassifier.html target="_blank") \|
		\| `linear` \| [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.LogisticRegression.html) \|
		\| `ridge` \| [RidgeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.RidgeClassifier.html) \|
		\| `stochastic_gradient_descent` \| [SGDClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.SGDClassifier.html) \|
		\| `perceptron` \| [Perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.Perceptron.html) \|
		\| `passive_aggressive` \| [PassiveAggressiveClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear\_model.PassiveAggressiveClassifier.html) \|

		#### Examples

Expand All		@@ -114,7 +114,7 @@ SELECT * FROM pgml.train('Handwritten Digits', algorithm => 'passive_aggressive'

		\| Algorithm \| Reference \|
		\| ------------------ \| --------------------------------------------------------------------------------------------------------------------------------------- \|
		\| `gaussian_process` \| [GaussianProcessClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian\_process.GaussianProcessClassifier.html target="_blank") \|
		\| `gaussian_process` \| [GaussianProcessClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian\_process.GaussianProcessClassifier.html) \|

		#### Examples

Expand Down
Original file line number	Diff line number	Diff line change
Expand Up		@@ -4,7 +4,7 @@ Models can be trained using `pgml.train` on unlabeled data to identify groups wi

		## Example

		This example trains models on the sklearn digits dataset -- which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits target="_blank"). This demonstrates using a table with a single array feature column for clustering. You could do something similar with a vector column.
		This example trains models on the sklearn digits dataset -- which is a copy of the test set of the [UCI ML hand-written digits datasets](https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits). This demonstrates using a table with a single array feature column for clustering. You could do something similar with a vector column.

		```sql
		SELECT pgml.load_dataset('digits');
Expand All		@@ -31,10 +31,10 @@ All clustering algorithms implemented by PostgresML are online versions. You may

		\| Algorithm \| Reference \|
		\| ---------------------- \| ----------------------------------------------------------------------------------------------------------------- \|
		\| `affinity_propagation` \| [AffinityPropagation](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AffinityPropagation.html target="_blank") \|
		\| `birch` \| [Birch](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.Birch.html target="_blank") \|
		\| `kmeans` \| [K-Means](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html target="_blank") \|
		\| `mini_batch_kmeans` \| [MiniBatchKMeans](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.MiniBatchKMeans.html target="_blank") \|
		\| `affinity_propagation` \| [AffinityPropagation](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AffinityPropagation.html) \|
		\| `birch` \| [Birch](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.Birch.html) \|
		\| `kmeans` \| [K-Means](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html) \|
		\| `mini_batch_kmeans` \| [MiniBatchKMeans](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.MiniBatchKMeans.html) \|

		### Examples

Expand Down