NotificationsYou must be signed in to change notification settings
Fork328
Star6.4k

Commitd0e696c

authored

Document inference w/ preprocessing (#520)

Co-authored-by: Montana Low <montana.low@gmail.com>

1 parent6f73eaa commitd0e696cCopy full SHA for d0e696c

File tree

2 files changed

+10

-1

lines changed

pgml-docs
- docs/user_guides/training
  - preprocessing.md
- mkdocs.yml

2 files changed

+10

-1

lines changed

`‎pgml-docs/docs/user_guides/training/preprocessing.md`

Lines changed: 9 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -29,7 +29,7 @@ There are 3 steps to preprocessing data:`
`29`	`29`	These preprocessing steps may be specified on a per-column basis to the[train()](/user_guides/training/overview/) function. By default, PostgresML does minimal preprocessing on training data, and will raise an error during analysis if NULL values are encountered without a preprocessor. All types other than`TEXT` are treated as quantitative variables and cast to floating point representations before passing them to the underlying algorithm implementations.
`30`	`30`
`31`	`31`	```postgresql title="pgml.train()"
`32`		`-select pgml.train(`
	`32`	`+SELECT pgml.train(`
`33`	`33`	`project_name => 'preprocessed_model',`
`34`	`34`	`task => 'classification',`
`35`	`35`	`relation_name => 'weather_data',`
`@@ -52,6 +52,14 @@ In some cases, it may make sense to use multiple steps for a single column. For`
`52`	`52`	`!!! note`
`53`	`53`	`TEXT is used in this document to also refer to VARCHAR and CHAR(N) types.`
`54`	`54`
	`55`	`+##Predicting with Preprocessors`
	`56`	`+`
	`57`	+A model that has been trained with preprocessors should use a Postgres tuple for prediction, rather than a`FLOAT4[]`. Tuples may contain multiple different types (like`TEXT` and`BIGINT`), while an ARRAY may only contain a single type. You can use parenthesis around values to create a Postgres tuple.
	`58`	`+`
	`59`	+```postgresql title="pgml.predict()"
	`60`	`+ SELECT pgml.predict('preprocessed_model', ('jan', 'nimbus', 0.5, 7));`
	`61`	+```
	`62`	`+`
`55`	`63`	`##Categorical encodings`
`56`	`64`	`Encoding categorical variables is an O(N log(M)) where N is the number of rows, and M is the number of distinct categories.`
`57`	`65`

`‎pgml-docs/mkdocs.yml`

Lines changed: 1 addition & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -127,6 +127,7 @@ nav:`
`127`	`127`	`-Training:`
`128`	`128`	`-Training Overview:user_guides/training/overview.md`
`129`	`129`	`-Algorithm Selection:user_guides/training/algorithm_selection.md`
	`130`	`+ -Preprocessing Data:user_guides/training/preprocessing.md`
`130`	`131`	`-Hyperparameter Search:user_guides/training/hyperparameter_search.md`
`131`	`132`	`-Joint Optimization:user_guides/training/joint_optimization.md`
`132`	`133`	`-Predictions:`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commitd0e696c

File tree

2 files changed

2 files changed

`‎pgml-docs/docs/user_guides/training/preprocessing.md`

`‎pgml-docs/mkdocs.yml`

0 commit comments