Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Document inference w/ preprocessing#520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
montanalow merged 1 commit intomasterfrommontana/docs
Jan 30, 2023
Merged
Show file tree
Hide file tree
Changes fromall commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletionpgml-docs/docs/user_guides/training/preprocessing.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -29,7 +29,7 @@ There are 3 steps to preprocessing data:
These preprocessing steps may be specified on a per-column basis to the [train()](/user_guides/training/overview/) function. By default, PostgresML does minimal preprocessing on training data, and will raise an error during analysis if NULL values are encountered without a preprocessor. All types other than `TEXT` are treated as quantitative variables and cast to floating point representations before passing them to the underlying algorithm implementations.

```postgresql title="pgml.train()"
select pgml.train(
SELECT pgml.train(
project_name => 'preprocessed_model',
task => 'classification',
relation_name => 'weather_data',
Expand All@@ -52,6 +52,14 @@ In some cases, it may make sense to use multiple steps for a single column. For
!!! note
TEXT is used in this document to also refer to VARCHAR and CHAR(N) types.

## Predicting with Preprocessors

A model that has been trained with preprocessors should use a Postgres tuple for prediction, rather than a `FLOAT4[]`. Tuples may contain multiple different types (like `TEXT` and `BIGINT`), while an ARRAY may only contain a single type. You can use parenthesis around values to create a Postgres tuple.

```postgresql title="pgml.predict()"
SELECT pgml.predict('preprocessed_model', ('jan', 'nimbus', 0.5, 7));
```

## Categorical encodings
Encoding categorical variables is an O(N log(M)) where N is the number of rows, and M is the number of distinct categories.

Expand Down
1 change: 1 addition & 0 deletionspgml-docs/mkdocs.yml
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -127,6 +127,7 @@ nav:
- Training:
- Training Overview: user_guides/training/overview.md
- Algorithm Selection: user_guides/training/algorithm_selection.md
- Preprocessing Data: user_guides/training/preprocessing.md
- Hyperparameter Search: user_guides/training/hyperparameter_search.md
- Joint Optimization: user_guides/training/joint_optimization.md
- Predictions:
Expand Down

[8]ページ先頭

©2009-2025 Movatter.jp