Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

add product to nav#1545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
chillenberger merged 21 commits intomasterfromdan-korvus-nav-update
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from1 commit
Commits
Show all changes
21 commits
Select commitHold shift + click to select a range
1c71108
add product to nav
chillenbergerJun 27, 2024
41ceca2
add pgcat icon as font, update footer
chillenbergerJun 27, 2024
1b59410
Changed structure and cleaned up Korvus docs
SilasMarvinJul 2, 2024
61264ef
Korvus docs in a decent place
SilasMarvinJul 3, 2024
29624bb
Added pgml.transform_stream docs and cleaned up pgml docs a bit
SilasMarvinJul 8, 2024
0977a52
Move PgCat under open source
SilasMarvinJul 8, 2024
a526c6f
Fix spelling error
SilasMarvinJul 8, 2024
5bf2e05
Update the docs landing page
SilasMarvinJul 9, 2024
0cf476b
Update product section of docs
SilasMarvinJul 9, 2024
c2cce09
Add correct route for enterprise plan
SilasMarvinJul 9, 2024
71a9655
Merge branch 'master' into dan-korvus-nav-update
chillenbergerJul 9, 2024
8bdd015
Clean up semantic search example app
SilasMarvinJul 9, 2024
a831255
add korvus icon, widen nav dropdown bridge, align product dropdown text
chillenbergerJul 9, 2024
2f995d6
update korvus icon font
chillenbergerJul 10, 2024
c0a4c8c
Korvus blog post
SilasMarvinJul 10, 2024
f26ad3e
Updated date for korvus launch blog post
SilasMarvinJul 10, 2024
89608ce
Cloud docs outline (#1553)
montanalowJul 10, 2024
89925e1
Merge remote-tracking branch 'origin/silas-docs-overhaul' into dan-ko…
chillenbergerJul 10, 2024
2464fb6
Merge remote-tracking branch 'origin/silas-docs-overhaul' into dan-ko…
chillenbergerJul 10, 2024
a9847b6
update links to korvus nav
chillenbergerJul 10, 2024
c32fdcb
update solutions links and footer
chillenbergerJul 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
Added pgml.transform_stream docs and cleaned up pgml docs a bit
  • Loading branch information
@SilasMarvin
SilasMarvin committedJul 8, 2024
commit29624bbdcefeddc37ff3662c71511d663aa91e6d
2 changes: 1 addition & 1 deletionpgml-cms/docs/SUMMARY.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -28,6 +28,7 @@
* [Token Classification](open-source/pgml/api/pgml.transform/token-classification.md)
* [Translation](open-source/pgml/api/pgml.transform/translation.md)
* [Zero-shot Classification](open-source/pgml/api/pgml.transform/zero-shot-classification.md)
* [pgml.transform_stream()](open-source/pgml/api/pgml.transform_stream.md)
* [pgml.deploy()](open-source/pgml/api/pgml.deploy.md)
* [pgml.decompose()](open-source/pgml/api/pgml.decompose.md)
* [pgml.chunk()](open-source/pgml/api/pgml.chunk.md)
Expand All@@ -43,7 +44,6 @@
* [Hyperparameter Search](open-source/pgml/api/pgml.train/hyperparameter-search.md)
* [Joint Optimization](open-source/pgml/api/pgml.train/joint-optimization.md)
* [pgml.tune()](open-source/pgml/api/pgml.tune.md)
* [Guides](open-source/pgml/guides/README.md)
* [Korvus](open-source/korvus/README.md)
* [API](open-source/korvus/api/README.md)
* [Collections](open-source/korvus/api/collections.md)
Expand Down
2 changes: 2 additions & 0 deletionspgml-cms/docs/open-source/korvus/README.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -4,6 +4,8 @@ description: Korvus is an SDK for JavaScript, Python and Rust implements common

# Korvus

Korvus is an all-in-one, open-source RAG (Retrieval-Augmented Generation) pipeline built for PostgresML. It combines LLMs, vector memory, embedding generation, reranking, summarization and custom models into a single query, maximizing performance and simplifying your search architecture.

Korvus can be installed using standard package managers for JavaScript, Python, and Rust. Since the SDK is written in Rust, the JavaScript and Python packages come with no additional dependencies.

For key features, a quick start, and the code see [the Korvus GitHub](https://github.com/postgresml/korvus)
Expand Down
44 changes: 44 additions & 0 deletionspgml-cms/docs/open-source/pgml/README.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
---
description: >-
The PostgresML extension for PostgreSQL provides Machine Learning and Artificial
Intelligence APIs with access to algorithms to train your models, or download
state-of-the-art open source models from Hugging Face.
---

# SQL extension

`pgml` is a PostgreSQL extension which adds SQL functions to the database. Those functions provide access to AI models downloaded from Hugging Face, and classical machine learning algorithms like XGBoost and LightGBM.

Our SQL API is stable and safe to use in your applications, while the models and algorithms we support continue to evolve and improve.

## Common Tasks

See the [API](api/) for a full list of all functions provided by `pgml`.

Common tasks include:
- [Splitting text - pgml.chunk()](api/pgml.chunk)
- [Generating embeddings - pgml.embed()](api/pgml.embed)
- [Generating text - pgml.transform()](api/pgml.transform/text-generation)
- [Streaming generated text - pgml.transform_stream()](api/pgml.transform_stream)

## Open-source LLMs

PostgresML defines four SQL functions which use [🤗 Hugging Face](https://huggingface.co/transformers) transformers and embeddings models, running directly in the database:

| Function | Description |
|---------------|-------------|
| [pgml.embed()](api/pgml.embed) | Generate embeddings using latest sentence transformers from Hugging Face. |
| [pgml.transform()](api/pgml.transform/) | Text generation using LLMs like Llama, Mixtral, and many more, with models downloaded from Hugging Face. |
| [pgml.transform_stream()](api/pgml.transform_stream) | Streaming version of [pgml.transform()](api/pgml.transform/), which fetches partial responses as they are being generated by the model, substantially decreasing time to first token. |
| [pgml.tune()](api/pgml.tune) | Perform fine tuning tasks on Hugging Face models, using data stored in the database. |

## Classical machine learning

PostgresML defines four SQL functions which allow training regression, classification, and clustering models on tabular data:

| Function | Description |
|---------------|-------------|
| [pgml.train()](api/pgml.train/) | Train a model on PostgreSQL tables or views using any algorithm from Scikit-learn, with the additional support for XGBoost, LightGBM and Catboost. |
| [pgml.predict()](api/pgml.predict/) | Run inference on live application data using a model trained with [pgml.train()](pgml.train/). |
| [pgml.deploy()](api/pgml.deploy) | Deploy a specific version of a model trained with pgml.train(), using your own accuracy metrics. |
| [pgml.load_dataset()](api/pgml.load_dataset) | Load any of the toy datasets from Scikit-learn or any dataset from Hugging Face. |
193 changes: 11 additions & 182 deletionspgml-cms/docs/open-source/pgml/api/README.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -1,196 +1,25 @@
---
description: >-
The PostgresML extension for PostgreSQL provides Machine Learning and Artificial
Intelligence APIs with access to algorithms to train your models, or download
state-of-the-art open source models from Hugging Face.
description: The pgml extension API.
---

#SQL extension
#PGML API

PostgresML is a PostgreSQL extension which adds SQL functions tothedatabase. Thosefunctionsprovide access to AI models downloaded from Hugging Face, and classical machine learning algorithms like XGBoost and LightGBM.
The API docs provides a brief overview oftheavailablefunctionsexposed by `pgml`.

Our SQL API is stable and safe to use in your applications, whilethemodels and algorithms we support continue to evolve and improve.
<!-- For more in depth guides on sepcific features seethe[Guides section](). -->

## Open-source LLMs

PostgresML defines two SQL functions which use [🤗 Hugging Face](https://huggingface.co/transformers) transformers and embeddings models, running directly in the database:
<!-- For example applications see our [Example apps section](). -->

| Function | Description |
|---------------|-------------|
| [pgml.embed()](pgml.embed) | Generate embeddings using latest sentence transformers from Hugging Face. |
| [pgml.embed()](pgml.embed) | Generate embeddings usingthelatest sentence transformers from Hugging Face. |
| [pgml.transform()](pgml.transform/) | Text generation using LLMs like Llama, Mixtral, and many more, with models downloaded from Hugging Face. |
| pgml.transform_stream() | Streaming version of [pgml.transform()](pgml.transform/), which fetches partial responses as they are being generated by the model, substantially decreasing time to first token. |
|[pgml.transform_stream()](pgml.transform_stream) | Streaming version of [pgml.transform()](pgml.transform/), which fetches partial responses as they are being generated by the model, substantially decreasing time to first token. |
| [pgml.tune()](pgml.tune) | Perform fine tuning tasks on Hugging Face models, using data stored in the database. |

### Example

Using a SQL function for interacting with open-source models makes things really easy:

{% tabs %}
{% tab title="SQL" %}

```postgresql
SELECT pgml.embed(
'Alibaba-NLP/gte-base-en-v1.5',
'This text will be embedded using the Alibaba-NLP/gte-base-en-v1.5 model.'
) AS embedding;
```

{% endtab %}
{% tab title="Output" %}

```
embedding
-------------------------------------------
{-0.028478337,-0.06275077,-0.04322059, [...]
```

{% endtab %}
{% endtabs %}

Using the `pgml` SQL functions inside regular queries, it's possible to add embeddings and LLM-generated text inside any query, without the data ever leaving the database, removing the cost of a remote network call.

## Classical machine learning

PostgresML defines four SQL functions which allow training regression, classification, and clustering models on tabular data:

| Function | Description |
|---------------|-------------|
| [pgml.train()](pgml.train/) | Train a model on PostgreSQL tables or views using any algorithm from Scikit-learn, with the additional support for XGBoost, LightGBM and Catboost. |
| [pgml.predict()](pgml.predict/) | Run inference on live application data using a model trained with [pgml.train()](pgml.train/). |
| [pgml.deploy()](pgml.deploy) | Deploy a specific version of a model trained with pgml.train(), using your own accuracy metrics. |
| pgml.load_dataset() | Load any of the toy datasets from Scikit-learn or any dataset from Hugging Face. |

### Example

#### Load data

Using `pgml.load_dataset()`, we can load an example classification dataset from Scikit-learn:

{% tabs %}
{% tab title="SQL" %}

```postgresql
SELECT *
FROM pgml.load_dataset('digits');
```

{% endtab %}
{% tab title="Output" %}

```
table_name | rows
-------------+------
pgml.digits | 1797
(1 row)
```

{% endtab %}
{% endtabs %}

#### Train a model

Once we have some data, we can train a model on this data using [pgml.train()](pgml.train/):

{% tabs %}
{% tab title="SQL" %}

```postgresql
SELECT *
FROM pgml.train(
project_name => 'My project name',
task => 'classification',
relation_name =>'pgml.digits',
y_column_name => 'target',
algorithm => 'xgboost',
);
```

{% endtab %}
{% tab title="Output" %}

```
INFO: Metrics: {
"f1": 0.8755124,
"precision": 0.87670505,
"recall": 0.88005465,
"accuracy": 0.87750554,
"mcc": 0.8645154,
"fit_time": 0.33504912,
"score_time": 0.001842427
}

project | task | algorithm | deployed
-----------------+----------------+-----------+----------
My project name | classification | xgboost | t
(1 row)

```

{% endtab %}
{% endtabs %}

[pgml.train()](pgml.train/) reads data from the table, using the `target` column as the label, automatically splits the dataset into test and train sets, and trains an XGBoost model. Our extension supports more than 50 machine learning algorithms, and you can train a model using any of them by just changing the name of the `algorithm` argument.


#### Real time inference

Now that we have a model, we can use it to predict new data points, in real time, on live application data:

{% tabs %}
{% tab title="SQL" %}

```postgresql
SELECT
target,
pgml.predict(
'My project name',
image
) AS prediction
FROM
pgml.digits
LIMIT 1;
```

{% endtab %}
{% tab title="Output" %}

```
target | prediction
--------+------------
0 | 0
(1 row)
```

{% endtab %}
{% endtabs %}

#### Change model version

The train function automatically deploys the best model into production, using the precision score relevant to the type of the model. If you prefer to deploy models using your own accuracy metrics, the [pgml.deploy()](pgml.deploy) function can manually change which model version is used for subsequent database queries:

{% tabs %}
{% tab title="SQL" %}

```postgresql
SELECT *
FROM
pgml.deploy(
'My project name',
strategy => 'most_recent',
algorithm => 'xgboost'
);
```

{% endtab %}
{% tab title="Output" %}

```
project | strategy | algorithm
-----------------+-------------+-----------
My project name | most_recent | xgboost
(1 row)
```

{% endtab %}
{% endtabs %}
| [pgml.load_dataset()](pgml.load_dataset) | Load any of the toy datasets from Scikit-learn or any dataset from Hugging Face. |
| [pgml.decompose()](pgml.decompose) | Reduces the number of dimensions in a vector via matrix decomposition. |
| [pgml.chunk()](pgml.chunk) | Break large bodies of text into smaller pieces via commonly used splitters. |
| [pgml.generate()](pgml.generate) | Perform inference with custom models. |
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
# pgml.load_dataset()
Loading

[8]ページ先頭

©2009-2025 Movatter.jp