Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit3049db3

Browse files
committed
move guides under pgml
1 parentba4b3a7 commit3049db3

File tree

23 files changed

+44
-44
lines changed

23 files changed

+44
-44
lines changed

‎pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ LIMIT 5;
120120

121121
##Generating embeddings from natural language text
122122

123-
PostgresML provides a simple interface to generate embeddings from text in your database. You can use the[`pgml.embed`](https://postgresml.org/docs/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the[Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
123+
PostgresML provides a simple interface to generate embeddings from text in your database. You can use the[`pgml.embed`](https://postgresml.org/docs/open-source/pgml/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the[Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
124124

125125
Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model.[`Alibaba-NLP/gte-base-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
126126

‎pgml-cms/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@ We have truncated the output to two items
210210

211211
!!!
212212

213-
We also have asynchronous versions of the create and`create_stream` functions relatively named`create_async` and`create_stream_async`. Checkout[our documentation](https://postgresml.org/docs/guides/opensourceai) for a complete guide of the open-source AI SDK including guides on how to specify custom models.
213+
We also have asynchronous versions of the create and`create_stream` functions relatively named`create_async` and`create_stream_async`. Checkout[our documentation](https://postgresml.org/docs/open-source/pgml/guides/opensourceai) for a complete guide of the open-source AI SDK including guides on how to specify custom models.
214214

215215
PostgresML is free and open source. To run the above examples yourself[create an account](https://postgresml.org/signup), install korvus, and get running!
216216

‎pgml-cms/blog/semantic-search-in-postgres-in-15-minutes.md‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ SELECT '[1,2,3]'::vector <=> '[2,3,4]'::vector;
152152

153153
!!!
154154

155-
Other distance functions have similar formulas and provide convenient operators to use as well. It may be worth testing other operators and to see which performs better for your use case. For more information on the other distance functions, take a look at our[Embeddings guide](https://postgresml.org/docs/guides/embeddings/vector-similarity).
155+
Other distance functions have similar formulas and provide convenient operators to use as well. It may be worth testing other operators and to see which performs better for your use case. For more information on the other distance functions, take a look at our[Embeddings guide](https://postgresml.org/docs/open-source/pgml/guides/embeddings/vector-similarity).
156156

157157
Going back to our search example, we can compute the cosine distance between our query embedding and our documents:
158158

‎pgml-cms/docs/SUMMARY.md‎

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -72,20 +72,20 @@
7272

7373
##Guides
7474

75-
*[Embeddings](guides/embeddings/README.md)
76-
*[In-database Generation](guides/embeddings/in-database-generation.md)
77-
*[Dimensionality Reduction](guides/embeddings/dimensionality-reduction.md)
78-
*[Aggregation](guides/embeddings/vector-aggregation.md)
79-
*[Similarity](guides/embeddings/vector-similarity.md)
80-
*[Normalization](guides/embeddings/vector-normalization.md)
81-
*[Search](guides/improve-search-results-with-machine-learning.md)
82-
*[Chatbots](guides/chatbots/README.md)
75+
*[Embeddings](open-source/pgml/guides/embeddings/README.md)
76+
*[In-database Generation](open-source/pgml/guides/embeddings/in-database-generation.md)
77+
*[Dimensionality Reduction](open-source/pgml/guides/embeddings/dimensionality-reduction.md)
78+
*[Aggregation](open-source/pgml/guides/embeddings/vector-aggregation.md)
79+
*[Similarity](open-source/pgml/guides/embeddings/vector-similarity.md)
80+
*[Normalization](open-source/pgml/guides/embeddings/vector-normalization.md)
81+
*[Search](open-source/pgml/guides/improve-search-results-with-machine-learning.md)
82+
*[Chatbots](open-source/pgml/guides/chatbots/README.md)
8383
*[Example Application](use-cases/chatbots.md)
84-
*[Supervised Learning](guides/supervised-learning.md)
85-
*[Unified RAG](guides/unified-rag.md)
86-
*[OpenSourceAI](guides/opensourceai.md)
87-
*[Natural Language Processing](guides/natural-language-processing.md)
88-
*[Vector database](guides/vector-database.md)
84+
*[Supervised Learning](open-source/pgml/guides/supervised-learning.md)
85+
*[Unified RAG](open-source/pgml/guides/unified-rag.md)
86+
*[OpenSourceAI](open-source/pgml/guides/opensourceai.md)
87+
*[Natural Language Processing](open-source/pgml/guides/natural-language-processing.md)
88+
*[Vector database](open-source/pgml/guides/vector-database.md)
8989

9090
##Resources
9191

‎pgml-cms/docs/guides/chatbots/README.md‎renamed to ‎pgml-cms/docs/open-source/pgml/guides/chatbots/README.md‎

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -108,11 +108,11 @@ What does an `embedding` look like? `Embeddings` are just vectors (for our use c
108108
embedding_1= embed("King")# embed returns something like [0.11, -0.32, 0.46, ...]
109109
```
110110

111-
<figure><imgsrc="../../.gitbook/assets/embedding_king.png"alt=""><figcaption><p>The flow of word -> token -> embedding</p></figcaption></figure>
111+
<figure><imgsrc="../../../../.gitbook/assets/embedding_king.png"alt=""><figcaption><p>The flow of word -> token -> embedding</p></figcaption></figure>
112112

113113
`Embeddings` aren't limited to words, we have models that can embed entire sentences.
114114

115-
<figure><imgsrc="../../.gitbook/assets/embeddings_tokens.png"alt=""><figcaption><p>The flow of sentence -> tokens -> embedding</p></figcaption></figure>
115+
<figure><imgsrc="../../../../.gitbook/assets/embeddings_tokens.png"alt=""><figcaption><p>The flow of sentence -> tokens -> embedding</p></figcaption></figure>
116116

117117
Why do we care about`embeddings`?`Embeddings` have a very interesting property. Words and sentences that have close[semantic similarity](https://en.wikipedia.org/wiki/Semantic\_similarity) sit closer to one another in vector space than words and sentences that do not have close semantic similarity.
118118

@@ -157,7 +157,7 @@ print(context)
157157

158158
Thereis a lot going onwith this, let's check out this diagram and step through it.
159159

160-
<figure><imgsrc="../../.gitbook/assets/chatbot_flow.png"alt=""><figcaption><p>The flow of taking a document, splitting it into chunks, embedding those chunks,and then retrieving a chunk based off of a users query</p></figcaption></figure>
160+
<figure><imgsrc="../../../../.gitbook/assets/chatbot_flow.png"alt=""><figcaption><p>The flow of taking a document, splitting it into chunks, embedding those chunks,and then retrieving a chunk based off of a users query</p></figcaption></figure>
161161

162162
Step1: We take the documentand split it into chunks. Chunks are typically a paragraphor twoin size. There are many ways to split documents into chunks,for more information check out [this guide](https://www.pinecone.io/learn/chunking-strategies/).
163163

‎pgml-cms/docs/guides/embeddings/README.md‎renamed to ‎pgml-cms/docs/open-source/pgml/guides/embeddings/README.md‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Vectors can be stored in the native Postgres [`ARRAY[]`](https://www.postgresql.
3939

4040
!!! warning
4141

42-
Other cloud providers claim to offer embeddings "inside the database", but[benchmarks](../../resources/benchmarks/mindsdb-vs-postgresml.md) show that they are orders of magnitude slower than PostgresML. The reason is they don't actually run inside the database with hardware acceleration. They are thin wrapper functions that make network calls to remote service providers. PostgresML is the only cloud that puts GPU hardware in the database for full acceleration, and it shows.
42+
Other cloud providers claim to offer embeddings "inside the database", but[benchmarks](../../../../resources/benchmarks/mindsdb-vs-postgresml.md) show that they are orders of magnitude slower than PostgresML. The reason is they don't actually run inside the database with hardware acceleration. They are thin wrapper functions that make network calls to remote service providers. PostgresML is the only cloud that puts GPU hardware in the database for full acceleration, and it shows.
4343

4444
!!!
4545

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp