Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitcc5d24e

Browse files
authored
Update docs for serverless v2 (#1485)
1 parentb44f381 commitcc5d24e

File tree

57 files changed

+212
-227
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+212
-227
lines changed

‎packages/pgml-rds-proxy/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ SELECT
7676
FROM
7777
dblink(
7878
'postgresml',
79-
'SELECT * FROM pgml.embed(''intfloat/e5-small'', ''embed this text'') AS embedding'
79+
'SELECT * FROM pgml.embed(''Alibaba-NLP/gte-base-en-v1.5'', ''embed this text'') AS embedding'
8080
) AS t1(embedding real[386]);
8181
```
8282

‎pgml-apps/pgml-chat/pgml_chat/main.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ def handler(signum, frame):
123123
"--chat_completion_model",
124124
dest="chat_completion_model",
125125
type=str,
126-
default="HuggingFaceH4/zephyr-7b-beta",
126+
default="meta-llama/Meta-Llama-3-8B-Instruct",
127127
)
128128

129129
parser.add_argument(
@@ -195,9 +195,8 @@ def handler(signum, frame):
195195
)
196196

197197
splitter=Splitter(splitter_name,splitter_params)
198-
model_name="hkunlp/instructor-xl"
199-
model_embedding_instruction="Represent the %s document for retrieval: "% (bot_topic)
200-
model_params= {"instruction":model_embedding_instruction}
198+
model_name="Alibaba-NLP/gte-base-en-v1.5"
199+
model_params= {}
201200

202201
model=Model(model_name,"pgml",model_params)
203202
pipeline=Pipeline(args.collection_name+"_pipeline",model,splitter)

‎pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -122,14 +122,14 @@ LIMIT 5;
122122

123123
PostgresML provides a simple interface to generate embeddings from text in your database. You can use the[`pgml.embed`](https://postgresml.org/docs/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the[Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
124124

125-
Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model.[`intfloat/e5-small`](https://huggingface.co/intfloat/e5-small) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
125+
Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model.[`Alibaba-NLP/gte-base-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
126126

127-
It takes a couple of minutes to download and cache the`intfloat/e5-small` model to generate the first embedding. After that, it's pretty fast.
127+
It takes a couple of minutes to download and cache the`Alibaba-NLP/gte-base-en-v1.5` model to generate the first embedding. After that, it's pretty fast.
128128

129129
Note how we prefix the text we want to embed with either`passage:` or`query:` , the e5 model requires us to prefix our data with`passage:` if we're generating embeddings for our corpus and`query:` if we want to find semantically similar content.
130130

131131
```postgresql
132-
SELECT pgml.embed('intfloat/e5-small', 'passage: hi mom');
132+
SELECT pgml.embed('Alibaba-NLP/gte-base-en-v1.5', 'passage: hi mom');
133133
```
134134

135135
This is a pretty powerful function, because we can pass any arbitrary text to any open source model, and it will generate an embedding for us. We can benchmark how long it takes to generate an embedding for a single review, using client-side timings in Postgres:
@@ -147,7 +147,7 @@ Aside from using this function with strings passed from a client, we can use it
147147
```postgresql
148148
SELECT
149149
review_body,
150-
pgml.embed('intfloat/e5-small', 'passage: ' || review_body)
150+
pgml.embed('Alibaba-NLP/gte-base-en-v1.5', 'passage: ' || review_body)
151151
FROM pgml.amazon_us_reviews
152152
LIMIT 1;
153153
```
@@ -171,7 +171,7 @@ Time to generate an embedding increases with the length of the input text, and v
171171
```postgresql
172172
SELECT
173173
review_body,
174-
pgml.embed('intfloat/e5-small', 'passage: ' || review_body) AS embedding
174+
pgml.embed('Alibaba-NLP/gte-base-en-v1.5', 'passage: ' || review_body) AS embedding
175175
FROM pgml.amazon_us_reviews
176176
LIMIT 1000;
177177
```
@@ -190,7 +190,7 @@ We can also do a quick sanity check to make sure we're really getting value out
190190
SELECT
191191
reviqew_body,
192192
pgml.embed(
193-
'intfloat/e5-small',
193+
'Alibaba-NLP/gte-base-en-v1.5',
194194
'passage: ' || review_body,
195195
'{"device": "cpu"}'
196196
) AS embedding
@@ -224,6 +224,12 @@ You can also find embedding models that outperform OpenAI's `text-embedding-ada-
224224

225225
The current leading model is`hkunlp/instructor-xl`. Instructor models take an additional`instruction` parameter which includes context for the embeddings use case, similar to prompts before text generation tasks.
226226

227+
!!! note
228+
229+
"Alibaba-NLP/gte-base-en-v1.5" surpassed the quality of instructor-xl, and should be used instead, but we've left this documentation available for existing users
230+
231+
!!!
232+
227233
Instructions can provide a "classification" or "topic" for the text:
228234

229235
####Classification
@@ -325,7 +331,7 @@ BEGIN
325331
326332
UPDATE pgml.amazon_us_reviews
327333
SET review_embedding_e5_large = pgml.embed(
328-
'intfloat/e5-large',
334+
'Alibaba-NLP/gte-base-en-v1.5',
329335
'passage: ' || review_body
330336
)
331337
WHERE id BETWEEN i AND i + 10

‎pgml-cms/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ The Switch Kit is an open-source AI SDK that provides a drop in replacement for
4444
constpgml=require("pgml");
4545
constclient=pgml.newOpenSourceAI();
4646
constresults=client.chat_completions_create(
47-
"HuggingFaceH4/zephyr-7b-beta",
47+
"meta-llama/Meta-Llama-3-8B-Instruct",
4848
[
4949
{
5050
role:"system",
@@ -65,7 +65,7 @@ console.log(results);
6565
import pgml
6666
client= pgml.OpenSourceAI()
6767
results= client.chat_completions_create(
68-
"HuggingFaceH4/zephyr-7b-beta",
68+
"meta-llama/Meta-Llama-3-8B-Instruct",
6969
[
7070
{
7171
"role":"system",
@@ -96,7 +96,7 @@ print(results)
9696
],
9797
"created":1701291672,
9898
"id":"abf042d2-9159-49cb-9fd3-eef16feb246c",
99-
"model":"HuggingFaceH4/zephyr-7b-beta",
99+
"model":"meta-llama/Meta-Llama-3-8B-Instruct",
100100
"object":"chat.completion",
101101
"system_fingerprint":"eecec9d4-c28b-5a27-f90b-66c3fb6cee46",
102102
"usage": {
@@ -113,7 +113,7 @@ We don't charge per token, so OpenAI “usage” metrics are not particularly re
113113

114114
!!!
115115

116-
The above is an example using our open-source AI SDK withzephyr-7b-beta, an incredibly popular and highly efficient7 billion parameter model.
116+
The above is an example using our open-source AI SDK withMeta-Llama-3-8B-Instruct, an incredibly popular and highly efficient8 billion parameter model.
117117

118118
Notice there is near one to one relation between the parameters and return type of OpenAI’s`chat.completions.create` and our`chat_completion_create`.
119119

‎pgml-cms/blog/llm-based-pipelines-with-postgresml-and-dbt-data-build-tool.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ vars:
119119
splitter_name: "recursive_character"
120120
splitter_parameters: {"chunk_size": 100, "chunk_overlap": 20}
121121
task: "embedding"
122-
model_name: "intfloat/e5-base"
122+
model_name: "intfloat/e5-small-v2"
123123
query_string: 'Lorem ipsum 3'
124124
limit: 2
125125
```
@@ -129,7 +129,7 @@ Here's a summary of the key parameters:
129129
* `splitter_name`: Specifies the name of the splitter, set as "recursive\_character".
130130
* `splitter_parameters`: Defines the parameters for the splitter, such as a chunk size of 100 and a chunk overlap of 20.
131131
* `task`: Indicates the task being performed, specified as "embedding".
132-
* `model_name`: Specifies the name of the model to be used, set as "intfloat/e5-base".
132+
* `model_name`: Specifies the name of the model to be used, set as "intfloat/e5-small-v2".
133133
* `query_string`: Provides a query string, set as 'Lorem ipsum 3'.
134134
* `limit`: Specifies a limit of 2, indicating the maximum number of results to be processed.
135135

‎pgml-cms/blog/personalize-embedding-results-with-application-data-in-your-database.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ We can find a customer that our embeddings model feels is close to the sentiment
137137
```postgresql
138138
WITH request AS (
139139
SELECT pgml.embed(
140-
'intfloat/e5-large',
140+
'Alibaba-NLP/gte-base-en-v1.5',
141141
'query: I love all Star Wars, but Empire Strikes Back is particularly amazing'
142142
)::vector(1024) AS embedding
143143
)
@@ -214,7 +214,7 @@ Now we can write our personalized SQL query. It's nearly the same as our query f
214214
-- create a request embedding on the fly
215215
WITH request AS (
216216
SELECT pgml.embed(
217-
'intfloat/e5-large',
217+
'Alibaba-NLP/gte-base-en-v1.5',
218218
'query: Best 1980''s scifi movie'
219219
)::vector(1024) AS embedding
220220
),

‎pgml-cms/blog/pgml-chat-a-command-line-tool-for-deploying-low-latency-knowledge-based-chatbots-part-i.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -127,9 +127,7 @@ cp .env.template .env
127127
```bash
128128
OPENAI_API_KEY=<OPENAI_API_KEY>
129129
DATABASE_URL=<POSTGRES_DATABASE_URL starts with postgres://>
130-
MODEL=hkunlp/instructor-xl
131-
MODEL_PARAMS={"instruction":"Represent the document for retrieval:"}
132-
QUERY_PARAMS={"instruction":"Represent the question for retrieving supporting documents:"}
130+
MODEL=Alibaba-NLP/gte-base-en-v1.5
133131
SYSTEM_PROMPT=<># System prompt used for OpenAI chat completion
134132
BASE_PROMPT=<># Base prompt used for OpenAI chat completion for each turn
135133
SLACK_BOT_TOKEN=<SLACK_BOT_TOKEN># Slack bot token to run Slack chat service
@@ -332,7 +330,7 @@ Once the discord app is running, you can interact with the chatbot on Discord as
332330
333331
### PostgresML vs. Hugging Face + Pinecone
334332
335-
To evaluate query latency, we performed an experiment with 10,000 Wikipedia documents from the SQuAD dataset. Embeddings were generated using theintfloat/e5-large model.
333+
To evaluate query latency, we performed an experiment with 10,000 Wikipedia documents from the SQuAD dataset. Embeddings were generated using theAlibaba-NLP/gte-base-en-v1.5 model.
336334
337335
For PostgresML, we used a GPU-powered serverless database running on NVIDIA A10G GPUs with clientin us-west-2 region. For HuggingFace, we used their inference API endpoint running on NVIDIA A10G GPUsin us-east-1 region and a clientin the same us-east-1 region. Pinecone was used as the vector search indexfor HuggingFace embeddings.
338336

‎pgml-cms/blog/speeding-up-vector-recall-5x-with-hnsw.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Let's run that query again:
4545
```postgresql
4646
WITH request AS (
4747
SELECT pgml.embed(
48-
'intfloat/e5-large',
48+
'Alibaba-NLP/gte-base-en-v1.5',
4949
'query: Best 1980''s scifi movie'
5050
)::vector(1024) AS embedding
5151
)
@@ -100,7 +100,7 @@ Now let's try the query again utilizing the new HNSW index we created.
100100
```postgresql
101101
WITH request AS (
102102
SELECT pgml.embed(
103-
'intfloat/e5-large',
103+
'Alibaba-NLP/gte-base-en-v1.5',
104104
'query: Best 1980''s scifi movie'
105105
)::vector(1024) AS embedding
106106
)

‎pgml-cms/blog/the-1.0-sdk-is-here.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ const pipeline = pgml.newPipeline("my_pipeline", {
5050
text: {
5151
splitter: { model:"recursive_character" },
5252
semantic_search: {
53-
model:"intfloat/e5-small",
53+
model:"Alibaba-NLP/gte-base-en-v1.5",
5454
},
5555
},
5656
});
@@ -90,7 +90,7 @@ pipeline = Pipeline(
9090
"text": {
9191
"splitter": {"model":"recursive_character"},
9292
"semantic_search": {
93-
"model":"intfloat/e5-small",
93+
"model":"Alibaba-NLP/gte-base-en-v1.5",
9494
},
9595
},
9696
},

‎pgml-cms/blog/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ We'll start with semantic search. Given a user query, e.g. "Best 1980's scifi mo
124124
```postgresql
125125
WITH request AS (
126126
SELECT pgml.embed(
127-
'intfloat/e5-large',
127+
'Alibaba-NLP/gte-base-en-v1.5',
128128
'query: Best 1980''s scifi movie'
129129
)::vector(1024) AS embedding
130130
)
@@ -171,7 +171,7 @@ Generating a query plan more quickly and only computing the values once, may mak
171171
There's some good stuff happening in those query results, so let's break it down:
172172

173173
***It's fast** - We're able to generate a request embedding on the fly with a state-of-the-art model, and search 5M reviews in 152ms, including fetching the results back to the client 😍. You can't even generate an embedding from OpenAI's API in that time, much less search 5M reviews in some other database with it.
174-
***It's good** - The`review_body` results are very similar to the "Best 1980's scifi movie" request text. We're using the`intfloat/e5-large` open source embedding model, which outperforms OpenAI's`text-embedding-ada-002` in most[quality benchmarks](https://huggingface.co/spaces/mteb/leaderboard).
174+
***It's good** - The`review_body` results are very similar to the "Best 1980's scifi movie" request text. We're using the`Alibaba-NLP/gte-base-en-v1.5` open source embedding model, which outperforms OpenAI's`text-embedding-ada-002` in most[quality benchmarks](https://huggingface.co/spaces/mteb/leaderboard).
175175
* Qualitatively: the embeddings understand our request for`scifi` being equivalent to`Sci-Fi`,`sci-fi`,`SciFi`, and`sci fi`, as well as`1980's` matching`80s` and`80's` and is close to`seventies` (last place). We didn't have to configure any of this and the most enthusiastic for "best" is at the top, the least enthusiastic is at the bottom, so the model has appropriately captured "sentiment".
176176
* Quantitatively: the`cosine_similarity` of all results are high and tight, 0.90-0.95 on a scale from -1:1. We can be confident we recalled very similar results from our 5M candidates, even though it would take 485 times as long to check all of them directly.
177177
***It's reliable** - The model is stored in the database, so we don't need to worry about managing a separate service. If you repeat this query over and over, the timings will be extremely consistent, because we don't have to deal with things like random network congestion.
@@ -254,7 +254,7 @@ Now we can quickly search for movies by what people have said about them:
254254
```postgresql
255255
WITH request AS (
256256
SELECT pgml.embed(
257-
'intfloat/e5-large',
257+
'Alibaba-NLP/gte-base-en-v1.5',
258258
'Best 1980''s scifi movie'
259259
)::vector(1024) AS embedding
260260
)
@@ -312,7 +312,7 @@ SET ivfflat.probes = 300;
312312
```postgresql
313313
WITH request AS (
314314
SELECT pgml.embed(
315-
'intfloat/e5-large',
315+
'Alibaba-NLP/gte-base-en-v1.5',
316316
'Best 1980''s scifi movie'
317317
)::vector(1024) AS embedding
318318
)
@@ -401,7 +401,7 @@ SET ivfflat.probes = 1;
401401
```postgresql
402402
WITH request AS (
403403
SELECT pgml.embed(
404-
'intfloat/e5-large',
404+
'Alibaba-NLP/gte-base-en-v1.5',
405405
'query: Best 1980''s scifi movie'
406406
)::vector(1024) AS embedding
407407
)
@@ -457,7 +457,7 @@ SQL is a very expressive language that can handle a lot of complexity. To keep t
457457
-- create a request embedding on the fly
458458
WITH request AS (
459459
SELECT pgml.embed(
460-
'intfloat/e5-large',
460+
'Alibaba-NLP/gte-base-en-v1.5',
461461
'query: Best 1980''s scifi movie'
462462
)::vector(1024) AS embedding
463463
),

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp