Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitfa262b4

Browse files
committed
rename embedding models
1 parent6d60aac commitfa262b4

File tree

50 files changed

+275
-181
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+275
-181
lines changed

‎packages/pgml-rds-proxy/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ SELECT
7676
FROM
7777
dblink(
7878
'postgresml',
79-
'SELECT * FROM pgml.embed(''intfloat/e5-small'', ''embed this text'') AS embedding'
79+
'SELECT * FROM pgml.embed(''intfloat/e5-small-v2'', ''embed this text'') AS embedding'
8080
) AS t1(embedding real[386]);
8181
```
8282

‎pgml-apps/pgml-chat/pgml_chat/main.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -195,9 +195,8 @@ def handler(signum, frame):
195195
)
196196

197197
splitter=Splitter(splitter_name,splitter_params)
198-
model_name="hkunlp/instructor-xl"
199-
model_embedding_instruction="Represent the %s document for retrieval: "% (bot_topic)
200-
model_params= {"instruction":model_embedding_instruction}
198+
model_name="intfloat/e5-small-v2"
199+
model_params= {}
201200

202201
model=Model(model_name,"pgml",model_params)
203202
pipeline=Pipeline(args.collection_name+"_pipeline",model,splitter)

‎pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -122,14 +122,14 @@ LIMIT 5;
122122

123123
PostgresML provides a simple interface to generate embeddings from text in your database. You can use the[`pgml.embed`](https://postgresml.org/docs/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the[Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
124124

125-
Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model.[`intfloat/e5-small`](https://huggingface.co/intfloat/e5-small) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
125+
Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model.[`intfloat/e5-small-v2`](https://huggingface.co/intfloat/e5-small-v2) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
126126

127-
It takes a couple of minutes to download and cache the`intfloat/e5-small` model to generate the first embedding. After that, it's pretty fast.
127+
It takes a couple of minutes to download and cache the`intfloat/e5-small-v2` model to generate the first embedding. After that, it's pretty fast.
128128

129129
Note how we prefix the text we want to embed with either`passage:` or`query:` , the e5 model requires us to prefix our data with`passage:` if we're generating embeddings for our corpus and`query:` if we want to find semantically similar content.
130130

131131
```postgresql
132-
SELECT pgml.embed('intfloat/e5-small', 'passage: hi mom');
132+
SELECT pgml.embed('intfloat/e5-small-v2', 'passage: hi mom');
133133
```
134134

135135
This is a pretty powerful function, because we can pass any arbitrary text to any open source model, and it will generate an embedding for us. We can benchmark how long it takes to generate an embedding for a single review, using client-side timings in Postgres:
@@ -147,7 +147,7 @@ Aside from using this function with strings passed from a client, we can use it
147147
```postgresql
148148
SELECT
149149
review_body,
150-
pgml.embed('intfloat/e5-small', 'passage: ' || review_body)
150+
pgml.embed('intfloat/e5-small-v2', 'passage: ' || review_body)
151151
FROM pgml.amazon_us_reviews
152152
LIMIT 1;
153153
```
@@ -171,7 +171,7 @@ Time to generate an embedding increases with the length of the input text, and v
171171
```postgresql
172172
SELECT
173173
review_body,
174-
pgml.embed('intfloat/e5-small', 'passage: ' || review_body) AS embedding
174+
pgml.embed('intfloat/e5-small-v2', 'passage: ' || review_body) AS embedding
175175
FROM pgml.amazon_us_reviews
176176
LIMIT 1000;
177177
```
@@ -190,7 +190,7 @@ We can also do a quick sanity check to make sure we're really getting value out
190190
SELECT
191191
reviqew_body,
192192
pgml.embed(
193-
'intfloat/e5-small',
193+
'intfloat/e5-small-v2',
194194
'passage: ' || review_body,
195195
'{"device": "cpu"}'
196196
) AS embedding
@@ -224,6 +224,12 @@ You can also find embedding models that outperform OpenAI's `text-embedding-ada-
224224

225225
The current leading model is`hkunlp/instructor-xl`. Instructor models take an additional`instruction` parameter which includes context for the embeddings use case, similar to prompts before text generation tasks.
226226

227+
!!! note
228+
229+
"intfloat/e5-small-v2" surpassed the quality of instructor-xl, and should be used instead, but we've left this documentation available for existing users
230+
231+
!!!
232+
227233
Instructions can provide a "classification" or "topic" for the text:
228234

229235
####Classification
@@ -325,7 +331,7 @@ BEGIN
325331
326332
UPDATE pgml.amazon_us_reviews
327333
SET review_embedding_e5_large = pgml.embed(
328-
'intfloat/e5-large',
334+
'intfloat/e5-small-v2',
329335
'passage: ' || review_body
330336
)
331337
WHERE id BETWEEN i AND i + 10

‎pgml-cms/blog/personalize-embedding-results-with-application-data-in-your-database.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ We can find a customer that our embeddings model feels is close to the sentiment
137137
```postgresql
138138
WITH request AS (
139139
SELECT pgml.embed(
140-
'intfloat/e5-large',
140+
'intfloat/e5-small-v2',
141141
'query: I love all Star Wars, but Empire Strikes Back is particularly amazing'
142142
)::vector(1024) AS embedding
143143
)
@@ -214,7 +214,7 @@ Now we can write our personalized SQL query. It's nearly the same as our query f
214214
-- create a request embedding on the fly
215215
WITH request AS (
216216
SELECT pgml.embed(
217-
'intfloat/e5-large',
217+
'intfloat/e5-small-v2',
218218
'query: Best 1980''s scifi movie'
219219
)::vector(1024) AS embedding
220220
),

‎pgml-cms/blog/pgml-chat-a-command-line-tool-for-deploying-low-latency-knowledge-based-chatbots-part-i.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -127,9 +127,7 @@ cp .env.template .env
127127
```bash
128128
OPENAI_API_KEY=<OPENAI_API_KEY>
129129
DATABASE_URL=<POSTGRES_DATABASE_URL starts with postgres://>
130-
MODEL=hkunlp/instructor-xl
131-
MODEL_PARAMS={"instruction":"Represent the document for retrieval:"}
132-
QUERY_PARAMS={"instruction":"Represent the question for retrieving supporting documents:"}
130+
MODEL=intfloat/e5-small-v2
133131
SYSTEM_PROMPT=<># System prompt used for OpenAI chat completion
134132
BASE_PROMPT=<># Base prompt used for OpenAI chat completion for each turn
135133
SLACK_BOT_TOKEN=<SLACK_BOT_TOKEN># Slack bot token to run Slack chat service
@@ -332,7 +330,7 @@ Once the discord app is running, you can interact with the chatbot on Discord as
332330
333331
### PostgresML vs. Hugging Face + Pinecone
334332
335-
To evaluate query latency, we performed an experiment with 10,000 Wikipedia documents from the SQuAD dataset. Embeddings were generated using the intfloat/e5-large model.
333+
To evaluate query latency, we performed an experiment with 10,000 Wikipedia documents from the SQuAD dataset. Embeddings were generated using the intfloat/e5-small-v2 model.
336334
337335
For PostgresML, we used a GPU-powered serverless database running on NVIDIA A10G GPUs with clientin us-west-2 region. For HuggingFace, we used their inference API endpoint running on NVIDIA A10G GPUsin us-east-1 region and a clientin the same us-east-1 region. Pinecone was used as the vector search indexfor HuggingFace embeddings.
338336

‎pgml-cms/blog/speeding-up-vector-recall-5x-with-hnsw.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Let's run that query again:
4545
```postgresql
4646
WITH request AS (
4747
SELECT pgml.embed(
48-
'intfloat/e5-large',
48+
'intfloat/e5-small-v2',
4949
'query: Best 1980''s scifi movie'
5050
)::vector(1024) AS embedding
5151
)
@@ -100,7 +100,7 @@ Now let's try the query again utilizing the new HNSW index we created.
100100
```postgresql
101101
WITH request AS (
102102
SELECT pgml.embed(
103-
'intfloat/e5-large',
103+
'intfloat/e5-small-v2',
104104
'query: Best 1980''s scifi movie'
105105
)::vector(1024) AS embedding
106106
)

‎pgml-cms/blog/the-1.0-sdk-is-here.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ const pipeline = pgml.newPipeline("my_pipeline", {
5050
text: {
5151
splitter: { model:"recursive_character" },
5252
semantic_search: {
53-
model:"intfloat/e5-small",
53+
model:"intfloat/e5-small-v2",
5454
},
5555
},
5656
});
@@ -90,7 +90,7 @@ pipeline = Pipeline(
9090
"text": {
9191
"splitter": {"model":"recursive_character"},
9292
"semantic_search": {
93-
"model":"intfloat/e5-small",
93+
"model":"intfloat/e5-small-v2",
9494
},
9595
},
9696
},

‎pgml-cms/blog/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ We'll start with semantic search. Given a user query, e.g. "Best 1980's scifi mo
124124
```postgresql
125125
WITH request AS (
126126
SELECT pgml.embed(
127-
'intfloat/e5-large',
127+
'intfloat/e5-small-v2',
128128
'query: Best 1980''s scifi movie'
129129
)::vector(1024) AS embedding
130130
)
@@ -171,7 +171,7 @@ Generating a query plan more quickly and only computing the values once, may mak
171171
There's some good stuff happening in those query results, so let's break it down:
172172

173173
***It's fast** - We're able to generate a request embedding on the fly with a state-of-the-art model, and search 5M reviews in 152ms, including fetching the results back to the client 😍. You can't even generate an embedding from OpenAI's API in that time, much less search 5M reviews in some other database with it.
174-
***It's good** - The`review_body` results are very similar to the "Best 1980's scifi movie" request text. We're using the`intfloat/e5-large` open source embedding model, which outperforms OpenAI's`text-embedding-ada-002` in most[quality benchmarks](https://huggingface.co/spaces/mteb/leaderboard).
174+
***It's good** - The`review_body` results are very similar to the "Best 1980's scifi movie" request text. We're using the`intfloat/e5-small-v2` open source embedding model, which outperforms OpenAI's`text-embedding-ada-002` in most[quality benchmarks](https://huggingface.co/spaces/mteb/leaderboard).
175175
* Qualitatively: the embeddings understand our request for`scifi` being equivalent to`Sci-Fi`,`sci-fi`,`SciFi`, and`sci fi`, as well as`1980's` matching`80s` and`80's` and is close to`seventies` (last place). We didn't have to configure any of this and the most enthusiastic for "best" is at the top, the least enthusiastic is at the bottom, so the model has appropriately captured "sentiment".
176176
* Quantitatively: the`cosine_similarity` of all results are high and tight, 0.90-0.95 on a scale from -1:1. We can be confident we recalled very similar results from our 5M candidates, even though it would take 485 times as long to check all of them directly.
177177
***It's reliable** - The model is stored in the database, so we don't need to worry about managing a separate service. If you repeat this query over and over, the timings will be extremely consistent, because we don't have to deal with things like random network congestion.
@@ -254,7 +254,7 @@ Now we can quickly search for movies by what people have said about them:
254254
```postgresql
255255
WITH request AS (
256256
SELECT pgml.embed(
257-
'intfloat/e5-large',
257+
'intfloat/e5-small-v2',
258258
'Best 1980''s scifi movie'
259259
)::vector(1024) AS embedding
260260
)
@@ -312,7 +312,7 @@ SET ivfflat.probes = 300;
312312
```postgresql
313313
WITH request AS (
314314
SELECT pgml.embed(
315-
'intfloat/e5-large',
315+
'intfloat/e5-small-v2',
316316
'Best 1980''s scifi movie'
317317
)::vector(1024) AS embedding
318318
)
@@ -401,7 +401,7 @@ SET ivfflat.probes = 1;
401401
```postgresql
402402
WITH request AS (
403403
SELECT pgml.embed(
404-
'intfloat/e5-large',
404+
'intfloat/e5-small-v2',
405405
'query: Best 1980''s scifi movie'
406406
)::vector(1024) AS embedding
407407
)
@@ -457,7 +457,7 @@ SQL is a very expressive language that can handle a lot of complexity. To keep t
457457
-- create a request embedding on the fly
458458
WITH request AS (
459459
SELECT pgml.embed(
460-
'intfloat/e5-large',
460+
'intfloat/e5-small-v2',
461461
'query: Best 1980''s scifi movie'
462462
)::vector(1024) AS embedding
463463
),

‎pgml-cms/blog/using-postgresml-with-django-and-embedding-search.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ class EmbedSmallExpression(models.Expression):
5858
self.embedding_field= field
5959

6060
defas_sql(self,compiler,connection,template=None):
61-
returnf"pgml.embed('intfloat/e5-small',{self.embedding_field})",None
61+
returnf"pgml.embed('intfloat/e5-small-v2',{self.embedding_field})",None
6262
```
6363

6464
And that's it! In just a few lines of code, we're generating and storing high quality embeddings automatically in our database. No additional setup is required, and all the AI complexity is taken care of by PostgresML.
@@ -70,7 +70,7 @@ Djago Rest Framework provides the bulk of the implementation. We just added a `M
7070
```python
7171
results= TodoItem.objects.annotate(
7272
similarity=RawSQL(
73-
"pgml.embed('intfloat/e5-small',%s)::vector(384) &#x3C;=> embedding",
73+
"pgml.embed('intfloat/e5-small-v2',%s)::vector(384) &#x3C;=> embedding",
7474
[query],
7575
)
7676
).order_by("similarity")
@@ -115,7 +115,7 @@ In return, you'll get your to-do item alongside the embedding of the `descriptio
115115

116116
The embedding contains 384 floating point numbers; we removed most of them in this blog post to make sure it fits on the page.
117117

118-
You can try creating multiple to-do items for fun and profit. If the description is changed, so will the embedding, demonstrating how the`intfloat/e5-small` model understands the semantic meaning of your text.
118+
You can try creating multiple to-do items for fun and profit. If the description is changed, so will the embedding, demonstrating how the`intfloat/e5-small-v2` model understands the semantic meaning of your text.
119119

120120
###Searching
121121

‎pgml-cms/docs/api/client-sdk/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ const pipeline = pgml.newPipeline("sample_pipeline", {
8080
text: {
8181
splitter: { model:"recursive_character" },
8282
semantic_search: {
83-
model:"intfloat/e5-small",
83+
model:"intfloat/e5-small-v2",
8484
},
8585
},
8686
});
@@ -98,7 +98,7 @@ pipeline = Pipeline(
9898
"text": {
9999
"splitter": {"model":"recursive_character" },
100100
"semantic_search": {
101-
"model":"intfloat/e5-small",
101+
"model":"intfloat/e5-small-v2",
102102
},
103103
},
104104
},
@@ -111,7 +111,7 @@ await collection.add_pipeline(pipeline)
111111

112112
The pipeline configuration is a key/value object, where the key is the name of a column in a document, and the value is the action the SDK should perform on that column.
113113

114-
In this example, the documents contain a column called`text` which we are instructing the SDK to chunk the contents of using the recursive character splitter, and to embed those chunks using the Hugging Face`intfloat/e5-small` embeddings model.
114+
In this example, the documents contain a column called`text` which we are instructing the SDK to chunk the contents of using the recursive character splitter, and to embed those chunks using the Hugging Face`intfloat/e5-small-v2` embeddings model.
115115

116116
###Add documents
117117

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp