NotificationsYou must be signed in to change notification settings
Fork328
Star6.4k

Commitfa262b4

committed

rename embedding models

1 parent6d60aac commitfa262b4Copy full SHA for fa262b4

File tree

50 files changed

+275

-181

lines changed

packages/pgml-rds-proxy
- README.md
pgml-apps/pgml-chat/pgml_chat
- main.py
pgml-cms
- blog
- docs
  - api
    - client-sdk
    - sql-extension
      - README.md
  - guides
    - chatbots
      - README.md
    - embeddings
      - dimensionality-reduction.md
  - introduction/getting-started/import-your-data
    - foreign-data-wrappers.md
  - product
    - vector-database.md
  - resources
    - data-storage-and-retrieval
      - partitioning.md
    - developer-docs
      - quick-start-with-docker.md
  - use-cases
    - chatbots.md
    - embeddings
      - personalize-embedding-results-with-application-data-in-your-database.md
      - tuning-vector-recall-while-generating-query-embeddings-in-the-database.md
pgml-dashboard
- content/blog/benchmarks/hf_pinecone_vs_postgresml
- src
  - components/pages/demo
    - template.html
  - utils
    - markdown.rs
pgml-extension/examples
- transformers.sql
pgml-sdks/pgml
- javascript
  - examples
  - tests/typescript-tests
    - test.ts
- python
  - examples
  - tests
    - stress_test.py
    - test.py
- src
  - lib.rs
  - model.rs
  - sql
    - remote.sql

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+275

-181

lines changed

`‎packages/pgml-rds-proxy/README.md`

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -76,7 +76,7 @@ SELECT`
`76`	`76`	`FROM`
`77`	`77`	`dblink(`
`78`	`78`	`'postgresml',`
`79`		`- 'SELECT * FROM pgml.embed(''intfloat/e5-small'', ''embed this text'') AS embedding'`
	`79`	`+ 'SELECT * FROM pgml.embed(''intfloat/e5-small-v2'', ''embed this text'') AS embedding'`
`80`	`80`	`) AS t1(embedding real[386]);`
`81`	`81`	```
`82`	`82`

`‎pgml-apps/pgml-chat/pgml_chat/main.py`

Lines changed: 2 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -195,9 +195,8 @@ def handler(signum, frame):`
`195`	`195`	`)`
`196`	`196`
`197`	`197`	`splitter=Splitter(splitter_name,splitter_params)`
`198`		`-model_name="hkunlp/instructor-xl"`
`199`		`-model_embedding_instruction="Represent the %s document for retrieval: "% (bot_topic)`
`200`		`-model_params= {"instruction":model_embedding_instruction}`
	`198`	`+model_name="intfloat/e5-small-v2"`
	`199`	`+model_params= {}`
`201`	`200`
`202`	`201`	`model=Model(model_name,"pgml",model_params)`
`203`	`202`	`pipeline=Pipeline(args.collection_name+"_pipeline",model,splitter)`

`‎pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md`

Lines changed: 13 additions & 7 deletions

Original file line number	Diff line number	Diff line change
`@@ -122,14 +122,14 @@ LIMIT 5;`
`122`	`122`
`123`	`123`	PostgresML provides a simple interface to generate embeddings from text in your database. You can use the[`pgml.embed`](https://postgresml.org/docs/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the[Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
`124`	`124`
`125`		-Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model.[`intfloat/e5-small`](https://huggingface.co/intfloat/e5-small) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
	`125`	+Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model.[`intfloat/e5-small-v2`](https://huggingface.co/intfloat/e5-small-v2) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
`126`	`126`
`127`		-It takes a couple of minutes to download and cache the`intfloat/e5-small` model to generate the first embedding. After that, it's pretty fast.
	`127`	+It takes a couple of minutes to download and cache the`intfloat/e5-small-v2` model to generate the first embedding. After that, it's pretty fast.
`128`	`128`
`129`	`129`	Note how we prefix the text we want to embed with either`passage:` or`query:` , the e5 model requires us to prefix our data with`passage:` if we're generating embeddings for our corpus and`query:` if we want to find semantically similar content.
`130`	`130`
`131`	`131`	```postgresql
`132`		`-SELECT pgml.embed('intfloat/e5-small', 'passage: hi mom');`
	`132`	`+SELECT pgml.embed('intfloat/e5-small-v2', 'passage: hi mom');`
`133`	`133`	```
`134`	`134`
`135`	`135`	`This is a pretty powerful function, because we can pass any arbitrary text to any open source model, and it will generate an embedding for us. We can benchmark how long it takes to generate an embedding for a single review, using client-side timings in Postgres:`
`@@ -147,7 +147,7 @@ Aside from using this function with strings passed from a client, we can use it`
`147`	`147`	```postgresql
`148`	`148`	`SELECT`
`149`	`149`	`review_body,`
`150`		`- pgml.embed('intfloat/e5-small', 'passage: ' \|\| review_body)`
	`150`	`+ pgml.embed('intfloat/e5-small-v2', 'passage: ' \|\| review_body)`
`151`	`151`	`FROM pgml.amazon_us_reviews`
`152`	`152`	`LIMIT 1;`
`153`	`153`	```
`@@ -171,7 +171,7 @@ Time to generate an embedding increases with the length of the input text, and v`
`171`	`171`	```postgresql
`172`	`172`	`SELECT`
`173`	`173`	`review_body,`
`174`		`- pgml.embed('intfloat/e5-small', 'passage: ' \|\| review_body) AS embedding`
	`174`	`+ pgml.embed('intfloat/e5-small-v2', 'passage: ' \|\| review_body) AS embedding`
`175`	`175`	`FROM pgml.amazon_us_reviews`
`176`	`176`	`LIMIT 1000;`
`177`	`177`	```
`@@ -190,7 +190,7 @@ We can also do a quick sanity check to make sure we're really getting value out`
`190`	`190`	`SELECT`
`191`	`191`	`reviqew_body,`
`192`	`192`	`pgml.embed(`
`193`		`- 'intfloat/e5-small',`
	`193`	`+ 'intfloat/e5-small-v2',`
`194`	`194`	`'passage: ' \|\| review_body,`
`195`	`195`	`'{"device": "cpu"}'`
`196`	`196`	`) AS embedding`
@@ -224,6 +224,12 @@ You can also find embedding models that outperform OpenAI's `text-embedding-ada-
`224`	`224`
`225`	`225`	The current leading model is`hkunlp/instructor-xl`. Instructor models take an additional`instruction` parameter which includes context for the embeddings use case, similar to prompts before text generation tasks.
`226`	`226`
	`227`	`+!!! note`
	`228`	`+`
	`229`	`+"intfloat/e5-small-v2" surpassed the quality of instructor-xl, and should be used instead, but we've left this documentation available for existing users`
	`230`	`+`
	`231`	`+!!!`
	`232`	`+`
`227`	`233`	`Instructions can provide a "classification" or "topic" for the text:`
`228`	`234`
`229`	`235`	`####Classification`
`@@ -325,7 +331,7 @@ BEGIN`
`325`	`331`
`326`	`332`	`UPDATE pgml.amazon_us_reviews`
`327`	`333`	`SET review_embedding_e5_large = pgml.embed(`
`328`		`- 'intfloat/e5-large',`
	`334`	`+ 'intfloat/e5-small-v2',`
`329`	`335`	`'passage: ' \|\| review_body`
`330`	`336`	`)`
`331`	`337`	`WHERE id BETWEEN i AND i + 10`

`‎pgml-cms/blog/personalize-embedding-results-with-application-data-in-your-database.md`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -137,7 +137,7 @@ We can find a customer that our embeddings model feels is close to the sentiment`
`137`	`137`	```postgresql
`138`	`138`	`WITH request AS (`
`139`	`139`	`SELECT pgml.embed(`
`140`		`- 'intfloat/e5-large',`
	`140`	`+ 'intfloat/e5-small-v2',`
`141`	`141`	`'query: I love all Star Wars, but Empire Strikes Back is particularly amazing'`
`142`	`142`	`)::vector(1024) AS embedding`
`143`	`143`	`)`
`@@ -214,7 +214,7 @@ Now we can write our personalized SQL query. It's nearly the same as our query f`
`214`	`214`	`-- create a request embedding on the fly`
`215`	`215`	`WITH request AS (`
`216`	`216`	`SELECT pgml.embed(`
`217`		`- 'intfloat/e5-large',`
	`217`	`+ 'intfloat/e5-small-v2',`
`218`	`218`	`'query: Best 1980''s scifi movie'`
`219`	`219`	`)::vector(1024) AS embedding`
`220`	`220`	`),`

`‎pgml-cms/blog/pgml-chat-a-command-line-tool-for-deploying-low-latency-knowledge-based-chatbots-part-i.md`

Lines changed: 2 additions & 4 deletions

Original file line number	Diff line number	Diff line change
`@@ -127,9 +127,7 @@ cp .env.template .env`
`127`	`127`	```bash
`128`	`128`	`OPENAI_API_KEY=<OPENAI_API_KEY>`
`129`	`129`	`DATABASE_URL=<POSTGRES_DATABASE_URL starts with postgres://>`
`130`		`-MODEL=hkunlp/instructor-xl`
`131`		`-MODEL_PARAMS={"instruction":"Represent the document for retrieval:"}`
`132`		`-QUERY_PARAMS={"instruction":"Represent the question for retrieving supporting documents:"}`
	`130`	`+MODEL=intfloat/e5-small-v2`
`133`	`131`	`SYSTEM_PROMPT=<># System prompt used for OpenAI chat completion`
`134`	`132`	`BASE_PROMPT=<># Base prompt used for OpenAI chat completion for each turn`
`135`	`133`	`SLACK_BOT_TOKEN=<SLACK_BOT_TOKEN># Slack bot token to run Slack chat service`
`@@ -332,7 +330,7 @@ Once the discord app is running, you can interact with the chatbot on Discord as`
`332`	`330`
`333`	`331`	`### PostgresML vs. Hugging Face + Pinecone`
`334`	`332`
`335`		`-To evaluate query latency, we performed an experiment with 10,000 Wikipedia documents from the SQuAD dataset. Embeddings were generated using the intfloat/e5-large model.`
	`333`	`+To evaluate query latency, we performed an experiment with 10,000 Wikipedia documents from the SQuAD dataset. Embeddings were generated using the intfloat/e5-small-v2 model.`
`336`	`334`
`337`	`335`	`For PostgresML, we used a GPU-powered serverless database running on NVIDIA A10G GPUs with clientin us-west-2 region. For HuggingFace, we used their inference API endpoint running on NVIDIA A10G GPUsin us-east-1 region and a clientin the same us-east-1 region. Pinecone was used as the vector search indexfor HuggingFace embeddings.`
`338`	`336`

`‎pgml-cms/blog/speeding-up-vector-recall-5x-with-hnsw.md`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -45,7 +45,7 @@ Let's run that query again:`
`45`	`45`	```postgresql
`46`	`46`	`WITH request AS (`
`47`	`47`	`SELECT pgml.embed(`
`48`		`- 'intfloat/e5-large',`
	`48`	`+ 'intfloat/e5-small-v2',`
`49`	`49`	`'query: Best 1980''s scifi movie'`
`50`	`50`	`)::vector(1024) AS embedding`
`51`	`51`	`)`
`@@ -100,7 +100,7 @@ Now let's try the query again utilizing the new HNSW index we created.`
`100`	`100`	```postgresql
`101`	`101`	`WITH request AS (`
`102`	`102`	`SELECT pgml.embed(`
`103`		`- 'intfloat/e5-large',`
	`103`	`+ 'intfloat/e5-small-v2',`
`104`	`104`	`'query: Best 1980''s scifi movie'`
`105`	`105`	`)::vector(1024) AS embedding`
`106`	`106`	`)`

`‎pgml-cms/blog/the-1.0-sdk-is-here.md`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -50,7 +50,7 @@ const pipeline = pgml.newPipeline("my_pipeline", {`
`50`	`50`	`text: {`
`51`	`51`	`splitter: { model:"recursive_character" },`
`52`	`52`	`semantic_search: {`
`53`		`- model:"intfloat/e5-small",`
	`53`	`+ model:"intfloat/e5-small-v2",`
`54`	`54`	`},`
`55`	`55`	`},`
`56`	`56`	`});`
`@@ -90,7 +90,7 @@ pipeline = Pipeline(`
`90`	`90`	`"text": {`
`91`	`91`	`"splitter": {"model":"recursive_character"},`
`92`	`92`	`"semantic_search": {`
`93`		`-"model":"intfloat/e5-small",`
	`93`	`+"model":"intfloat/e5-small-v2",`
`94`	`94`	`},`
`95`	`95`	`},`
`96`	`96`	`},`

`‎pgml-cms/blog/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md`

Lines changed: 6 additions & 6 deletions

Original file line number	Diff line number	Diff line change
`@@ -124,7 +124,7 @@ We'll start with semantic search. Given a user query, e.g. "Best 1980's scifi mo`
`124`	`124`	```postgresql
`125`	`125`	`WITH request AS (`
`126`	`126`	`SELECT pgml.embed(`
`127`		`- 'intfloat/e5-large',`
	`127`	`+ 'intfloat/e5-small-v2',`
`128`	`128`	`'query: Best 1980''s scifi movie'`
`129`	`129`	`)::vector(1024) AS embedding`
`130`	`130`	`)`
`@@ -171,7 +171,7 @@ Generating a query plan more quickly and only computing the values once, may mak`
`171`	`171`	`There's some good stuff happening in those query results, so let's break it down:`
`172`	`172`
`173`	`173`	`*It's fast - We're able to generate a request embedding on the fly with a state-of-the-art model, and search 5M reviews in 152ms, including fetching the results back to the client 😍. You can't even generate an embedding from OpenAI's API in that time, much less search 5M reviews in some other database with it.`
`174`		-*It's good - The`review_body` results are very similar to the "Best 1980's scifi movie" request text. We're using the`intfloat/e5-large` open source embedding model, which outperforms OpenAI's`text-embedding-ada-002` in most[quality benchmarks](https://huggingface.co/spaces/mteb/leaderboard).
	`174`	+*It's good - The`review_body` results are very similar to the "Best 1980's scifi movie" request text. We're using the`intfloat/e5-small-v2` open source embedding model, which outperforms OpenAI's`text-embedding-ada-002` in most[quality benchmarks](https://huggingface.co/spaces/mteb/leaderboard).
`175`	`175`	* Qualitatively: the embeddings understand our request for`scifi` being equivalent to`Sci-Fi`,`sci-fi`,`SciFi`, and`sci fi`, as well as`1980's` matching`80s` and`80's` and is close to`seventies` (last place). We didn't have to configure any of this and the most enthusiastic for "best" is at the top, the least enthusiastic is at the bottom, so the model has appropriately captured "sentiment".
`176`	`176`	* Quantitatively: the`cosine_similarity` of all results are high and tight, 0.90-0.95 on a scale from -1:1. We can be confident we recalled very similar results from our 5M candidates, even though it would take 485 times as long to check all of them directly.
`177`	`177`	`*It's reliable - The model is stored in the database, so we don't need to worry about managing a separate service. If you repeat this query over and over, the timings will be extremely consistent, because we don't have to deal with things like random network congestion.`
`@@ -254,7 +254,7 @@ Now we can quickly search for movies by what people have said about them:`
`254`	`254`	```postgresql
`255`	`255`	`WITH request AS (`
`256`	`256`	`SELECT pgml.embed(`
`257`		`- 'intfloat/e5-large',`
	`257`	`+ 'intfloat/e5-small-v2',`
`258`	`258`	`'Best 1980''s scifi movie'`
`259`	`259`	`)::vector(1024) AS embedding`
`260`	`260`	`)`
`@@ -312,7 +312,7 @@ SET ivfflat.probes = 300;`
`312`	`312`	```postgresql
`313`	`313`	`WITH request AS (`
`314`	`314`	`SELECT pgml.embed(`
`315`		`- 'intfloat/e5-large',`
	`315`	`+ 'intfloat/e5-small-v2',`
`316`	`316`	`'Best 1980''s scifi movie'`
`317`	`317`	`)::vector(1024) AS embedding`
`318`	`318`	`)`
`@@ -401,7 +401,7 @@ SET ivfflat.probes = 1;`
`401`	`401`	```postgresql
`402`	`402`	`WITH request AS (`
`403`	`403`	`SELECT pgml.embed(`
`404`		`- 'intfloat/e5-large',`
	`404`	`+ 'intfloat/e5-small-v2',`
`405`	`405`	`'query: Best 1980''s scifi movie'`
`406`	`406`	`)::vector(1024) AS embedding`
`407`	`407`	`)`
`@@ -457,7 +457,7 @@ SQL is a very expressive language that can handle a lot of complexity. To keep t`
`457`	`457`	`-- create a request embedding on the fly`
`458`	`458`	`WITH request AS (`
`459`	`459`	`SELECT pgml.embed(`
`460`		`- 'intfloat/e5-large',`
	`460`	`+ 'intfloat/e5-small-v2',`
`461`	`461`	`'query: Best 1980''s scifi movie'`
`462`	`462`	`)::vector(1024) AS embedding`
`463`	`463`	`),`

`‎pgml-cms/blog/using-postgresml-with-django-and-embedding-search.md`

Lines changed: 3 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -58,7 +58,7 @@ class EmbedSmallExpression(models.Expression):`
`58`	`58`	`self.embedding_field= field`
`59`	`59`
`60`	`60`	`defas_sql(self,compiler,connection,template=None):`
`61`		`-returnf"pgml.embed('intfloat/e5-small',{self.embedding_field})",None`
	`61`	`+returnf"pgml.embed('intfloat/e5-small-v2',{self.embedding_field})",None`
`62`	`62`	```
`63`	`63`
`64`	`64`	`And that's it! In just a few lines of code, we're generating and storing high quality embeddings automatically in our database. No additional setup is required, and all the AI complexity is taken care of by PostgresML.`
@@ -70,7 +70,7 @@ Djago Rest Framework provides the bulk of the implementation. We just added a `M
`70`	`70`	```python
`71`	`71`	`results= TodoItem.objects.annotate(`
`72`	`72`	`similarity=RawSQL(`
`73`		`-"pgml.embed('intfloat/e5-small',%s)::vector(384) <=> embedding",`
	`73`	`+"pgml.embed('intfloat/e5-small-v2',%s)::vector(384) <=> embedding",`
`74`	`74`	`[query],`
`75`	`75`	`)`
`76`	`76`	`).order_by("similarity")`
@@ -115,7 +115,7 @@ In return, you'll get your to-do item alongside the embedding of the `descriptio
`115`	`115`
`116`	`116`	`The embedding contains 384 floating point numbers; we removed most of them in this blog post to make sure it fits on the page.`
`117`	`117`
`118`		-You can try creating multiple to-do items for fun and profit. If the description is changed, so will the embedding, demonstrating how the`intfloat/e5-small` model understands the semantic meaning of your text.
	`118`	+You can try creating multiple to-do items for fun and profit. If the description is changed, so will the embedding, demonstrating how the`intfloat/e5-small-v2` model understands the semantic meaning of your text.
`119`	`119`
`120`	`120`	`###Searching`
`121`	`121`

`‎pgml-cms/docs/api/client-sdk/README.md`

Lines changed: 3 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -80,7 +80,7 @@ const pipeline = pgml.newPipeline("sample_pipeline", {`
`80`	`80`	`text: {`
`81`	`81`	`splitter: { model:"recursive_character" },`
`82`	`82`	`semantic_search: {`
`83`		`- model:"intfloat/e5-small",`
	`83`	`+ model:"intfloat/e5-small-v2",`
`84`	`84`	`},`
`85`	`85`	`},`
`86`	`86`	`});`
`@@ -98,7 +98,7 @@ pipeline = Pipeline(`
`98`	`98`	`"text": {`
`99`	`99`	`"splitter": {"model":"recursive_character" },`
`100`	`100`	`"semantic_search": {`
`101`		`-"model":"intfloat/e5-small",`
	`101`	`+"model":"intfloat/e5-small-v2",`
`102`	`102`	`},`
`103`	`103`	`},`
`104`	`104`	`},`
`@@ -111,7 +111,7 @@ await collection.add_pipeline(pipeline)`
`111`	`111`
`112`	`112`	`The pipeline configuration is a key/value object, where the key is the name of a column in a document, and the value is the action the SDK should perform on that column.`
`113`	`113`
`114`		-In this example, the documents contain a column called`text` which we are instructing the SDK to chunk the contents of using the recursive character splitter, and to embed those chunks using the Hugging Face`intfloat/e5-small` embeddings model.
	`114`	+In this example, the documents contain a column called`text` which we are instructing the SDK to chunk the contents of using the recursive character splitter, and to embed those chunks using the Hugging Face`intfloat/e5-small-v2` embeddings model.
`115`	`115`
`116`	`116`	`###Add documents`
`117`	`117`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commitfa262b4

File tree

50 files changed

Some content is hidden

50 files changed

`‎packages/pgml-rds-proxy/README.md`

`‎pgml-apps/pgml-chat/pgml_chat/main.py`

`‎pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md`

`‎pgml-cms/blog/personalize-embedding-results-with-application-data-in-your-database.md`

`‎pgml-cms/blog/pgml-chat-a-command-line-tool-for-deploying-low-latency-knowledge-based-chatbots-part-i.md`

`‎pgml-cms/blog/speeding-up-vector-recall-5x-with-hnsw.md`

`‎pgml-cms/blog/the-1.0-sdk-is-here.md`

`‎pgml-cms/blog/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md`

`‎pgml-cms/blog/using-postgresml-with-django-and-embedding-search.md`

`‎pgml-cms/docs/api/client-sdk/README.md`

0 commit comments