- Notifications
You must be signed in to change notification settings - Fork328
Open
Description
I found a specific query that causes a crash in postgres.
I tested this bug on both postgres 15/16 on my debian installation and it also crashes the postgres in the latest docker ( ghcr.io/postgresml/postgresml:2.7.3 )
To reproduce follow the following sequence
SELECTpgml.load_dataset('ag_news');DROPTABLE phrases2;CREATETABLEphrases2 ( idserialPRIMARY KEY, phrasetext, embedding vector(384));insert into phrases2(phrase,embedding)selecttext,pgml.embed('all-MiniLM-L6-v2',text)::vectorfrompgml.ag_newslimit10000;
After the import is complete to phrases2. Exit the psql client. And start up the psql new. Then run this query:
WITH EmbeddingsAS (SELECTpgml.embed('all-MiniLM-L6-v2',p.phrase)AS embedding, idFROM phrases2 plimit45)SELECTp.id,p.phrase,1- (p.embedding<=>e.embedding::vector)AS similarityFROM Embeddings eJOIN phrases2 pON trueORDER BY (p.embedding<=>e.embedding::vector)limit10;
This will create the following log files and an abort
: CommandLine Error: Option 'nvptx-no-f16-math' registered more than once!LLVM ERROR: inconsistency in registered CommandLine options2023-11-29 20:30:51.296 UTC [29] LOG: server process (PID 162) was terminated by signal 6: Aborted
I found that initializing the embedding first mitigates the bug. So if you start psql client and then run:
SELECTpgml.embed('all-MiniLM-L6-v2','initialize the framework')::vector;WITH EmbeddingsAS (SELECTpgml.embed('all-MiniLM-L6-v2',p.phrase)AS embedding, idFROM phrases2 plimit45)SELECTp.id,p.phrase,1- (p.embedding<=>e.embedding::vector)AS similarityFROM Embeddings eJOIN phrases2 pON trueORDER BY (p.embedding<=>e.embedding::vector)limit10;