- Notifications
You must be signed in to change notification settings - Fork6
pgvector support for Java, Kotlin, Groovy, and Scala
License
pgvector/pgvector-java
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
pgvector support for Java, Kotlin, Groovy, and Scala
SupportsJDBC,Spring JDBC,Groovy SQL, andSlick
For Maven, add topom.xml
under<dependencies>
:
<dependency> <groupId>com.pgvector</groupId> <artifactId>pgvector</artifactId> <version>0.1.6</version></dependency>
For sbt, add tobuild.sbt
:
libraryDependencies+="com.pgvector"%"pgvector"%"0.1.6"
For other build tools, seethis page.
And follow the instructions for your database library:
- Java -JDBC,Spring JDBC,Hibernate,R2DBC
- Kotlin -JDBC
- Groovy -JDBC,Groovy SQL
- Scala -JDBC,Slick
Or check out some examples:
- Embeddings with OpenAI
- Binary embeddings with Cohere
- Sentence embeddings with Deep Java Library
- Hybrid search with Deep Java Library (Reciprocal Rank Fusion)
- Sparse search with Text Embeddings Inference
- Extended-connectivity fingerprints with the Chemistry Development Kit
- Recommendations with Disco
- Horizontal scaling with Citus
- Bulk loading with
COPY
Import thePGvector
class
importcom.pgvector.PGvector;
Enable the extension
StatementsetupStmt =conn.createStatement();setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector");
Register the types with your connection
PGvector.registerTypes(conn);
Create a table
StatementcreateStmt =conn.createStatement();createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))");
Insert a vector
PreparedStatementinsertStmt =conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)");insertStmt.setObject(1,newPGvector(newfloat[] {1,1,1}));insertStmt.executeUpdate();
Get the nearest neighbors
PreparedStatementneighborStmt =conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5");neighborStmt.setObject(1,newPGvector(newfloat[] {1,1,1}));ResultSetrs =neighborStmt.executeQuery();while (rs.next()) {System.out.println((PGvector)rs.getObject("embedding"));}
Add an approximate index
StatementindexStmt =conn.createStatement();indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)");// orindexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)");
Usevector_ip_ops
for inner product andvector_cosine_ops
for cosine distance
See afull example
Import thePGvector
class
importcom.pgvector.PGvector;
Enable the extension
jdbcTemplate.execute("CREATE EXTENSION IF NOT EXISTS vector");
Create a table
jdbcTemplate.execute("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))");
Insert a vector
Object[]insertParams =newObject[] {newPGvector(newfloat[] {1,1,1}) };jdbcTemplate.update("INSERT INTO items (embedding) VALUES (?)",insertParams);
Get the nearest neighbors
Object[]neighborParams =newObject[] {newPGvector(newfloat[] {1,1,1}) };List<Map<String,Object>>rows =jdbcTemplate.queryForList("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5",neighborParams);for (Maprow :rows) {System.out.println(row.get("embedding"));}
Add an approximate index
jdbcTemplate.execute("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)");// orjdbcTemplate.execute("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)");
Usevector_ip_ops
for inner product andvector_cosine_ops
for cosine distance
See afull example
Hibernate 6.4+ has avector module (use this instead ofcom.pgvector.pgvector
).
For Maven, add topom.xml
under<dependencies>
:
<dependency> <groupId>org.hibernate.orm</groupId> <artifactId>hibernate-vector</artifactId> <version>6.4.0.Final</version></dependency>
Define an entity
importjakarta.persistence.*;importorg.hibernate.annotations.Array;importorg.hibernate.annotations.JdbcTypeCode;importorg.hibernate.type.SqlTypes;@EntityclassItem {@Id@GeneratedValueprivateLongid;@Column@JdbcTypeCode(SqlTypes.VECTOR)@Array(length =3)// dimensionsprivatefloat[]embedding;publicvoidsetEmbedding(float[]embedding) {this.embedding =embedding; }}
Insert a vector
Itemitem =newItem();item.setEmbedding(newfloat[] {1,1,1});entityManager.persist(item);
Get the nearest neighbors
List<Item>items =entityManager .createQuery("FROM Item ORDER BY l2_distance(embedding, :embedding) LIMIT 5",Item.class) .setParameter("embedding",newfloat[] {1,1,1}) .getResultList();
See afull example
R2DBC PostgreSQL 1.0.3+ supports thevector type (use this instead ofcom.pgvector.pgvector
).
For Maven, add topom.xml
under<dependencies>
:
<dependency> <groupId>org.postgresql</groupId> <artifactId>r2dbc-postgresql</artifactId> <version>1.0.3.RELEASE</version></dependency>
Import thePGvector
class
importcom.pgvector.PGvector
Enable the extension
val setupStmt= conn.createStatement()setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")
Register the types with your connection
PGvector.registerTypes(conn)
Create a table
val createStmt= conn.createStatement()createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")
Insert a vector
val insertStmt= conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)")insertStmt.setObject(1,PGvector(floatArrayOf(1.0f,1.0f,1.0f)))insertStmt.executeUpdate()
Get the nearest neighbors
val neighborStmt= conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5")neighborStmt.setObject(1,PGvector(floatArrayOf(1.0f,1.0f,1.0f)))val rs= neighborStmt.executeQuery()while (rs.next()) {println(rs.getObject("embedding")asPGvector?)}
Add an approximate index
val indexStmt= conn.createStatement()indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)")// orindexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")
Usevector_ip_ops
for inner product andvector_cosine_ops
for cosine distance
See afull example
Import thePGvector
class
importcom.pgvector.PGvector
Enable the extension
def setupStmt= conn.createStatement()setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")
Register the types with your connection
PGvector.registerTypes(conn)
Create a table
def createStmt= conn.createStatement()createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")
Insert a vector
def insertStmt= conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)")insertStmt.setObject(1,newPGvector([1,1,1]asfloat[]))insertStmt.executeUpdate()
Get the nearest neighbors
def neighborStmt= conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5")neighborStmt.setObject(1,newPGvector([1,1,1]asfloat[]))def rs= neighborStmt.executeQuery()while (rs.next()) {println((PGvector) rs.getObject("embedding"))}
Add an approximate index
def indexStmt= conn.createStatement()indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)")// orindexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")
Usevector_ip_ops
for inner product andvector_cosine_ops
for cosine distance
See afull example
Import thePGvector
class
importcom.pgvector.PGvector
Enable the extension
sql.execute"CREATE EXTENSION IF NOT EXISTS vector"
Create a table
sql.execute"CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))"
Insert a vector
def params= [newPGvector([1,1,1]asfloat[])]sql.executeInsert"INSERT INTO items (embedding) VALUES (?)", params
Get the nearest neighbors
def params= [newPGvector([1,1,1]asfloat[])]sql.eachRow("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5", params) {row->println row.embedding}
Add an approximate index
sql.execute"CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)"// orsql.execute"CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)"
Usevector_ip_ops
for inner product andvector_cosine_ops
for cosine distance
See afull example
Import thePGvector
class
importcom.pgvector.PGvector
Enable the extension
valsetupStmt =conn.createStatement()setupStmt.executeUpdate("CREATE EXTENSION IF NOT EXISTS vector")
Register the types with your connection
PGvector.registerTypes(conn)
Create a table
valcreateStmt= conn.createStatement()createStmt.executeUpdate("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")
Insert a vector
valinsertStmt= conn.prepareStatement("INSERT INTO items (embedding) VALUES (?)")insertStmt.setObject(1,newPGvector(Array[Float](1,1,1)))insertStmt.executeUpdate()
Get the nearest neighbors
valneighborStmt= conn.prepareStatement("SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5")neighborStmt.setObject(1,newPGvector(Array[Float](1,1,1)))valrs= neighborStmt.executeQuery()while (rs.next()) { println(rs.getObject("embedding").asInstanceOf[PGvector])}
Add an approximate index
valindexStmt= conn.createStatement()indexStmt.executeUpdate("CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)")// orindexStmt.executeUpdate("CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")
Usevector_ip_ops
for inner product andvector_cosine_ops
for cosine distance
See afull example
Import thePGvector
class
importcom.pgvector.PGvector
Enable the extension
db.run(sqlu"CREATE EXTENSION IF NOT EXISTS vector")
Add a vector column
classItems(tag:Tag)extendsTable[(String)](tag,"items") {defembedding= column[String]("embedding",O.SqlType("vector(3)"))def*= (embedding)}
Insert a vector
valembedding=newPGvector(Array[Float](1,1,1)).toStringdb.run(sqlu"INSERT INTO items (embedding) VALUES ($embedding::vector)")
Get the nearest neighbors
valembedding=newPGvector(Array[Float](1,1,1)).toStringdb.run(sql"SELECT * FROM items ORDER BY embedding <->$embedding::vector LIMIT 5".as[(String)])
Add an approximate index
db.run(sqlu"CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)")// ordb.run(sqlu"CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")
Usevector_ip_ops
for inner product andvector_cosine_ops
for cosine distance
See afull example
Create a vector from an array
PGvectorvec =newPGvector(newfloat[] {1,2,3});
Or aList<T>
List<Float>list =List.of(Float.valueOf(1),Float.valueOf(2),Float.valueOf(3));PGvectorvec =newPGvector(list);
Get an array
float[]arr =vec.toArray();
Create a half vector from an array
PGhalfvecvec =newPGhalfvec(newfloat[] {1,2,3});
Or aList<T>
List<Float>list =List.of(Float.valueOf(1),Float.valueOf(2),Float.valueOf(3));PGhalfvecvec =newPGhalfvec(list);
Get an array
float[]arr =vec.toArray();
Create a binary vector from a byte array
PGbitvec =newPGbit(newbyte[] {(byte) 0b00000000, (byte) 0b11111111});
Or a boolean array
PGbitvec =newPGbit(newboolean[] {true,false,true});
Or a string
PGbitvec =newPGbit("101");
Get the length (number of bits)
intlength =vec.length();
Get a byte array
byte[]bytes =vec.toByteArray();
Or a boolean array
boolean[]bits =vec.toArray();
Create a sparse vector from an array
PGsparsevecvec =newPGsparsevec(newfloat[] {1,0,2,0,3,0});
Or a map of non-zero elements
Map<Integer,Float>map =newHashMap<Integer,Float>();map.put(Integer.valueOf(0),Float.valueOf(1));map.put(Integer.valueOf(2),Float.valueOf(2));map.put(Integer.valueOf(4),Float.valueOf(3));PGsparsevecvec =newPGsparsevec(map,6);
Note: Indices start at 0
Get the number of dimensions
intdim =vec.getDimensions();
Get the indices of non-zero elements
int[]indices =vec.getIndices();
Get the values of non-zero elements
float[]values =vec.getValues();
Get an array
float[]arr =vec.toArray();
View thechangelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs andsubmit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/pgvector/pgvector-java.gitcd pgvector-javacreatedb pgvector_java_testmvntest
To run an example:
cd examples/loadingcreatedb pgvector_examplemvn packagejava -jar target/example-jar-with-dependencies.jar
About
pgvector support for Java, Kotlin, Groovy, and Scala
Resources
License
Security policy
Uh oh!
There was an error while loading.Please reload this page.