Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

pgvector support for Python

License

NotificationsYou must be signed in to change notification settings

pgvector/pgvector-python

Repository files navigation

pgvector support for Python

SupportsDjango,SQLAlchemy,SQLModel,Psycopg 3,Psycopg 2,asyncpg,pg8000, andPeewee

Build Status

Installation

Run:

pip install pgvector

And follow the instructions for your database library:

Or check out some examples:

Django

Create a migration to enable the extension

frompgvector.djangoimportVectorExtensionclassMigration(migrations.Migration):operations= [VectorExtension()    ]

Add a vector field to your model

frompgvector.djangoimportVectorFieldclassItem(models.Model):embedding=VectorField(dimensions=3)

Also supportsHalfVectorField,BitField, andSparseVectorField

Insert a vector

item=Item(embedding=[1,2,3])item.save()

Get the nearest neighbors to a vector

frompgvector.djangoimportL2DistanceItem.objects.order_by(L2Distance('embedding', [3,1,2]))[:5]

Also supportsMaxInnerProduct,CosineDistance,L1Distance,HammingDistance, andJaccardDistance

Get the distance

Item.objects.annotate(distance=L2Distance('embedding', [3,1,2]))

Get items within a certain distance

Item.objects.alias(distance=L2Distance('embedding', [3,1,2])).filter(distance__lt=5)

Average vectors

fromdjango.db.modelsimportAvgItem.objects.aggregate(Avg('embedding'))

Also supportsSum

Add an approximate index

frompgvector.djangoimportHnswIndex,IvfflatIndexclassItem(models.Model):classMeta:indexes= [HnswIndex(name='my_index',fields=['embedding'],m=16,ef_construction=64,opclasses=['vector_l2_ops']            ),# orIvfflatIndex(name='my_index',fields=['embedding'],lists=100,opclasses=['vector_l2_ops']            )        ]

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

Half-Precision Indexing

Index vectors at half-precision

fromdjango.contrib.postgres.indexesimportOpClassfromdjango.db.models.functionsimportCastfrompgvector.djangoimportHnswIndex,HalfVectorFieldclassItem(models.Model):classMeta:indexes= [HnswIndex(OpClass(Cast('embedding',HalfVectorField(dimensions=3)),name='halfvec_l2_ops'),name='my_index',m=16,ef_construction=64            )        ]

Note: Add'django.contrib.postgres' toINSTALLED_APPS to useOpClass

Get the nearest neighbors

distance=L2Distance(Cast('embedding',HalfVectorField(dimensions=3)), [3,1,2])Item.objects.order_by(distance)[:5]

SQLAlchemy

Enable the extension

session.execute(text('CREATE EXTENSION IF NOT EXISTS vector'))

Add a vector column

frompgvector.sqlalchemyimportVectorclassItem(Base):embedding=mapped_column(Vector(3))

Also supportsHALFVEC,BIT, andSPARSEVEC

Insert a vector

item=Item(embedding=[1,2,3])session.add(item)session.commit()

Get the nearest neighbors to a vector

session.scalars(select(Item).order_by(Item.embedding.l2_distance([3,1,2])).limit(5))

Also supportsmax_inner_product,cosine_distance,l1_distance,hamming_distance, andjaccard_distance

Get the distance

session.scalars(select(Item.embedding.l2_distance([3,1,2])))

Get items within a certain distance

session.scalars(select(Item).filter(Item.embedding.l2_distance([3,1,2])<5))

Average vectors

frompgvector.sqlalchemyimportavgsession.scalars(select(avg(Item.embedding))).first()

Also supportssum

Add an approximate index

index=Index('my_index',Item.embedding,postgresql_using='hnsw',postgresql_with={'m':16,'ef_construction':64},postgresql_ops={'embedding':'vector_l2_ops'})# orindex=Index('my_index',Item.embedding,postgresql_using='ivfflat',postgresql_with={'lists':100},postgresql_ops={'embedding':'vector_l2_ops'})index.create(engine)

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

Half-Precision Indexing

Index vectors at half-precision

frompgvector.sqlalchemyimportHALFVECfromsqlalchemy.sqlimportfuncindex=Index('my_index',func.cast(Item.embedding,HALFVEC(3)).label('embedding'),postgresql_using='hnsw',postgresql_with={'m':16,'ef_construction':64},postgresql_ops={'embedding':'halfvec_l2_ops'})

Get the nearest neighbors

order=func.cast(Item.embedding,HALFVEC(3)).l2_distance([3,1,2])session.scalars(select(Item).order_by(order).limit(5))

Arrays

Add an array column

frompgvector.sqlalchemyimportVectorfromsqlalchemyimportARRAYclassItem(Base):embeddings=mapped_column(ARRAY(Vector(3)))

And register the types with the underlying driver

For Psycopg 3, use

frompgvector.psycopgimportregister_vectorfromsqlalchemyimportevent@event.listens_for(engine,"connect")defconnect(dbapi_connection,connection_record):register_vector(dbapi_connection)

Forasync connections with Psycopg 3, use

frompgvector.psycopgimportregister_vector_asyncfromsqlalchemyimportevent@event.listens_for(engine.sync_engine,"connect")defconnect(dbapi_connection,connection_record):dbapi_connection.run_async(register_vector_async)

For Psycopg 2, use

frompgvector.psycopg2importregister_vectorfromsqlalchemyimportevent@event.listens_for(engine,"connect")defconnect(dbapi_connection,connection_record):register_vector(dbapi_connection,arrays=True)

SQLModel

Enable the extension

session.exec(text('CREATE EXTENSION IF NOT EXISTS vector'))

Add a vector column

frompgvector.sqlalchemyimportVectorclassItem(SQLModel,table=True):embedding:Any=Field(sa_type=Vector(3))

Also supportsHALFVEC,BIT, andSPARSEVEC

Insert a vector

item=Item(embedding=[1,2,3])session.add(item)session.commit()

Get the nearest neighbors to a vector

session.exec(select(Item).order_by(Item.embedding.l2_distance([3,1,2])).limit(5))

Also supportsmax_inner_product,cosine_distance,l1_distance,hamming_distance, andjaccard_distance

Get the distance

session.exec(select(Item.embedding.l2_distance([3,1,2])))

Get items within a certain distance

session.exec(select(Item).filter(Item.embedding.l2_distance([3,1,2])<5))

Average vectors

frompgvector.sqlalchemyimportavgsession.exec(select(avg(Item.embedding))).first()

Also supportssum

Add an approximate index

fromsqlmodelimportIndexindex=Index('my_index',Item.embedding,postgresql_using='hnsw',postgresql_with={'m':16,'ef_construction':64},postgresql_ops={'embedding':'vector_l2_ops'})# orindex=Index('my_index',Item.embedding,postgresql_using='ivfflat',postgresql_with={'lists':100},postgresql_ops={'embedding':'vector_l2_ops'})index.create(engine)

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

Psycopg 3

Enable the extension

conn.execute('CREATE EXTENSION IF NOT EXISTS vector')

Register the types with your connection

frompgvector.psycopgimportregister_vectorregister_vector(conn)

Forconnection pools, use

defconfigure(conn):register_vector(conn)pool=ConnectionPool(...,configure=configure)

Forasync connections, use

frompgvector.psycopgimportregister_vector_asyncawaitregister_vector_async(conn)

Create a table

conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')

Insert a vector

embedding=np.array([1,2,3])conn.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,))

Get the nearest neighbors to a vector

conn.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,)).fetchall()

Add an approximate index

conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')# orconn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

Psycopg 2

Enable the extension

cur=conn.cursor()cur.execute('CREATE EXTENSION IF NOT EXISTS vector')

Register the types with your connection or cursor

frompgvector.psycopg2importregister_vectorregister_vector(conn)

Create a table

cur.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')

Insert a vector

embedding=np.array([1,2,3])cur.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,))

Get the nearest neighbors to a vector

cur.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,))cur.fetchall()

Add an approximate index

cur.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')# orcur.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

asyncpg

Enable the extension

awaitconn.execute('CREATE EXTENSION IF NOT EXISTS vector')

Register the types with your connection

frompgvector.asyncpgimportregister_vectorawaitregister_vector(conn)

or your pool

asyncdefinit(conn):awaitregister_vector(conn)pool=awaitasyncpg.create_pool(...,init=init)

Create a table

awaitconn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')

Insert a vector

embedding=np.array([1,2,3])awaitconn.execute('INSERT INTO items (embedding) VALUES ($1)',embedding)

Get the nearest neighbors to a vector

awaitconn.fetch('SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5',embedding)

Add an approximate index

awaitconn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')# orawaitconn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

pg8000

Enable the extension

conn.run('CREATE EXTENSION IF NOT EXISTS vector')

Register the types with your connection

frompgvector.pg8000importregister_vectorregister_vector(conn)

Create a table

conn.run('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))')

Insert a vector

embedding=np.array([1,2,3])conn.run('INSERT INTO items (embedding) VALUES (:embedding)',embedding=embedding)

Get the nearest neighbors to a vector

conn.run('SELECT * FROM items ORDER BY embedding <-> :embedding LIMIT 5',embedding=embedding)

Add an approximate index

conn.run('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)')# orconn.run('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)')

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

Peewee

Add a vector column

frompgvector.peeweeimportVectorFieldclassItem(BaseModel):embedding=VectorField(dimensions=3)

Also supportsHalfVectorField,FixedBitField, andSparseVectorField

Insert a vector

item=Item.create(embedding=[1,2,3])

Get the nearest neighbors to a vector

Item.select().order_by(Item.embedding.l2_distance([3,1,2])).limit(5)

Also supportsmax_inner_product,cosine_distance,l1_distance,hamming_distance, andjaccard_distance

Get the distance

Item.select(Item.embedding.l2_distance([3,1,2]).alias('distance'))

Get items within a certain distance

Item.select().where(Item.embedding.l2_distance([3,1,2])<5)

Average vectors

frompeeweeimportfnItem.select(fn.avg(Item.embedding).coerce(True)).scalar()

Also supportssum

Add an approximate index

Item.add_index('embedding vector_l2_ops',using='hnsw')

Usevector_ip_ops for inner product andvector_cosine_ops for cosine distance

Reference

Half Vectors

Create a half vector from a list

vec=HalfVector([1,2,3])

Or a NumPy array

vec=HalfVector(np.array([1,2,3]))

Get a list

lst=vec.to_list()

Get a NumPy array

arr=vec.to_numpy()

Sparse Vectors

Create a sparse vector from a list

vec=SparseVector([1,0,2,0,3,0])

Or a NumPy array

vec=SparseVector(np.array([1,0,2,0,3,0]))

Or a SciPy sparse array

arr=coo_array(([1,2,3], ([0,2,4],)),shape=(6,))vec=SparseVector(arr)

Or a dictionary of non-zero elements

vec=SparseVector({0:1,2:2,4:3},6)

Note: Indices start at 0

Get the number of dimensions

dim=vec.dimensions()

Get the indices of non-zero elements

indices=vec.indices()

Get the values of non-zero elements

values=vec.values()

Get a list

lst=vec.to_list()

Get a NumPy array

arr=vec.to_numpy()

Get a SciPy sparse array

arr=vec.to_coo()

History

View thechangelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/pgvector/pgvector-python.gitcd pgvector-pythonpip install -r requirements.txtcreatedb pgvector_python_testpytest

To run an example:

cd examples/loadingpip install -r requirements.txtcreatedb pgvector_examplepython3 example.py

About

pgvector support for Python

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp