Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

openGauss VectorStore

This notebook covers how to get started with the openGauss VectorStore.openGauss is a high-performance relational database with native vector storage and retrieval capabilities. This integration enables ACID-compliant vector operations within LangChain applications, combining traditional SQL functionality with modern AI-driven similarity search.vector store.

Setup

Launch openGauss Container

docker run --name opengauss \
-d \
-e GS_PASSWORD='MyStrongPass@123' \
-p 8888:5432 \
opengauss/opengauss-server:latest

Install langchain-opengauss

pip install langchain-opengauss

System Requirements:

  • openGauss ≥ 7.0.0
  • Python ≥ 3.8
  • psycopg2-binary

Credentials

Using your openGauss Credentials

Initialization

pip install -qU langchain-openai
import getpass
import os

ifnot os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"]= getpass.getpass("Enter API key for OpenAI: ")

from langchain_openaiimport OpenAIEmbeddings

embeddings= OpenAIEmbeddings(model="text-embedding-3-large")
from langchain_opengaussimport OpenGauss, OpenGaussSettings

# Configure with schema validation
config= OpenGaussSettings(
table_name="test_langchain",
embedding_dimension=384,
index_type="HNSW",
distance_strategy="COSINE",
)
vector_store= OpenGauss(embedding=embeddings, config=config)

Manage vector store

Add items to vector store

from langchain_core.documentsimport Document

document_1= Document(page_content="foo", metadata={"source":"https://example.com"})

document_2= Document(page_content="bar", metadata={"source":"https://example.com"})

document_3= Document(page_content="baz", metadata={"source":"https://example.com"})

documents=[document_1, document_2, document_3]

vector_store.add_documents(documents=documents, ids=["1","2","3"])
API Reference:Document

Update items in vector store

updated_document= Document(
page_content="qux", metadata={"source":"https://another-example.com"}
)

# If the id is already exist, will update the document
vector_store.add_documents(document_id="1", document=updated_document)

Delete items from vector store

vector_store.delete(ids=["3"])

Query vector store

Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.

Query directly

Performing a simple similarity search can be done as follows:

  • TODO: Edit and then run code cell to generate output
results= vector_store.similarity_search(
query="thud", k=1,filter={"source":"https://another-example.com"}
)
for docin results:
print(f"*{doc.page_content} [{doc.metadata}]")

If you want to execute a similarity search and receive the corresponding scores you can run:

results= vector_store.similarity_search_with_score(
query="thud", k=1,filter={"source":"https://example.com"}
)
for doc, scorein results:
print(f"* [SIM={score:3f}]{doc.page_content} [{doc.metadata}]")

Query by turning into retriever

You can also transform the vector store into a retriever for easier usage in your chains.

  • TODO: Edit and then run code cell to generate output
retriever= vector_store.as_retriever(search_type="mmr", search_kwargs={"k":1})
retriever.invoke("thud")

Usage for retrieval-augmented generation

For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:

Configuration

Connection Settings

ParameterDefaultDescription
hostlocalhostDatabase server address
port8888Database connection port
usergaussdbDatabase username
password-Complex password string
databasepostgresDefault database name
min_connections1Connection pool minimum size
max_connections5Connection pool maximum size
table_namelangchain_docsName of the table for storing vector data and metadata
index_typeIndexType.HNSWVector index algorithm type. Options: HNSW or IVFFLAT\nDefault is HNSW.
vector_typeVectorType.vectorType of vector representation to use. Default is Vector.
distance_strategyDistanceStrategy.COSINEVector similarity metric to use for retrieval. Options: euclidean (L2 distance), cosine (angular distance, ideal for text embeddings), manhattan (L1 distance for sparse data), negative_inner_product (dot product for normalized vectors).\n Default is cosine.
embedding_dimension1536Dimensionality of the vector embeddings.

Supported Combinations

Vector TypeDimensionsIndex TypesSupported Distance Strategies
vector≤2000HNSW/IVFFLATCOSINE/EUCLIDEAN/MANHATTAN/INNER_PROD

Performance Optimization

Index Tuning Guidelines

HNSW Parameters:

  • m: 16-100 (balance between recall and memory)
  • ef_construction: 64-1000 (must be > 2*m)

IVFFLAT Recommendations:

import math

lists=min(
int(math.sqrt(total_rows))if total_rows>1e6elseint(total_rows/1000),
2000,# openGauss maximum
)

Connection Pooling

OpenGaussSettings(min_connections=3, max_connections=20)

Limitations

  • bit andsparsevec vector types currently in development
  • Maximum vector dimensions: 2000 forvector type

API reference

For detailed documentation of all __ModuleName__VectorStore features and configurations head to the API reference:https://python.langchain.com/api_reference/en/latest/vectorstores/opengauss.OpenGuass.html

Related


[8]ページ先頭

©2009-2025 Movatter.jp