Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

TileDB

TileDB is a powerful engine for indexing and querying dense and sparse multi-dimensional arrays.

TileDB offers ANN search capabilities using theTileDB-Vector-Search module. It provides serverless execution of ANN queries and storage of vector indexes both on local disk and cloud object stores (i.e. AWS S3).

More details in:

This notebook shows how to use theTileDB vector database.

%pip install--upgrade--quiet  tiledb-vector-search langchain-community

Basic Example

from langchain_community.document_loadersimport TextLoader
from langchain_community.vectorstoresimport TileDB
from langchain_huggingfaceimport HuggingFaceEmbeddings
from langchain_text_splittersimport CharacterTextSplitter

raw_documents= TextLoader("../../how_to/state_of_the_union.txt").load()
text_splitter= CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents= text_splitter.split_documents(raw_documents)
model_name="sentence-transformers/all-mpnet-base-v2"
embeddings= HuggingFaceEmbeddings(model_name=model_name)
db= TileDB.from_documents(
documents, embeddings, index_uri="/tmp/tiledb_index", index_type="FLAT"
)
query="What did the president say about Ketanji Brown Jackson"
docs= db.similarity_search(query)
docs[0].page_content

Similarity search by vector

embedding_vector= embeddings.embed_query(query)
docs= db.similarity_search_by_vector(embedding_vector)
docs[0].page_content

Similarity search with score

docs_and_scores= db.similarity_search_with_score(query)
docs_and_scores[0]

Maximal Marginal Relevance Search (MMR)

In addition to using similarity search in the retriever object, you can also usemmr as retriever.

retriever= db.as_retriever(search_type="mmr")
retriever.invoke(query)

Or usemax_marginal_relevance_search directly:

db.max_marginal_relevance_search(query, k=2, fetch_k=10)

Related


[8]ページ先頭

©2009-2025 Movatter.jp