Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Llama-Index

Illustration

Quick start

You would need to install the integration viapip install llama-index-vector-stores-lancedb in order to use it. You can run the below script to try it out :

importloggingimportsys# Uncomment to see debug logs# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))fromllama_index.coreimportSimpleDirectoryReader,Document,StorageContextfromllama_index.coreimportVectorStoreIndexfromllama_index.vector_stores.lancedbimportLanceDBVectorStoreimporttextwrapimportopenaiopenai.api_key="sk-..."documents=SimpleDirectoryReader("./data/your-data-dir/").load_data()print("Document ID:",documents[0].doc_id,"Document Hash:",documents[0].hash)## For LanceDB cloud :# vector_store = LanceDBVectorStore(#     uri="db://db_name", # your remote DB URI#     api_key="sk_..", # lancedb cloud api key#     region="your-region" # the region you configured#     ...# )vector_store=LanceDBVectorStore(uri="./lancedb",mode="overwrite",query_type="vector")storage_context=StorageContext.from_defaults(vector_store=vector_store)index=VectorStoreIndex.from_documents(documents,storage_context=storage_context)lance_filter="metadata.file_name = 'paul_graham_essay.txt' "retriever=index.as_retriever(vector_store_kwargs={"where":lance_filter})response=retriever.retrieve("What did the author do growing up?")

Checkout Complete example here -LlamaIndex demo

Filtering

For metadata filtering, you can use a Lance SQL-like string filter as demonstrated in the example above. Additionally, you can also filter using theMetadataFilters class from LlamaIndex:

fromllama_index.core.vector_storesimport(MetadataFilters,FilterOperator,FilterCondition,MetadataFilter,)query_filters=MetadataFilters(filters=[MetadataFilter(key="creation_date",operator=FilterOperator.EQ,value="2024-05-23"),MetadataFilter(key="file_size",value=75040,operator=FilterOperator.GT),],condition=FilterCondition.AND,)

Hybrid Search

For complete documentation, referhere. This example uses thecolbert reranker. Make sure to install necessary dependencies for the reranker you choose.

fromlancedb.rerankersimportColbertRerankerreranker=ColbertReranker()vector_store._add_reranker(reranker)query_engine=index.as_query_engine(filters=query_filters,vector_store_kwargs={"query_type":"hybrid",})response=query_engine.query("How much did Viaweb charge per month?")

In the above snippet, you can change/specify query_type again when creating the engine/retriever.

API reference

The exhaustive list of parameters forLanceDBVectorStore vector store are :
-connection: Optional,lancedb.db.LanceDBConnection connection object to use. If not provided, a new connection will be created.-uri: Optional[str], the uri of your database. Defaults to"/tmp/lancedb".-table_name : Optional[str], Name of your table in the database. Defaults to"vectors".-table: Optional[Any],lancedb.db.LanceTable object to be passed. Defaults toNone. -vector_column_name: Optional[Any], Column name to use for vector's in the table. Defaults to'vector'.
-doc_id_key: Optional[str], Column name to use for document id's in the table. Defaults to'doc_id'.
-text_key: Optional[str], Column name to use for text in the table. Defaults to'text'.
-api_key: Optional[str], API key to use for LanceDB cloud database. Defaults toNone.
-region: Optional[str], Region to use for LanceDB cloud database. Only for LanceDB Cloud, defaults toNone.
-nprobes : Optional[int], Set the number of probes to use. Only applicable if ANN index is created on the table else its ignored. Defaults to20.-refine_factor : Optional[int], Refine the results by reading extra elements and re-ranking them in memory. Defaults toNone.-reranker: Optional[Any], The reranker to use for LanceDB. Defaults toNone.-overfetch_factor: Optional[int], The factor by which to fetch more results. Defaults to1.-mode: Optional[str], The mode to use for LanceDB. Defaults to"overwrite".-query_type:Optional[str], The type of query to use for LanceDB. Defaults to"vector".

Methods

  • from_table(cls, table: lancedb.db.LanceTable) ->LanceDBVectorStore : (class method) Creates instance from lancedb table.

  • _add_reranker(self, reranker: lancedb.rerankers.Reranker) ->None : Add a reranker to an existing vector store.

    • Usage :
      fromlancedb.rerankersimportColbertRerankerreranker=ColbertReranker()vector_store._add_reranker(reranker)
  • _table_exists(self, tbl_name:Optional[str] =None) ->bool : ReturnsTrue iftbl_name exists in database.
  • create_index(
    self, scalar:Optional[bool] = False, col_name:Optional[str] = None, num_partitions:Optional[int] = 256, num_sub_vectors:Optional[int] = 96, index_cache_size:Optional[int] = None, metric:Optional[str] = "l2",
    ) ->None
    : Creates a scalar(for non-vector cols) or a vector index on a table. Make sure your vector column has enough data before creating an index on it.

  • add(self, nodes:List[BaseNode], **add_kwargs:Any, ) ->List[str] :adds Nodes to the table

  • delete(self, ref_doc_id:str) ->None: Delete nodes using with node_ids.

  • delete_nodes(self, node_ids:List[str]) ->None : Delete nodes using with node_ids.
  • query( self, query:VectorStoreQuery, **kwargs:Any, ) ->VectorStoreQueryResult: Query index(VectorStoreIndex) for top k most similar nodes. Accepts llamaIndexVectorStoreQuery object.

[8]ページ先頭

©2009-2025 Movatter.jp