Baidu Cloud ElasticSearch VectorSearch

Baidu Cloud VectorSearch is a fully managed, enterprise-level distributed search and analysis service which is 100% compatible to open source. Baidu Cloud VectorSearch provides low-cost, high-performance, and reliable retrieval and analysis platform level product services for structured/unstructured data. As a vector database , it supports multiple index types and similarity distance methods.

Baidu Cloud ElasticSearch provides a privilege management mechanism, for you to configure the cluster privileges freely, so as to further ensure data security.

This notebook shows how to use functionality related to theBaidu Cloud ElasticSearch VectorStore.To run, you should have anBaidu Cloud ElasticSearch instance up and running:

Read thehelp document to quickly familiarize and configure Baidu Cloud ElasticSearch instance.

After the instance is up and running, follow these steps to split documents, get embeddings, connect to the baidu cloud elasticsearch instance, index documents, and perform vector retrieval.

We need to install the following Python packages first.

%pip install--upgrade--quiet langchain-community elasticsearch==7.11.0

First, we want to useQianfanEmbeddings so we have to get the Qianfan AK and SK. Details for QianFan is related toBaidu Qianfan Workshop

import getpass
import os

if"QIANFAN_AK"notin os.environ:
    os.environ["QIANFAN_AK"]= getpass.getpass("Your Qianfan AK:")
if"QIANFAN_SK"notin os.environ:
    os.environ["QIANFAN_SK"]= getpass.getpass("Your Qianfan SK:")

Secondly, split documents and get embeddings.

from langchain_community.document_loadersimport TextLoader
from langchain_text_splittersimport CharacterTextSplitter

loader= TextLoader("../../../state_of_the_union.txt")
documents= loader.load()
text_splitter= CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs= text_splitter.split_documents(documents)

from langchain_community.embeddingsimport QianfanEmbeddingsEndpoint

embeddings= QianfanEmbeddingsEndpoint()

API Reference:TextLoader |CharacterTextSplitter |QianfanEmbeddingsEndpoint

Then, create a Baidu ElasticeSearch accessable instance.

# Create a bes instance and index docs.
from langchain_community.vectorstoresimport BESVectorStore

bes= BESVectorStore.from_documents(
    documents=docs,
    embedding=embeddings,
    bes_url="your bes cluster url",
    index_name="your vector index",
)
bes.client.indices.refresh(index="your vector index")

API Reference:BESVectorStore

Finally, Query and retrieve data

query="What did the president say about Ketanji Brown Jackson"
docs= bes.similarity_search(query)
print(docs[0].page_content)

Please feel free to contactliuboyao@baidu.com orchenweixu01@baidu.com if you encounter any problems during use, and we will do our best to support you.

Vector storeconceptual guide
Vector storehow-to guides

Movatterモバイル変換

Baidu Cloud ElasticSearch VectorSearch

Related

Movatterモバイル変換

Related​

Related