DingoDB

DingoDB is a distributed multi-mode vector database, which combines the characteristics of data lakes and vector databases, and can store data of any type and size (Key-Value, PDF, audio, video, etc.). It has real-time low-latency processing capabilities to achieve rapid insight and response, and can efficiently conduct instant analysis and process multi-modal data.

You'll need to installlangchain-community withpip install -qU langchain-community to use this integration

This notebook shows how to use functionality related to the DingoDB vector database.

To run, you should have aDingoDB instance up and running.

%pip install--upgrade--quiet  dingodb
# or install latest:
%pip install--upgrade--quiet  git+https://git@github.com/dingodb/pydingo.git

We want to use OpenAIEmbeddings so we have to get the OpenAI API Key.

import getpass
import os

if"OPENAI_API_KEY"notin os.environ:
    os.environ["OPENAI_API_KEY"]= getpass.getpass("OpenAI API Key:")

OpenAI API Key:········

from langchain_community.document_loadersimport TextLoader
from langchain_community.vectorstoresimport Dingo
from langchain_openaiimport OpenAIEmbeddings
from langchain_text_splittersimport CharacterTextSplitter

API Reference:TextLoader |Dingo |OpenAIEmbeddings |CharacterTextSplitter

from langchain_community.document_loadersimport TextLoader

loader= TextLoader("../../how_to/state_of_the_union.txt")
documents= loader.load()
text_splitter= CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs= text_splitter.split_documents(documents)

embeddings= OpenAIEmbeddings()

API Reference:TextLoader

from dingodbimport DingoDB

index_name="langchain_demo"

dingo_client= DingoDB(user="", password="", host=["127.0.0.1:13000"])
# First, check if our index already exists. If it doesn't, we create it
if(
    index_namenotin dingo_client.get_index()
and index_name.upper()notin dingo_client.get_index()
):
# we create a new index, modify to your own
    dingo_client.create_index(
        index_name=index_name, dimension=1536, metric_type="cosine", auto_id=False
)

# The OpenAI embedding model `text-embedding-ada-002 uses 1536 dimensions`
docsearch= Dingo.from_documents(
    docs, embeddings, client=dingo_client, index_name=index_name
)

from langchain_community.document_loadersimport TextLoader
from langchain_community.vectorstoresimport Dingo
from langchain_openaiimport OpenAIEmbeddings
from langchain_text_splittersimport CharacterTextSplitter

API Reference:TextLoader |Dingo |OpenAIEmbeddings |CharacterTextSplitter

query="What did the president say about Ketanji Brown Jackson"
docs= docsearch.similarity_search(query)

print(docs[0].page_content)

Adding More Text to an Existing Index

More text can embedded and upserted to an existing Dingo index using theadd_texts function

vectorstore= Dingo(embeddings,"text", client=dingo_client, index_name=index_name)

vectorstore.add_texts(["More text!"])

Maximal Marginal Relevance Searches

In addition to using similarity search in the retriever object, you can also usemmr as retriever.

retriever= docsearch.as_retriever(search_type="mmr")
matched_docs= retriever.invoke(query)
for i, dinenumerate(matched_docs):
print(f"\n## Document{i}\n")
print(d.page_content)

Or usemax_marginal_relevance_search directly:

found_docs= docsearch.max_marginal_relevance_search(query, k=2, fetch_k=10)
for i, docinenumerate(found_docs):
print(f"{i+1}.", doc.page_content,"\n")

Vector storeconceptual guide
Vector storehow-to guides

Movatterモバイル変換

DingoDB

Adding More Text to an Existing Index

Maximal Marginal Relevance Searches

Related

Movatterモバイル変換

Adding More Text to an Existing Index​

Maximal Marginal Relevance Searches​

Related​

Adding More Text to an Existing Index

Maximal Marginal Relevance Searches

Related