Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

DingoDB

DingoDB is a distributed multi-mode vector database, which combines the characteristics of data lakes and vector databases, and can store data of any type and size (Key-Value, PDF, audio, video, etc.). It has real-time low-latency processing capabilities to achieve rapid insight and response, and can efficiently conduct instant analysis and process multi-modal data.

You'll need to installlangchain-community withpip install -qU langchain-community to use this integration

This notebook shows how to use functionality related to the DingoDB vector database.

To run, you should have aDingoDB instance up and running.

%pip install--upgrade--quiet  dingodb
# or install latest:
%pip install--upgrade--quiet git+https://git@github.com/dingodb/pydingo.git

We want to use OpenAIEmbeddings so we have to get the OpenAI API Key.

import getpass
import os

if"OPENAI_API_KEY"notin os.environ:
os.environ["OPENAI_API_KEY"]= getpass.getpass("OpenAI API Key:")
OpenAI API Key:········
from langchain_community.document_loadersimport TextLoader
from langchain_community.vectorstoresimport Dingo
from langchain_openaiimport OpenAIEmbeddings
from langchain_text_splittersimport CharacterTextSplitter
from langchain_community.document_loadersimport TextLoader

loader= TextLoader("../../how_to/state_of_the_union.txt")
documents= loader.load()
text_splitter= CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs= text_splitter.split_documents(documents)

embeddings= OpenAIEmbeddings()
API Reference:TextLoader
from dingodbimport DingoDB

index_name="langchain_demo"

dingo_client= DingoDB(user="", password="", host=["127.0.0.1:13000"])
# First, check if our index already exists. If it doesn't, we create it
if(
index_namenotin dingo_client.get_index()
and index_name.upper()notin dingo_client.get_index()
):
# we create a new index, modify to your own
dingo_client.create_index(
index_name=index_name, dimension=1536, metric_type="cosine", auto_id=False
)

# The OpenAI embedding model `text-embedding-ada-002 uses 1536 dimensions`
docsearch= Dingo.from_documents(
docs, embeddings, client=dingo_client, index_name=index_name
)
from langchain_community.document_loadersimport TextLoader
from langchain_community.vectorstoresimport Dingo
from langchain_openaiimport OpenAIEmbeddings
from langchain_text_splittersimport CharacterTextSplitter
query="What did the president say about Ketanji Brown Jackson"
docs= docsearch.similarity_search(query)
print(docs[0].page_content)

Adding More Text to an Existing Index

More text can embedded and upserted to an existing Dingo index using theadd_texts function

vectorstore= Dingo(embeddings,"text", client=dingo_client, index_name=index_name)

vectorstore.add_texts(["More text!"])

Maximal Marginal Relevance Searches

In addition to using similarity search in the retriever object, you can also usemmr as retriever.

retriever= docsearch.as_retriever(search_type="mmr")
matched_docs= retriever.invoke(query)
for i, dinenumerate(matched_docs):
print(f"\n## Document{i}\n")
print(d.page_content)

Or usemax_marginal_relevance_search directly:

found_docs= docsearch.max_marginal_relevance_search(query, k=2, fetch_k=10)
for i, docinenumerate(found_docs):
print(f"{i+1}.", doc.page_content,"\n")

Related


[8]ページ先頭

©2009-2025 Movatter.jp