How to handle multiple retrievers when doing query analysis
Sometimes, a query analysis technique may allow for selection of whichretriever to use. To use this, you will need to add some logic to select the retriever to do. We will show a simple example (using mock data) of how to do that.
Setup
Install dependencies
%pip install-qU langchain langchain-community langchain-openai langchain-chroma
Note: you may need to restart the kernel to use updated packages.
Set environment variables
We'll use OpenAI in this example:
import getpass
import os
if"OPENAI_API_KEY"notin os.environ:
os.environ["OPENAI_API_KEY"]= getpass.getpass()
# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.
# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass()
Create Index
We will create a vectorstore over fake information.
from langchain_chromaimport Chroma
from langchain_openaiimport OpenAIEmbeddings
from langchain_text_splittersimport RecursiveCharacterTextSplitter
texts=["Harrison worked at Kensho"]
embeddings= OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore= Chroma.from_texts(texts, embeddings, collection_name="harrison")
retriever_harrison= vectorstore.as_retriever(search_kwargs={"k":1})
texts=["Ankush worked at Facebook"]
embeddings= OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore= Chroma.from_texts(texts, embeddings, collection_name="ankush")
retriever_ankush= vectorstore.as_retriever(search_kwargs={"k":1})
API Reference:OpenAIEmbeddings |RecursiveCharacterTextSplitter
Query analysis
We will use function calling to structure the output. We will let it return multiple queries.
from typingimport List, Optional
from pydanticimport BaseModel, Field
classSearch(BaseModel):
"""Search for information about a person."""
query:str= Field(
...,
description="Query to look up",
)
person:str= Field(
...,
description="Person to look things up for. Should be `HARRISON` or `ANKUSH`.",
)
from langchain_core.output_parsers.openai_toolsimport PydanticToolsParser
from langchain_core.promptsimport ChatPromptTemplate
from langchain_core.runnablesimport RunnablePassthrough
from langchain_openaiimport ChatOpenAI
output_parser= PydanticToolsParser(tools=[Search])
system="""You have the ability to issue search queries to get information to help answer user information."""
prompt= ChatPromptTemplate.from_messages(
[
("system", system),
("human","{question}"),
]
)
llm= ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm= llm.with_structured_output(Search)
query_analyzer={"question": RunnablePassthrough()}| prompt| structured_llm
We can see that this allows for routing between retrievers
query_analyzer.invoke("where did Harrison Work")
Search(query='work history', person='HARRISON')
query_analyzer.invoke("where did ankush Work")
Search(query='work history', person='ANKUSH')
Retrieval with query analysis
So how would we include this in a chain? We just need some simple logic to select the retriever and pass in the search query
from langchain_core.runnablesimport chain
API Reference:chain
retrievers={
"HARRISON": retriever_harrison,
"ANKUSH": retriever_ankush,
}
@chain
defcustom_chain(question):
response= query_analyzer.invoke(question)
retriever= retrievers[response.person]
return retriever.invoke(response.query)
custom_chain.invoke("where did Harrison Work")
[Document(page_content='Harrison worked at Kensho')]
custom_chain.invoke("where did ankush Work")
[Document(page_content='Ankush worked at Facebook')]