Self RAG 🤳

Self-RAG is a strategy for Retrieval-Augmented Generation (RAG) to get better retrieved information, generated text, and validation, without loss of flexibility. Unlike the traditional Retrieval-Augmented Generation (RAG) method, Self-RAG retrieves information as needed, can skip retrieval if not needed, and evaluates its own output while generating text. It also uses a process to pick the best output based on different preferences.

Official Paper

Official Implementation

Self-RAG starts by generating a response without retrieving extra info if it's not needed. For questions that need more details, it retrieves to get the necessary information.

Here’s a code snippet for defining retriever using Langchain:

fromlangchain.text_splitterimportRecursiveCharacterTextSplitterfromlangchain_community.document_loadersimportWebBaseLoaderfromlangchain_community.vectorstoresimportLanceDBfromlangchain_openaiimportOpenAIEmbeddingsurls=["https://lilianweng.github.io/posts/2023-06-23-agent/","https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/","https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",]docs=[WebBaseLoader(url).load()forurlinurls]docs_list=[itemforsublistindocsforiteminsublist]text_splitter=RecursiveCharacterTextSplitter.from_tiktoken_encoder(chunk_size=100,chunk_overlap=50)doc_splits=text_splitter.split_documents(docs_list)# add documents in LanceDBvectorstore=LanceDB.from_documents(documents=doc_splits,embedding=OpenAIEmbeddings(),)retriever=vectorstore.as_retriever()

The following functions grade the retrieved documents and formulate an improved query for better retrieval results, if required:

defgrade_documents(state)->Literal["generate","rewrite"]:classgrade(BaseModel):binary_score:str=Field(description="Relevance score 'yes' or 'no'")model=ChatOpenAI(temperature=0,model="gpt-4-0125-preview",streaming=True)llm_with_tool=model.with_structured_output(grade)prompt=PromptTemplate(template="""You are a grader assessing relevance of a retrieved document to a user question.\n        Here is the retrieved document:\n\n{context}\n\n        Here is the user question:{question}\n        If the document contains keyword(s) or semantic meaning related to the user question, grade it as relevant.\n        Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question.""",input_variables=["context","question"],)chain=prompt|llm_with_toolmessages=state["messages"]last_message=messages[-1]question=messages[0].contentdocs=last_message.contentscored_result=chain.invoke({"question":question,"context":docs})score=scored_result.binary_scorereturn"generate"ifscore=="yes"else"rewrite"defrewrite(state):messages=state["messages"]question=messages[0].contentmsg=[HumanMessage(content=f"""\n            Look at the input and try to reason about the underlying semantic intent / meaning.\n            Here is the initial question:\n -------\n{question}\n -------\n            Formulate an improved question: """,)]model=ChatOpenAI(temperature=0,model="gpt-4-0125-preview",streaming=True)response=model.invoke(msg)return{"messages":[response]}

Movatterモバイル変換

Self RAG 🤳