FLARE 💥
FLARE, stands for Forward-Looking Active REtrieval augmented generation is a generic retrieval-augmented generation method that actively decides when and what to retrieve using a prediction of the upcoming sentence to anticipate future content and utilize it as the query to retrieve relevant documents if it contains low-confidence tokens.

Here’s a code snippet for using FLARE with Langchain:
fromlangchain.vectorstoresimportLanceDBfromlangchain.document_loadersimportArxivLoaderfromlangchain.chainsimportFlareChainfromlangchain.promptsimportPromptTemplatefromlangchain.chainsimportLLMChainfromlangchain.llmsimportOpenAIllm=OpenAI()# load dataset# LanceDB retrievervector_store=LanceDB.from_documents(doc_chunks,embeddings,connection=table)retriever=vector_store.as_retriever()# define flare chainflare=FlareChain.from_llm(llm=llm,retriever=vector_store_retriever,max_generation_len=300,min_prob=0.45)result=flare.run(input_text)