codeaudit/ai2-scholarqa-libPublic

forked fromallenai/ai2-scholarqa-lib

NotificationsYou must be signed in to change notification settings
Fork0
Star0

Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library

allenai.org/blog/ai2-scholarqa

License

Apache-2.0 license

0 stars 43 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
api		api
docs		docs
proxy		proxy
sonar		sonar
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml

Repository files navigation

Ai2 Scholar QA

This repo houses the code for thelive demo and can be run as local docker containers or embedded into another application as apython package.

Ai2 Scholar QA is a system for answering scientific queries and generating literature reviews by gathering evidence from multiple documents across our corpus (11M+ full text and 100M+ abstracts) and synthesizing an organized report with evidence for each claim. Based on the RAG architecture, Scholar QA has a retrieval component and a three step generator pipeline.

Retrieval:
The retrieval component consists of two sub-components:
i.Retriever - Based on the user query, relevant evidence passages are fetched using the Semantic Scholar public api's snippet/search end point which looks up an index of open source papers. Further, we also use the api's keyword search to suppliement the results from the index with paper abstracts. The user query is first pre-processed to extract metadata for filtering the candidate papers and re-phrasing the query as needed with the help of an LLM -Prompt.
ii.Reranker - The results from the retriever are then reranked withmixedbread-ai/mxbai-rerank-large-v1 and top k passages are retained and aggregated at the paper-level to combine all the passages from a single paper.

These components are encapsulated in thePaperFinder class.

Multi-step Generation:
The generation pipeline uses an LLM (Claude Sonnet 3.7 default) and comprises of three steps:
i.Quote Extraction - The user query along with the aggregated passages from the retrieval component are used to extract the exact quotes relevant to answering the query -Prompt .
ii.Planning and Clustering - First, an organization outline is generated for the report with sections headings and the corresponding format of the section. The quotes from step (i) are clustered and assigned to each heading -Prompt.
iii.Summary Generation - Each section is generated based on the quotes assigned to that section and all the prior section text generated in the report -Prompt.
These steps are encapsulated in theMultiStepQAPipeline class. For sections that are determined to have a list format, we also generate literature review tables that compare and contrast all papers referenced in that section. We generate these tables using the pipeline proposed by theArxivDIGESTables paper, which is availablehere.

Both PaperFinder and MultiStepQAPipeline are in turn wrapped insideScholarQA, which is the main class powering our system.

For more info please refer to ourblogpost.

The code is in this repo can be used as a Dockerized web app, an Async API or as a Python library. We start with the common configuration setup required for both the modes and then describe each mode separately below.

Common Setup

Environment Variables

Ai2 Scholar QA requires Semantic Scholar api and LLMs for its core functionality of retrieval and generation. So please ensure to create a.env file in the root directory with OR include in your runtime environment directly the following variables:

export S2_API_KEY=  export ANTHROPIC_API_KEY=export OPENAI_API_KEY=

S2_API_KEY : Used to retrieve the relevantpaper passages ,keyword search results andassociated metadata via the Semantic Scholar public api.

ANTHROPIC_API_KEY : Ai2 Scholar QA uses Anthropic'sClaude 3.7 Sonnet as the primary LLM for generation, but any model served by litellm from the providers listedhere will work. Please configure the corresponding api key here.OPENAI_API_KEY: OpenAI'sGPT 4o is configured as the fallback llm.

Note: We also use OpenAI'stext moderation api to validate and filter harmful queries. If you don't have access to an OpenAI api key, this feature will be disabled.

If you useModal to serve your models, please configureMODAL_TOKEN andMODAL_TOKEN_SECRET here as well.

Web App

Application Configuration

The web app is initialized with a json config outlining the logging and pipeline attributes to be used at runtime.Please refer todefault.json for the default runtime config.

{"logs": {"log_dir":"logs","llm_cache_dir":"llm_cache","event_trace_loc":"scholarqa_traces","tracing_mode":"local"  },"run_config": {"retrieval_service":"public_api","retriever_args": {"n_retrieval":256,"n_keyword_srch":20    },"reranker_service":"modal","reranker_args": {"app_name":"ai2-scholar-qa","api_name":"inference_api","batch_size":256,"gen_options": {}    },"paper_finder_args": {"n_rerank":50,"context_threshold":0.5    },"pipeline_args": {"validate":true,"llm":"anthropic/claude-3-5-sonnet-20241022","decomposer_llm":"anthropic/claude-3-5-sonnet-20241022"    }  }}

The config is used to populate theAppConfig instance.It wraps the logging and pipeline instances which are initialized with the config and are outlined below:

Logging

classLogsConfig(BaseModel):log_dir:str=Field(default="logs",description="Directory to store logs, event traces and litellm cache")llm_cache_dir:str=Field(default="llm_cache",description="Sub directory to cache llm calls")event_trace_loc:str=Field(default="scholarqa_traces",description="Sub directory to store event traces""OR the GCS bucket name")tracing_mode:Literal["local","gcs"]=Field(default="local",description="Mode to store event traces (local or gcs)")

Note:

i. Event Traces are json documents containing a trace of the entirepipeline i.e. the results of retrieval, reranking, each step of the qapipeline and associated costs, if any.

ii. llm_cache_dir is used to initialize the local disk cache for caching llm calls vialitellm.

iii. The traces are stored locally in{log_dir}/{event_trace_loc} bydefault. They can also be persisted in a Google Cloud Storage (GCS)bucket. Please set thetracing_mode="gcs" andevent_trace_loc=<GCS bucket name> here and theexport GOOGLE_APPLICATION_CREDENTIALS=<Service Account Key json file path>in .env.

iv. By default, the working directory is./api , so thelog_dir will be created inside it as a sub-directory unless the config is modified.

You can also activate Langsmith based log traces if you have an api key configured.Please add the following environment variables:

LANGCHAIN_API_KEYLANGCHAIN_TRACING_V2LANGCHAIN_ENDPOINTLANGCHAIN_PROJECT

Pipeline

classRunConfig(BaseModel):retrieval_service:str=Field(default="public_api",description="Service to use for paper retrieval")retriever_args:dict=Field(default=None,description="Arguments for the retrieval service")reranker_service:str=Field(default="modal",description="Service to use for paper reranking")reranker_args:dict=Field(default=None,description="Arguments for the reranker service")paper_finder_args:dict=Field(default=None,description="Arguments for the paper finder service")pipeline_args:dict=Field(default=None,description="Arguments for the Scholar QA pipeline service")

Note:

i.*(retrieval, reranker)_service can be used to indicate the typeof retrieval/reranker you want to instantiate. Ai2 Scholar QA uses theFullTextRetriever andModalReranker respectively, which are chosen based on thedefaultpublic_api andmodal config values respectively. To choose aSentenceTransformers reranker, replacemodal withcross_encoder orbiencoder or define your own types.

ii.*(retriever, reranker, paper_finder, pipeline)_args are used toinitialize the corresponding instances of the pipeline components. eg.retriever = FullTextRetriever(**run_config.retriever_args). Youcan initialize multiple runs and customize your pipeline.

iii. If thereranker_args are not defined, the app resorts to using only the retrieval service.

docker-compose.yaml

The web app initializes 4 docker containers - one each for the API, GUI, nginx proxy and sonar with their own Dockerfiles.The api container config can also be used to declare environment variables -

api:build:./apivolumes:  -./api:/api-./secret:/secretenvironment:# This ensures that errors are printed as they occur, which# makes debugging easier.-PYTHONUNBUFFERED=1-LOG_LEVEL=INFO-CONFIG_PATH=run_configs/default.jsonports:  -8000:8000env_file:  -.env

environment.CONFIG_PATH indicates the path of the application configuration json file.env_file indicates the path of the file with environment variables.

Running the Webapp

Please refer toDOCKER.mdfor more info on setting up the docker app.

i. Clone the repo

git clone git@github.com:allenai/ai2-scholarqa-lib.gitcd ai2-scholarqa-lib

ii. Run docker-compose

docker compose up --build

The docker compose command takes a while to run the first time to install torch and related dependencies.You can get the verbose output with the following command:

docker compose build --progress plain

Below we show videos of app startup, the UI and also backend logging while processing a user query.

Startup

Screen.Recording.2025-02-07.at.6.29.38.PM.mov

UI

Screen.Recording.2025-02-07.at.6.59.47.PM.mov

Backend

Screen.Recording.2025-02-07.at.7.05.40.PM.mov

Async API

The Ai2 Scholar QA UI is powered by an async api at the back end inapp.py which is run fromdev.sh.

i. Thequery_corpusqa end point is first called with thequery, and a uuid as theuser_id, adn it returns atask_id.

ii. Subsequently, thequery_corpusqa is then polled to get the updated status of the async task until the task status is notCOMPLETED

Sample response

Python Package

condacreate-nscholarqapython=3.11.3condaactivatescholarqapipinstallai2-scholar-qa#to use sentence transformer models as re-rankerpipinstall'ai2-scholar-qa[all]'

Both the webapp and the api are powered by the same pipeline represented by theScholarQA class. The pipeline consists of a retrieval component,PaperFinder which consists of a retriever and maybe a reranker and a 3 step generator componentMultiStepQAPipeline. Each component is extensible and can be replaced by custom instances/classes as required.

Sample usage

fromscholarqa.rag.reranker.reranker_baseimportCrossEncoderScoresfromscholarqa.rag.reranker.modal_engineimportModalRerankerfromscholarqa.rag.retrievalimportPaperFinderWithRerankerfromscholarqa.rag.retriever_baseimportFullTextRetrieverfromscholarqaimportScholarQAfromscholarqa.llms.constantsimportCLAUDE_37_SONNET#Retrieval class/stepsretriever=FullTextRetriever(n_retrieval=256,n_keyword_srch=20)#full text and keyword searchreranker=CrossEncoderScores(model_name_or_path="mixedbread-ai/mxbai-rerank-large-v1")#sentence transformer#Reranker if deployed on Modal, modal_app_name and modal_api_name are modal specific arguments.#Please refer https://github.com/allenai/ai2-scholarqa-lib/blob/aps/readme_fixes/docs/MODAL.md for more inforeranker=ModalReranker(app_name='<modal_app_name>',api_name='<modal_api_name>',batch_size=256,gen_options=dict())#wraps around the retriever with `retrieve_passages()` and `retrieve_additional_papers()`, and reranker with rerank()#any modifications to the retrieval output can be made herepaper_finder=PaperFinderWithReranker(retriever,reranker,n_rerank=50,context_threshold=0.5)#For wrapper class with MultiStepQAPipeline integratedscholar_qa=ScholarQA(paper_finder=paper_finder,llm_model=CLAUDE_37_SONNET)#llm_model can be any litellm modelprint(scholar_qa.answer_query("Which is the 9th planet in our solar system?"))

Pipeline steps (Modular usage)

Continuing from sample usage, below is a breakdown of the pipeline execution in the ScholarQA class.

fromscholarqaimportScholarQAfromscholarqa.rag.multi_step_qa_pipelineimportMultiStepQAPipelinefromscholarqa.llms.constantsimportCLAUDE_37_SONNETfromscholarqa.llms.promptsimportSYSTEM_PROMPT_QUOTE_PER_PAPER,SYSTEM_PROMPT_QUOTE_CLUSTER,PROMPT_ASSEMBLE_SUMMARYfromscholarqa.utilsimportNUMERIC_META_FIELDS,CATEGORICAL_META_FIELDS# Custom MultiStepQAPipeline class/steps with llm_model asa any litellm supported modelmqa_pipeline=MultiStepQAPipeline(llm_model=CLAUDE_37_SONNET)query="Which is the 9th planet in our solar system?"scholar_qa=ScholarQA(paper_finder=paper_finder,multi_step_pipeline=mqa_pipeline,llm_model=CLAUDE_37_SONNET)# Decompose the query to get filters like year, venue, fos, citations, etc along with# a re-written version of the query and a query suitable for keyword search.llm_processed_query=scholar_qa.preprocess_query(query)# Paper finder step - retrieve relevant paper passages from semantic scholar index and apifull_text_src,keyword_srch_res=scholar_qa.find_relevant_papers(llm_processed_query.result)retrieved_candidates=full_text_src+keyword_srch_res# Rerank the retrieved candidates based on the query with a cross encoder# keyword search results are returned with associated metadata, metadata is retrieved separately for full text serach resultskeyword_srch_metadata= [    {k:vfork,vinpaper.items()ifk=="corpus_id"orkinNUMERIC_META_FIELDSorkinCATEGORICAL_META_FIELDS}forpaperinkeyword_srch_res]reranked_df,paper_metadata=scholar_qa.rerank_and_aggregate(query,retrieved_candidates,filter_paper_metadata={str(paper["corpus_id"]):paperforpaperinkeyword_srch_metadata})# Step 1 - quote extractionper_paper_quotes=scholar_qa.step_select_quotes(query,reranked_df,sys_prompt=SYSTEM_PROMPT_QUOTE_PER_PAPER)# step 2: outline planning and clusteringcluster_json=scholar_qa.step_clustering(query,per_paper_quotes.result,sys_prompt=SYSTEM_PROMPT_QUOTE_CLUSTER)# Changing to expected format in the summary generation promptplan_json= {f'{dim["name"]} ({dim["format"]})':dim["quotes"]fordimincluster_json.result["dimensions"]}# step 2.1: extend the clustered snippets in plan json with their inline citationsper_paper_summaries_extd=scholar_qa.extract_quote_citations(reranked_df,per_paper_quotes.result,plan_json,paper_metadata)# step 3: generating output as per the outlineanswer=list(scholar_qa.step_gen_iterative_summary(query,per_paper_summaries_extd,plan_json,sys_prompt=PROMPT_ASSEMBLE_SUMMARY))

Custom Pipeline

API end points

The api end points in app.py can be extended with a fastapi APIRouter in another script.eg.custom_app.py

fromfastapiimportAPIRouter,FastAPIfromscholarqa.appimportcreate_appascreate_app_basefromscholarqa.appimportapp_configfromscholarqa.modelsimportToolRequestdefcreate_app()->FastAPI:app=create_app_base()custom_router=APIRouter()@custom_router.post("/retrieval")defretrieval(tool_request:ToolRequest,task_id:str):scholar_qa=app_config.load_scholarqa(task_id)#a re-written version of the query and a query suitable for keyword search.llm_processed_query=scholar_qa.preprocess_query(tool_request.query)full_text_src,keyword_srch_res=scholar_qa.find_relevant_papers(llm_processed_query.result)retrieved_candidates=full_text_src+keyword_srch_resreturnretrieved_candidatesapp.include_router(custom_router)returnapp.py

To runcustom_app.py, simply replacescholarqa.app:create_app in dev.sh with<package>.custom_app:create_app

ScholarQA class
To extend the existing ScholarQA functionality in a new class you can either create a sub class of ScholarQA or a new class altogether.Either way,lazy_load_scholarqa in app.py should be reimplemented in the new api script to ensure the correct class is initialized.
Pipeline Components
The components of the pipeline are individually extensible.We have the following abstract classes that can be extended to achieve desired customization for retrieval:
and the MultiStepQAPipeline can be extended/modified as needed for generation.
Modal deployment
If you would prefer to serve your models via modal, please refer toMODAL.md for more info and sample code that we used to deploy the reranker model in the live demo.

Citation

Please cite the work as follows:

@inproceedings{Singh2025Ai2SQ,title={Ai2 Scholar QA: Organized Literature Synthesis with Attribution},author={Amanpreet Singh and Joseph Chee Chang and Chloe Anastasiades and Dany Haddad and Aakanksha Naik and Amber Tanaka and Angele Zamarron and Cecile Nguyen and Jena D. Hwang and Jason Dunkleberger and Matt Latzke and Smita Rao and Jaron Lochner and Rob Evans and Rodney Kinney and Daniel S. Weld and Doug Downey and Sergey Feldman},year={2025},url={https://api.semanticscholar.org/CorpusID:277786810}

About

Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library

allenai.org/blog/ai2-scholarqa

Releases

No releases published

Packages

No packages published

Languages

Python60.9%
TypeScript36.3%
JavaScript1.1%
Other1.7%

Movatterモバイル変換

License

codeaudit/ai2-scholarqa-lib

Folders and files

Latest commit

History

Repository files navigation

Ai2 Scholar QA

Overview

Retrieval:

Multi-step Generation:

Common Setup

Environment Variables

Web App

Application Configuration

docker-compose.yaml

Running the Webapp

Startup

UI

Backend

Async API

Python Package

Custom Pipeline

API end points

ScholarQA class

Pipeline Components

Modal deployment

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages