- Notifications
You must be signed in to change notification settings - Fork34
A relation-free graph constrcution method for efficient GraphRAG.
License
DEEP-PolyU/LinearRAG
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A relation-free graph construction method for efficient GraphRAG. It eliminates LLM token costs during graph construction, making GraphRAG faster and more efficient than ever.
- ✅Context-Preserving: Relation-free graph construction, relying on lightweight entity recognition and semantic linking to achieve comprehensive contextual comprehension.
- ✅Complex Reasoning: Enables deep retrieval via semantic bridging, achieving multi-hop reasoning in a single retrieval pass without requiring explicit relational graphs.
- ✅High Scalability: Zero LLM token consumption, faster processing speed, and linear time/space complexity.
- [2025-10-27] We releaseLinearRAG, a relation-free graph construction method for efficient GraphRAG.
- [2025-06-06] We releaseGraphRAG-Bench, the benchmark for evaluating GraphRAG models.
- [2025-01-21] We release theGraphRAG survey.
Step 1: Install Python packages
pip install -r requirements.txt
Step 2: Download Spacy language model
python -m spacy download en_core_web_trf
Note: For the
medicaldataset, you need to install the scientific/biomedical Spacy model:
pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.3/en_core_sci_scibert-0.5.3.tar.gz
Step 3: Set up your OpenAI API key
export OPENAI_API_KEY="your-api-key-here"export OPENAI_BASE_URL="your-base-url-here"
Step 4: Download Datasets
Download the datasets from HuggingFace and place them in thedataset/ folder:
git clone https://huggingface.co/datasets/Zly0523/linear-ragcp -r linear-rag/dataset/* dataset/Step 5: Prepare Embedding Model
Make sure the embedding model is available at:
model/all-mpnet-base-v2/SPACY_MODEL="en_core_web_trf"EMBEDDING_MODEL="model/all-mpnet-base-v2"DATASET_NAME="2wikimultihop"LLM_MODEL="gpt-4o-mini"MAX_WORKERS=16python run.py \ --spacy_model${SPACY_MODEL} \ --embedding_model${EMBEDDING_MODEL} \ --dataset_name${DATASET_NAME} \ --llm_model${LLM_MODEL} \ --max_workers${MAX_WORKERS}
If you find this work helpful, please consider citing us:
@article{zhuang2025linearrag,title={LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora},author={Zhuang, Luyao and Chen, Shengyuan and Xiao, Yilin and Zhou, Huachi and Zhang, Yujing and Chen, Hao and Zhang, Qinggang and Huang, Xiao},journal={arXiv preprint arXiv:2510.10114},year={2025}}
This project is licensed under the GNU General Public License v3.0 (License).
✉️ Email:zhuangluyao523@gmail.com
About
A relation-free graph constrcution method for efficient GraphRAG.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.


