- Notifications
You must be signed in to change notification settings - Fork19
Official code of the ACL 2025 paper "SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation"
YZ-Cai/SimGRAG
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The is the repository for the ACL 2025 paper"SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation".SimGRAG is a KG-driven RAG approach that can support various KG based tasks, such as question answering and fact verification.
It supports plug-and-play usability with the following three components:
- Large language model: For generation.
- Embedding model: For node and relation embedding.
- Vector database: store the embedding of the nodes and relations in the knowledge graph, supporting efficient similarity search.
This repository is built on open-source solutions of these components:
- Ollama for runing the large language model of Llama 3 70B
- Nomic embedding model for node and relation embedding
- Milvus for vector database
You can replace the components with your own preference, all you need is to prepare the APIs.Next, we provide the preparation steps for the components we used.
Please visit theOllama website to install Ollama on your local environment.After installation, you can use the following command to run the Llama 3 70B model:
ollama run llama3:70bThen, you can use the following command to start the service needed by SimGRAG:
bash ollama_server.shYou can clone the model fromhere with the following command:
mkdir -p data/rawcd data/rawgit clone https://huggingface.co/nomic-ai/nomic-embed-text-v1Please visit theMilvus website to install Milvus on your local environment.After installation, you can follow its documentation to start the service needed by SimGRAG.
Please download the MetaQA dataset following the url in therepository and put it in thedata/raw folder.
Please download the FactKG dataset following the url in therepository and put it in thedata/raw folder.
After preparation, the directories should be organized as follows:
SimGraphRAG├── data│ └── raw│ ├── nomic-embed-text-v1│ ├── MetaQA│ └── FactKG├── configs├── pipeline├── prompts└── srcYou can find the configuration files in theconfigs folder. You can modify the configuration files to fit your needs.
For MetaQA, you can run the following command:
cd pipelinepython metaQA_index.pypython metaQA_query1hop.pypython metaQA_query2hop.pypython metaQA_query3hop.pyFor FactKG, you can run the following command:
cd pipelinepython factKG_index.pypython factKG_query.pyThe results can be found in the file that assigned to the "output_filename" in the configuration file. For example, "results/FactKG_query.txt".Each line of the result file is a dictionary, in which the key "correct" presents the correctness of the final answer.
If you use this project in your research, please cite this paper:
@inproceedings{simgrag2025,title ="{SimGRAG}: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation",author ="Cai, Yuzheng and Guo, Zhenyue and Pei, Yiwen and Bian, Wanrui and Zheng, Weiguo",booktitle ="Findings of the Association for Computational Linguistics: ACL 2025",month = jul,year ="2025",address ="Vienna, Austria",publisher ="Association for Computational Linguistics",url ="https://aclanthology.org/2025.findings-acl.163/",pages ="3139--3158",ISBN ="979-8-89176-256-5"}
About
Official code of the ACL 2025 paper "SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation"
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.