NotificationsYou must be signed in to change notification settings
Fork1
Star3

FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
llm_v1		llm_v1
llm_v1_singleStage		llm_v1_singleStage
.gitignore		.gitignore
README.md		README.md
fine_tune_hf.py		fine_tune_hf.py
kg.py		kg.py
llm_fact_check.py		llm_fact_check.py
llm_filter_relation.py		llm_filter_relation.py
llm_v1_jsons.zip		llm_v1_jsons.zip
mine_llm_filtered_relation.py		mine_llm_filtered_relation.py
requirements.txt		requirements.txt

Repository files navigation

FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs

Sushant Gautam's submission for IN5550 final exam: Fact-checking track

Helper files

Helper for fine-tuning and evaluating the fact-checker model

usage:pythonfine_tune_hf.py [-h] [--batch_sizeBATCH_SIZE] [--lrLR] [--modelMODEL] [--epochsEPOCHS] [--freezeFREEZE] [--dbpedia_pathDBPEDIA_PATH]                   [--data_pathDATA_PATH] [--plot_roc]optionalarguments:-h,--helpshowthishelpmessageandexit--batch_sizeBATCH_SIZE--lrLR--modelMODEL--epochsEPOCHS--dbpedia_pathDBPEDIA_PATH--data_pathDATA_PATH--plot_rocIfset,theROCcurvewillbeplottedandsaved.

The script will output the evaluation results on the test set as well across all five reasoning types as reported on the paper and also save the model in the ./results directory.

Helper function for LLM-based fact-checking

usage:pythonllm_fact_check.py [-h] [--data_pathDATA_PATH] [--dbpedia_pathDBPEDIA_PATH] [--evidence_pathEVIDENCE_PATH] [--set {test,train,val}]                          [--num_procNUM_PROC] [--llm_knowledge] [--vllm_urlVLLM_URL]optionalarguments:-h,--helpshowthishelpmessageandexit--data_pathDATA_PATH--dbpedia_pathDBPEDIA_PATH--evidence_pathEVIDENCE_PATHPathtotheedvidenceJSONspredictedbyLLM.--set {test,train,val}--num_procNUM_PROC--llm_knowledgeIfset,theinstructionwillbeclaimonlyLLMbasedfactchecking.--vllm_urlVLLM_URLURLofthevLLMserver,e.g.,http://g002:8000

Set up Llama3-Instruct inference server using vLLM

Note: The output of graph filtering using LLM is made available in the repo already. So you can choose to skip inference server set-up and jump to fine-tuning pre-trained models and evaluating them.

Run on any server with NVIDIA A100 GPU (80 GB VRAM)

Request access to the model athttps://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.

Make sure you have logged in to Hugging Face with`huggingface-cli whoami` and have access to the model.

python -m vllm.entrypoints.openai.api_server  --model meta-llama/Meta-Llama-3-8B-Instruct

The inference server should be up and running on port 8000 athttp://hostname:8000 by default. Remember to replace --vllm_url argument with appropriate server url when using LLM.

1. Evaluating Zero-shot Claim Only Baseline with LLM

python llm_fact_check.py --settest --llm_knowledge --vllm_url http://g002:8000

2. Fine-tune and evaluate RoBERTa as Claim Only Baseline

python fine_tune_hf.py --model roberta-base --batch_size 32

The fine-tuned model is pushed at:https://huggingface.co/SushantGautam/KG-LLM-roberta-base-claim_only.Can be loaded using the Hugging Face Transformers library and used for inference with appropriate tokenizer.

The finetuning logs and metrics can be found athttps://wandb.ai/ubl/FactKG_IN9550/runs/ui83kr9l.

3. Filtering connections and caching data for fine-tuning and evaluation

3.1: Filtering Possible Connections with LLM

python llm_filter_relation.py --set train --vllm_url http://g002:8000python llm_filter_relation.py --set val --vllm_url http://g002:8000python llm_filter_relation.py --settest --vllm_url http://g002:8000

This will output JSON files in ./llm_train, ./llm_val, and ./llm_test directories. The output directories have been saved as zip file: llm_v1_jsons.zip.

Assume ./llm_train, ./llm_val, and ./llm_test directories with JSONS are made available inside the ./llm_v1_jsons/ directory. Or can get them by unzipping the llm_v1_jsons.zip file.

unzip llm_v1_jsons.zip

3.2: Fuzzy Relation Mining

3.2.1: Two-stage Fuzzy Relation Mining

python mine_llm_filtered_relation.py --set train --outputPath ./llm_v1/ --jsons_path ./llm_v1_jsons/python mine_llm_filtered_relation.py --set val --outputPath ./llm_v1/ --jsons_path ./llm_v1_jsons/python mine_llm_filtered_relation.py --settest --outputPath ./llm_v1/ --jsons_path ./llm_v1_jsons/

This will output CSV files in ./llm_v1/ directory. The output has been made available in the repo already.

3.2.1: Single-stage Fuzzy Relation Mining

python mine_llm_filtered_relation.py --set train --outputPath ./llm_v1_singleStage/ --jsons_path ./llm_v1_jsons/ --skip_second_stage python mine_llm_filtered_relation.py --set val --outputPath ./llm_v1_singleStage/ --jsons_path ./llm_v1_jsons/ --skip_second_stagepython mine_llm_filtered_relation.py --settest --outputPath ./llm_v1_singleStage/ --jsons_path ./llm_v1_jsons/ --skip_second_stage

This will output CSV files in ./llm_v1_singleStage/ directory. The output has been made available in the repo already.

4. Zero-shot LLM as Fact Classifier with evidence

python llm_fact_check.py --settest --vllm_url http://g002:8000  --evidence_path ./llm_v1_jsons

5. Fine-tuning pre-trained models

5.1: Fine-tuning Two-stage BERT classifier on the filtered data

python fine_tune_hf.py --model bert-base-uncased --batch_size 64 --data_path ./llm_v1/

5.2: Fine-tuning Single-stage RoBERTa classifier on the filtered data

python fine_tune_hf.py --model roberta-base --batch_size 32 --data_path ./llm_v1_singleStage/

5.3: Fine-tuning Two-stage RoBERTa classifier on the filtered data

python fine_tune_hf.py --model roberta-base --batch_size 32 --data_path ./llm_v1/

The fine-tuned models are pushed at:https://huggingface.co/SushantGautam/KG-LLM-bert-base,https://huggingface.co/SushantGautam/KG-LLM-roberta-base-single_stage andhttps://huggingface.co/SushantGautam/KG-LLM-roberta-base.Can be loaded using the Hugging Face Transformers library and used for inference with appropriate tokenizer.

The finetuning logs and metrics can be found athttps://wandb.ai/ubl/FactKG_IN9550/runs/7sf9kelb,https://wandb.ai/ubl/FactKG_IN9550/runs/sweuj8g6 andhttps://wandb.ai/ubl/FactKG_IN9550/runs/m5vqnfcr.

About

FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

SushantGautam/FactGenius

Folders and files

Latest commit

History

Repository files navigation

FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs

Sushant Gautam's submission for IN5550 final exam: Fact-checking track

Helper files

Helper for fine-tuning and evaluating the fact-checker model

Helper function for LLM-based fact-checking

Set up Llama3-Instruct inference server using vLLM

Note: The output of graph filtering using LLM is made available in the repo already. So you can choose to skip inference server set-up and jump to fine-tuning pre-trained models and evaluating them.

Run on any server with NVIDIA A100 GPU (80 GB VRAM)

Request access to the model athttps://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.

Make sure you have logged in to Hugging Face withhuggingface-cli whoami and have access to the model.

1. Evaluating Zero-shot Claim Only Baseline with LLM

2. Fine-tune and evaluate RoBERTa as Claim Only Baseline

3. Filtering connections and caching data for fine-tuning and evaluation

3.1: Filtering Possible Connections with LLM

3.2: Fuzzy Relation Mining

3.2.1: Two-stage Fuzzy Relation Mining

3.2.1: Single-stage Fuzzy Relation Mining

4. Zero-shot LLM as Fact Classifier with evidence

5. Fine-tuning pre-trained models

5.1: Fine-tuning Two-stage BERT classifier on the filtered data

5.2: Fine-tuning Single-stage RoBERTa classifier on the filtered data

5.3: Fine-tuning Two-stage RoBERTa classifier on the filtered data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Make sure you have logged in to Hugging Face with`huggingface-cli whoami` and have access to the model.

Packages