- Notifications
You must be signed in to change notification settings - Fork1
SushantGautam/FactGenius
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs
usage:pythonfine_tune_hf.py [-h] [--batch_sizeBATCH_SIZE] [--lrLR] [--modelMODEL] [--epochsEPOCHS] [--freezeFREEZE] [--dbpedia_pathDBPEDIA_PATH] [--data_pathDATA_PATH] [--plot_roc]optionalarguments:-h,--helpshowthishelpmessageandexit--batch_sizeBATCH_SIZE--lrLR--modelMODEL--epochsEPOCHS--dbpedia_pathDBPEDIA_PATH--data_pathDATA_PATH--plot_rocIfset,theROCcurvewillbeplottedandsaved.
The script will output the evaluation results on the test set as well across all five reasoning types as reported on the paper and also save the model in the ./results directory.
usage:pythonllm_fact_check.py [-h] [--data_pathDATA_PATH] [--dbpedia_pathDBPEDIA_PATH] [--evidence_pathEVIDENCE_PATH] [--set {test,train,val}] [--num_procNUM_PROC] [--llm_knowledge] [--vllm_urlVLLM_URL]optionalarguments:-h,--helpshowthishelpmessageandexit--data_pathDATA_PATH--dbpedia_pathDBPEDIA_PATH--evidence_pathEVIDENCE_PATHPathtotheedvidenceJSONspredictedbyLLM.--set {test,train,val}--num_procNUM_PROC--llm_knowledgeIfset,theinstructionwillbeclaimonlyLLMbasedfactchecking.--vllm_urlVLLM_URLURLofthevLLMserver,e.g.,http://g002:8000
Note: The output of graph filtering using LLM is made available in the repo already. So you can choose to skip inference server set-up and jump to fine-tuning pre-trained models and evaluating them.
Request access to the model athttps://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.
Make sure you have logged in to Hugging Face withhuggingface-cli whoami
and have access to the model.
python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3-8B-Instruct
The inference server should be up and running on port 8000 athttp://hostname:8000 by default. Remember to replace --vllm_url argument with appropriate server url when using LLM.
python llm_fact_check.py --settest --llm_knowledge --vllm_url http://g002:8000
python fine_tune_hf.py --model roberta-base --batch_size 32
The fine-tuned model is pushed at:https://huggingface.co/SushantGautam/KG-LLM-roberta-base-claim_only.Can be loaded using the Hugging Face Transformers library and used for inference with appropriate tokenizer.
The finetuning logs and metrics can be found athttps://wandb.ai/ubl/FactKG_IN9550/runs/ui83kr9l.
python llm_filter_relation.py --set train --vllm_url http://g002:8000python llm_filter_relation.py --set val --vllm_url http://g002:8000python llm_filter_relation.py --settest --vllm_url http://g002:8000
This will output JSON files in ./llm_train, ./llm_val, and ./llm_test directories. The output directories have been saved as zip file: llm_v1_jsons.zip.
Assume ./llm_train, ./llm_val, and ./llm_test directories with JSONS are made available inside the ./llm_v1_jsons/ directory. Or can get them by unzipping the llm_v1_jsons.zip file.
unzip llm_v1_jsons.zip
python mine_llm_filtered_relation.py --set train --outputPath ./llm_v1/ --jsons_path ./llm_v1_jsons/python mine_llm_filtered_relation.py --set val --outputPath ./llm_v1/ --jsons_path ./llm_v1_jsons/python mine_llm_filtered_relation.py --settest --outputPath ./llm_v1/ --jsons_path ./llm_v1_jsons/
This will output CSV files in ./llm_v1/ directory. The output has been made available in the repo already.
python mine_llm_filtered_relation.py --set train --outputPath ./llm_v1_singleStage/ --jsons_path ./llm_v1_jsons/ --skip_second_stage python mine_llm_filtered_relation.py --set val --outputPath ./llm_v1_singleStage/ --jsons_path ./llm_v1_jsons/ --skip_second_stagepython mine_llm_filtered_relation.py --settest --outputPath ./llm_v1_singleStage/ --jsons_path ./llm_v1_jsons/ --skip_second_stage
This will output CSV files in ./llm_v1_singleStage/ directory. The output has been made available in the repo already.
python llm_fact_check.py --settest --vllm_url http://g002:8000 --evidence_path ./llm_v1_jsons
python fine_tune_hf.py --model bert-base-uncased --batch_size 64 --data_path ./llm_v1/
python fine_tune_hf.py --model roberta-base --batch_size 32 --data_path ./llm_v1_singleStage/
python fine_tune_hf.py --model roberta-base --batch_size 32 --data_path ./llm_v1/
The fine-tuned models are pushed at:https://huggingface.co/SushantGautam/KG-LLM-bert-base,https://huggingface.co/SushantGautam/KG-LLM-roberta-base-single_stage andhttps://huggingface.co/SushantGautam/KG-LLM-roberta-base.Can be loaded using the Hugging Face Transformers library and used for inference with appropriate tokenizer.
The finetuning logs and metrics can be found athttps://wandb.ai/ubl/FactKG_IN9550/runs/7sf9kelb,https://wandb.ai/ubl/FactKG_IN9550/runs/sweuj8g6 andhttps://wandb.ai/ubl/FactKG_IN9550/runs/m5vqnfcr.