Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Document Visual Question Answering

License

NotificationsYou must be signed in to change notification settings

anisha2102/docvqa

Repository files navigation

This repo hosts the basic functional code for our approach entitledHyperDQA in theDocument Visual Question Answering competition hosted as a part ofWorkshop on Text and Documents in Deep Learning Era atCVPR2020. Our approach stands at position 4 on theLeaderboard.

Read more about our approach in thisblogpost!

Installation

Virtual Environment Python 3 (Recommended)

  1. Clone the repository
git clone https://github.com/anisha2102/docvqa.git
  1. Install libraries
pip install -r requirements.txt

Downloads

  1. Download the datasetThe dataset for Task 1 can be downloaded from the CompetitionWebsite from the Downloads Section.The dataset consists of document images and their corresponding OCR transcriptions.

  2. Download the pretrained modelDownload the pretrained model for LayoutLM-Base, Uncased fromhere

Prepare dataset

python create_dataset.py \         <data-ocr-folder> \         <data-documents-folder> \         <path-to-train_v1.0.json> \         <train-output-json-path> \         <validation-output-json-path>

Train the model

CUDA_VISIBLE_DEVICES=0 python run_docvqa.py \    --data_dir <data-folder> \    --model_type layoutlm \    --model_name_or_path <pretrained-model-path> \ #example ./models/layoutlm-base-uncased    --do_lower_case \    --max_seq_length 512 \    --do_train \    --num_train_epochs 15 \    --logging_steps 500 \    --evaluate_during_training \    --save_steps 500 \    --do_eval \    --output_dir  <data-folder>/<exp-folder> \    --per_gpu_train_batch_size 8 \    --overwrite_output_dir \    --cache_dir <data-folder>/models \    --skip_match_answers \    --val_json <train-output-json-path> \    --train_json <train-output-json-path> \

Model Checkpoints

Download the pytorch_model.bin file from the link below and copy it to the models folder.Google Drive Link

Demo

Try out the demo on a sample datapoint with demo.ipynb

Acknowledgements

The code and pretrained models are based onLayoutLM andHuggingFace Transformers. Many thanks for their amazing open source contributions.

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp