- Notifications
You must be signed in to change notification settings - Fork0
A repository with the code for the paper with the same title
License
mloncode/structured-neural-summarization
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A repository with the code forthe paper with the same title. The experiments are based on the more general-purpose graph neural network libraryOpenGNN. You can install it by following it's README.md.
Experiments are based around thetrain_and_eval.py script. Besides the main experiments, this repo also contains the following folders:
- Parsers: A collection of scripts to parse and process various datasets to the format used by the experiments
- Data: A collection of scripts to utility functions to handle and analyse the formated data
- Models: Some bash script wrapppers around the main script with some model/hyperparameter combination for diferent experiments
As an example, we will show how run a sequenced-graph to sequence model with attention on theCNN/DailyMail dataset.This assumed the process data is located in/data/naturallanguage/cnn_dailymail/split/{train,valid,test}/{inputs,targets}.jsonl.gz.
For instruction on how to process see the correspondingsubfolder.
Start by build vocabularies for the node and edge labels in the input side and tokens in the output side by running
ognn-build-vocab --field_name node_labels \ --save_vocab /data/naturallanguage/cnn_dailymail/node.vocab \ /data/naturallanguage/cnn_dailymail/split/train/inputs.jsonl.gzognn-build-vocab --no_pad_token --field_name edges --string_index 0 \ --save_vocab /data/naturallanguage/cnn_dailymail/edge.vocab \ /data/naturallanguage/cnn_dailymail/split/train/inputs.jsonl.gzognn-build-vocab --with_sequence_tokens \ --save_vocab /data/naturallanguage/cnn_dailymail/output.vocab \ /data/naturallanguage/cnn_dailymail/split/train/inputs.jsonl.gz
Then run
python train_and_eval.py
This will create the model directorycnndailymail_summarizer, which contains tensorflow checkpoint and event files that can monitored in tensorboard.
We can also pass directly the file we wish to do inference on by running
python train_and_eval.py --infer_source_file /data/naturallanguage/cnn_dailymail/split/test/inputs.jsonl.gz \ --infer_predictions_file /data/naturallanguage/cnn_dailymail/split/test/predictions.jsonl
Then print the metrics on the predictions run
python rouge_evaluator /data/naturallanguage/cnn_dailymail/split/test/summaries.jsonl.gz \ /data/naturallanguage/cnn_dailymail/split/test/predictions.jsonl
About
A repository with the code for the paper with the same title
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Languages
- Python78.9%
- Java19.0%
- Shell2.1%