- Notifications
You must be signed in to change notification settings - Fork7
The source code for my bachelor's thesis "Abstractive Summarization of Meetings"
License
Bastian/Abstractive-Summarization-of-Meetings
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project contains the source code for my bachelor's thesis "Abstractive Text Summarization of Meetings".
This project was only tested with Python 3.6 but should also work with more recent version of Python.For dependency versions, take a look at therequirements.txt
file.
python prepare_data.py
reads thedata.[train|dev|test].tsv
files and generates 3 TFRecord data filestrain.tf_record
,eval.tf_record
, andtest.tf_record
.These files are used for training.
python main.py --run_mode=train_and_evaluate
starts the training.
python main.py --run_mode=test
can be used to calculate BLEU and ROUGE scores on the test data.It will print the results into the console and write the three filestest-inputs.txt
,test-predictions.txt
,test-targets.txt
in the/outputs
folder. These files contain the sentences in a human readable format.
python main.py --run_mode=predict
takes the content from the/data/predict.txt
file and creates two files in the output-folder:predict-inputs.txt
andpredict-predictions.txt
.
The data from thepredict.txt
anddata.[train|dev|test].tsv
files is taken from theAMI Corpusand processed using theNITE XML Toolkit. The code that parses the corpuscan be found atMeeting-Parser.
The AMI Corpus license can be found here:AMI Meeting Corpus License.
Main parts of the code are taken from the Texar examples for BERT and Transformers. They can be found underthe following links:
These examples are licensed under theApache License 2.0.Copied files contain a link to their original version in the file header. Any of my modificationsare also licensed under the same license.
This project was inspired by the GitHub repositoryAbstractive Summarization With Transfer Learning.This project uses no source code of the repository, though. The repository is also based on the Texar examples and thushas similar code.
This project is licensed under theApache License 2.0.
About
The source code for my bachelor's thesis "Abstractive Summarization of Meetings"
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.