This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model–BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44%, 48%, 39%, 49%, 40%, and 57% on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences.
@inproceedings{trieu-etal-2019-coreference, title = "Coreference Resolution in Full Text Articles with {BERT} and Syntax-based Mention Filtering", author = "Trieu, Hai-Long and Duong Nguyen, Anh-Khoa and Nguyen, Nhung and Miwa, Makoto and Takamura, Hiroya and Ananiadou, Sophia", editor = "Jin-Dong, Kim and Claire, N{\'e}dellec and Robert, Bossy and Louise, Del{\'e}ger", booktitle = "Proceedings of the 5th Workshop on BioNLP Open Shared Tasks", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/D19-5727/", doi = "10.18653/v1/D19-5727", pages = "196--205", abstract = "This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model{--}BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44{\%}, 48{\%}, 39{\%}, 49{\%}, 40{\%}, and 57{\%} on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences."}
<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="trieu-etal-2019-coreference"> <titleInfo> <title>Coreference Resolution in Full Text Articles with BERT and Syntax-based Mention Filtering</title> </titleInfo> <name type="personal"> <namePart type="given">Hai-Long</namePart> <namePart type="family">Trieu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Anh-Khoa</namePart> <namePart type="family">Duong Nguyen</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Nhung</namePart> <namePart type="family">Nguyen</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Makoto</namePart> <namePart type="family">Miwa</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hiroya</namePart> <namePart type="family">Takamura</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sophia</namePart> <namePart type="family">Ananiadou</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2019-11</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 5th Workshop on BioNLP Open Shared Tasks</title> </titleInfo> <name type="personal"> <namePart type="given">Kim</namePart> <namePart type="family">Jin-Dong</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Nédellec</namePart> <namePart type="family">Claire</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Bossy</namePart> <namePart type="family">Robert</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Deléger</namePart> <namePart type="family">Louise</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Hong Kong, China</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model–BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44%, 48%, 39%, 49%, 40%, and 57% on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences.</abstract> <identifier type="citekey">trieu-etal-2019-coreference</identifier> <identifier type="doi">10.18653/v1/D19-5727</identifier> <location> <url>https://aclanthology.org/D19-5727/</url> </location> <part> <date>2019-11</date> <extent unit="page"> <start>196</start> <end>205</end> </extent> </part></mods></modsCollection>
%0 Conference Proceedings%T Coreference Resolution in Full Text Articles with BERT and Syntax-based Mention Filtering%A Trieu, Hai-Long%A Duong Nguyen, Anh-Khoa%A Nguyen, Nhung%A Miwa, Makoto%A Takamura, Hiroya%A Ananiadou, Sophia%Y Jin-Dong, Kim%Y Claire, Nédellec%Y Robert, Bossy%Y Louise, Deléger%S Proceedings of the 5th Workshop on BioNLP Open Shared Tasks%D 2019%8 November%I Association for Computational Linguistics%C Hong Kong, China%F trieu-etal-2019-coreference%X This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model–BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44%, 48%, 39%, 49%, 40%, and 57% on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences.%R 10.18653/v1/D19-5727%U https://aclanthology.org/D19-5727/%U https://doi.org/10.18653/v1/D19-5727%P 196-205
[Coreference Resolution in Full Text Articles with BERT and Syntax-based Mention Filtering](https://aclanthology.org/D19-5727/) (Trieu et al., BioNLP 2019)