Grammatical error correction (GEC) is a promising task aimed at correcting errors in a text. Many methods have been proposed to facilitate this task with remarkable results. However, most of them only focus on enhancing textual feature extraction without exploring the usage of other modalities’ information (e.g., speech), which can also provide valuable knowledge to help the model detect grammatical errors. To shore up this deficiency, we propose a novel framework that integrates both speech and text features to enhance GEC. In detail, we create new multimodal GEC datasets for English and German by generating audio from text using the advanced text-to-speech models. Subsequently, we extract acoustic and textual representations by a multimodal encoder that consists of a speech and a text encoder. A mixture-of-experts (MoE) layer is employed to selectively align representations from the two modalities, and then a dot attention mechanism is used to fuse them as final multimodal representations. Experimental results on CoNLL14, BEA19 English, and Falko-MERLIN German show that our multimodal GEC models achieve significant improvements over strong baselines and achieve a new state-of-the-art result on the Falko-MERLIN test set.
Tao Fang, Jinpeng Hu, Derek F. Wong, Xiang Wan, Lidia S. Chao, and Tsung-Hui Chang. 2023.Improving Grammatical Error Correction with Multimodal Feature Integration. InFindings of the Association for Computational Linguistics: ACL 2023, pages 9328–9344, Toronto, Canada. Association for Computational Linguistics.
@inproceedings{fang-etal-2023-improving, title = "Improving Grammatical Error Correction with Multimodal Feature Integration", author = "Fang, Tao and Hu, Jinpeng and Wong, Derek F. and Wan, Xiang and Chao, Lidia S. and Chang, Tsung-Hui", editor = "Rogers, Anna and Boyd-Graber, Jordan and Okazaki, Naoaki", booktitle = "Findings of the Association for Computational Linguistics: ACL 2023", month = jul, year = "2023", address = "Toronto, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.findings-acl.594/", doi = "10.18653/v1/2023.findings-acl.594", pages = "9328--9344", abstract = "Grammatical error correction (GEC) is a promising task aimed at correcting errors in a text. Many methods have been proposed to facilitate this task with remarkable results. However, most of them only focus on enhancing textual feature extraction without exploring the usage of other modalities' information (e.g., speech), which can also provide valuable knowledge to help the model detect grammatical errors. To shore up this deficiency, we propose a novel framework that integrates both speech and text features to enhance GEC. In detail, we create new multimodal GEC datasets for English and German by generating audio from text using the advanced text-to-speech models. Subsequently, we extract acoustic and textual representations by a multimodal encoder that consists of a speech and a text encoder. A mixture-of-experts (MoE) layer is employed to selectively align representations from the two modalities, and then a dot attention mechanism is used to fuse them as final multimodal representations. Experimental results on CoNLL14, BEA19 English, and Falko-MERLIN German show that our multimodal GEC models achieve significant improvements over strong baselines and achieve a new state-of-the-art result on the Falko-MERLIN test set."}
<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="fang-etal-2023-improving"> <titleInfo> <title>Improving Grammatical Error Correction with Multimodal Feature Integration</title> </titleInfo> <name type="personal"> <namePart type="given">Tao</namePart> <namePart type="family">Fang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jinpeng</namePart> <namePart type="family">Hu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Derek</namePart> <namePart type="given">F</namePart> <namePart type="family">Wong</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xiang</namePart> <namePart type="family">Wan</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Lidia</namePart> <namePart type="given">S</namePart> <namePart type="family">Chao</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Tsung-Hui</namePart> <namePart type="family">Chang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2023-07</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Findings of the Association for Computational Linguistics: ACL 2023</title> </titleInfo> <name type="personal"> <namePart type="given">Anna</namePart> <namePart type="family">Rogers</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jordan</namePart> <namePart type="family">Boyd-Graber</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Naoaki</namePart> <namePart type="family">Okazaki</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Toronto, Canada</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>Grammatical error correction (GEC) is a promising task aimed at correcting errors in a text. Many methods have been proposed to facilitate this task with remarkable results. However, most of them only focus on enhancing textual feature extraction without exploring the usage of other modalities’ information (e.g., speech), which can also provide valuable knowledge to help the model detect grammatical errors. To shore up this deficiency, we propose a novel framework that integrates both speech and text features to enhance GEC. In detail, we create new multimodal GEC datasets for English and German by generating audio from text using the advanced text-to-speech models. Subsequently, we extract acoustic and textual representations by a multimodal encoder that consists of a speech and a text encoder. A mixture-of-experts (MoE) layer is employed to selectively align representations from the two modalities, and then a dot attention mechanism is used to fuse them as final multimodal representations. Experimental results on CoNLL14, BEA19 English, and Falko-MERLIN German show that our multimodal GEC models achieve significant improvements over strong baselines and achieve a new state-of-the-art result on the Falko-MERLIN test set.</abstract> <identifier type="citekey">fang-etal-2023-improving</identifier> <identifier type="doi">10.18653/v1/2023.findings-acl.594</identifier> <location> <url>https://aclanthology.org/2023.findings-acl.594/</url> </location> <part> <date>2023-07</date> <extent unit="page"> <start>9328</start> <end>9344</end> </extent> </part></mods></modsCollection>
%0 Conference Proceedings%T Improving Grammatical Error Correction with Multimodal Feature Integration%A Fang, Tao%A Hu, Jinpeng%A Wong, Derek F.%A Wan, Xiang%A Chao, Lidia S.%A Chang, Tsung-Hui%Y Rogers, Anna%Y Boyd-Graber, Jordan%Y Okazaki, Naoaki%S Findings of the Association for Computational Linguistics: ACL 2023%D 2023%8 July%I Association for Computational Linguistics%C Toronto, Canada%F fang-etal-2023-improving%X Grammatical error correction (GEC) is a promising task aimed at correcting errors in a text. Many methods have been proposed to facilitate this task with remarkable results. However, most of them only focus on enhancing textual feature extraction without exploring the usage of other modalities’ information (e.g., speech), which can also provide valuable knowledge to help the model detect grammatical errors. To shore up this deficiency, we propose a novel framework that integrates both speech and text features to enhance GEC. In detail, we create new multimodal GEC datasets for English and German by generating audio from text using the advanced text-to-speech models. Subsequently, we extract acoustic and textual representations by a multimodal encoder that consists of a speech and a text encoder. A mixture-of-experts (MoE) layer is employed to selectively align representations from the two modalities, and then a dot attention mechanism is used to fuse them as final multimodal representations. Experimental results on CoNLL14, BEA19 English, and Falko-MERLIN German show that our multimodal GEC models achieve significant improvements over strong baselines and achieve a new state-of-the-art result on the Falko-MERLIN test set.%R 10.18653/v1/2023.findings-acl.594%U https://aclanthology.org/2023.findings-acl.594/%U https://doi.org/10.18653/v1/2023.findings-acl.594%P 9328-9344
Tao Fang, Jinpeng Hu, Derek F. Wong, Xiang Wan, Lidia S. Chao, and Tsung-Hui Chang. 2023.Improving Grammatical Error Correction with Multimodal Feature Integration. InFindings of the Association for Computational Linguistics: ACL 2023, pages 9328–9344, Toronto, Canada. Association for Computational Linguistics.