Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields. Most existing work focuses heavily on languages with abundant training datasets, which limits the scope of target languages to less than 100 languages. This work attempts to apply zero-shot learning to approximate G2P models for all low-resource and endangered languages in Glottolog (about 8k languages). For any unseen target language, we first build the phylogenetic tree (i.e. language family tree) to identify top-k nearest languages for which we have training sets. Then we run models of those languages to obtain a hypothesis set, which we combine into a confusion network to propose a most likely hypothesis as an approximation to the target language. We test our approach on over 600 unseen languages and demonstrate it significantly outperforms baselines.
Xinjian Li, Florian Metze, David Mortensen, Shinji Watanabe, and Alan Black. 2022.Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble. InFindings of the Association for Computational Linguistics: ACL 2022, pages 2106–2115, Dublin, Ireland. Association for Computational Linguistics.
@inproceedings{li-etal-2022-zero, title = "Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble", author = "Li, Xinjian and Metze, Florian and Mortensen, David and Watanabe, Shinji and Black, Alan", editor = "Muresan, Smaranda and Nakov, Preslav and Villavicencio, Aline", booktitle = "Findings of the Association for Computational Linguistics: ACL 2022", month = may, year = "2022", address = "Dublin, Ireland", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.findings-acl.166/", doi = "10.18653/v1/2022.findings-acl.166", pages = "2106--2115", abstract = "Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields. Most existing work focuses heavily on languages with abundant training datasets, which limits the scope of target languages to less than 100 languages. This work attempts to apply zero-shot learning to approximate G2P models for all low-resource and endangered languages in Glottolog (about 8k languages). For any unseen target language, we first build the phylogenetic tree (i.e. language family tree) to identify top-$k$ nearest languages for which we have training sets. Then we run models of those languages to obtain a hypothesis set, which we combine into a confusion network to propose a most likely hypothesis as an approximation to the target language. We test our approach on over 600 unseen languages and demonstrate it significantly outperforms baselines."}
<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="li-etal-2022-zero"> <titleInfo> <title>Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble</title> </titleInfo> <name type="personal"> <namePart type="given">Xinjian</namePart> <namePart type="family">Li</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Florian</namePart> <namePart type="family">Metze</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">David</namePart> <namePart type="family">Mortensen</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shinji</namePart> <namePart type="family">Watanabe</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Alan</namePart> <namePart type="family">Black</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2022-05</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Findings of the Association for Computational Linguistics: ACL 2022</title> </titleInfo> <name type="personal"> <namePart type="given">Smaranda</namePart> <namePart type="family">Muresan</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Preslav</namePart> <namePart type="family">Nakov</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Aline</namePart> <namePart type="family">Villavicencio</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Dublin, Ireland</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields. Most existing work focuses heavily on languages with abundant training datasets, which limits the scope of target languages to less than 100 languages. This work attempts to apply zero-shot learning to approximate G2P models for all low-resource and endangered languages in Glottolog (about 8k languages). For any unseen target language, we first build the phylogenetic tree (i.e. language family tree) to identify top-k nearest languages for which we have training sets. Then we run models of those languages to obtain a hypothesis set, which we combine into a confusion network to propose a most likely hypothesis as an approximation to the target language. We test our approach on over 600 unseen languages and demonstrate it significantly outperforms baselines.</abstract> <identifier type="citekey">li-etal-2022-zero</identifier> <identifier type="doi">10.18653/v1/2022.findings-acl.166</identifier> <location> <url>https://aclanthology.org/2022.findings-acl.166/</url> </location> <part> <date>2022-05</date> <extent unit="page"> <start>2106</start> <end>2115</end> </extent> </part></mods></modsCollection>
%0 Conference Proceedings%T Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble%A Li, Xinjian%A Metze, Florian%A Mortensen, David%A Watanabe, Shinji%A Black, Alan%Y Muresan, Smaranda%Y Nakov, Preslav%Y Villavicencio, Aline%S Findings of the Association for Computational Linguistics: ACL 2022%D 2022%8 May%I Association for Computational Linguistics%C Dublin, Ireland%F li-etal-2022-zero%X Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields. Most existing work focuses heavily on languages with abundant training datasets, which limits the scope of target languages to less than 100 languages. This work attempts to apply zero-shot learning to approximate G2P models for all low-resource and endangered languages in Glottolog (about 8k languages). For any unseen target language, we first build the phylogenetic tree (i.e. language family tree) to identify top-k nearest languages for which we have training sets. Then we run models of those languages to obtain a hypothesis set, which we combine into a confusion network to propose a most likely hypothesis as an approximation to the target language. We test our approach on over 600 unseen languages and demonstrate it significantly outperforms baselines.%R 10.18653/v1/2022.findings-acl.166%U https://aclanthology.org/2022.findings-acl.166/%U https://doi.org/10.18653/v1/2022.findings-acl.166%P 2106-2115
[Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble](https://aclanthology.org/2022.findings-acl.166/) (Li et al., Findings 2022)
Xinjian Li, Florian Metze, David Mortensen, Shinji Watanabe, and Alan Black. 2022.Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble. InFindings of the Association for Computational Linguistics: ACL 2022, pages 2106–2115, Dublin, Ireland. Association for Computational Linguistics.