Word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Uncontextualized word embeddings are used in many NLP tasks today, especially in resource-limited settings where high memory capacity and GPUs are not available. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the common conditions that seem to be required for making performant word embeddings. We believe that the theoretical findings in this paper can provide a basis for more informed development of future models.
Kian Kenyon-Dean, Edward Newell, and Jackie Chi Kit Cheung. 2020.Deconstructing word embedding algorithms. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8479–8484, Online. Association for Computational Linguistics.
@inproceedings{kenyon-dean-etal-2020-deconstructing, title = "Deconstructing word embedding algorithms", author = "Kenyon-Dean, Kian and Newell, Edward and Cheung, Jackie Chi Kit", editor = "Webber, Bonnie and Cohn, Trevor and He, Yulan and Liu, Yang", booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)", month = nov, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2020.emnlp-main.681/", doi = "10.18653/v1/2020.emnlp-main.681", pages = "8479--8484", abstract = "Word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Uncontextualized word embeddings are used in many NLP tasks today, especially in resource-limited settings where high memory capacity and GPUs are not available. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the common conditions that seem to be required for making performant word embeddings. We believe that the theoretical findings in this paper can provide a basis for more informed development of future models."}
<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="kenyon-dean-etal-2020-deconstructing"> <titleInfo> <title>Deconstructing word embedding algorithms</title> </titleInfo> <name type="personal"> <namePart type="given">Kian</namePart> <namePart type="family">Kenyon-Dean</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Edward</namePart> <namePart type="family">Newell</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jackie</namePart> <namePart type="given">Chi</namePart> <namePart type="given">Kit</namePart> <namePart type="family">Cheung</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2020-11</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title> </titleInfo> <name type="personal"> <namePart type="given">Bonnie</namePart> <namePart type="family">Webber</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Trevor</namePart> <namePart type="family">Cohn</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yulan</namePart> <namePart type="family">He</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yang</namePart> <namePart type="family">Liu</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Online</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>Word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Uncontextualized word embeddings are used in many NLP tasks today, especially in resource-limited settings where high memory capacity and GPUs are not available. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the common conditions that seem to be required for making performant word embeddings. We believe that the theoretical findings in this paper can provide a basis for more informed development of future models.</abstract> <identifier type="citekey">kenyon-dean-etal-2020-deconstructing</identifier> <identifier type="doi">10.18653/v1/2020.emnlp-main.681</identifier> <location> <url>https://aclanthology.org/2020.emnlp-main.681/</url> </location> <part> <date>2020-11</date> <extent unit="page"> <start>8479</start> <end>8484</end> </extent> </part></mods></modsCollection>
%0 Conference Proceedings%T Deconstructing word embedding algorithms%A Kenyon-Dean, Kian%A Newell, Edward%A Cheung, Jackie Chi Kit%Y Webber, Bonnie%Y Cohn, Trevor%Y He, Yulan%Y Liu, Yang%S Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)%D 2020%8 November%I Association for Computational Linguistics%C Online%F kenyon-dean-etal-2020-deconstructing%X Word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Uncontextualized word embeddings are used in many NLP tasks today, especially in resource-limited settings where high memory capacity and GPUs are not available. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the common conditions that seem to be required for making performant word embeddings. We believe that the theoretical findings in this paper can provide a basis for more informed development of future models.%R 10.18653/v1/2020.emnlp-main.681%U https://aclanthology.org/2020.emnlp-main.681/%U https://doi.org/10.18653/v1/2020.emnlp-main.681%P 8479-8484
Kian Kenyon-Dean, Edward Newell, and Jackie Chi Kit Cheung. 2020.Deconstructing word embedding algorithms. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8479–8484, Online. Association for Computational Linguistics.