Automatic text dating(ATD) is a challenging task since explicit temporal mentions usually do not appear in texts. Existing state-of-the-art approaches learn word representations via language models, whereas most of them ignore diachronic change of words, which may affect the efforts of text modeling. Meanwhile, few of them consider text modeling for long diachronic documents. In this paper, we present a time-aware language model named TALM, to learn temporal word representations by transferring language models of general domains to those of time-specific ones. We also build a hierarchical modeling approach to represent diachronic documents by encoding them with temporal word representations. Experiments on a Chinese diachronic corpus show that our model effectively captures implicit temporal information of words, and outperforms state-of-the-art approaches in historical text dating as well.
Han Ren, Hai Wang, Yajie Zhao, and Yafeng Ren. 2023.Time-Aware Language Modeling for Historical Text Dating. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 13646–13656, Singapore. Association for Computational Linguistics.
@inproceedings{ren-etal-2023-time, title = "Time-Aware Language Modeling for Historical Text Dating", author = "Ren, Han and Wang, Hai and Zhao, Yajie and Ren, Yafeng", editor = "Bouamor, Houda and Pino, Juan and Bali, Kalika", booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023", month = dec, year = "2023", address = "Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.findings-emnlp.911/", doi = "10.18653/v1/2023.findings-emnlp.911", pages = "13646--13656", abstract = "Automatic text dating(ATD) is a challenging task since explicit temporal mentions usually do not appear in texts. Existing state-of-the-art approaches learn word representations via language models, whereas most of them ignore diachronic change of words, which may affect the efforts of text modeling. Meanwhile, few of them consider text modeling for long diachronic documents. In this paper, we present a time-aware language model named TALM, to learn temporal word representations by transferring language models of general domains to those of time-specific ones. We also build a hierarchical modeling approach to represent diachronic documents by encoding them with temporal word representations. Experiments on a Chinese diachronic corpus show that our model effectively captures implicit temporal information of words, and outperforms state-of-the-art approaches in historical text dating as well."}
<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="ren-etal-2023-time"> <titleInfo> <title>Time-Aware Language Modeling for Historical Text Dating</title> </titleInfo> <name type="personal"> <namePart type="given">Han</namePart> <namePart type="family">Ren</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hai</namePart> <namePart type="family">Wang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yajie</namePart> <namePart type="family">Zhao</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yafeng</namePart> <namePart type="family">Ren</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2023-12</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Findings of the Association for Computational Linguistics: EMNLP 2023</title> </titleInfo> <name type="personal"> <namePart type="given">Houda</namePart> <namePart type="family">Bouamor</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Juan</namePart> <namePart type="family">Pino</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Kalika</namePart> <namePart type="family">Bali</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Singapore</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>Automatic text dating(ATD) is a challenging task since explicit temporal mentions usually do not appear in texts. Existing state-of-the-art approaches learn word representations via language models, whereas most of them ignore diachronic change of words, which may affect the efforts of text modeling. Meanwhile, few of them consider text modeling for long diachronic documents. In this paper, we present a time-aware language model named TALM, to learn temporal word representations by transferring language models of general domains to those of time-specific ones. We also build a hierarchical modeling approach to represent diachronic documents by encoding them with temporal word representations. Experiments on a Chinese diachronic corpus show that our model effectively captures implicit temporal information of words, and outperforms state-of-the-art approaches in historical text dating as well.</abstract> <identifier type="citekey">ren-etal-2023-time</identifier> <identifier type="doi">10.18653/v1/2023.findings-emnlp.911</identifier> <location> <url>https://aclanthology.org/2023.findings-emnlp.911/</url> </location> <part> <date>2023-12</date> <extent unit="page"> <start>13646</start> <end>13656</end> </extent> </part></mods></modsCollection>
%0 Conference Proceedings%T Time-Aware Language Modeling for Historical Text Dating%A Ren, Han%A Wang, Hai%A Zhao, Yajie%A Ren, Yafeng%Y Bouamor, Houda%Y Pino, Juan%Y Bali, Kalika%S Findings of the Association for Computational Linguistics: EMNLP 2023%D 2023%8 December%I Association for Computational Linguistics%C Singapore%F ren-etal-2023-time%X Automatic text dating(ATD) is a challenging task since explicit temporal mentions usually do not appear in texts. Existing state-of-the-art approaches learn word representations via language models, whereas most of them ignore diachronic change of words, which may affect the efforts of text modeling. Meanwhile, few of them consider text modeling for long diachronic documents. In this paper, we present a time-aware language model named TALM, to learn temporal word representations by transferring language models of general domains to those of time-specific ones. We also build a hierarchical modeling approach to represent diachronic documents by encoding them with temporal word representations. Experiments on a Chinese diachronic corpus show that our model effectively captures implicit temporal information of words, and outperforms state-of-the-art approaches in historical text dating as well.%R 10.18653/v1/2023.findings-emnlp.911%U https://aclanthology.org/2023.findings-emnlp.911/%U https://doi.org/10.18653/v1/2023.findings-emnlp.911%P 13646-13656
Han Ren, Hai Wang, Yajie Zhao, and Yafeng Ren. 2023.Time-Aware Language Modeling for Historical Text Dating. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 13646–13656, Singapore. Association for Computational Linguistics.