A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
| Version: | 4.3.1 |
| Depends: | R (≥ 4.1.0), methods |
| Imports: | fastmatch,jsonlite,lifecycle,magrittr,Matrix (≥ 1.5-0),Rcpp (≥ 0.12.12),SnowballC,stopwords,stringi,xml2,yaml |
| LinkingTo: | Rcpp |
| Suggests: | rmarkdown,spelling,testthat,formatR,tm (≥ 0.6),knitr,lsa,rlang,slam |
| Enhances: | dplyr,lda,purrr,spacyr,stm,text2vec,tibble,tidytext,tokenizers,topicmodels |
| Published: | 2025-07-10 |
| DOI: | 10.32614/CRAN.package.quanteda |
| Author: | Kenneth Benoit [cre, aut, cph], Kohei Watanabe [aut], Haiyan Wang [aut], Paul Nulty [aut], Adam Obeng [aut], Stefan Müller [aut], Akitaka Matsuo [aut], William Lowe [aut], Christian Müller [ctb], Olivier Delmarcelle [ctb], European Research Council [fnd] (ERC-2011-StG 283794-QUANTESS) |
| Maintainer: | Kenneth Benoit <kbenoit at lse.ac.uk> |
| BugReports: | https://github.com/quanteda/quanteda/issues |
| License: | GPL-3 |
| URL: | https://quanteda.io |
| NeedsCompilation: | yes |
| Language: | en-GB |
| Citation: | quanteda citation info |
| Materials: | README,NEWS |
| In views: | NaturalLanguageProcessing |
| CRAN checks: | quanteda results |