Movatterモバイル変換


[0]ホーム

URL:


textreuse: Detect Text Reuse and Document Similarity

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Version:0.1.5
Depends:R (≥ 3.1.1)
Imports:assertthat (≥ 0.1),digest (≥ 0.6.8),dplyr (≥ 0.8.0),NLP (≥ 0.1.8),Rcpp (≥ 0.12.0),RcppProgress (≥ 0.1),stringr (≥ 1.0.0),tibble (≥ 3.0.1),tidyr (≥ 0.3.1)
LinkingTo:BH,Rcpp,RcppProgress
Suggests:testthat (≥ 0.11.0),knitr (≥ 1.11),rmarkdown (≥ 0.8),covr
Published:2020-05-15
DOI:10.32614/CRAN.package.textreuse
Author:Lincoln MullenORCID iD [aut, cre]
Maintainer:Lincoln Mullen <lincoln at lincolnmullen.com>
BugReports:https://github.com/ropensci/textreuse/issues
License:MIT + fileLICENSE
URL:https://docs.ropensci.org/textreuse,https://github.com/ropensci/textreuse
NeedsCompilation:yes
Materials:README,NEWS
In views:NaturalLanguageProcessing
CRAN checks:textreuse results

Documentation:

Reference manual:textreuse.html ,textreuse.pdf
Vignettes:Text alignment (source,R code)
Introduction to the textreuse packages (source,R code)
Minhash and locality-sensitive hashing (source,R code)
Pairwise comparisons for document similarity (source,R code)

Downloads:

Package source: textreuse_0.1.5.tar.gz
Windows binaries: r-devel:textreuse_0.1.5.zip, r-release:textreuse_0.1.5.zip, r-oldrel:textreuse_0.1.5.zip
macOS binaries: r-release (arm64):textreuse_0.1.5.tgz, r-oldrel (arm64):textreuse_0.1.5.tgz, r-release (x86_64):textreuse_0.1.5.tgz, r-oldrel (x86_64):textreuse_0.1.5.tgz
Old sources: textreuse archive

Reverse dependencies:

Reverse suggests:textrank

Linking:

Please use the canonical formhttps://CRAN.R-project.org/package=textreuseto link to this page.


[8]ページ先頭

©2009-2025 Movatter.jp