Movatterモバイル変換


[0]ホーム

URL:


tok: Fast Text Tokenization

Interfaces with the 'Hugging Face' tokenizers library to provide implementations of today's most used tokenizers such as the 'Byte-Pair Encoding' algorithm <https://huggingface.co/docs/tokenizers/index>. It's extremely fast for both training new vocabularies and tokenizing texts.

Version:0.2.1
Depends:R (≥ 4.2.0)
Imports:R6,cli
Suggests:rmarkdown,testthat (≥ 3.0.0),hfhub (≥ 0.1.1),withr
Published:2025-09-30
DOI:10.32614/CRAN.package.tok
Author:Daniel Falbel [aut, cre], Regouby Christophe [ctb], Posit [cph]
tok author details
Maintainer:Daniel Falbel <daniel at posit.co>
BugReports:https://github.com/mlverse/tok/issues
License:MIT + fileLICENSE
URL:https://github.com/mlverse/tok
NeedsCompilation:yes
SystemRequirements:Cargo (Rust's package manager), rustc >= 1.75
Materials:README,NEWS
CRAN checks:tok results

Documentation:

Reference manual:tok.html ,tok.pdf

Downloads:

Package source: tok_0.2.1.tar.gz
Windows binaries: r-devel:tok_0.2.1.zip, r-release:tok_0.2.1.zip, r-oldrel:tok_0.2.1.zip
macOS binaries: r-release (arm64):tok_0.2.1.tgz, r-oldrel (arm64):tok_0.2.1.tgz, r-release (x86_64):tok_0.2.1.tgz, r-oldrel (x86_64):tok_0.2.1.tgz
Old sources: tok archive

Linking:

Please use the canonical formhttps://CRAN.R-project.org/package=tokto link to this page.


[8]ページ先頭

©2009-2025 Movatter.jp