Movatterモバイル変換


[0]ホーム

URL:


wordpiece.data: Data for Wordpiece-Style Tokenization

Provides data to be used by the wordpiece algorithm in order to tokenize text into somewhat meaningful chunks. Included vocabularies were retrieved from <https://huggingface.co/bert-base-cased/resolve/main/vocab.txt> and <https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt> and parsed into an R-friendly format.

Version:2.0.0
Depends:R (≥ 3.5.0)
Suggests:testthat (≥ 3.0.0)
Published:2022-03-03
DOI:10.32614/CRAN.package.wordpiece.data
Author:Jonathan BrattORCID iD [aut], Jon HarmonORCID iD [aut, cre], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph], Google, Inc [cph] (original BERT vocabularies)
Maintainer:Jon Harmon <jonthegeek at gmail.com>
BugReports:https://github.com/macmillancontentscience/wordpiece.data/issues
License:Apache License (≥ 2)
URL:https://github.com/macmillancontentscience/wordpiece.data
NeedsCompilation:no
Materials:README,NEWS
CRAN checks:wordpiece.data results

Documentation:

Reference manual:wordpiece.data.html ,wordpiece.data.pdf

Downloads:

Package source: wordpiece.data_2.0.0.tar.gz
Windows binaries: r-devel:wordpiece.data_2.0.0.zip, r-release:wordpiece.data_2.0.0.zip, r-oldrel:wordpiece.data_2.0.0.zip
macOS binaries: r-release (arm64):wordpiece.data_2.0.0.tgz, r-oldrel (arm64):wordpiece.data_2.0.0.tgz, r-release (x86_64):wordpiece.data_2.0.0.tgz, r-oldrel (x86_64):wordpiece.data_2.0.0.tgz
Old sources: wordpiece.data archive

Reverse dependencies:

Reverse imports:wordpiece

Linking:

Please use the canonical formhttps://CRAN.R-project.org/package=wordpiece.datato link to this page.


[8]ページ先頭

©2009-2025 Movatter.jp