Movatterモバイル変換


[0]ホーム

URL:


Package: pangoling 1.0.1

Bruno Nicenboim

pangoling: Access to Large Language Model Predictions

Provides access to word predictability estimates usinglarge language models (LLMs) based on 'transformer'architectures via integration with the 'Hugging Face'ecosystem. The package interfaces with pre-trained neuralnetworks and supports both causal/auto-regressive LLMs (e.g.,'GPT-2'; Radford et al., 2019) and masked/bidirectional LLMs(e.g., 'BERT'; Devlin et al., 2019,<doi:10.48550/arXiv.1810.04805>) to compute the probability ofwords, phrases, or tokens given their linguistic context. Byenabling a straightforward estimation of word predictability,the package facilitates research in psycholinguistics,computational linguistics, and natural language processing(NLP).

Authors:Bruno Nicenboim [aut, cre],Chris Emmerly [ctb],Giovanni Cassani [ctb],Lisa Levinson [rev],Utku Turk [rev]

pangoling_1.0.1.tar.gz
pangoling_1.0.1.zip(r-4.5)pangoling_1.0.1.zip(r-4.4)pangoling_1.0.1.zip(r-4.3)
pangoling_1.0.1.tgz(r-4.5-any)pangoling_1.0.1.tgz(r-4.4-any)pangoling_1.0.1.tgz(r-4.3-any)
pangoling_1.0.1.tar.gz(r-4.5-noble)pangoling_1.0.1.tar.gz(r-4.4-noble)
pangoling_1.0.1.tgz(r-4.4-emscripten)pangoling_1.0.1.tgz(r-4.3-emscripten)
pangoling.pdf |pangoling.html
pangoling/json (API)
NEWS

# Install 'pangoling' in R:
install.packages('pangoling', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))

Reviews:rOpenSci Software Review #575

Bug tracker:https://github.com/ropensci/pangoling/issues

Pkgdown site:https://docs.ropensci.org

Datasets:
  • df_jaeger14 - Self-Paced Reading Dataset on Chinese Relative Clauses
  • df_sent - Example dataset: Two word-by-word sentences

On CRAN:

Conda:

nlppsycholinguisticstransformers

4.90 score 8 stars 24 exports 26 dependencies

Last updated 13 days agofrom:967d98b74e (on main). Checks:4 OK, 5 NOTE. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKMar 11 2025
R-4.5-winOKMar 11 2025
R-4.5-macOKMar 11 2025
R-4.5-linuxOKMar 11 2025
R-4.4-winNOTEMar 11 2025
R-4.4-macNOTEMar 11 2025
R-4.4-linuxNOTEMar 11 2025
R-4.3-winNOTEMar 11 2025
R-4.3-macNOTEMar 11 2025

Exports:causal_configcausal_lpcausal_lp_matscausal_next_tokens_pred_tblcausal_next_tokens_tblcausal_pred_matscausal_preloadcausal_targets_predcausal_tokens_lp_tblcausal_tokens_pred_lstcausal_words_predinstall_py_pangolinginstalled_py_pangolingmasked_configmasked_lpmasked_preloadmasked_targets_predmasked_tokens_pred_tblmasked_tokens_tblntokensperplexity_calcset_cache_foldertokenize_lsttransformer_vocab

Dependencies:cachemclidata.tablefastmapglueherejsonlitelatticelifecyclemagrittrMatrixmemoisepillarpngrappdirsRcppRcppTOMLreticulaterlangrprojrootrstudioapitidyselecttidytableutf8vctrswithr

Troubleshooting the use of Python in R

Rendered fromtroubleshooting.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11

Using a Bert model to get the predictability of words in their context

Rendered fromintro-bert.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11

Using a GPT2 transformer model to get word predictability

Rendered fromintro-gpt2.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11

Worked-out example: Surprisal from a causal (GPT) model as a cognitive processing bottleneck in reading

Rendered fromexample.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11

Citation

Development and contributors

Readme and manuals

Help Manual

Help pageTopics
Returns the configuration of a causal modelcausal_config
Generate next tokens after a context and their predictability using a causal transformer modelcausal_next_tokens_pred_tbl
Generate a list of predictability matrices using a causal transformer modelcausal_pred_mats
Preloads a causal language modelcausal_preload
Compute predictability using a causal transformer modelcausal_targets_pred causal_tokens_pred_lst causal_words_pred
Self-Paced Reading Dataset on Chinese Relative Clausesdf_jaeger14
Example dataset: Two word-by-word sentencesdf_sent
Install the Python packages needed for 'pangoling'install_py_pangoling
Check if the required Python dependencies for 'pangoling' are installedinstalled_py_pangoling
Returns the configuration of a masked modelmasked_config
Preloads a masked language modelmasked_preload
Get the predictability of a target word (or phrase) given a left and right contextmasked_targets_pred
Get the possible tokens and their log probabilities for each mask in a sentencemasked_tokens_pred_tbl
The number of tokens in a string or vector of stringsntokens
Calculates perplexityperplexity_calc
Set cache folder for HuggingFace transformersset_cache_folder
Tokenize an inputtokenize_lst
Returns the vocabulary of a modeltransformer_vocab

Usage by other packages (reverse dependencies)


  • [8]ページ先頭

    ©2009-2025 Movatter.jp