- Notifications
You must be signed in to change notification settings - Fork47
Fast and portable character string processing in R (with the Unicode ICU)
License
gagolews/stringi
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A comprehensive tutorial and reference manual is availableathttps://stringi.gagolewski.com/.
Check out
stringxfor a set of wrappersaroundstringiwith a base R-compatible API.To learn more about R, check out Marek's open-access (free!) textbookDeep R Programming.
stringi (pronounced “stringy”, IPA [strinɡi])is THER package for string/text/natural language processing.It is very fast, consistent, convenient, and — thanks to theICU – International Components for Unicodelibrary — portable across all locales and platforms.
Available features include:
- string concatenation, padding, wrapping,
- substring extraction,
- pattern searching (e.g., with Java-like regular expressions),
- collation and sorting,
- random string generation,
- case mapping and folding,
- string transliteration,
- Unicode normalisation,
- date-time formatting and parsing,
and many more.
Package Maintainer:Marek Gagolewski
Authors and Contributors:Marek Gagolewski,with contributions from Bartłomiej Tartanus and many others.
The package's API was inspired by that of the early (pre-tidyverse; v0.6.2)version of Hadley Wickham'sstringrpackage (and since the 2015 v1.0.0stringr is powered bystringi).
Homepage:https://stringi.gagolewski.com/
Citation: Gagolewski M.,stringi: Fast and portable character string processing in R,Journal of Statistical Software103(2), 2022, 1–59,https://dx.doi.org/10.18637/jss.v103.i02.
CRAN Entry:https://CRAN.R-project.org/package=stringi
System Requirements:R >= 3.4,ICU4C >= 61 (refer to theINSTALLfile for more details)
License:stringi's source code is distributed under the open sourceBSD-3-clause license. For more details, seeLICENSE.
Thisgit repository also contains a custom subset ofICU4C source code whichis copyrighted by Unicode, Inc. and others. A binary version of the UnicodeCharacter Database is included. For more details on copyright holders, seeLICENSE.TheICU project is covered by theUnicode license —a simple, permissive non-copyleft free software license, compatible withthe GNU GPL. TheICU licenseisintendedto allowICU to be included in free software projects as well asin proprietary or commercial products.
Changes: see theNEWS file.
How to access the stringi C++ API from within an Rcpp-based R package
About
Fast and portable character string processing in R (with the Unicode ICU)
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.
