Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A fresh approach to string manipulation in R

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
NotificationsYou must be signed in to change notification settings

tidyverse/stringr

Repository files navigation

CRAN statusR-CMD-checkCodecov test coverageLifecycle: stable

Overview

Strings are not glamorous, high-profile components of R, but they doplay a big role in many data cleaning and preparation tasks. The stringrpackage provides a cohesive set of functions designed to make workingwith strings as easy as possible. If you’re not familiar with strings,the best place to start is thechapter onstrings in R for Data Science.

stringr is built on top ofstringi, which uses theICU C library to provide fast, correctimplementations of common string manipulations. stringr focusses on themost important and commonly used string manipulation functions whereasstringi provides a comprehensive set covering almost anything you canimagine. If you find that stringr is missing a function that you need,try looking in stringi. Both packages share similar conventions, so onceyou’ve mastered stringr, you should find stringi similarly easy to use.

Installation

# The easiest way to get stringr is to install the whole tidyverse:install.packages("tidyverse")# Alternatively, install just stringr:install.packages("stringr")

Cheatsheet

Usage

All functions in stringr start withstr_ and take a vector of stringsas the first argument:

x<- c("why","video","cross","extra","deal","authority")str_length(x)#> [1] 3 5 5 5 4 9str_c(x,collapse=",")#> [1] "why, video, cross, extra, deal, authority"str_sub(x,1,2)#> [1] "wh" "vi" "cr" "ex" "de" "au"

Most string functions work with regular expressions, a concise languagefor describing patterns of text. For example, the regular expression"[aeiou]" matches any single character that is a vowel:

str_subset(x,"[aeiou]")#> [1] "video"     "cross"     "extra"     "deal"      "authority"str_count(x,"[aeiou]")#> [1] 0 3 1 2 2 4

There are seven main verbs that work with patterns:

  • str_detect(x, pattern) tells you if there’s any match to thepattern:

    str_detect(x,"[aeiou]")#> [1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
  • str_count(x, pattern) counts the number of patterns:

    str_count(x,"[aeiou]")#> [1] 0 3 1 2 2 4
  • str_subset(x, pattern) extracts the matching components:

    str_subset(x,"[aeiou]")#> [1] "video"     "cross"     "extra"     "deal"      "authority"
  • str_locate(x, pattern) gives the position of the match:

    str_locate(x,"[aeiou]")#>      start end#> [1,]    NA  NA#> [2,]     2   2#> [3,]     3   3#> [4,]     1   1#> [5,]     2   2#> [6,]     1   1
  • str_extract(x, pattern) extracts the text of the match:

    str_extract(x,"[aeiou]")#> [1] NA  "i" "o" "e" "e" "a"
  • str_match(x, pattern) extracts parts of the match defined byparentheses:

    # extract the characters on either side of the vowelstr_match(x,"(.)[aeiou](.)")#>      [,1]  [,2] [,3]#> [1,] NA    NA   NA#> [2,] "vid" "v"  "d"#> [3,] "ros" "r"  "s"#> [4,] NA    NA   NA#> [5,] "dea" "d"  "a"#> [6,] "aut" "a"  "t"
  • str_replace(x, pattern, replacement) replaces the matches with newtext:

    str_replace(x,"[aeiou]","?")#> [1] "why"       "v?deo"     "cr?ss"     "?xtra"     "d?al"      "?uthority"
  • str_split(x, pattern) splits up a string into multiple pieces:

    str_split(c("a,b","c,d,e"),",")#> [[1]]#> [1] "a" "b"#>#> [[2]]#> [1] "c" "d" "e"

As well as regular expressions (the default), there are three otherpattern matching engines:

  • fixed(): match exact bytes
  • coll(): match human letters
  • boundary(): match boundaries

RStudio Addin

TheRegExplain RStudioaddin provides afriendly interface for working with regular expressions and functionsfrom stringr. This addin allows you to interactively build your regexp,check the output of common string matching functions, consult theinteractive help pages, or use the included resources to learn regularexpressions.

This addin can easily be installed with devtools:

# install.packages("devtools")devtools::install_github("gadenbuie/regexplain")

Compared to base R

R provides a solid set of string operations, but because they have grownorganically over time, they can be inconsistent and a little hard tolearn. Additionally, they lag behind the string operations in otherprogramming languages, so that some things that are easy to do inlanguages like Ruby or Python are rather hard to do in R.

  • Uses consistent function and argument names. The first argument isalways the vector of strings to modify, which makes stringr workparticularly well in conjunction with the pipe:

    letters %>%.[1:10] %>%   str_pad(3,"right") %>%  str_c(letters[2:11])#>  [1] "a  b" "b  c" "c  d" "d  e" "e  f" "f  g" "g  h" "h  i" "i  j" "j  k"
  • Simplifies string operations by eliminating options that you don’tneed 95% of the time.

  • Produces outputs than can easily be used as inputs. This includesensuring that missing inputs result in missing outputs, and zerolength inputs result in zero length outputs.

Learn more invignette("from-base")

About

A fresh approach to string manipulation in R

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Code of conduct

Stars

Watchers

Forks

Contributors84

Languages


[8]ページ先頭

©2009-2025 Movatter.jp