str_like(ignore_case) is deprecated, withstr_like() now always case sensitive to better follow theconventions of the SQL LIKE operator (str_replace_all(), areplacementfunction now receives all values in a single vector. This radicallyimproves performance at the cost of breaking some existing uses(#462).vignette("locale-sensitive") about locale sensitivefunctions (str_ilike() that follows the conventions of the SQLILIKE operator (str_to_camel(),str_to_snake(), andstr_to_kebab() for changing “programming” case (str_* now errors ifpattern includes anyNAs (str_dup() gains asep argument so you canadd a separator between every repeated value (str_sub<- now gives a more informative error ifvalue is not the correct length.str_view() displays a message when called with azero-length character vector ([[.stringr_pattern method to match existing[.stringr_pattern (R CMD check fixesSome minor documentation improvements.
str_trunc() now correctly truncates strings whenside is"left" or"center" (
stringr functions now consistently implement the tidyverserecycling rules (#372). There are two main changes:
Only vectors of length 1 are recycled. Previously, (e.g.)str_detect(letters, c("x", "y")) worked, but it nowerrors.
str_c() ignoresNULLs, rather thantreating them as length 0 vectors.
Additionally, many more arguments now throw errors, rather thanwarnings, if supplied the wrong type of input.
regex() and friends now generate class names withstringr_ prefix (#384).
str_detect(),str_starts(),str_ends() andstr_subset() now error whenused with either an empty string ("") or aboundary(). These operations didn’t really make sense(str_detect(x, "") returnedTRUE for allnon-empty strings) and made it easy to make mistakes whenprogramming.
Many tweaks to the documentation to make it more useful andconsistent.
Newvignette("from-base") by
Newstr_escape() escapes regular expressionmetacharacters, providing an alternative tofixed() if youwant to compose a pattern from user supplied strings (#408).
Newstr_equal() compares two character vectors usingunicode rules, optionally ignoring case (#381).
str_extract() can now optionally extract a capturinggroup instead of the complete match (#420).
Newstr_flatten_comma() is a special case ofstr_flatten() designed for comma separated flattening andcan correctly apply the Oxford commas when there are only two elements(#444).
Newstr_split_1() is tailored for the special caseof splitting up a single string (#409).
Newstr_split_i() extract a single piece from astring (#278,
Newstr_like() allows the use of SQL wildcards(#280,
Newstr_rank() to complete the set oforder/rank/sort functions (#353).
Newstr_sub_all() to extract multiple substringsfrom each string.
Newstr_unique() is a wrapper aroundstri_unique() and returns unique string values in acharacter vector (#249,
str_view() uses ANSI colouring rather than an HTMLwidget (#370). This works in more places and requires fewerdependencies. It includes a number of other small improvements:
andpattern`(#407).str_view_all() redundant (and hence deprecated)(#455).Newstr_width() returns the display width of astring (#380).
stringr is now licensed as MIT (#351).
Better error message if you supply a non-string pattern(#378).
A new data source forsentences has fixed many smallerrors.
str_extract() andstr_exctract_all()now work correctly whenpattern is aboundary().
str_flatten() gains alast argumentthat optionally override the final separator (#377). It gains ana.rm argument to remove missing values (since it’s asummary function) (#439).
str_pad() gainsuse_width argument tocontrol whether to use the total code point width or the number of codepoints as “width” of a string (#190).
str_replace() andstr_replace_all() canuse standard tidyverse formula shorthand forreplacementfunction (#331).
str_starts() andstr_ends() nowcorrectly respect regex operator precedence (
str_wrap() breaks only at whitespace by default; setwhitespace_only = FALSE to return to the previous behaviour(#335,
word() now returns all the sentence when using anegativestart parameter that is greater or equal than thenumber of words. (
Hot patch release to resolve R CMD check failures.
str_interp() now renders lists consistentlyindependent on the presence of additional placeholders (
Newstr_starts() andstr_ends()functions to detect patterns at the beginning or end of strings (
str_subset(),str_detect(), andstr_which() getnegate argument, which isuseful when you want the elements that do NOT match (#259,
Newstr_to_sentence() function to capitalize withsentence case (
str_replace_all() with a named vector now respectsmodifier functions (#207)
str_trunc() is once again vectorised correctly(#203,
str_view() handlesNA values moregracefully (#217). I’ve also tweaked the sizing policy so hopefully itshould work better in notebooks, while preserving the existing behaviourin knit documents (#232).
Error : object ‘ignore.case’ is not exported by 'namespace:stringr'.This is because the long deprecatedstr_join(),ignore.case() andperl() have now beenremoved.str_glue() andstr_glue_data() provideconvenient wrappers aroundglue andglue_data() from theglue package (#157).
str_flatten() is a wrapper aroundstri_flatten() and clearly conveys flattening a charactervector into a single string (#186).
str_remove() andstr_remove_all()functions. These wrapstr_replace() andstr_replace_all() to remove patterns from strings. (
str_squish() removes spaces from both the left andright side of strings, and also converts multiple space (or space-likecharacters) to a single space within strings (
str_sub() gainsomit_na argument forignoringNA. Accordingly,str_replace() nowignoresNAs and keeps the original strings. (
str_trunc() now preserves NAs (
str_trunc() now throws an error whenwidth is shorter thanellipsis (
Long deprecatedstr_join(),ignore.case() andperl() have now beenremoved.
str_match_all() now returns NA if an optional groupdoesn’t match (previously it returned ““). This is more consistent withstr_match() and other match failures (#134).Instr_replace(),replacement can nowbe a function that is called once for each match and whose return valueis used to replace the match.
Newstr_which() mimicsgrep()(#129).
A new vignette (vignette("regular-expressions"))describes the details of the regular expressions supported by stringr.The main vignette (vignette("stringr")) has been updated togive a high-level overview of the package.
str_order() andstr_sort() gainexplicitnumeric argument for sorting mixed numbers andstrings.
str_replace_all() now throws an error ifreplacement is not a character vector. Ifreplacement isNA_character_ it replaces thecomplete string with replaces withNA (#124).
All functions that take a locale(e.g. str_to_lower() andstr_sort()) defaultto “en” (English) to ensure that the default is consistent acrossplatforms.
Add sample datasets:fruit,words andsentences.
fixed(),regex(), andcoll() now throw an error if you use them with anythingother than a plain string (#60). I’ve clarified that the replacement forperl() isregex() notregexp()(#61).boundary() has improved defaults when splitting onnon-word boundaries (#58,
str_detect() now can detect boundaries (by checkingfor astr_count() > 0) (#120).str_subset()works similarly.
str_extract() andstr_extract_all() nowwork withboundary(). This is particularly useful if youwant to extract logical constructs like words or sentences.str_extract_all() respects thesimplifyargument when used withfixed() matches.
str_subset() now respects custom options forfixed() patterns (#79,
str_replace() andstr_replace_all() nowbehave correctly when a replacement string contains$s,\\\\1, etc. (#83, #99).
str_split() gains asimplify argumentto matchstr_extract_all() etc.
str_view() andstr_view_all() createHTML widgets that display regular expression matches (#96).
word() returnsNA for indexes greaterthan number of words (#112).
stringr is now powered bystringi instead of base Rregular expressions. This improves unicode and support, and makes mostoperations considerably faster. If you find stringr inadequate for yourstring processing needs, I highly recommend looking at stringi in moredetail.
stringr gains a vignette, currently a straight forward update ofthe article that appeared in the R Journal.
str_c() now returns a zero length vector if any ofits inputs are zero length vectors. This is consistent with all otherfunctions, and standard R recycling rules. Similarly, usingstr_c("x", NA) now yieldsNA. If you want"xNA", usestr_replace_na() on theinputs.
str_replace_all() gains a convenient syntax forapplying multiple pairs of pattern and replacement to the samevector:
input<-c("abc","def")str_replace_all(input,c("[ad]"="!","[cf]"="?"))str_match() now returns NA if an optional groupdoesn’t match (previously it returned ““). This is more consistent withstr_extract() and other match failures.
Newstr_subset() keeps values that match a pattern.It’s a convenient wrapper forx[str_detect(x)] (#21,
Newstr_order() andstr_sort() allowyou to sort and order strings in a specified locale.
Newstr_conv() to convert strings from specifiedencoding to UTF-8.
New modifierboundary() allows you to count, locateand split by character, word, line and sentence boundaries.
The documentation got a lot of love, and very similar functions(e.g. first and all variants) are now documented together. This shouldhopefully make it easier to locate the function you need.
ignore.case(x) has been deprecated in favour offixed|regex|coll(x, ignore.case = TRUE),perl(x) has been deprecated in favour ofregex(x).
str_join() is deprecated, please usestr_c() instead.
fixed path instr_wrap example so works for more Rinstallations.
remove dependency on plyr
Zero input tostr_split_fixed returns 0 row matrixwithn columns
Exportstr_join
new modifierperl that switches to Perl regularexpressions
str_match now uses new base functionregmatches to extract matches - this should hopefully befaster than my previous pure R algorithm
newstr_wrap function which givesstrwrap output in a more convenient format
newword function extract words from a string givenuser defined separator (thanks to suggestion by David Cooper)
str_locate now returns consistent type when matchingempty string (thanks to Stavros Macrakis)
newstr_count counts number of matches in astring.
str_pad andstr_trim receiveperformance tweaks - for large vectors this should give at least a twoorder of magnitude speed up
str_length returns NA for invalid multibyte strings
fix small bug in internalrecyclablefunction