Movatterモバイル変換

The string (or character) data type typically requires moremanipulation to be helpful for data analysts. Thus, there is a need fora robust package that is up to the task.forstringr isa new package built on top of ‘stringr’ to execute various stringmanipulations in R programming. The main aim of ‘forstringr’ is tosimplify string manipulation for R beginners. This package combines itspower with the adaptability of other manipulation tools such as tidyrand dplyr. Like in the stringr package, most functions inforstringr begin withstr_. For a quick videotutorial, I gave a talk at Africa R users meetup, which you can findhere.

Installation

install.packages("forstringr")

if(!require("devtools")){install.packages("devtools")}devtools::install_github("gbganalyst/forstringr")

Usage

This section provides a concise overview of the different functionsavailable in theforstringr package. These functions servevarious purposes and are designed to aid in string manipulationtasks.

length_omit_na()

library(forstringr)#> Loading required package: stringrethnicity<-c("Hausa",NA,"Yoruba","Ijaw","Igbo",NA,"Ibibio","Tiv","Fulani","Kanuri","Others")# count all the observations, including NAs.length(ethnicity)#> [1] 11# count all the observations, without NAs.length_omit_na(ethnicity)#> [1] 9

str_title_case()

str_title_case() converts string to title case,capitalizing only the first letter of each word while ignoring articles,prepositions, and conjunctions.

Please note thatstr_title_case() is different fromstringr::str_to_title() which converts to title case, whereonly the first letter of each word is capitalized.

words<-"the quick brown fox jumps over a lazy dog"str_title_case(words)# from forstringr package#> [1] "The Quick Brown Fox Jumps over a Lazy Dog"str_to_title(words)# from stringr package#> [1] "The Quick Brown Fox Jumps Over A Lazy Dog"

str_left()

Given a character vector,str_left() returns the leftside of a string. For examples:

str_left("Nigeria")#> [1] "N"str_left("Nigeria",n =3)#> [1] "Nig"str_left(c("Female","Male","Male","Female"))#> [1] "F" "M" "M" "F"

str_right()

Given a character vector,str_right() returns the rightside of a string. For examples:

str_right("July 20, 2022",4)#> [1] "2022"str_right("Sale Price",n =5)#> [1] "Price"

str_mid()

Like in Microsoft Excel, thestr_mid()returns a specificnumber of characters from a text string, starting at the position youspecify, based on the number of characters you select.

str_mid("Super Eagle",7,5)#> [1] "Eagle"str_mid("Oyo Ibadan",5,6)#> [1] "Ibadan"

str_split_extract()

If you want to split up a string into pieces and extract the resultsusing a specific index position, then, you will usestr_split_extract(). You can interpret it as follows:

Given a character string,S, extract the element at agiven position,k, from the result of splittingS by a given pattern,m. For example:

top_10_richest_nig<-c("Aliko Dangote","Mike Adenuga","Femi Otedola","Arthur Eze","Abdulsamad Rabiu","Cletus Ibeto","Orji Uzor Kalu","ABC Orjiakor","Jimoh Ibrahim","Tony Elumelu")first_name<-str_split_extract(top_10_richest_nig," ",1)first_name#>  [1] "Aliko"      "Mike"       "Femi"       "Arthur"     "Abdulsamad"#>  [6] "Cletus"     "Orji"       "ABC"        "Jimoh"      "Tony"

str_extract_part()

first_name<-str_extract_part(top_10_richest_nig,pattern =" ",before =TRUE)first_name#>  [1] "Aliko"      "Mike"       "Femi"       "Arthur"     "Abdulsamad"#>  [6] "Cletus"     "Orji Uzor"  "ABC"        "Jimoh"      "Tony"revenue<-c("$159","$587","$891","$207","$793")str_extract_part(revenue,pattern ="$",before =FALSE)#> [1] "159" "587" "891" "207" "793"

str_englue()

You can dynamically label ggplot2 plots with the glue operators{} or{{}} usingstr_englue().For example, any value wrapped in{ } will be inserted intothe string and you automatically inserts a given variable name using{{ }}.

It is important to note thatstr_englue() must be usedinside a function.str_englue("{{ var }}") defuses theargumentvar and transforms it to a string using thedefault name operation.

library(ggplot2)histogram_plot<-function(df, var, binwidth) { df%>%ggplot(aes(x = {{ var }}))+geom_histogram(binwidth = binwidth)+labs(title =str_englue("A histogram of {{var}} with binwidth {binwidth}"))}iris%>%histogram_plot(Sepal.Length,binwidth =0.1)

str_rm_whitespace_df()

Extra spaces are accidentally entered when working with survey data,and problems can arise when evaluating such data because of extraspaces. Therefore, the functionstr_rm_whitespace_df()eliminates your data frame unnecessary leading, trailing, or otherwhitespaces.

# a dataframe with whitespacesrichest_in_nigeria#> # A tibble: 10 × 5#>     Rank Name                   `Net worth`         Age `Source of Wealth`#>    <dbl> <chr>                  <chr>             <dbl> <chr>#>  1     1 " Aliko Dangote"       "$14 Billion"        64 "  Cement and Sugar "#>  2     2 "Mike Adenuga"         "$7.9  Billion "     68 "Telecommunication,    …#>  3     3 "Femi   Otedola"       "$5.9   Billion"     59 "Oil  and Gas"#>  4     4 " Arthur Eze"          "$5 Billion"         73 "Oil and Gas"#>  5     5 "Abdulsamad     Rabiu" "$3.7 Billion"       61 "Cement   and Sugar"#>  6     6 " Cletus Ibeto "       " $3.5 Billion"      69 "Automobile, Real Estat…#>  7     7 "Orji Uzor Kalu"       "$3.2 Billion"       61 "Furniture,    Publishi…#>  8     8 "ABC Orjiakor "        "  $1.2 Billion"     63 "Oil and Gas"#>  9     9 "  Jimoh Ibrahim"      "$1 Billion "        54 "Insurance, Oil and Gas…#> 10    10 "Tony   Elumelu"       "$900    Million"    58 "  Banking  "

# a dataframe with no whitespacesstr_rm_whitespace_df(richest_in_nigeria)#> # A tibble: 10 × 5#>     Rank Name             `Net worth`    Age `Source of Wealth`#>    <dbl> <chr>            <chr>        <dbl> <chr>#>  1     1 Aliko Dangote    $14 Billion     64 Cement and Sugar#>  2     2 Mike Adenuga     $7.9 Billion    68 Telecommunication, Oil, and Gas#>  3     3 Femi Otedola     $5.9 Billion    59 Oil and Gas#>  4     4 Arthur Eze       $5 Billion      73 Oil and Gas#>  5     5 Abdulsamad Rabiu $3.7 Billion    61 Cement and Sugar#>  6     6 Cletus Ibeto     $3.5 Billion    69 Automobile, Real Estate#>  7     7 Orji Uzor Kalu   $3.2 Billion    61 Furniture, Publishing#>  8     8 ABC Orjiakor     $1.2 Billion    63 Oil and Gas#>  9     9 Jimoh Ibrahim    $1 Billion      54 Insurance, Oil and Gas, Real Estate#> 10    10 Tony Elumelu     $900 Million    58 Banking

Movatterモバイル変換

forstringr

Overview

Installation

Usage

`length_omit_na()`

`str_title_case()`

`str_left()`

`str_right()`

`str_mid()`

`str_split_extract()`

`str_extract_part()`

`str_englue()`

`str_rm_whitespace_df()`