Overview of selection features:
tidyselect implements a DSL for selecting variables. It provides helpersfor selecting variables:
var1:var10: variables lying betweenvar1on the left andvar10on the right.
starts_with("a"): names that start with"a".ends_with("z"): names that end with"z".contains("b"): names that contain"b".matches("x.y"): names that match regular expressionx.y.num_range(x, 1:4): names following the pattern,x1,x2, ...,x4.all_of(vars)/any_of(vars):matches names stored in the character vectorvars.all_of(vars)willerror if the variables aren't present;any_of(var)will match just thevariables that exist.everything(): all variables.last_col(): furthest column on the right.where(is.numeric): all variables whereis.numeric()returnsTRUE.
As well as operators for combining those selections:
!selection: only variables that don't matchselection.selection1 & selection2: only variables included in bothselection1andselection2.selection1 | selection2: all variables that match eitherselection1orselection2.
When writing code inside packages you can substitute"var" forvar to avoidR CMD check notes.
Simple examples
Here we show the usage for the basic selection operators. See thespecific help pages to learn about helpers likestarts_with().
The selection language can be used in functions likedplyr::select() ortidyr::pivot_longer(). Let's first attachthe tidyverse:
Select variables by name:
starwars%>%select(height)#> # A tibble: 87 x 1#> height#> <int>#> 1 172#> 2 167#> 3 96#> 4 202#> # i 83 more rowsiris%>%pivot_longer(Sepal.Length)#> # A tibble: 150 x 6#> Sepal.Width Petal.Length Petal.Width Species name value#> <dbl> <dbl> <dbl> <fct> <chr> <dbl>#> 1 3.5 1.4 0.2 setosa Sepal.Length 5.1#> 2 3 1.4 0.2 setosa Sepal.Length 4.9#> 3 3.2 1.3 0.2 setosa Sepal.Length 4.7#> 4 3.1 1.5 0.2 setosa Sepal.Length 4.6#> # i 146 more rowsSelect multiple variables by separating them with commas. Note howthe order of columns is determined by the order of inputs:
starwars%>%select(homeworld,height,mass)#> # A tibble: 87 x 3#> homeworld height mass#> <chr> <int> <dbl>#> 1 Tatooine 172 77#> 2 Tatooine 167 75#> 3 Naboo 96 32#> 4 Tatooine 202 136#> # i 83 more rowsFunctions liketidyr::pivot_longer() don't take variables withdots. In this case usec() to select multiple variables:
iris%>%pivot_longer(c(Sepal.Length,Petal.Length))#> # A tibble: 300 x 5#> Sepal.Width Petal.Width Species name value#> <dbl> <dbl> <fct> <chr> <dbl>#> 1 3.5 0.2 setosa Sepal.Length 5.1#> 2 3.5 0.2 setosa Petal.Length 1.4#> 3 3 0.2 setosa Sepal.Length 4.9#> 4 3 0.2 setosa Petal.Length 1.4#> # i 296 more rowsOperators:
The: operator selects a range of consecutive variables:
starwars%>%select(name:mass)#> # A tibble: 87 x 3#> name height mass#> <chr> <int> <dbl>#> 1 Luke Skywalker 172 77#> 2 C-3PO 167 75#> 3 R2-D2 96 32#> 4 Darth Vader 202 136#> # i 83 more rowsThe! operator negates a selection:
starwars%>%select(!(name:mass))#> # A tibble: 87 x 11#> hair_color skin_color eye_color birth_year sex gender homeworld species#> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>#> 1 blond fair blue 19 male masculine Tatooine Human#> 2 <NA> gold yellow 112 none masculine Tatooine Droid#> 3 <NA> white, blue red 33 none masculine Naboo Droid#> 4 none white yellow 41.9 male masculine Tatooine Human#> # i 83 more rows#> # i 3 more variables: films <list>, vehicles <list>, starships <list>iris%>%select(!c(Sepal.Length,Petal.Length))#> # A tibble: 150 x 3#> Sepal.Width Petal.Width Species#> <dbl> <dbl> <fct>#> 1 3.5 0.2 setosa#> 2 3 0.2 setosa#> 3 3.2 0.2 setosa#> 4 3.1 0.2 setosa#> # i 146 more rowsiris%>%select(!ends_with("Width"))#> # A tibble: 150 x 3#> Sepal.Length Petal.Length Species#> <dbl> <dbl> <fct>#> 1 5.1 1.4 setosa#> 2 4.9 1.4 setosa#> 3 4.7 1.3 setosa#> 4 4.6 1.5 setosa#> # i 146 more rows& and| take the intersection or the union of two selections:
iris%>%select(starts_with("Petal")&ends_with("Width"))#> # A tibble: 150 x 1#> Petal.Width#> <dbl>#> 1 0.2#> 2 0.2#> 3 0.2#> 4 0.2#> # i 146 more rowsiris%>%select(starts_with("Petal")|ends_with("Width"))#> # A tibble: 150 x 3#> Petal.Length Petal.Width Sepal.Width#> <dbl> <dbl> <dbl>#> 1 1.4 0.2 3.5#> 2 1.4 0.2 3#> 3 1.3 0.2 3.2#> 4 1.5 0.2 3.1#> # i 146 more rowsTo take the difference between two selections, combine the& and! operators:
iris%>%select(starts_with("Petal")&!ends_with("Width"))#> # A tibble: 150 x 1#> Petal.Length#> <dbl>#> 1 1.4#> 2 1.4#> 3 1.3#> 4 1.5#> # i 146 more rows