| Title: | A Grammar of Nested Data Manipulation |
| Version: | 0.3.0 |
| Author: | Mark Rieke [aut], Bolívar Aponte Rolón |
| Maintainer: | Bolívar Aponte Rolón <bolaponte@pm.me> |
| Description: | Provides functions for manipulating nested data frames in a list-column using 'dplyr'https://dplyr.tidyverse.org/ syntax. Rather than unnesting, then manipulating a data frame, 'nplyr' allows users to manipulate each nested data frame directly. 'nplyr' is a wrapper for 'dplyr' functions that provide tools for common data manipulation steps: filtering rows, selecting columns, summarising grouped data, among others. |
| License: | MIT + file LICENSE |
| URL: | https://github.com/jibarozzo/nplyr,https://jibarozzo.github.io/nplyr/ |
| BugReports: | https://github.com/jibarozzo/nplyr/issues |
| Depends: | R (≥ 3.5.0) |
| Imports: | assertthat, dplyr, magrittr, purrr, rlang, tidyr |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Suggests: | gapminder, knitr, readr, rmarkdown, stringr, testthat (≥3.0.0), tibble |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| LazyData: | true |
| NeedsCompilation: | no |
| Packaged: | 2025-05-28 22:15:12 UTC; baponte |
| Repository: | CRAN |
| Date/Publication: | 2025-05-29 14:50:02 UTC |
Pipe operator
Description
Seemagrittr::%>% for details.
Usage
lhs %>% rhsArguments
lhs | A value or the magrittr placeholder. |
rhs | A function call using the magrittr semantics. |
Value
The result of callingrhs(lhs).
Example survey data regarding job satisfaction
Description
A toy dataset containing 500 responses to a job satisfaction survey. Theresponses were randomly generated using the Qualtrics survey platform.
Usage
job_surveyFormat
A data frame with 500 rows and 6 variables:
- survey_name
name of survey
- Q1
respondent age
- Q2
city the respondent resides in
- Q3
field that the respondent that works in
- Q4
respondent's job satisfaction (on a scale from extremely satisfied to extremely dissatisfied)
- Q5
respondent's annual salary, in thousands of dollars
Nested filtering joins
Description
Nested filtering joins filter rows from.nest_data based on the presence orabsence of matches iny:
nest_semi_join()returns all rows from.nest_datawith a match iny.nest_anti_join()returns all rows from.nest_datawithout a match iny.
Usage
nest_semi_join(.data, .nest_data, y, by = NULL, copy = FALSE, ...)nest_anti_join(.data, .nest_data, y, by = NULL, copy = FALSE, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
y | A data frame, data frame extension (e.g., a tibble), or a lazy dataframe (e.g., from dbplyr or dtplyr). |
by | A character vector of variables to join by or a join specificationcreated with If To join on different variables between the objects in To join by multiple variables, use a vector with length >1. For example, To perform a cross-join, generating all combinations of each object in |
copy | If |
... | One or more unquoted expressions separated by commas. Variablenames can be used if they were positions in the data frame, so expressionslike |
Details
nest_semi_join() andnest_anti_join() are largely wrappers fordplyr::semi_join() anddplyr::anti_join() and maintain the functionalityofsemi_join() andanti_join() within each nested data frame. For moreinformation onsemi_join() oranti_join(), please refer to thedocumentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
Rows are a subset of the input, but appear in the same order.
Columns are not modified.
Data frame attributes are preserved.
Groups are taken from
.nest_data. The number of groups may be reduced.
See Also
Other joins:nest-mutate-joins,nest_nest_join()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_codes <- gapminder::country_codes %>% dplyr::slice_sample(n = 10)gm_nest %>% nest_semi_join(country_data, gm_codes, by = "country")gm_nest %>% nest_anti_join(country_data, gm_codes, by = "country")Nested Mutating joins
Description
Nested mutating joins add columns fromy to each of the nested data framesin.nest_data, matching observations based on the keys. There are fournested mutating joins:
Inner join
nest_inner_join() only keeps observations from.nest_data that have amatching key iny.
The most important property of an inner join is that unmatched rows in eitherinput are not included in the result.
Outer joins
There are three outer joins that keep observations that appear in at leastone of the data frames:
nest_left_join()keeps all observations in.nest_data.nest_right_join()keeps all observations iny.nest_full_join()keeps all observations in.nest_dataandy.
Usage
nest_inner_join( .data, .nest_data, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ..., keep = FALSE)nest_left_join( .data, .nest_data, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ..., keep = FALSE)nest_right_join( .data, .nest_data, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ..., keep = FALSE)nest_full_join( .data, .nest_data, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ..., keep = FALSE)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
y | A data frame, data frame extension (e.g., a tibble), or a lazy dataframe (e.g., from dbplyr or dtplyr). |
by | A character vector of variables to join by or a join specificationcreated with If To join on different variables between the objects in To join by multiple variables, use a vector with length >1. For example, To perform a cross-join, generating all combinations of each object in |
copy | If |
suffix | If there are non-joined duplicate variables in |
... | Other parameters passed onto methods. Includes:
|
keep | Should the join keys from both |
Details
nest_inner_join(),nest_left_join(),nest_right_join(), andnest_full_join() are largely wrappers fordplyr::inner_join(),dplyr::left_join(),dplyr::right_join(), anddplyr::full_join() andmaintain the functionality of these verbs within each nested data frame. Formore information oninner_join(),left_join(),right_join(), orfull_join(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. The order of the rows and columnsof each object in.nest_data is preserved as much as possible. Each objectin.nest_data has the following properties:
For
nest_inner_join(), a subset of rows in each object in.nest_data.Fornest_left_join(), all rows in each object in.nest_data.Fornest_right_join(), a subset of rows in each object in.nest_data,followed by unmatchedyrows.Fornest_full_join(), all rows in each object in.nest_data, followedby unmatchedyrows.Output columns include all columns from each
.nest_dataand all non-keycolumns fromy. Ifkeep = TRUE, the key columns fromyare includedas well.If non-key columns in any object in
.nest_dataandyhave the same name,suffixes are added to disambiguate. Ifkeep = TRUEand key columns in.nest_dataandyhave the same name,suffixes are added todisambiguate these as well.If
keep = FALSE, output columns included inbyare coerced to theircommon type between the objects in.nest_dataandy.Groups are taken from
.nest_data.
See Also
Other joins:nest-filter-joins,nest_nest_join()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_codes <- gapminder::country_codesgm_nest %>% nest_inner_join(country_data, gm_codes, by = "country")gm_nest %>% nest_left_join(country_data, gm_codes, by = "country")gm_nest %>% nest_right_join(country_data, gm_codes, by = "country")gm_nest %>% nest_full_join(country_data, gm_codes, by = "country")Arrange rows within a nested data frames by column values
Description
nest_arrange() orders the rows of nested data frames by the values ofselected columns.
Usage
nest_arrange(.data, .nest_data, ..., .by_group = FALSE)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Variables, or functions of variables. Use |
.by_group | If |
Details
nest_arrange() is largely a wrapper fordplyr::arrange() and maintainsthe functionality ofarrange() within each nested data frame. For moreinformation onarrange(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill be also of the same type as the input. Each object in.nest_data hasthe following properties:
All rows appear in the output, but (usually) in a different place.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
See Also
Other single table verbs:nest_filter(),nest_mutate(),nest_rename(),nest_select(),nest_slice(),nest_summarise()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_arrange(country_data, pop)gm_nest %>% nest_arrange(country_data, desc(pop))Count observations in a nested data frame by group
Description
nest_count() lets you quickly count the unique values of one or morevariables within each nested data frame.nest_count() results in a summarywith one row per each set of variables to count by.nest_add_count() isequivalent with the exception that it retains all rows and adds a new columnwith group-wise counts.
Usage
nest_count(.data, .nest_data, ..., wt = NULL, sort = FALSE, name = NULL)nest_add_count(.data, .nest_data, ..., wt = NULL, sort = FALSE, name = NULL)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Variables to group by. |
wt | Frequency weights.Can be
|
sort | If |
name | The name of the new column in the output. |
Details
nest_count() andnest_add_count() are largely wrappers fordplyr::count() anddplyr::add_count() and maintain the functionality ofcount() andadd_count() within each nested data frame. For moreinformation oncount() andadd_count(), please refer to the documentationindplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input.nest_count() andnest_add_count() group each object in.nest_data transiently, so theoutput returned in.nest_data will have the same groups as the input.
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)# count the number of times each country appears in each nested tibblegm_nest %>% nest_count(country_data, country)gm_nest %>% nest_add_count(country_data, country)# count the sum of population for each country in each nested tibblegm_nest %>% nest_count(country_data, country, wt = pop)gm_nest %>% nest_add_count(country_data, country, wt = pop)Subset distinct/unique rows within a nested data frame
Description
nest_distinct() selects only unique/distinct rows in a nested data frame.
Usage
nest_distinct(.data, .nest_data, ..., .keep_all = FALSE)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Optional variables to use when determining uniqueness. If thereare multiple rows for a given combination of inputs, only the first rowwill be preserved. If omitted, will use all variables. |
.keep_all | If |
Details
nest_distinct() is largely a wrapper fordplyr::distinct() and maintainsthe functionality ofdistinct() within each nested data frame. For moreinformation ondistinct(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
Rows are a subset of the input but appear in the same order.
Columns are not modified if
...is empty or.keep_allisTRUE.Otherwise,nest_distinct()first callsdplyr::mutate()to create newcolumns within each object in.nest_data.Groups are not modified.
Data frame attributes are preserved.
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_distinct(country_data, country)gm_nest %>% nest_distinct(country_data, country, year)Drop rows containing missing values in a column of nested data frames
Description
nest_drop_na() is used to drop rows from each data frame in a column ofnested data frames.
Usage
nest_drop_na(.data, .nest_data, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Columns within |
Details
nest_drop_na() is a wrapper fortidyr::drop_na() and maintains the functionalityofdrop_na() within each nested data frame. For more information ondrop_na()please refer to the documentation in'tidyr'.
Value
An object of the same type as.data. Each object in the column.nest_datawill have rows dropped according to the presence of NAs.
See Also
Other tidyr verbs:nest_extract(),nest_fill(),nest_replace_na(),nest_separate(),nest_unite()
Examples
gm <- gapminder::gapminder # randomly insert NAs into the dataframe & nestset.seed(123) gm <- gm %>% dplyr::mutate(pop = dplyr::if_else(runif(nrow(gm)) >= 0.9, NA_integer_, pop)) gm_nest <- gm %>% tidyr::nest(country_data = -continent)# drop rows where an NA exists in column `pop`gm_nest %>% nest_drop_na(country_data, pop)Extract a character column into multiple columns using regex groups in a column of nested data frames
Description
nest_extract() is used to extract capturing groups from a column in a nesteddata frame using regular expressions into a new column. If the groups don'tmatch, or the input is NA, the output will be NA.
Usage
nest_extract( .data, .nest_data, col, into, regex = "([[:alnum:]]+)", remove = TRUE, convert = FALSE, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
col | Column name or position within This argument is passed by expression and supports quasiquotation (you canunquote column names or column positions). |
into | Names of new variables to create as character vector.Use |
regex | A string representing a regular expression used to extract thedesired values. There should be one group (defined by |
remove | If |
convert | If NB: this will cause string |
... | Additional arguments passed on to |
Details
nest_extract() is a wrapper fortidyr::extract() and maintains the functionalityofextract() within each nested data frame. For more information onextract()please refer to the documentation in'tidyr'.
Value
An object of the same type as.data. Each object in the column.nest_datawill have new columns created according to the capture groups specified inthe regular expression.
See Also
Other tidyr verbs:nest_drop_na(),nest_fill(),nest_replace_na(),nest_separate(),nest_unite()
Examples
set.seed(123)gm <- gapminder::gapminder gm <- gm %>% dplyr::mutate(comb = sample(c(NA, "a-b", "a-d", "b-c", "d-e"), size = nrow(gm), replace = TRUE)) gm_nest <- gm %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_extract(country_data, col = comb, into = c("var1","var2"), regex = "([[:alnum:]]+)-([[:alnum:]]+)")Fill missing values in a column of nested data frames
Description
nest_fill() is used to fill missing values in selected columns of nested dataframes using the next or previous entries in a column of nested data frames.
Usage
nest_fill( .data, .nest_data, ..., .direction = c("down", "up", "downup", "updown"))Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... |
|
.direction | Direction in which to fill missing values. Currently either"down" (the default), "up", "downup" (i.e. first down and then up) or "updown"(first up and then down). |
Details
nest_fill() is a wrapper for[tidyr::fill()] and maintains the functionalityoffill() within each nested data frame. For more information onfill()please refer to the documentation in'tidyr'.
Value
An object of the same type as.data. Each object in the column.nest_datawill have the chosen columns filled in the direction specified by.direction.
See Also
Other tidyr verbs:nest_drop_na(),nest_extract(),nest_replace_na(),nest_separate(),nest_unite()
Examples
set.seed(123)gm <- gapminder::gapminder %>% dplyr::mutate(pop = dplyr::if_else(runif(dplyr::n()) >= 0.9, NA_integer_, pop))gm_nest <- gm %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_fill(country_data, pop, .direction = "down")Subset rows in nested data frames using column values.
Description
nest_filter() is used to subset nested data frames, retaining all rows thatsatisfy your conditions. To be retained, the row must produce a value ofTRUE for all conditions. Note that when a condition evaluates toNA therow will be dropped, unlike base subsetting with[.
nest_filter() subsets the rows within.nest_data, applying theexpressions in... to the column values to determine which rows should beretained. It can be applied to both grouped and ungrouped data.
Usage
nest_filter(.data, .nest_data, ..., .preserve = FALSE)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Expressions that return a logical value, and are defined in termsof the variables in |
.preserve | Relevant when |
Details
nest_filter() is largely a wrapper fordplyr::filter() and maintains thefunctionality offilter() within each nested data frame. For moreinformation onfilter(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
Rows are a subset of the input, but appear in the same order.
Columns are not modified.
The number of groups may be reduced (if
.preserveis notTRUE).Data frame attributes are preserved.
See Also
Other single table verbs:nest_arrange(),nest_mutate(),nest_rename(),nest_select(),nest_slice(),nest_summarise()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)# apply a filtergm_nest %>% nest_filter(country_data, year > 1972)# apply multiple filtersgm_nest %>% nest_filter(country_data, year > 1972, pop < 10000000) # apply a filter on grouped datagm_nest %>% nest_group_by(country_data, country) %>% nest_filter(country_data, pop > mean(pop))Group nested data frames by one or more variables
Description
nest_group_by() takes a set of nested tbls and converts it to a set ofnested grouped tbls where operations are performed "by group".nest_ungroup() removes grouping.
Usage
nest_group_by(.data, .nest_data, ..., .add = FALSE, .drop = TRUE)nest_ungroup(.data, .nest_data, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | In |
.add | When |
.drop | Drop groups formed by factor levels that don't appear in thedata? The default is |
Details
nest_group_by() andnest_ungroup() are largely wrappers fordplyr::group_by() anddplyr::ungroup() and maintain the functionality ofgroup_by() andungroup() within each nested data frame. For moreinformation ongroup_by() orungroup(), please refer to the documentationindplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill be returned as a grouped data frame with classgrouped_df, unless thecombination of... and.add yields an empty set of grouping columns, inwhich case a tibble will be returned.
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)# grouping doesn't change .nest_data, just .nest_data class:gm_nest_grouped <- gm_nest %>% nest_group_by(country_data, year)gm_nest_grouped# It changes how it acts with other nplyr verbs:gm_nest_grouped %>% nest_summarise( country_data, lifeExp = mean(lifeExp), pop = mean(pop), gdpPercap = mean(gdpPercap) )# ungrouping removes variable groups:gm_nest_grouped %>% nest_ungroup(country_data)Create, modify, and delete columns in nested data frames
Description
nest_mutate() adds new variables to and preserves existing ones withinthe nested data frames in.nest_data.nest_transmute() adds new variables to and drops existing ones from thenested data frames in.nest_data.
Usage
nest_mutate(.data, .nest_data, ...)nest_transmute(.data, .nest_data, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Name-value pairs.The name gives the name of the column in the output. The value can be:
|
Details
nest_mutate() andnest_transmute() are largely wrappers fordplyr::mutate() anddplyr::transmute() and maintain the functionality ofmutate() andtransmute() within each nested data frame. For moreinformation onmutate() ortransmute(), please refer to the documentationindplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
For
nest_mutate():Columns from each object in
.nest_datawill be preserved according tothe.keepargument.Existing columns that are modified by
...will always be returned intheir original location.New columns created through
...will be placed according to the.beforeand.afterarguments.
For
nest_transmute():Columns created or modified through
...will be returned in the orderspecified by....Unmodified grouping columns will be placed at the front.
The number of rows is not affected.
Columns given the value
NULLwill be removed.Groups will be recomputed if a grouping variable is mutated.
Data frame attributes will be preserved.
See Also
Other single table verbs:nest_arrange(),nest_filter(),nest_rename(),nest_select(),nest_slice(),nest_summarise()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)# add or modify columns:gm_nest %>% nest_mutate( country_data, lifeExp = NULL, gdp = gdpPercap * pop, pop = pop/1000000 ) # use dplyr::across() to apply transformation to multiple columns gm_nest %>% nest_mutate( country_data, across(c(lifeExp:gdpPercap), mean) )# nest_transmute() drops unused columns when mutating:gm_nest %>% nest_transmute( country_data, country = country, year = year, pop = pop/1000000 )Nested nest join
Description
nest_nest_join() returns all rows and columns in.nest_data with a newnested-df column that contains all matches fromy. When there is no match,the list contains a 0-row tibble.
Usage
nest_nest_join( .data, .nest_data, y, by = NULL, copy = FALSE, keep = FALSE, name = NULL, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
y | A data frame, data frame extension (e.g., a tibble), or a lazy dataframe (e.g., from dbplyr or dtplyr). |
by | A character vector of variables to join by or a join specificationcreated with If To join on different variables between the objects in To join by multiple variables, use a vector with length >1. For example, To perform a cross-join, generating all combinations of each object in |
copy | If |
keep | Should the join keys from both |
name | The name of the list column nesting joins create. If |
... | One or more unquoted expressions separated by commas. Variablenames can be used if they were positions in the data frame, so expressionslike |
Details
nest_nest_join() is largely a wrapper arounddplyr::nest_join() andmaintains the functionality ofnest_join() within east nested data frame.For more information onnest_join(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input.
See Also
Other joins:nest-filter-joins,nest-mutate-joins
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_codes <- gapminder::country_codesgm_nest %>% nest_nest_join(country_data, gm_codes, by = "country")Change column order within a nested data frame
Description
nest_relocate() changes column positions within a nested data frame, usingthe same syntax asnest_select() ordplyr::select() to make it easy tomove blocks of columns at once.
Usage
nest_relocate(.data, .nest_data, ..., .before = NULL, .after = NULL)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Columns to move. |
.before,.after | Destination of columns selected by |
Details
nest_relocate() is largely a wrapper fordplyr::relocate() and maintainsthe functionality ofrelocate() within each nested data frame. For moreinformation onrelocate(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
Rows are not affected.
The same columns appear in the output, but (usually) in a different place.
Data frame attributes are preserved.
Groups are not affected.
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_relocate(country_data, year)gm_nest %>% nest_relocate(country_data, pop, .after = year)Rename columns in nested data frames
Description
nest_rename() changes the names of individual variables usingnew_name = old_name syntax;nest_rename_with() renames columns using afunction.
Usage
nest_rename(.data, .nest_data, ...)nest_rename_with(.data, .nest_data, .fn, .cols = dplyr::everything(), ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | For For |
.fn | A function used to transform the selected |
.cols | Columns to rename; defaults to all columns. |
Details
nest_rename() andnest_rename_with() are largely wrappers fordplyr::rename() anddplyr::rename_with() and maintain the functionalityofrename() andrename_with() within each nested data frame. For moreinformation onrename() orrename_with(), please refer to thedocumentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
Rows are not affected.
Column names are changed; column order is preserved.
Data frame attributes are preserved.
Groups are updated to reflect new names.
See Also
Other single table verbs:nest_arrange(),nest_filter(),nest_mutate(),nest_select(),nest_slice(),nest_summarise()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_rename(country_data, population = pop)gm_nest %>% nest_rename_with(country_data, stringr::str_to_lower)Replace NAs with specified values in a column of nested data frames
Description
nest_replace_na() is used to replace missing values in selected columns ofnested data frames using values specified by column.
Usage
nest_replace_na(.data, .nest_data, replace, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
replace | A list of values, with one value for each column in that has |
... | Additional arguments for |
Details
nest_replace_na() is a wrapper fortidyr::replace_na() and maintains the functionalityofreplace_na() within each nested data frame. For more information onreplace_na()please refer to the documentation in'tidyr'.
Value
An object of the same type as.data. Each object in the column.nest_datawill have NAs replaced in the specified columns.
See Also
Other tidyr verbs:nest_drop_na(),nest_extract(),nest_fill(),nest_separate(),nest_unite()
Examples
set.seed(123)gm <- gapminder::gapminder %>% dplyr::mutate(pop = dplyr::if_else(runif(dplyr::n()) >= 0.9, NA_integer_, pop)) gm_nest <- gm %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_replace_na(.nest_data = country_data, replace = list(pop = -500))Subset columns in nested data frames using their names and types
Description
nest_select() selects (and optionally renames) variables in nested dataframes, using a concise mini-language that makes it easy to refer tovariables based on their name (e.g.,a:f selects all columns froma onthe left tof on the right). You can also use predicate functions likeis.numeric to select variables based on their properties.
Usage
nest_select(.data, .nest_data, ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | One or more unquoted expressions separated by commas. Variablenames can be used if they were positions in the data frame, so expressionslike |
Details
nest_select() is largely a wrapper fordplyr::select() and maintains thefunctionality ofselect() within each nested data frame. For moreinformation onselect(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
Rows are not affect.
Output columns are a subset of input columns, potentially with a differentorder. Columns will be renamed if
new_name = old_nameform is used.Data frame attributes are preserved.
Groups are maintained; you can't select off grouping variables.
See Also
Other single table verbs:nest_arrange(),nest_filter(),nest_mutate(),nest_rename(),nest_slice(),nest_summarise()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_select(country_data, country, year, pop)gm_nest %>% nest_select(country_data, dplyr::where(is.numeric))Separate a character column into multiple columns in a column of nested data frames
Description
nest_separate() is used to separate a single character column into multiplecolumns using a regular expression or a vector of character positions in alist of nested data frames.
Usage
nest_separate( .data, .nest_data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ...)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
col | Column name or position within. Must be present in all data framesin This argument is passed by expression and supports quasiquotation (you canunquote column names or column positions). |
into | Names of new variables to create as character vector.Use |
sep | Separator between columns. If character, If numeric, |
remove | If |
convert | If NB: this will cause string |
extra | If
|
fill | If
|
... | Additional arguments passed on to |
Details
nest_separate() is a wrapper fortidyr::separate() and maintains the functionalityofseparate() within each nested data frame. For more information onseparate()please refer to the documentation in'tidyr'.
Value
An object of the same type as.data. Each object in the column.nest_datawill have the specified column split according to the regular expression orthe vector of character positions.
See Also
Other tidyr verbs:nest_drop_na(),nest_extract(),nest_fill(),nest_replace_na(),nest_unite()
Examples
set.seed(123)gm <- gapminder::gapminder %>% dplyr::mutate(comb = paste(continent, year, sep = "-")) gm_nest <- gm %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_separate(country_data, col = comb, into = c("var1","var2"), sep = "-")Subset rows in nested data frames using their positions.
Description
nest_slice() lets you index rows in nested data frames by their (integer)locations. It allows you to select, remove, and duplicate rows. It isaccompanied by a number of helpers for common use cases:
nest_slice_head()andnest_slice_tail()select the first or last rowsof each nested data frame in.nest_data.nest_slice_sample()randomly selects rows from each data frame in.nest_data.nest_slice_min()andnest_slice_max()select the rows with the highestor lowest values of a variable within each nested data frame in.nest_data.
If.nest_data is a grouped data frame, the operation will be performed oneach group, so that (e.g.)nest_slice_head(df, nested_dfs, n = 5) willreturn the first five rows in each group for each nested data frame.
Usage
nest_slice(.data, .nest_data, ..., .preserve = FALSE)nest_slice_head(.data, .nest_data, ...)nest_slice_tail(.data, .nest_data, ...)nest_slice_min(.data, .nest_data, order_by, ..., with_ties = TRUE)nest_slice_max(.data, .nest_data, order_by, ..., with_ties = TRUE)nest_slice_sample(.data, .nest_data, ..., weight_by = NULL, replace = FALSE)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | For Provide either positive values to keep, or negative values to drop. Thevalues provided must be either all positive or all negative. Indices beyondthe number of rows in the input are silently ignored. For Additionally:
|
.preserve | Relevant when |
order_by | Variable or function of variables to order by. |
with_ties | Should ties be kept together? The default, |
weight_by | Sampling weights. This must evaluate to a vector ofnon-negative numbers the same length as the input. Weights are automaticallystandardised to sum to 1. |
replace | Should sampling be performed with ( |
Details
nest_slice() and its helpers are largely wrappers fordplyr::slice() andits helpers and maintains the functionality ofslice() and its helperswithin each nested data frame. For more information onslice() or itshelpers, please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawill also be of the same type as the input. Each object in.nest_data hasthe following properties:
Each row may appear 0, 1, or many times in the output.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
See Also
Other single table verbs:nest_arrange(),nest_filter(),nest_mutate(),nest_rename(),nest_select(),nest_summarise()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)# select the 1st, 3rd, and 5th rows in each data frame in country_datagm_nest %>% nest_slice(country_data, 1, 3, 5)# or select all but the 1st, 3rd, and 5th rows:gm_nest %>% nest_slice(country_data, -1, -3, -5)# first and last rows based on existing order:gm_nest %>% nest_slice_head(country_data, n = 5)gm_nest %>% nest_slice_tail(country_data, n = 5)# rows with minimum and maximum values of a variable:gm_nest %>% nest_slice_min(country_data, lifeExp, n = 5)gm_nest %>% nest_slice_max(country_data, lifeExp, n = 5)# randomly select rows with or without replacement:gm_nest %>% nest_slice_sample(country_data, n = 5)gm_nest %>% nest_slice_sample(country_data, n = 5, replace = TRUE)Summarise each group in nested data frames to fewer rows
Description
nest_summarise() creates a new set of nested data frames. Each will haveone (or more) rows for each combination of grouping variables; if there areno grouping variables, the output will have a single row summarising allobservations in.nest_data. Each nested data frame will contain one columnfor each grouping variable and one column for each of the summary statisticsthat you have specified.
nest_summarise() andnest_summarize() are synonyms.
Usage
nest_summarise(.data, .nest_data, ..., .groups = NULL)nest_summarize(.data, .nest_data, ..., .groups = NULL)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
... | Name-value pairs of functions. The name will be the name of thevariable in the result. The value can be:
|
.groups |
|
Details
nest_summarise() is largely a wrapper fordplyr::summarise() andmaintains the functionality ofsummarise() within each nested data frame.For more information onsummarise(), please refer to the documentation indplyr.
Value
An object of the same type as.data. Each object in the column.nest_datawillusually be of the same type as the input. Each object in.nest_data hasthe following properties:
The rows come from the underlying
group_keys()The columns are a combination of the grouping keys and the summaryexpressions that you provide.
The grouping structure is controlled by the
.groupsargument, the outputmay be another grouped_df, a tibble, or a rowwise data frame.Data frame attributes arenot preserved, because
nest_summarise()fundamentally creates a new data frame for each object in.nest_data.
See Also
Other single table verbs:nest_arrange(),nest_filter(),nest_mutate(),nest_rename(),nest_select(),nest_slice()
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)# a summary applied to an ungrouped tbl returns a single rowgm_nest %>% nest_summarise( country_data, n = dplyr::n(), median_pop = median(pop) )# usually, you'll want to group firstgm_nest %>% nest_group_by(country_data, country) %>% nest_summarise( country_data, n = dplyr::n(), median_pop = median(pop) )Unite multiple columns into one in a column of nested data frames
Description
nest_unite() is used to combine multiple columns into one in a column ofnested data frames.
Usage
nest_unite( .data, .nest_data, col, ..., sep = "_", remove = TRUE, na.rm = FALSE)Arguments
.data | A data frame, data frame extension (e.g., a tibble), or a lazydata frame (e.g., from dbplyr or dtplyr). |
.nest_data | A list-column containing data frames |
col | The name of the new column, as a string or symbol. This argument is passed by expression and supportsquasiquotation (you can unquote stringsand symbols). The name is captured from the expression with |
... | Columns to unite. |
sep | Separator to use between values. |
remove | If |
na.rm | If |
Details
nest_unite() is a wrapper fortidyr::unite() and maintains the functionalityofunite() within each nested data frame. For more information onunite()please refer to the documentation in'tidyr'.
Value
An object of the same type as.data. Each object in the column.nest_datawill have a new column created as a combination of existing columns.
See Also
Other tidyr verbs:nest_drop_na(),nest_extract(),nest_fill(),nest_replace_na(),nest_separate()
Examples
set.seed(123)gm <- gapminder::gapminder gm_nest <- gm %>% tidyr::nest(country_data = -continent)gm_nest %>% nest_unite(country_data, col = comb, year, pop)Example survey data regarding personal life satisfaction
Description
A toy dataset containing 750 responses to a personal satisfaction survey. Theresponses were randomly generated using the Qualtrics survey platform.
Usage
personal_surveyFormat
A data frame with 750 rows and 6 variables
- survey_name
name of survey
- Q1
respondent age
- Q2
city the respondent resides in
- Q3
field that the respondent that works in
- Q4
respondent's personal life satisfaction (on a scale from extremely satisfied to extremely dissatisfied)
- Q5
open text response elaborating on personal life satisfaction