
The goal of tidyr is to help you createtidy data.Tidy data is data where:
Tidy data describes a standard way of storing data that is usedwherever possible throughout thetidyverse. If you ensure that yourdata is tidy, you’ll spend less time fighting with the tools and moretime working on your analysis. Learn more about tidy data invignette("tidy-data").
# The easiest way to get tidyr is to install the whole tidyverse:install.packages("tidyverse")# Alternatively, install just tidyr:install.packages("tidyr")# Or the development version from GitHub:# install.packages("pak")pak::pak("tidyverse/tidyr")library(tidyr)tidyr functions fall into five main categories:
“Pivoting” which converts between long and wide forms. tidyr1.0.0 introducespivot_longer() andpivot_wider(), replacing the olderspread()andgather() functions. Seevignette("pivot")for more details.
“Rectangling”, which turns deeply nested lists (as from JSON)into tidy tibbles. Seeunnest_longer(),unnest_wider(),hoist(), andvignette("rectangle") for more details.
Nesting converts grouped data to a form where each group becomesa single row containing a nested data frame, and unnesting does theopposite. Seenest(),unnest(), andvignette("nest") for more details.
Splitting and combining character columns. Useseparate_wider_delim(),separate_wider_position(), andseparate_wider_regex() to pull a single character columninto multiple columns; useunite() to combine multiplecolumns into a single character column.
Make implicit missing values explicit withcomplete(); make explicit missing values implicit withdrop_na(); replace missing values with next/previous valuewithfill(), or a known value withreplace_na().
tidyrsupersedesreshape2 (2010-2014) and reshape (2005-2010). Somewhatcounterintuitively, each iteration of the package has done less. tidyris designed specifically for tidying data, not general reshaping(reshape2), or the general aggregation (reshape).
data.tableprovides high-performance implementations ofmelt() anddcast()
If you’d like to read more about data reshaping from a CSperspective, I’d recommend the following three papers:
Wrangler:Interactive visual specification of data transformationscripts
Aninteractive framework for data cleaning (Potter’s wheel)
On efficientlyimplementing SchemaSQL on a SQL database system
To guide your reading, here’s a translation between the terminologyused in different places:
| tidyr 1.0.0 | pivot longer | pivot wider |
|---|---|---|
| tidyr < 1.0.0 | gather | spread |
| reshape(2) | melt | cast |
| spreadsheets | unpivot | pivot |
| databases | fold | unfold |
If you encounter a clear bug, please file a minimal reproducibleexample ongithub. Forquestions and other discussion, please usecommunity.rstudio.com.
Please note that the tidyr project is released with aContributor Codeof Conduct. By contributing to this project, you agree to abide byits terms.