Movatterモバイル変換

tidyr

Overview

The goal of tidyr is to help you createtidy data.Tidy data is data where:

Each variable is a column; each column is a variable.
Each observation is a row; each row is an observation.
Each value is a cell; each cell is a single value.

Tidy data describes a standard way of storing data that is usedwherever possible throughout thetidyverse. If you ensure that yourdata is tidy, you’ll spend less time fighting with the tools and moretime working on your analysis. Learn more about tidy data invignette("tidy-data").

Installation

# The easiest way to get tidyr is to install the whole tidyverse:install.packages("tidyverse")# Alternatively, install just tidyr:install.packages("tidyr")# Or the development version from GitHub:# install.packages("pak")pak::pak("tidyverse/tidyr")

Cheatsheet

Getting started

library(tidyr)

tidyr functions fall into five main categories:

“Pivoting” which converts between long and wide forms. tidyr1.0.0 introducespivot_longer() andpivot_wider(), replacing the olderspread()andgather() functions. Seevignette("pivot")for more details.
“Rectangling”, which turns deeply nested lists (as from JSON)into tidy tibbles. Seeunnest_longer(),unnest_wider(),hoist(), andvignette("rectangle") for more details.
Nesting converts grouped data to a form where each group becomesa single row containing a nested data frame, and unnesting does theopposite. Seenest(),unnest(), andvignette("nest") for more details.
Splitting and combining character columns. Useseparate_wider_delim(),separate_wider_position(), andseparate_wider_regex() to pull a single character columninto multiple columns; useunite() to combine multiplecolumns into a single character column.
Make implicit missing values explicit withcomplete(); make explicit missing values implicit withdrop_na(); replace missing values with next/previous valuewithfill(), or a known value withreplace_na().

Related work

tidyrsupersedesreshape2 (2010-2014) and reshape (2005-2010). Somewhatcounterintuitively, each iteration of the package has done less. tidyris designed specifically for tidying data, not general reshaping(reshape2), or the general aggregation (reshape).

data.tableprovides high-performance implementations ofmelt() anddcast()

If you’d like to read more about data reshaping from a CSperspective, I’d recommend the following three papers:

To guide your reading, here’s a translation between the terminologyused in different places:

tidyr 1.0.0	pivot longer	pivot wider
tidyr < 1.0.0	gather	spread
reshape(2)	melt	cast
spreadsheets	unpivot	pivot
databases	fold	unfold

Getting help

If you encounter a clear bug, please file a minimal reproducibleexample ongithub. Forquestions and other discussion, please usecommunity.rstudio.com.

Please note that the tidyr project is released with aContributor Codeof Conduct. By contributing to this project, you agree to abide byits terms.

[8]ページ先頭