Movatterモバイル変換


[0]ホーム

URL:


rdrr.io

cut: Convert Numeric to Factor

cutR Documentation

Convert Numeric to Factor

Description

cut divides the range ofx into intervalsand codes the values inx according to whichinterval they fall. The leftmost interval corresponds to level one,the next leftmost to level two and so on.

Usage

cut(x, ...)## Default S3 method:cut(x, breaks, labels = NULL,    include.lowest = FALSE, right = TRUE, dig.lab = 3,    ordered_result = FALSE, ...)

Arguments

x

a numeric vector which is to be converted to a factor by cutting.

breaks

either a numeric vector of two or more unique cut points or asingle number (greater than or equal to 2) giving the number ofintervals into whichx is to be cut.

labels

labels for the levels of the resulting category. By default,labels are constructed using"(a,b]" interval notation. Iflabels = FALSE, simple integer codes are returned instead ofa factor.

include.lowest

logical, indicating if an ‘x[i]’ equal tothe lowest (or highest, forright = FALSE) ‘breaks’value should be included.

right

logical, indicating if the intervals should be closed onthe right (and open on the left) or vice versa.

dig.lab

integer which is used when labels are not given. Itdetermines the number of digits used in formatting the break numbers.

ordered_result

logical: should the result be an ordered factor?

...

further arguments passed to or from other methods.

Details

Whenbreaks is specified as a single number, the range of thedata is divided intobreaks pieces of equal length, and thenthe outer limits are moved away by 0.1% of the range to ensure thatthe extreme values both fall within the break intervals. (Ifxis a constant vector, equal-length intervals are created, one ofwhich includes the single value.)

If alabels parameter is specified, its values are used to namethe factor levels. If none is specified, the factor level labels areconstructed as"(b1, b2]","(b2, b3]" etc. forright = TRUE and as"[b1, b2)", ... ifright = FALSE.In this case,dig.lab indicates the minimum number of digitsshould be used in formatting the numbersb1,b2, ....A larger value (up to 12) will be used if needed to distinguishbetween any pair of endpoints: if this fails labels such as"Range3" will be used. Formatting is done byformatC.

The default method will sort a numeric vector ofbreaks, butother methods are not required to andlabels will correspond tothe intervals after sorting.

As fromR 3.2.0,getOption("OutDec") is consulted when labelsare constructed forlabels = NULL.

Value

Afactor is returned, unlesslabels = FALSE whichresults in an integer vector of level codes.

Values which fall outside the range ofbreaks are coded asNA, as areNaN andNA values.

Note

Instead oftable(cut(x, br)),hist(x, br, plot = FALSE) ismore efficient and less memory hungry. Instead ofcut(*, labels = FALSE),findInterval() is more efficient.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

split for splitting a variable according to a group factor;factor,tabulate,table,findInterval.

quantile for ways of choosing breaks of roughly equalcontent (rather than length).

.bincode for a bare-bones version.

Examples

Z <- stats::rnorm(10000)table(cut(Z, breaks = -6:6))sum(table(cut(Z, breaks = -6:6, labels = FALSE)))sum(graphics::hist(Z, breaks = -6:6, plot = FALSE)$counts)cut(rep(1,5), 4) #-- dummytx0 <- c(9, 4, 6, 5, 3, 10, 5, 3, 5)x <- rep(0:8, tx0)stopifnot(table(x) == tx0)table( cut(x, breaks = 8))table( cut(x, breaks = 3*(-2:5)))table( cut(x, breaks = 3*(-2:5), right = FALSE))##--- some values OUTSIDE the breaks :table(cx  <- cut(x, breaks = 2*(0:4)))table(cxl <- cut(x, breaks = 2*(0:4), right = FALSE))which(is.na(cx));  x[is.na(cx)]  #-- the first 9  values  0which(is.na(cxl)); x[is.na(cxl)] #-- the last  5  values  8## Label construction:y <- stats::rnorm(100)table(cut(y, breaks = pi/3*(-3:3)))table(cut(y, breaks = pi/3*(-3:3), dig.lab = 4))table(cut(y, breaks =  1*(-3:3), dig.lab = 4))# extra digits don't "harm" heretable(cut(y, breaks =  1*(-3:3), right = FALSE))#- the same, since no exact INT!## sometimes the default dig.lab is not enough to be avoid confusion:aaa <- c(1,2,3,4,5,2,3,4,5,6,7)cut(aaa, 3)cut(aaa, 3, dig.lab = 4, ordered_result = TRUE)## one way to extract the breakpointslabs <- levels(cut(aaa, 3))cbind(lower = as.numeric( sub("\\((.+),.*", "\\1", labs) ),      upper = as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", labs) ))

What can we improve?

R Package Documentation

Browse R Packages

We want your feedback!

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

 
Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, readEmbedding Snippets.

Close

[8]ページ先頭

©2009-2026 Movatter.jp