Title: | The R Base Package |
---|---|
Description: | Base R functions. |
Authors: | R Core Team and contributors worldwide |
Maintainer: | R Core Team <[email protected]> |
License: | Part of R 4.4.1 |
Version: | 4.4.1 |
Built: | 2024-06-15 17:27:47 UTC |
Source: | base |
Base R functions
This package contains the basic functions which letRfunction as a language: arithmetic, input/output, basicprogramming support, etc. Its contents are available throughinheritance from any environment.
For a complete list of functions, uselibrary(help = "base")
.
Bin a numeric vector and return integer codes for the binning.
.bincode(x, breaks, right = TRUE, include.lowest = FALSE)
.bincode(x, breaks, right=TRUE, include.lowest=FALSE)
x | a numeric vector which is to be converted to integer codes bybinning. |
breaks | a numeric vector of two or more cut points, sorted inincreasing order. |
right | logical, indicating if the intervals should be closed onthe right (and open on the left) or vice versa. |
include.lowest | logical, indicating if an ‘x[i]’ equal tothe lowest (or highest, for |
This is a ‘barebones’ version ofcut.default(labels = FALSE)
intended for use in other functions which have checked thearguments passed. (Note the different order of the arguments they havein common.)
Unlikecut
, thebreaks
do not need to be unique.An input can only fall into a zero-length interval if it is closedat both ends, so only ifinclude.lowest = TRUE
and it is thefirst (or last forright = FALSE
) interval.
An integer vector of the same length asx
indicating which bineach element falls into (the leftmost bin being bin1
).NaN
andNA
elements ofx
are mapped toNA
codes, as are values outside range ofbreaks
.
## An example with non-unique breaks:x <- c(0, 0.01, 0.5, 0.99, 1)b <- c(0, 0, 1, 1).bincode(x, b, TRUE).bincode(x, b, FALSE).bincode(x, b, TRUE, TRUE).bincode(x, b, FALSE, TRUE)
## An example with non-unique breaks:x<- c(0,0.01,0.5,0.99,1)b<- c(0,0,1,1).bincode(x, b,TRUE).bincode(x, b,FALSE).bincode(x, b,TRUE,TRUE).bincode(x, b,FALSE,TRUE)
A pairlist of the names of open graphics devices is stored in.Devices
. The name of the active device (seedev.cur
) is stored in.Device
. Both are symbolsand so appear in the base namespace.
.Device.Devices
.Device.Devices
.Device
is a length-one character vector.
.Devices
is apairlist of length-one character vectors.The first entry is always"null device"
, and there are as manyentries as the maximal number of graphics devices which have beensimultaneously active. If a device has been removed, its entry will be""
until the device number is reused.
Devices may add attributes to the character vector: for exampledevices which write to a file may record its path in attribute"filepath"
.
.Machine
is a variable holding information on the numericalcharacteristics of the machineR is running on, such as the largestdouble or integer and the machine's precision.
.Machine
.Machine
The algorithm is based on Cody's (1988) subroutineMACHAR
. As allcurrent implementations ofR use 32-bit integers and useIEC 60559floating-point (double precision) arithmetic, the"integer"
and"double"
related values are the same for almost allR builds.
Note that on most platforms smaller positive values than.Machine$double.xmin
can occur. On a typicalR platform thesmallest positive double is about5e-324
.
A list with components
double.eps | the smallest positive floating-point number |
double.neg.eps | a small positive floating-point number |
double.xmin | the smallest non-zero normalizedfloating-point number, a power of the radix, i.e., |
double.xmax | the largest normalized floating-point number.Typically, it is equal to |
double.base | the radix for the floating-point representation:normally |
double.digits | the number of base digits in the floating-pointsignificand: normally |
double.rounding | the rounding action, one of |
double.guard | the number of guard digits for multiplicationwith truncating arithmetic. It is 1 if floating-point arithmetictruncates and more than |
double.ulp.digits | the largest negative integer |
double.neg.ulp.digits | the largest negative integer |
double.exponent | the number of bits (decimal places if |
double.min.exp | the largest in magnitude negative integer |
double.max.exp | the smallest positive power of |
integer.max | the largest integer which can be represented.Always |
sizeof.long | the number of bytes in a C ‘long’ type: |
sizeof.longlong | the number of bytes in a C ‘long long’type. Will be zero if there is no such type, otherwise usually |
sizeof.longdouble | the number of bytes in a C ‘long double’type. Will be zero if there is no such type (or its use wasdisabled whenR was built), otherwise possibly |
sizeof.pointer | the number of bytes in the C |
sizeof.time_t | the number ofbytes in the C |
longdouble.eps ,longdouble.neg.eps ,longdouble.digits ,... | introduced inR 4.0.0. When |
In the (typical) case wherecapabilities("long.double")
istrue,R uses the ‘long double’ C type in quite a few places internallyfor accumulators in e.g.sum
, reading non-integernumeric constants into (binary) double precision numbers, or arithmeticsuch asx %% y
; also, ‘long double’ can be read byreadBin
.
For this reason, in that case,.Machine
contains ten further components,longdouble.eps
,*.neg.eps
,*.digits
,*.rounding
*.guard
,*.ulp.digits
,*.neg.ulp.digits
,*.exponent
,*.min.exp
, and*.max.exp
, computedentirely analogously to theirdouble.*
counterparts, see there.
sizeof.longdouble
only tells you the amount of storageallocated for a long double. Often what is stored is the 80-bit extendeddouble type ofIEC 60559, padded to the double alignment used on theplatform — this seems to be the case for the commonR platformsusing ix86 and x86_64 chips. There are other implementation of longdouble, usually in software for example on Sparc Solaris and AIX.
Note that it is legal for a platform to have a ‘long double’ Ctype which is identical to the ‘double’ type — this happens onARM CPUs. In that casecapabilities("long.double")
willbe false but on versions ofR prior to 4.0.4,.Machine
may contain"longdouble.kind"
elements.
Uses a C translation of Fortran code in the reference, modified by theR Core Team to defeat over-optimization in modern compilers.
Cody, W. J. (1988).MACHAR: A subroutine to dynamically determine machine parameters.Transactions on Mathematical Software,14(4), 303–311.doi:10.1145/50063.51907.
.Platform
for details of the platform.
.Machine## or for a neat printoutnoquote(unlist(format(.Machine)))
.Machine## or for a neat printoutnoquote(unlist(format(.Machine)))
.Platform
is a list with some details of the platform underwhichR was built. This provides means to write OS-portableRcode.
.Platform
.Platform
A list with at least the following components:
OS.type | character string, giving theOperatingSystem(family) of the computer. One of |
file.sep | character string, giving thefileseparator used on yourplatform: |
dynlib.ext | character string, giving the file nameextension ofdynamically loadablelibraries, e.g., |
GUI | character string, giving the type of GUI in use, or |
endian | character string, |
pkgType | character string, the preferred setting for This shouldnot be used to identify the OS. |
path.sep | character string, giving thepathseparator,used on your platform, e.g., |
r_arch | character string, possibly |
.Platform$GUI
is set to"AQUA"
under the macOS GUI,R.app
. This has a number of consequences:
‘/usr/local/bin’ isappended to thePATHenvironment variable.
the default graphics device is set toquartz
.
selects native (rather than Tk) widgets for thegraphics = TRUE
options ofmenu
andselect.list
.
HTML help is displayed in the internal browser.
the spreadsheet-like data editor/viewer uses a Quartz versionrather than the X11 one.
R.version
andSys.info
give more detailsabout the OS. In particular,R.version$platform
is thecanonical name of the platform under whichR was compiled.osVersion
may give more details about the platformR is running on.
.Machine
for details of the arithmetic used, andsystem
for invoking platform-specific system commands.
capabilities
andextSoftVersion
(and linksthere) for availability of capabilities partlyexternal toRbut used fromR functions.
## Note: this can be done in a system-independent way by dir.exists()if(.Platform$OS.type == "unix") { system.test <- function(...) system(paste("test", ...)) == 0L dir.exists2 <- function(dir) sapply(dir, function(d) system.test("-d", d)) dir.exists2(c(R.home(), "/tmp", "~", "/NO")) # > T T T F}
## Note: this can be done in a system-independent way by dir.exists()if(.Platform$OS.type=="unix"){ system.test<-function(...) system(paste("test",...))==0L dir.exists2<-function(dir) sapply(dir,function(d) system.test("-d", d)) dir.exists2(c(R.home(),"/tmp","~","/NO"))# > T T T F}
Abbreviate strings to at leastminlength
characters,such that they remainunique (if they were),unlessstrict = TRUE
.
abbreviate(names.arg, minlength = 4, use.classes = TRUE, dot = FALSE, strict = FALSE, method = c("left.kept", "both.sides"), named = TRUE)
abbreviate(names.arg, minlength=4, use.classes=TRUE, dot=FALSE, strict=FALSE, method= c("left.kept","both.sides"), named=TRUE)
names.arg | a character vector of names to be abbreviated, or anobject to be coerced to a character vector by |
minlength | the minimum length of the abbreviations. |
use.classes | logical: should lowercase characters be removed first? |
dot | logical: should a dot ( |
strict | logical: should |
method | a character string specifying the method used with default |
named | logical: should |
The default algorithm (method = "left.kept"
) used is similarto that of S. For a single string it works as follows.First spaces at the ends of the string are stripped.Then (if necessary) any other spaces are stripped.Next, lower case vowels are removed followed by lower case consonants.Finally if the abbreviation is still longer thanminlength
upper case letters and symbols are stripped.
Characters are always stripped from the end of the strings first. Ifan element ofnames.arg
contains more than one word (words areseparated by spaces) then at least one letter from each word will beretained.
Missing (NA
) values are unaltered.
Ifuse.classes
isFALSE
then the only distinction is tobe between letters and space.
A character vector containing abbreviations for the character stringsin its first argument. Duplicates in the originalnames.arg
will be given identical abbreviations. If any non-duplicated elementshave the sameminlength
abbreviations then, ifmethod = "both.sides"
the basic internalabbreviate()
algorithm isapplied to the characterwisereversed strings; if there arestill duplicated abbreviations and ifstrict = FALSE
as bydefault,minlength
is incremented by one and new abbreviationsare found for those elements only. This process is repeated until allunique elements ofnames.arg
have unique abbreviations.
Ifnames
is true, the character version ofnames.arg
isattached to the returned value as anames
attribute: noother attributes are retained.
If a input element contains non-ASCII characters, the correspondingvalue will be in UTF-8 and marked as such (seeEncoding
).
Ifuse.classes
is true (the default), this is really onlysuitable for English, and prior toR 3.3.0 did not work correctlywith non-ASCII characters in multibyte locales. It will warn if usedwith non-ASCII characters (and required to reduce the length). It isunlikely to work well with inputs not in the Unicode Basic MultilingualPlane nor on (rare) platforms where wide characters are not encoded inUnicode.
As fromR 3.3.0 the concept of ‘vowel’ is extended fromEnglish vowels by including characters which are accented versions oflower-case English vowels (including ‘o with stroke’). Ofcourse, there are languages (even Western European languages such asWelsh) with other vowels.
x <- c("abcd", "efgh", "abce")abbreviate(x, 2)abbreviate(x, 2, strict = TRUE) # >> 1st and 3rd are == "ab"(st.abb <- abbreviate(state.name, 2))stopifnot(identical(unname(st.abb), abbreviate(state.name, 2, named=FALSE)))table(nchar(st.abb)) # out of 50, 3 need 4 letters :as <- abbreviate(state.name, 3, strict = TRUE)as[which(as == "Mss")]## and without distinguishing vowels:st.abb2 <- abbreviate(state.name, 2, FALSE)cbind(st.abb, st.abb2)[st.abb2 != st.abb, ]## method = "both.sides" helps: no 4-letters, and only 4 3-letters:st.ab2 <- abbreviate(state.name, 2, method = "both")table(nchar(st.ab2))## Compare the two methods:cbind(st.abb, st.ab2)
x<- c("abcd","efgh","abce")abbreviate(x,2)abbreviate(x,2, strict=TRUE)# >> 1st and 3rd are == "ab"(st.abb<- abbreviate(state.name,2))stopifnot(identical(unname(st.abb), abbreviate(state.name,2, named=FALSE)))table(nchar(st.abb))# out of 50, 3 need 4 letters :as<- abbreviate(state.name,3, strict=TRUE)as[which(as=="Mss")]## and without distinguishing vowels:st.abb2<- abbreviate(state.name,2,FALSE)cbind(st.abb, st.abb2)[st.abb2!= st.abb,]## method = "both.sides" helps: no 4-letters, and only 4 3-letters:st.ab2<- abbreviate(state.name,2, method="both")table(nchar(st.ab2))## Compare the two methods:cbind(st.abb, st.ab2)
Searches for approximate matches topattern
(the first argument)within each element of the stringx
(the second argument) usingthe generalized Levenshtein edit distance (the minimal possiblyweighted number of insertions, deletions and substitutions needed totransform one string into another).
agrep(pattern, x, max.distance = 0.1, costs = NULL, ignore.case = FALSE, value = FALSE, fixed = TRUE, useBytes = FALSE)agrepl(pattern, x, max.distance = 0.1, costs = NULL, ignore.case = FALSE, fixed = TRUE, useBytes = FALSE)
agrep(pattern, x, max.distance=0.1, costs=NULL, ignore.case=FALSE, value=FALSE, fixed=TRUE, useBytes=FALSE)agrepl(pattern, x, max.distance=0.1, costs=NULL, ignore.case=FALSE, fixed=TRUE, useBytes=FALSE)
pattern | a non-empty character string to be matched. For |
x | character vector where matches are sought.Coerced by |
max.distance | maximum distance allowed for a match. Expressedeither as integer, or as a fraction of thepattern lengthtimes the maximal transformation cost (will be replaced by thesmallest integer not less than the corresponding fraction), or alist with possible components
If |
costs | a numeric vector or list with names partially matching‘insertions’, ‘deletions’ and ‘substitutions’ givingthe respective costs for computing the generalized Levenshteindistance, or |
ignore.case | if |
value | if |
fixed | logical. If |
useBytes | logical. If |
The Levenshtein edit distance is used as measure of approximateness:it is the (possibly cost-weighted) total number of insertions,deletions and substitutions required to transform one string intoanother.
This uses thetre
code by Ville Laurikari(https://github.com/laurikari/tre), which supportsMBCScharacter matching.
The main effect ofuseBytes = TRUE
is to avoid errors/warningsabout invalid inputs and spurious matches in multibyte locales.It inhibits the conversion of inputs with marked encodings, and isforced if any input is found which is marked as"bytes"
(seeEncoding
).
agrep
returns a vector giving the indices of the elements thatyielded a match, or, ifvalue
isTRUE
, the matchedelements (after coercion, preserving names but no other attributes).
agrepl
returns a logical vector.
Since someone who read the description carelessly even filed a bugreport on it, do note that this matches substrings of each element ofx
(just asgrep
does) andnot wholeelements. See alsoadist
in packageutils, whichoptionally returns the offsets of the matched substrings.
Original version inR < 2.10.0 by David Meyer.Current version by Brian Ripley and Kurt Hornik.
grep
,adist
.A different interface to approximate string matching is provided byaregexec()
.
agrep("lasy", "1 lazy 2")agrep("lasy", c(" 1 lazy 2", "1 lasy 2"), max.distance = list(sub = 0))agrep("laysy", c("1 lazy", "1", "1 LAZY"), max.distance = 2)agrep("laysy", c("1 lazy", "1", "1 LAZY"), max.distance = 2, value = TRUE)agrep("laysy", c("1 lazy", "1", "1 LAZY"), max.distance = 2, ignore.case = TRUE)
agrep("lasy","1 lazy 2")agrep("lasy", c(" 1 lazy 2","1 lasy 2"), max.distance= list(sub=0))agrep("laysy", c("1 lazy","1","1 LAZY"), max.distance=2)agrep("laysy", c("1 lazy","1","1 LAZY"), max.distance=2, value=TRUE)agrep("laysy", c("1 lazy","1","1 LAZY"), max.distance=2, ignore.case=TRUE)
Given a set of logical vectors, are all of the values true?
all(..., na.rm = FALSE)
all(..., na.rm=FALSE)
... | zero or more logical vectors. Other objects of zerolength are ignored, and the rest are coerced to logical ignoringany class. |
na.rm | logical. If true |
This is a generic function: methods can be defined for itdirectly or via theSummary
group generic.For this to work properly, the arguments...
should beunnamed, and dispatch is on the first argument.
Coercion of types other than integer (raw, double, complex, character,list) gives a warning as this is often unintentional.
This is aprimitive function.
The value is a logical vector of length one.
Letx
denote the concatenation of all the logical vectors in...
(after coercion), after removingNA
s if requested byna.rm = TRUE
.
The value returned isTRUE
if all of the values inx
areTRUE
(including if there are no values), andFALSE
if atleast one of the values inx
isFALSE
. Otherwise thevalue isNA
(which can only occur ifna.rm = FALSE
and...
contains noFALSE
values and at least oneNA
value).
This is part of the S4Summary
group generic. Methods for it must use the signaturex, ..., na.rm
.
Thatall(logical(0))
is true is a useful convention:it ensures that
all(all(x), all(y)) == all(x, y)
even ifx
has length zero.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
any
, the ‘complement’ ofall
, andstopifnot(*)
which is anall(*)
‘insurance’.
range(x <- sort(round(stats::rnorm(10) - 1.2, 1)))if(all(x < 0)) cat("all x values are negative\n")all(logical(0)) # true, as all zero of the elements are true.
range(x<- sort(round(stats::rnorm(10)-1.2,1)))if(all(x<0)) cat("all x values are negative\n")all(logical(0))# true, as all zero of the elements are true.
all.equal(x, y)
is a utility to compareR objectsx
andy
testing ‘near equality’. If they are different,comparison is still made to some extent, and a report of thedifferences is returned. Do not useall.equal
directly inif
expressions—either useisTRUE(all.equal(....))
oridentical
if appropriate.
all.equal(target, current, ...)## Default S3 method:all.equal(target, current, ..., check.class = TRUE)## S3 method for class 'numeric'all.equal(target, current, tolerance = sqrt(.Machine$double.eps), scale = NULL, countEQ = FALSE, formatFUN = function(err, what) format(err), ..., check.attributes = TRUE, check.class = TRUE, giveErr = FALSE)## S3 method for class 'list'all.equal(target, current, ..., check.attributes = TRUE, use.names = TRUE)## S3 method for class 'environment'all.equal(target, current, all.names = TRUE, evaluate = TRUE, ...)## S3 method for class 'function'all.equal(target, current, check.environment=TRUE, ...)## S3 method for class 'POSIXt'all.equal(target, current, ..., tolerance = 1e-3, scale, check.tzone = TRUE)attr.all.equal(target, current, ..., check.attributes = TRUE, check.names = TRUE)
all.equal(target, current,...)## Default S3 method:all.equal(target, current,..., check.class=TRUE)## S3 method for class 'numeric'all.equal(target, current, tolerance= sqrt(.Machine$double.eps), scale=NULL, countEQ=FALSE, formatFUN=function(err, what) format(err),..., check.attributes=TRUE, check.class=TRUE, giveErr=FALSE)## S3 method for class 'list'all.equal(target, current,..., check.attributes=TRUE, use.names=TRUE)## S3 method for class 'environment'all.equal(target, current, all.names=TRUE, evaluate=TRUE,...)## S3 method for class 'function'all.equal(target, current, check.environment=TRUE,...)## S3 method for class 'POSIXt'all.equal(target, current,..., tolerance=1e-3, scale, check.tzone=TRUE)attr.all.equal(target, current,..., check.attributes=TRUE, check.names=TRUE)
target | R object. |
current | otherR object, to be compared with |
... | further arguments for different methods, notably thefollowing two, for numerical comparison: |
tolerance | numeric |
scale |
|
countEQ | logical indicating if the |
formatFUN | a |
check.attributes | logical indicating if the |
check.class | logical indicating if the |
giveErr |
|
use.names | logical indicating if |
all.names | logical passed to |
evaluate | for the |
check.environment | logical requiring that the |
check.tzone | logical indicating if the |
check.names | logical indicating if the |
all.equal
is a generic function, dispatching methods on thetarget
argument. To see the available methods, usemethods("all.equal")
, but note that the default methodalso does some dispatching, e.g. using the raw method for logicaltargets.
Remember that arguments which follow...
must be specified by(unabbreviated) name. It is inadvisable to pass unnamed arguments in...
as these will match different arguments in differentmethods.
Numerical comparisons forscale = NULL
(the default) aretypically on arelative difference scale unless thetarget
values are close to zero or infinite. Specifically,the scale is computed as the mean absolute value oftarget
.If this scale is finite and exceedstolerance
, differencesare expressed relative to it; otherwise, absolute differences are used.Note that this scale and all further steps are computed only for thosevector elementswheretarget
is notNA
and differs fromcurrent
.IfcountEQ
is true, the equal andNA
cases arecounted in determining the “sample” size.
Ifscale
is numeric (and positive), absolute comparisons aremade after scaling (dividing) byscale
. Note that if all ofscale is close to 1 (specifically, within 1e-7), the difference is stillreported as being on an absolute scale.
For complextarget
, the modulus (Mod
) of thedifference is used:all.equal.numeric
is called so argumentstolerance
andscale
are available.
Thelist
method compares components oftarget
andcurrent
recursively, passing all otherarguments, as long as both are “list-like”, i.e., fulfilleitheris.vector
oris.list
.
Theenvironment
method works via thelist
method,and is also used for reference classes (unless a specificall.equal
method is defined).
The method for date-time objects usesall.equal.numeric
tocompare times (in"POSIXct"
representation) with adefaulttolerance
of 0.001 seconds, ignoringscale
.A time zone mismatch betweentarget
andcurrent
isreported unlesscheck.tzone = FALSE
.
attr.all.equal
is used for comparingattributes
, returningNULL
or acharacter
vector.
EitherTRUE
(NULL
forattr.all.equal
) or a vectorofmode
"character"
describing the differencesbetweentarget
andcurrent
.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer (for=
).
identical
,isTRUE
,==
, andall
for exact equality testing.
all.equal(pi, 355/113)# not precise enough (default tol) > relative errorquarts <- 1/4 + 1:10 # exactd45 <- pi*quarts ; one <- rep(1, 10)tan(d45) == one # mostly FALSE, as typically exact; embarrassingly,tanpi(quarts) == one # (is always FALSE (Fedora 34; gcc 11.2.1))stopifnot(all.equal( tan(d45), one)) # TRUE, but not if we are picky:all.equal(tan(d45), one, tolerance = 0) # to see differenceall.equal(tan(d45), one, tolerance = 0, scale = 1)# "absolute diff.."all.equal(tan(d45), one, tolerance = 0, scale = 1+(-2:2)/1e9) # "absolute"all.equal(tan(d45), one, tolerance = 0, scale = 1+(-2:2)/1e6) # "scaled"## advanced: equality of environmentsae <- all.equal(as.environment("package:stats"), asNamespace("stats"))stopifnot(is.character(ae), length(ae) > 10, ## were incorrectly "considered equal" in R <= 3.1.1 all.equal(asNamespace("stats"), asNamespace("stats")))## A situation where 'countEQ = TRUE' makes sense:x1 <- x2 <- (1:100)/10; x2[2] <- 1.1*x1[2]## 99 out of 100 pairs (x1[i], x2[i]) are equal:plot(x1,x2, main = "all.equal.numeric() -- not counting equal parts")all.equal(x1,x2) ## "Mean relative difference: 0.1"mtext(paste("all.equal(x1,x2) :", all.equal(x1,x2)), line= -2)##' extract the 'Mean relative difference' as number:all.eqNum <- function(...) as.numeric(sub(".*:", '', all.equal(...)))set.seed(17)## When x2 is jittered, typically all pairs (x1[i],x2[i]) do differ:summary(r <- replicate(100, all.eqNum(x1, x2*(1+rnorm(x1)*1e-7))))mtext(paste("mean(all.equal(x1, x2*(1 + eps_k))) {100 x} Mean rel.diff.=", signif(mean(r), 3)), line = -4, adj=0)## With argument countEQ=TRUE, get "the same" (w/o need for jittering):mtext(paste("all.equal(x1,x2, countEQ=TRUE) :", signif(all.eqNum(x1,x2, countEQ=TRUE), 3)), line= -6, col=2)## Using giveErr=TRUE :x1. <- x1 * (1+ 1e-9*rnorm(x1))str(all.equal(x1, x1., giveErr=TRUE))## logi TRUE## - attr(*, "err")= num 8.66e-10## - attr(*, "what")= chr "relative"## Used with stopifnot(), still *showing* diff:all.equalShow <- function (...) { r <- all.equal(..., giveErr=TRUE) cat(attr(r,"what"), "err:", attr(r,"err"), "\n") c(r) # can drop attributes, as not used anymore}# checks, showing error in any case:stopifnot(all.equalShow(x1, x1.)) # -> relative err: 8.66002e-10tryCatch(error=identity, stopifnot(all.equalShow(x1, 2*x1))) -> eAestopifnot(inherits(eAe, "error"))# stopifnot(all.equal....()) giving smart msg:cat(conditionMessage(eAe), "\n")two <- structure(2, foo = 1, class = "bar")all.equal(two^20, 2^20) # lots of diffall.equal(two^20, 2^20, check.attributes = FALSE)# "target is bar, current is numeric"all.equal(two^20, 2^20, check.attributes = FALSE, check.class = FALSE) # TRUE## comparison of date-time objectsnow <- Sys.time()stopifnot(all.equal(now, now + 1e-4) # TRUE (default tolerance = 0.001 seconds))all.equal(now, now + 0.2)all.equal(now, as.POSIXlt(now, "UTC"))stopifnot(all.equal(now, as.POSIXlt(now, "UTC"), check.tzone = FALSE) # TRUE)
all.equal(pi,355/113)# not precise enough (default tol) > relative errorquarts<-1/4+1:10# exactd45<- pi*quarts; one<- rep(1,10)tan(d45)== one# mostly FALSE, as typically exact; embarrassingly,tanpi(quarts)== one# (is always FALSE (Fedora 34; gcc 11.2.1))stopifnot(all.equal( tan(d45), one))# TRUE, but not if we are picky:all.equal(tan(d45), one, tolerance=0)# to see differenceall.equal(tan(d45), one, tolerance=0, scale=1)# "absolute diff.."all.equal(tan(d45), one, tolerance=0, scale=1+(-2:2)/1e9)# "absolute"all.equal(tan(d45), one, tolerance=0, scale=1+(-2:2)/1e6)# "scaled"## advanced: equality of environmentsae<- all.equal(as.environment("package:stats"), asNamespace("stats"))stopifnot(is.character(ae), length(ae)>10,## were incorrectly "considered equal" in R <= 3.1.1 all.equal(asNamespace("stats"), asNamespace("stats")))## A situation where 'countEQ = TRUE' makes sense:x1<- x2<-(1:100)/10; x2[2]<-1.1*x1[2]## 99 out of 100 pairs (x1[i], x2[i]) are equal:plot(x1,x2, main="all.equal.numeric() -- not counting equal parts")all.equal(x1,x2)## "Mean relative difference: 0.1"mtext(paste("all.equal(x1,x2) :", all.equal(x1,x2)), line=-2)##' extract the 'Mean relative difference' as number:all.eqNum<-function(...) as.numeric(sub(".*:",'', all.equal(...)))set.seed(17)## When x2 is jittered, typically all pairs (x1[i],x2[i]) do differ:summary(r<- replicate(100, all.eqNum(x1, x2*(1+rnorm(x1)*1e-7))))mtext(paste("mean(all.equal(x1, x2*(1 + eps_k))) {100 x} Mean rel.diff.=", signif(mean(r),3)), line=-4, adj=0)## With argument countEQ=TRUE, get "the same" (w/o need for jittering):mtext(paste("all.equal(x1,x2, countEQ=TRUE) :", signif(all.eqNum(x1,x2, countEQ=TRUE),3)), line=-6, col=2)## Using giveErr=TRUE :x1.<- x1*(1+1e-9*rnorm(x1))str(all.equal(x1, x1., giveErr=TRUE))## logi TRUE## - attr(*, "err")= num 8.66e-10## - attr(*, "what")= chr "relative"## Used with stopifnot(), still *showing* diff:all.equalShow<-function(...){ r<- all.equal(..., giveErr=TRUE) cat(attr(r,"what"),"err:", attr(r,"err"),"\n") c(r)# can drop attributes, as not used anymore}# checks, showing error in any case:stopifnot(all.equalShow(x1, x1.))# -> relative err: 8.66002e-10tryCatch(error=identity, stopifnot(all.equalShow(x1,2*x1)))-> eAestopifnot(inherits(eAe,"error"))# stopifnot(all.equal....()) giving smart msg:cat(conditionMessage(eAe),"\n")two<- structure(2, foo=1, class="bar")all.equal(two^20,2^20)# lots of diffall.equal(two^20,2^20, check.attributes=FALSE)# "target is bar, current is numeric"all.equal(two^20,2^20, check.attributes=FALSE, check.class=FALSE)# TRUE## comparison of date-time objectsnow<- Sys.time()stopifnot(all.equal(now, now+1e-4)# TRUE (default tolerance = 0.001 seconds))all.equal(now, now+0.2)all.equal(now, as.POSIXlt(now,"UTC"))stopifnot(all.equal(now, as.POSIXlt(now,"UTC"), check.tzone=FALSE)# TRUE)
Return a character vector containing all the names which occur in anexpression or call.
all.names(expr, functions = TRUE, max.names = -1L, unique = FALSE)all.vars(expr, functions = FALSE, max.names = -1L, unique = TRUE)
all.names(expr, functions=TRUE, max.names=-1L, unique=FALSE)all.vars(expr, functions=FALSE, max.names=-1L, unique=TRUE)
expr | anexpression orcall from which the namesare to be extracted. |
functions | a logical value indicating whether function namesshould be included in the result. |
max.names | the maximum number of names to be returned. |
unique | a logical value which indicates whether duplicate namesshould be removed from the value. |
These functions differ only in the default values for theirarguments.
A character vector with the extracted names.
substitute
to replace symbols with values in an expression.
all.names(expression(sin(x+y)))all.names(quote(sin(x+y))) # or a callall.vars(expression(sin(x+y)))
all.names(expression(sin(x+y)))all.names(quote(sin(x+y)))# or a callall.vars(expression(sin(x+y)))
Given a set of logical vectors, is at least one of the values true?
any(..., na.rm = FALSE)
any(..., na.rm=FALSE)
... | zero or more logical vectors. Other objects of zerolength are ignored, and the rest are coerced to logical ignoringany class. |
na.rm | logical. If true |
This is a generic function: methods can be defined for itdirectly or via theSummary
group generic.For this to work properly, the arguments...
should beunnamed, and dispatch is on the first argument.
Coercion of types other than integer (raw, double, complex, character,list) gives a warning as this is often unintentional.
This is aprimitive function.
The value is a logical vector of length one.
Letx
denote the concatenation of all the logical vectors in...
(after coercion), after removingNA
s if requested byna.rm = TRUE
.
The value returned isTRUE
if at least one of the values inx
isTRUE
, andFALSE
if all of the values inx
areFALSE
(including if there are no values). Otherwisethe value isNA
(which can only occur ifna.rm = FALSE
and...
contains noTRUE
values and at least oneNA
value).
This is part of the S4Summary
group generic. Methods for it must use the signaturex, ..., na.rm
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
all
, the ‘complement’ ofany
.
range(x <- sort(round(stats::rnorm(10) - 1.2, 1)))if(any(x < 0)) cat("x contains negative values\n")
range(x<- sort(round(stats::rnorm(10)-1.2,1)))if(any(x<0)) cat("x contains negative values\n")
Transpose an array by permuting its dimensions and optionally resizingit.
aperm(a, perm, ...)## Default S3 method:aperm(a, perm = NULL, resize = TRUE, ...)## S3 method for class 'table'aperm(a, perm = NULL, resize = TRUE, keep.class = TRUE, ...)
aperm(a, perm,...)## Default S3 method:aperm(a, perm=NULL, resize=TRUE,...)## S3 method for class 'table'aperm(a, perm=NULL, resize=TRUE, keep.class=TRUE,...)
a | the array to be transposed. |
perm | the subscript permutation vector, usually a permutation ofthe integers |
resize | a flag indicating whether the vector should beresized as well as having its elements reordered (default |
keep.class | logical indicating if the result should be of thesame class as |
... | potential further arguments of methods. |
A transposed version of arraya
, with subscripts permuted asindicated by the arrayperm
. Ifresize
isTRUE
,the array is reshaped as well as having its elements permuted, thedimnames
are also permuted; ifresize = FALSE
then thereturned object has the same dimensions asa
, and the dimnamesare dropped. In each case other attributes are copied froma
.
The functiont
provides a faster and more convenient way oftransposing matrices.
Jonathan Rougier,[email protected] did thefaster C implementation.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
t
, to transpose matrices.
# interchange the first two subscripts on a 3-way array xx <- array(1:24, 2:4)xt <- aperm(x, c(2,1,3))stopifnot(t(xt[,,2]) == x[,,2], t(xt[,,3]) == x[,,3], t(xt[,,4]) == x[,,4])UCB <- aperm(UCBAdmissions, c(2,1,3))UCB[1,,]summary(UCB) # UCB is still a contingency table
# interchange the first two subscripts on a 3-way array xx<- array(1:24,2:4)xt<- aperm(x, c(2,1,3))stopifnot(t(xt[,,2])== x[,,2], t(xt[,,3])== x[,,3], t(xt[,,4])== x[,,4])UCB<- aperm(UCBAdmissions, c(2,1,3))UCB[1,,]summary(UCB)# UCB is still a contingency table
Add elements to a vector.
append(x, values, after = length(x))
append(x, values, after= length(x))
x | the vector the values are to be appended to. |
values | to be included in the modified vector. |
after | a subscript, after which the values are to be appended. |
A vector containing the values inx
with the elements ofvalues
appended after the specified element ofx
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
append(1:5, 0:1, after = 3)
append(1:5,0:1, after=3)
Returns a vector or array or list of values obtained by applying afunction to margins of an array or matrix.
apply(X, MARGIN, FUN, ..., simplify = TRUE)
apply(X, MARGIN, FUN,..., simplify=TRUE)
X | an array, including a matrix. |
MARGIN | a vector giving the subscripts which the function willbe applied over. E.g., for a matrix |
FUN | the function to be applied: see ‘Details’.In the case of functions like |
... | optional arguments to |
simplify | a logical indicating whether results should besimplified if possible. |
IfX
is not an array but an object of a class with a non-nulldim
value (such as a data frame),apply
attemptsto coerce it to an array viaas.matrix
if it is two-dimensional(e.g., a data frame) or viaas.array
.
FUN
is found by a call tomatch.fun
and typicallyis either a function or a symbol (e.g., a backquoted name) or acharacter string specifying a function to be searched for from theenvironment of the call toapply
.
Arguments in...
cannot have the same name as any of theother arguments, and care may be needed to avoid partial matching toMARGIN
orFUN
. In general-purpose code it is goodpractice to name the first three arguments if...
is passedthrough: this both avoids partial matching toMARGIN
orFUN
and ensures that a sensible error message is given ifarguments namedX
,MARGIN
orFUN
are passedthrough...
.
If each call toFUN
returns a vector of lengthn
,andsimplify
isTRUE
, thenapply
returns an array of dimensionc(n, dim(X)[MARGIN])
ifn > 1
. Ifn
equals1
,apply
returns avector ifMARGIN
has length 1 and an array of dimensiondim(X)[MARGIN]
otherwise.Ifn
is0
, the result has length 0 but not necessarilythe ‘correct’ dimension.
If the calls toFUN
return vectors of different lengths,or ifsimplify
isFALSE
,apply
returns a list of lengthprod(dim(X)[MARGIN])
withdim
set toMARGIN
if this has length greater than one.
In all cases the result is coerced byas.vector
to oneof the basic vector types before the dimensions are set, so that (forexample) factor results will be coerced to a character array.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
lapply
and there,simplify2array
;tapply
, and convenience functionssweep
andaggregate
.
## Compute row and column sums for a matrix:x <- cbind(x1 = 3, x2 = c(4:1, 2:5))dimnames(x)[[1]] <- letters[1:8]apply(x, 2, mean, trim = .2)col.sums <- apply(x, 2, sum)row.sums <- apply(x, 1, sum)rbind(cbind(x, Rtot = row.sums), Ctot = c(col.sums, sum(col.sums)))stopifnot( apply(x, 2, is.vector))## Sort the columns of a matrixapply(x, 2, sort)## keeping named dimnamesnames(dimnames(x)) <- c("row", "col")x3 <- array(x, dim = c(dim(x),3), dimnames = c(dimnames(x), list(C = paste0("cop.",1:3))))identical(x, apply( x, 2, identity))identical(x3, apply(x3, 2:3, identity))##- function with extra args:cave <- function(x, c1, c2) c(mean(x[c1]), mean(x[c2]))apply(x, 1, cave, c1 = "x1", c2 = c("x1","x2"))ma <- matrix(c(1:4, 1, 6:8), nrow = 2)maapply(ma, 1, table) #--> a list of length 2apply(ma, 1, stats::quantile) # 5 x n matrix with rownamesstopifnot(dim(ma) == dim(apply(ma, 1:2, sum)))## Example with different lengths for each callz <- array(1:24, dim = 2:4)zseq <- apply(z, 1:2, function(x) seq_len(max(x)))zseq ## a 2 x 3 matrixtypeof(zseq) ## listdim(zseq) ## 2 3zseq[1,]apply(z, 3, function(x) seq_len(max(x)))# a list without a dim attribute
## Compute row and column sums for a matrix:x<- cbind(x1=3, x2= c(4:1,2:5))dimnames(x)[[1]]<- letters[1:8]apply(x,2, mean, trim=.2)col.sums<- apply(x,2, sum)row.sums<- apply(x,1, sum)rbind(cbind(x, Rtot= row.sums), Ctot= c(col.sums, sum(col.sums)))stopifnot( apply(x,2, is.vector))## Sort the columns of a matrixapply(x,2, sort)## keeping named dimnamesnames(dimnames(x))<- c("row","col")x3<- array(x, dim= c(dim(x),3), dimnames= c(dimnames(x), list(C= paste0("cop.",1:3))))identical(x, apply( x,2, identity))identical(x3, apply(x3,2:3, identity))##- function with extra args:cave<-function(x, c1, c2) c(mean(x[c1]), mean(x[c2]))apply(x,1, cave, c1="x1", c2= c("x1","x2"))ma<- matrix(c(1:4,1,6:8), nrow=2)maapply(ma,1, table)#--> a list of length 2apply(ma,1, stats::quantile)# 5 x n matrix with rownamesstopifnot(dim(ma)== dim(apply(ma,1:2, sum)))## Example with different lengths for each callz<- array(1:24, dim=2:4)zseq<- apply(z,1:2,function(x) seq_len(max(x)))zseq## a 2 x 3 matrixtypeof(zseq)## listdim(zseq)## 2 3zseq[1,]apply(z,3,function(x) seq_len(max(x)))# a list without a dim attribute
Displays the argument names and corresponding default values of a(non-primitive or primitive) function.
args(name)
args(name)
name | a function (a primitive or a closure, i.e.,“non-primitive”).If |
This function is mainly used interactively to print the argument listof a function. For programming, consider usingformals
instead.
For a closure, a closure with identical formal argument list but anempty (NULL
) body.
For a primitive (function), a closure with the documented usage andNULL
body. Note that some primitives do not make use of named argumentsand match by position rather than name.
NULL
in case of a non-function.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
formals
,help
;str
also prints the argument list of a function.
## "regular" (non-primitive) functions "print their arguments"## (by returning another function with NULL body which you also see):args(ls)args(graphics::plot.default)utils::str(ls) # (just "prints": does not show a NULL)## You can also pass a string naming a function.args("scan")## ...but :: package specification doesn't work in this case.tryCatch(args("graphics::plot.default"), error = print)## As explained above, args() gives a function with empty body:list(is.f = is.function(args(scan)), body = body(args(scan)))## Primitive functions mostly behave like non-primitive functions.args(c)args(`+`)## primitive functions without well-defined argument list return NULL:args(`if`)
## "regular" (non-primitive) functions "print their arguments"## (by returning another function with NULL body which you also see):args(ls)args(graphics::plot.default)utils::str(ls)# (just "prints": does not show a NULL)## You can also pass a string naming a function.args("scan")## ...but :: package specification doesn't work in this case.tryCatch(args("graphics::plot.default"), error= print)## As explained above, args() gives a function with empty body:list(is.f= is.function(args(scan)), body= body(args(scan)))## Primitive functions mostly behave like non-primitive functions.args(c)args(`+`)## primitive functions without well-defined argument list return NULL:args(`if`)
These unary and binary operators perform arithmetic on numeric orcomplex vectors (or objects which can be coerced to them).
+ x- xx + yx - yx * yx / yx ^ yx %% yx %/% y
+ x- xx+ yx- yx* yx/ yx^ yx%% yx%/% y
x ,y | numeric or complex vectors or objects which can becoerced to such, or other objects for which methods have been written. |
The unary and binary arithmetic operators are generic functions:methods can be written for them individually or via theOps
group generic function. (SeeOps
for how dispatch is computed.)
If applied to arrays the result will be an array if this is sensible(for example it will not if the recycling rule has been invoked).
Logical vectors will be coerced to integer or numeric vectors,FALSE
having value zero andTRUE
having value one.
1 ^ y
andy ^ 0
are1
,always.x ^ y
should also give the proper limit result wheneither (numeric) argument isinfinite (one ofInf
or-Inf
).
Objects such as arrays or time-series can be operated on thisway provided they are conformable.
For double arguments,%%
can be subject to catastrophic loss ofaccuracy ifx
is much larger thany
, and a warning isgiven if this is detected.
%%
andx %/% y
can be used for non-integery
,e.g.1 %/% 0.2
, but the results are subject to representationerror and so may be platform-dependent. Mathematically, the answer to1 %/% 0.2
should be5
, but because theIEC 60559representation of0.2
is a binary fraction slightly larger than0.2
most platforms give4
.
Users are sometimes surprised by the value returned, for example why(-8)^(1/3)
isNaN
. Fordouble inputs,R makesuse ofIEC 60559 arithmetic on all platforms, together with the Csystem function ‘pow’ for the^
operator. The relevantstandards define the result in many corner cases. In particular, theresult in the example above is mandated by the C99 standard. On manyUnix-alike systems the commandman pow
gives details of thevalues in a large number of corner cases.
Arithmetic on typedouble inR is supposed to be done in‘round to nearest, ties to even’ mode, but this does depend onthe compiler andFPU being set up correctly.
Unary+
and unary-
return a numeric or complex vector.All attributes (including class) are preserved if there is nocoercion: logicalx
is coerced to integer and names, dims anddimnames are preserved.
The binary operators return vectors containing the result of the elementby element operations. If involving a zero-length vector the resulthas length zero. Otherwise, the elements of shorter vectors are recycledas necessary (with awarning
when they are recycled onlyfractionally). The operators are+
for addition,-
for subtraction,*
for multiplication,/
fordivision and^
for exponentiation.
%%
indicatesx mod y
(“x modulo y”), i.e.,computes the ‘remainder’r <- x %% y
, and%/%
indicates integer division, whereR uses “floored”integer division, i.e.,q <- x %/% y := floor(x/y)
, as promotedby Donald Knuth, see the Wikipedia page on ‘Modulo operation’,and hencesign(r) == sign(y)
. It is guaranteed that
x == (x %% y) + y * (x %/% y)
(up to rounding error)
unlessy == 0
where the result of%%
isNA_integer_
orNaN
(depending on thetypeof
of the arguments) or for some non-finitearguments, e.g., when the RHS of the identity aboveamounts toInf - Inf
.
If either argument is complex the result will be complex, otherwise ifone or both arguments are numeric, the result will be numeric. Ifboth arguments are of typeinteger, the type of the result of/
and^
isnumeric and for the other operators itis integer (with overflow, which occurs at,returned as
NA_integer_
with a warning).
The rules for determining the attributes of the result are rathercomplicated. Most attributes are taken from the longer argument.Names will be copied from the first if it is the same length as theanswer, otherwise from the second if that is. If the arguments arethe same length, attributes will be copied from both, with those ofthe first argument taking precedence when the same attribute ispresent in both arguments. For time series, these operations areallowed only if the series are compatible, when the class andtsp
attribute of whichever is a time series (the same,if both are) are used. For arrays (and an array result) thedimensions and dimnames are taken from first argument if it is anarray, otherwise the second.
These operators are members of the S4Arith
group generic,and so methods can be written for them individually as well as for thegroup generic (or theOps
group generic), with argumentsc(e1, e2)
(withe2
missing for a unary operator).
R is dependent on OS services (and they onFPUs) for floating-pointarithmetic. On all currentR platformsIEC 60559 (also known as IEEE754) arithmetic is used, but some things in those standards areoptional. In particular, the support fordenormal akasubnormal numbers(those outside the range given by.Machine
) may differbetween platforms and even between calculations on a single platform.
Another potential issue is signed zeroes: onIEC 60559 platforms thereare two zeroes with internal representations differing by sign. WherepossibleR treats them as the same, but for example direct outputfrom C code often does not do so and may output ‘-0.0’ (and onWindows whether it does so or not depends on the version of Windows).One place inR where the difference might be seen is in division byzero:1/x
isInf
or-Inf
depending on the sign ofzerox
. Another place isidentical(0, -0, num.eq = FALSE)
.
All logical operations involving a zero-length vector have azero-length result.
The binary operators are sometimes called as functions ase.g.`&`(x, y)
: see the description of howargument-matching is done inOps
.
**
is translated in the parser to^
, but this wasundocumented for many years. It appears as an index entry in Beckeret al. (1988), pointing to the help forDeprecated
butis not actually mentioned on that page. Even though it had beendeprecated in S for 20 years, it was still accepted inR in 2008.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
D. Goldberg (1991).What Every Computer Scientist Should Know about Floating-PointArithmetic.ACM Computing Surveys,23(1), 5–48.doi:10.1145/103162.103163.
Also available athttps://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html.
For theIEC 60559 (aka IEEE 754) standard:https://www.iso.org/standard/57469.html andhttps://en.wikipedia.org/wiki/IEEE_754.
On the integer division and remainder (modulo) computations,%%
and%/%
:https://en.wikipedia.org/wiki/Modulo_operation, andDonald Knuth (1972)The Art of Computer Programming, Vol.1.
sqrt
for miscellaneous andSpecial
for specialmathematical functions.
Syntax
for operator precedence.
%*%
for matrix multiplication.
x <- -1:12x + 12 * x + 3x %% 3 # is periodic 2 0 1 2 0 1 ...x %% -3 # (ditto) -1 0 -2 -1 0 -2 ...x %/% 5x %% Inf # now is defined by limit (gave NaN in earlier versions of R)## Illustrating PR#18677, see above1 %/% print(0.2, digits=19)
x<--1:12x+12* x+3x%%3# is periodic 2 0 1 2 0 1 ...x%%-3# (ditto) -1 0 -2 -1 0 -2 ...x%/%5x%%Inf# now is defined by limit (gave NaN in earlier versions of R)## Illustrating PR#18677, see above1%/% print(0.2, digits=19)
Creates or tests for arrays.
array(data = NA, dim = length(data), dimnames = NULL)as.array(x, ...)is.array(x)
array(data=NA, dim= length(data), dimnames=NULL)as.array(x,...)is.array(x)
data | a vector (including a list or |
dim | the dim attribute for the array to be created, that is aninteger vector of length one or more giving the maximal indices ineach dimension. |
dimnames | either |
x | anR object. |
... | additional arguments to be passed to or from methods. |
An array inR can have one, two or more dimensions. It is simply avector which is stored with additionalattributes giving thedimensions (attribute"dim"
) and optionally names for thosedimensions (attribute"dimnames"
).
A two-dimensional array is the same thing as amatrix
.
One-dimensional arrays often look like vectors, but may be handleddifferently by some functions:str
does distinguishthem in recent versions ofR.
The"dim"
attribute is an integer vector of length one or morecontaining non-negative values: the product of the values must matchthe length of the array.
The"dimnames"
attribute is optional: if present it is a listwith one component for each dimension, eitherNULL
or acharacter vector of the length given by the element of the"dim"
attribute for that dimension.
is.array
is aprimitive function.
For a list array, theprint
methods prints entries of lengthnot one in the form ‘integer,7’ indicating the type and length.
array
returns an array with the extents specified indim
and naming information indimnames
. The values indata
aretaken to be those in the array with the leftmost subscript movingfastest. If there are too few elements indata
to fill the array,then the elements indata
are recycled. Ifdata
haslength zero,NA
of an appropriate type is used for atomicvectors (0
for raw vectors) andNULL
for lists.
Unlikematrix
,array
does not currently removeany attributes left byas.vector
from a classed listdata
, so can return a list array with a class attribute.
as.array
is a generic function for coercing to arrays. Thedefault method does so by attaching adim
attribute toit. It also attachesdimnames
ifx
hasnames
. The sole purpose of this is to make it possibleto access thedim[names]
attribute at a later time.
is.array
returnsTRUE
orFALSE
depending onwhether its argument is an array (i.e., has adim
attribute ofpositive length) or not. It is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.
is.array
is aprimitive function.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
dim(as.array(letters))array(1:3, c(2,4)) # recycle 1:3 "2 2/3 times"# [,1] [,2] [,3] [,4]#[1,] 1 3 2 1#[2,] 2 1 3 2
dim(as.array(letters))array(1:3, c(2,4))# recycle 1:3 "2 2/3 times"# [,1] [,2] [,3] [,4]#[1,] 1 3 2 1#[2,] 2 1 3 2
array2DF
converts an array, including list arrays commonlyreturned bytapply
, into data frames for use in furtheranalysis or plotting functions.
array2DF(x, responseName = "Value", sep = "", base = list(LETTERS), simplify = TRUE, allowLong = TRUE)
array2DF(x, responseName="Value", sep="", base= list(LETTERS), simplify=TRUE, allowLong=TRUE)
x | an array object. |
responseName | character string, used for creating column name(s)in the result, if required. |
sep | character string, used as separator when creating newnames, if required. |
base | character vector, giving an initial set of names to createdimnames of |
simplify | logical, whether to attempt simplification of theresult. |
allowLong | logical, specifying whether a long format data frameshould be returned if |
The main use ofarray2DF
is to convert an array, as typicallyreturned bytapply
, into a data frame.
Whensimplify = FALSE
, this is similar toas.data.frame.table
, except that it works for listarrays as well as atomic arrays. Specifically, the resulting dataframe has one row for each element of the array, with one column foreach dimension of the array giving the correspondingdimnames
. The contents of the array are placed in acolumn whose name is given by theresponseName
argument. Themode of this column is the same as that ofx
, usually an atomicvector or a list.
Ifx
does not havedimnames
, they areautomatically created usingbase
andsep
.
In the default case, whensimplify = TRUE
, some common casesare handled specially.
If all components ofx
are data frames with identical columnnames (with possibly different numbers of rows), they arerbind
-ed to form the response. The additional columnsgivingdimnames
are repeated according to the number ofrows, andresponseName
is ignored in this case.
If all components ofx
areunnamed atomic vectorsandallowLong = TRUE
, each component is treated as asingle-column data frame with column name given byresponseName
, and processed as above.
In all other cases, an attempt to simplify is made bysimplify2array
. If this results in multiple unnamedcolumns, names are constructed usingresponseName
andsep
.
A data frame with at leastlength(dim(x)) + 1
columns. Thefirstlength(dim(x))
columns each represent one dimension ofx
and gives the corresponding values ofdimnames
, whichare implicitly created if necessary. The remaining columns contain thecontents ofx
, after attempted simplification if requested.
tapply
,as.data.frame.table
,split
,aggregate
.
s1 <- with(ToothGrowth, tapply(len, list(dose, supp), mean, simplify = TRUE))s2 <- with(ToothGrowth, tapply(len, list(dose, supp), mean, simplify = FALSE))str(s1) # atomic arraystr(s2) # list arraystr(array2DF(s1, simplify = FALSE)) # Value column is vectorstr(array2DF(s2, simplify = FALSE)) # Value column is liststr(array2DF(s2, simplify = TRUE)) # simplified to vector### The remaining examples use the default 'simplify = TRUE' ## List array with list components: columns are lists (no simplification)with(ToothGrowth, tapply(len, list(dose, supp), function(x) t.test(x)[c("p.value", "alternative")])) |> array2DF() |> str()## List array with data frame components: columns are atomic (simplified)with(ToothGrowth, tapply(len, list(dose, supp), function(x) with(t.test(x), data.frame(p.value, alternative)))) |> array2DF() |> str()## named vectorswith(ToothGrowth, tapply(len, list(dose, supp), quantile)) |> array2DF()## unnamed vectors: long formatwith(ToothGrowth, tapply(len, list(dose, supp), sample, size = 5)) |> array2DF()## unnamed vectors: wide formatwith(ToothGrowth, tapply(len, list(dose, supp), sample, size = 5)) |> array2DF(allowLong = FALSE)## unnamed vectors of unequal lengthwith(ToothGrowth[-1, ], tapply(len, list(dose, supp), sample, replace = TRUE)) |> array2DF(allowLong = FALSE)## unnamed vectors of unequal length with allowLong = TRUE## (within-group bootstrap)with(ToothGrowth[-1, ], tapply(len, list(dose, supp), sample, replace = TRUE)) |> array2DF() |> str()## data frame inputtapply(ToothGrowth, ~ dose + supp, FUN = with, data.frame(n = length(len), mean = mean(len), sd = sd(len))) |> array2DF()
s1<- with(ToothGrowth, tapply(len, list(dose, supp), mean, simplify=TRUE))s2<- with(ToothGrowth, tapply(len, list(dose, supp), mean, simplify=FALSE))str(s1)# atomic arraystr(s2)# list arraystr(array2DF(s1, simplify=FALSE))# Value column is vectorstr(array2DF(s2, simplify=FALSE))# Value column is liststr(array2DF(s2, simplify=TRUE))# simplified to vector### The remaining examples use the default 'simplify = TRUE'## List array with list components: columns are lists (no simplification)with(ToothGrowth, tapply(len, list(dose, supp),function(x) t.test(x)[c("p.value","alternative")]))|> array2DF()|> str()## List array with data frame components: columns are atomic (simplified)with(ToothGrowth, tapply(len, list(dose, supp),function(x) with(t.test(x), data.frame(p.value, alternative))))|> array2DF()|> str()## named vectorswith(ToothGrowth, tapply(len, list(dose, supp), quantile))|> array2DF()## unnamed vectors: long formatwith(ToothGrowth, tapply(len, list(dose, supp), sample, size=5))|> array2DF()## unnamed vectors: wide formatwith(ToothGrowth, tapply(len, list(dose, supp), sample, size=5))|> array2DF(allowLong=FALSE)## unnamed vectors of unequal lengthwith(ToothGrowth[-1,], tapply(len, list(dose, supp), sample, replace=TRUE))|> array2DF(allowLong=FALSE)## unnamed vectors of unequal length with allowLong = TRUE## (within-group bootstrap)with(ToothGrowth[-1,], tapply(len, list(dose, supp), sample, replace=TRUE))|> array2DF()|> str()## data frame inputtapply(ToothGrowth,~ dose+ supp, FUN= with, data.frame(n= length(len), mean= mean(len), sd= sd(len)))|> array2DF()
Functions to check if an object is a data frame, or coerce it if possible.
as.data.frame(x, row.names = NULL, optional = FALSE, ...)## S3 method for class 'character'as.data.frame(x, ..., stringsAsFactors = FALSE)## S3 method for class 'list'as.data.frame(x, row.names = NULL, optional = FALSE, ..., cut.names = FALSE, col.names = names(x), fix.empty.names = TRUE, check.names = !optional, stringsAsFactors = FALSE)## S3 method for class 'matrix'as.data.frame(x, row.names = NULL, optional = FALSE, make.names = TRUE, ..., stringsAsFactors = FALSE)as.data.frame.vector(x, row.names = NULL, optional = FALSE, ..., nm = deparse1(substitute(x)))is.data.frame(x)
as.data.frame(x, row.names=NULL, optional=FALSE,...)## S3 method for class 'character'as.data.frame(x,..., stringsAsFactors=FALSE)## S3 method for class 'list'as.data.frame(x, row.names=NULL, optional=FALSE,..., cut.names=FALSE, col.names= names(x), fix.empty.names=TRUE, check.names=!optional, stringsAsFactors=FALSE)## S3 method for class 'matrix'as.data.frame(x, row.names=NULL, optional=FALSE, make.names=TRUE,..., stringsAsFactors=FALSE)as.data.frame.vector(x, row.names=NULL, optional=FALSE,..., nm= deparse1(substitute(x)))is.data.frame(x)
x | anyR object. |
row.names |
|
optional | logical. If |
... | additional arguments to be passed to or from methods. |
stringsAsFactors | logical: should the character vector be convertedto a factor? |
cut.names | logical or integer; indicating if column names withmore than 256 (or |
col.names | (optional) character vector of column names. |
fix.empty.names | logical indicating if empty column names, i.e., |
check.names | logical; passed to the |
make.names | a |
nm | a |
as.data.frame
is a generic function with many methods, andusers and packages can supply further methods. For classes that actas vectors, often a copy ofas.data.frame.vector
will workas the method.
SinceR 4.3.0, thedefault method will callas.data.frame.vector
for atomic (as byis.atomic
)x
.
Direct calls ofas.data.frame.class
are still possible (base package!),for 12 atomic base classes, but are deprecatedwhere callingas.data.frame.vector
instead is recommended.
If a list is supplied, each element is converted to a column in thedata frame. Similarly, each column of a matrix is converted separately.This can be overridden if the object has a class which hasa method foras.data.frame
: two examples arematrices of class"model.matrix"
(which areincluded as a single column) and list objects of class"POSIXlt"
which are coerced to class"POSIXct"
.
Arrays can be converted to data frames. One-dimensional arrays aretreated like vectors and two-dimensional arrays like matrices. Arrayswith more than two dimensions are converted to matrices by‘flattening’ all dimensions after the first and creatingsuitable column labels.
Character variables are converted to factor columns unless protectedbyI
.
If a data frame is supplied, all classes preceding"data.frame"
are stripped, and the row names are changed if that argument is supplied.
Ifrow.names = NULL
, row names are constructed from the namesor dimnames ofx
, otherwise are the integer sequencestarting at one. Few of the methods check for duplicated row names.Names are removed from vector columns unlessI
.
as.data.frame
returns a data frame, normally with all row names""
ifoptional = TRUE
.
is.data.frame
returnsTRUE
if its argument is a dataframe (that is, has"data.frame"
amongst its classes)andFALSE
otherwise.
Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
data.frame
,as.data.frame.table
for thetable
method (which has additional arguments if called directly).
Functions to convert between character representations and objects ofclass"Date"
representing calendar dates.
as.Date(x, ...)## S3 method for class 'character'as.Date(x, format, tryFormats = c("%Y-%m-%d", "%Y/%m/%d"), optional = FALSE, ...)## S3 method for class 'numeric'as.Date(x, origin, ...)## S3 method for class 'POSIXct'as.Date(x, tz = "UTC", ...)## S3 method for class 'Date'format(x, format = "%Y-%m-%d", ...)## S3 method for class 'Date'as.character(x, ...)
as.Date(x,...)## S3 method for class 'character'as.Date(x, format, tryFormats= c("%Y-%m-%d","%Y/%m/%d"), optional=FALSE,...)## S3 method for class 'numeric'as.Date(x, origin,...)## S3 method for class 'POSIXct'as.Date(x, tz="UTC",...)## S3 method for class 'Date'format(x, format="%Y-%m-%d",...)## S3 method for class 'Date'as.character(x,...)
x | an object to be converted. |
format | a |
tryFormats |
|
optional |
|
origin | a |
tz | a time zone name. |
... | further arguments to be passed from or to other methods. |
The usual vector re-cycling rules are applied tox
andformat
so the answer will be of length that of the longer of thevectors.
Locale-specific conversions to and from character strings are usedwhere appropriate and available. This affects the names of the daysand months.
Theas.Date
methods accept character strings, factors, logicalNA
and objects of classes"POSIXlt"
and"POSIXct"
. (The last is converted to days by ignoringthe time after midnight in the representation of the time in specifiedtime zone, default UTC.) Also objects of class"date"
(frompackagedate) and"dates"
(frompackagechron). Character strings are processedas far as necessary for the format specified: any trailing charactersare ignored.
as.Date
will accept numeric data (the number of days since anepoch), sinceR 4.3.0 also whenorigin
is not supplied.
Theformat
andas.character
methods ignore anyfractional part of the date.
Theformat
andas.character
methods return a character vectorrepresenting the date.NA
dates are returned asNA_character_
.
Theas.Date
methods return an object of class"Date"
.
Most systems record dates internally as the number of days since someorigin, but this is fraught with problems, including
Is the origin day 0 or day 1? As the ‘Examples’ show,Excel manages to use both choices for its two date systems.
If the origin is far enough back, the designers may show theirignorance of calendar systems. For example, Excel's designerthought 1900 was a leap year (claiming to copy the error fromearlier DOS spreadsheets), and Matlab's designer chose thenon-existent date of ‘January 0, 0000’ (there is no such day),not specifying the calendar. (There is such a year in the‘Gregorian’ calendar as used in ISO 8601:2004, but that does saythat it is only to be used for years before 1582 with the agreementof the parties in information exchange.)
The only safe procedure is to check the other systems values for knowndates: reports on the Internet (including R-help) are more often wrongthan right.
The default formats follow the rules of the ISO 8601 internationalstandard which expresses a day as"2001-02-03"
.
If the date string does not specify the date completely, the returnedanswer may be system-specific. The most common behaviour is to assumethat a missing year, month or day is the current one. If it specifiesa date incorrectly, reliable implementations will give an error andthe date is reported asNA
. Unfortunately some commonimplementations (such as ‘glibc’) are unreliable and guess at theintended meaning.
Years before 1CE (aka 1AD) will probably not be handled correctly.
International Organization for Standardization (2004, 1988, 1997,...)ISO 8601. Data elements and interchange formats –Information interchange – Representation of dates and times.For links to versions available on-line see (at the time of writing)https://www.qsl.net/g1smd/isopdf.htm.
Date for details of the date class;locales
to query or set a locale.
Your system's help pages onstrftime
andstrptime
to seehow to specify their formats. Windows users will find no help pageforstrptime
: code based on ‘glibc’ is used (withcorrections), so all the format specifiers described here aresupported, but with no alternative number representation nor eraavailable in any locale.
## locale-specific version of the dateformat(Sys.Date(), "%a %b %d")## read in date info in format 'ddmmmyyyy'## This will give NA(s) in some locales; setting the C locale## as in the commented lines will overcome this on most systems.## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")z <- as.Date(x, "%d%b%Y")## Sys.setlocale("LC_TIME", lct)z## read in date/time info in format 'm/d/y'dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")as.Date(dates, "%m/%d/%y")## date given as number of days since 1900-01-01 (a date in 1989)as.Date(32768, origin = "1900-01-01")## Excel is said to use 1900-01-01 as day 1 (Windows default) or## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel## incorrectly treating 1900 as a leap year.## So for dates (post-1901) from Windows Excelas.Date(35981, origin = "1899-12-30") # 1998-07-05## and Mac Excelas.Date(34519, origin = "1904-01-01") # 1998-07-05## (these values come from http://support.microsoft.com/kb/214330)## Experiment shows that Matlab's origin is 719529 days before ours,## (it takes the non-existent 0000-01-01 as day 1)## so Matlab day 734373 can be imported asas.Date(734373) - 719529 # 2010-08-23## (value from## http://www.mathworks.de/de/help/matlab/matlab_prog/represent-date-and-times-in-MATLAB.html)## Time zone effectz <- ISOdate(2010, 04, 13, c(0,12)) # midnight and midday UTCas.Date(z) # in UTC## these time zone names are commonas.Date(z, tz = "NZ")as.Date(z, tz = "HST") # Hawaii
## locale-specific version of the dateformat(Sys.Date(),"%a %b %d")## read in date info in format 'ddmmmyyyy'## This will give NA(s) in some locales; setting the C locale## as in the commented lines will overcome this on most systems.## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")x<- c("1jan1960","2jan1960","31mar1960","30jul1960")z<- as.Date(x,"%d%b%Y")## Sys.setlocale("LC_TIME", lct)z## read in date/time info in format 'm/d/y'dates<- c("02/27/92","02/27/92","01/14/92","02/28/92","02/01/92")as.Date(dates,"%m/%d/%y")## date given as number of days since 1900-01-01 (a date in 1989)as.Date(32768, origin="1900-01-01")## Excel is said to use 1900-01-01 as day 1 (Windows default) or## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel## incorrectly treating 1900 as a leap year.## So for dates (post-1901) from Windows Excelas.Date(35981, origin="1899-12-30")# 1998-07-05## and Mac Excelas.Date(34519, origin="1904-01-01")# 1998-07-05## (these values come from http://support.microsoft.com/kb/214330)## Experiment shows that Matlab's origin is 719529 days before ours,## (it takes the non-existent 0000-01-01 as day 1)## so Matlab day 734373 can be imported asas.Date(734373)-719529# 2010-08-23## (value from## http://www.mathworks.de/de/help/matlab/matlab_prog/represent-date-and-times-in-MATLAB.html)## Time zone effectz<- ISOdate(2010,04,13, c(0,12))# midnight and midday UTCas.Date(z)# in UTC## these time zone names are commonas.Date(z, tz="NZ")as.Date(z, tz="HST")# Hawaii
A generic function coercing anR object to anenvironment
. A number or a character string isconverted to the corresponding environment on the search path.
as.environment(x)
as.environment(x)
x | anR object to convert. If it is already anenvironment, just return it. If it is a positive number, return theenvironment corresponding to that position on the search list. If itis If it is a list, the equivalent of If |
This is aprimitive generic function: you can write methods tohandle specific classes of objects, seeInternalMethods.
The corresponding environment object.
John Chambers
environment
for creation and manipulation,search
;list2env
.
as.environment(1) ## the global environmentidentical(globalenv(), as.environment(1)) ## is TRUEtry( ## <<- stats need not be attached as.environment("package:stats"))ee <- as.environment(list(a = "A", b = pi, ch = letters[1:8]))ls(ee) # names of objects in eeutils::ls.str(ee)
as.environment(1)## the global environmentidentical(globalenv(), as.environment(1))## is TRUEtry(## <<- stats need not be attached as.environment("package:stats"))ee<- as.environment(list(a="A", b= pi, ch= letters[1:8]))ls(ee)# names of objects in eeutils::ls.str(ee)
as.function
is a generic function which is used to convertobjects to functions.
as.function.default
works on a listx
, which should contain theconcatenation of a formal argument list and an expression or anobject of mode"call"
which will become the function body.The function will be defined in a specified environment, by defaultthat of the caller.
as.function(x, ...)## Default S3 method:as.function(x, envir = parent.frame(), ...)
as.function(x,...)## Default S3 method:as.function(x, envir= parent.frame(),...)
x | object to convert, a list for the default method. |
... | additional arguments to be passed to or from methods. |
envir | environment in which the function should be defined. |
The desired function.
Peter Dalgaard
function
;alist
which is handy for the construction ofargument lists, etc.
as.function(alist(a = , b = 2, a+b))as.function(alist(a = , b = 2, a+b))(3)
as.function(alist(a=, b=2, a+b))as.function(alist(a=, b=2, a+b))(3)
Functions to manipulate objects of classes"POSIXlt"
and"POSIXct"
representing calendar dates and times.
as.POSIXct(x, tz = "", ...)as.POSIXlt(x, tz = "", ...)## S3 method for class 'character'as.POSIXlt(x, tz = "", format, tryFormats = c("%Y-%m-%d %H:%M:%OS", "%Y/%m/%d %H:%M:%OS", "%Y-%m-%d %H:%M", "%Y/%m/%d %H:%M", "%Y-%m-%d", "%Y/%m/%d"), optional = FALSE, ...)## Default S3 method:as.POSIXlt(x, tz = "", optional = FALSE, ...)## S3 method for class 'numeric'as.POSIXlt(x, tz = "", origin, ...)## S3 method for class 'Date'as.POSIXct(x, tz = "UTC", ...)## S3 method for class 'Date'as.POSIXlt(x, tz = "UTC", ...)## S3 method for class 'numeric'as.POSIXct(x, tz = "", origin, ...)## S3 method for class 'POSIXlt'as.double(x, ...)
as.POSIXct(x, tz="",...)as.POSIXlt(x, tz="",...)## S3 method for class 'character'as.POSIXlt(x, tz="", format, tryFormats= c("%Y-%m-%d %H:%M:%OS","%Y/%m/%d %H:%M:%OS","%Y-%m-%d %H:%M","%Y/%m/%d %H:%M","%Y-%m-%d","%Y/%m/%d"), optional=FALSE,...)## Default S3 method:as.POSIXlt(x, tz="", optional=FALSE,...)## S3 method for class 'numeric'as.POSIXlt(x, tz="", origin,...)## S3 method for class 'Date'as.POSIXct(x, tz="UTC",...)## S3 method for class 'Date'as.POSIXlt(x, tz="UTC",...)## S3 method for class 'numeric'as.POSIXct(x, tz="", origin,...)## S3 method for class 'POSIXlt'as.double(x,...)
x | R object to be converted. |
tz | a character string. The time zone specification to be usedfor the conversion,if one is required. System-specific (seetime zones), but |
... | further arguments to be passed to or from other methods. |
format | character string giving a date-time format as usedby |
tryFormats |
|
optional |
|
origin | a date-time object, or something which can be coerced by |
Theas.POSIX*
functions convert an object to one of the twoclasses used to represent date/times (calendar dates plus time to thenearest second). They can convert objects of the other class and ofclass"Date"
to these classes. Dates without times aretreated as being at midnight UTC.
They can also convert character strings of the formats"2001-02-03"
and"2001/02/03"
optionally followed bywhite space and a time in the format"14:52"
or"14:52:03"
. (Formats such as"01/02/03"
are ambiguousbut can be converted via a format specification bystrptime
.) Fractional seconds are allowed.Alternatively,format
can be specified for character vectors orfactors: if it is not specified and no standard format works forall non-NA
inputs an error is thrown.
Ifformat
is specified, remember that some of the formatspecifications are locale-specific, and you may need to set theLC_TIME
category appropriatelyviaSys.setlocale
. This most often affects the use of%a
,%A
(weekday names),%b
,%B
(month names) and%p
(AM/PM).
LogicalNA
s can be converted to either of the classes, but noother logical vectors can be.
If you are given a numeric time as the number of seconds since anepoch, see the examples.
Character input is first converted to class"POSIXlt"
bystrptime
: numeric input is first converted to"POSIXct"
. Any conversion that needs to go between the twodate-time classes requires a time zone: conversion from"POSIXlt"
to"POSIXct"
will validate times in theselected time zone. One issue is what happens at transitionsto and from DST, for example in the UK
as.POSIXct(strptime("2011-03-27 01:30:00", "%Y-%m-%d %H:%M:%S"))as.POSIXct(strptime("2010-10-31 01:30:00", "%Y-%m-%d %H:%M:%S"))
are respectively invalid (the clocks went forward at 1:00 GMT to 2:00BST) and ambiguous (the clocks went back at 2:00BST to 1:00 GMT). Whathappens in such cases is OS-specific: one should expect the first tobeNA
, but the second could be interpreted as eitherBST orGMT (and common OSes give both possible values). Note too (seestrftime
) that OS facilities may not format invalidtimes correctly.
as.POSIXct
andas.POSIXlt
return an object of theappropriate class. Iftz
was specified,as.POSIXlt
will give an appropriate"tzone"
attribute. Date-times knownto be invalid will be returned asNA
.
Some of the concepts used have to be extended backwards in time (theusage is said to be ‘proleptic’). For example, the origin oftime for the"POSIXct"
class, ‘1970-01-01 00:00.00 UTC’,is before UTC was defined. More importantly, conversion is doneassuming the Gregorian calendar which was introduced in 1582 and notused near-universally until the 20th century. One of there-interpretations assumed by ISO 8601:2004 is that there was a yearzero, even though current year numbering (and zero) is a much laterconcept (525 CE for year numbers from 1 CE).
Conversions between"POSIXlt"
and"POSIXct"
of futuretimes are speculative except in UTC. The main uncertainty is in theuse of and transitions to/from DST (most systems will assume thecontinuation of current rules but these can be changed at shortnotice).
If you want to extract specific aspects of a time (such as the day ofthe week) just convert it to class"POSIXlt"
and extract therelevant component(s) of the list, or if you want a characterrepresentation (such as a named day of the week) use theformat
method.
If a time zone is needed and that specified is invalid on your system,what happens is system-specific but attempts to set it will probablybe ignored.
Conversion from character needs to find a suitable format unless oneis supplied (by trying common formats in turn): this can be slow forlong inputs.
DateTimeClasses for details of the classes;strptime
for conversion to and from characterrepresentations.
Sys.timezone
for details of the (system-specific) namingof time zones.
locales for locale-specific aspects.
(z <- Sys.time()) # the current datetime, as class "POSIXct"unclass(z) # a large integerfloor(unclass(z)/86400) # the number of days since 1970-01-01 (UTC)(now <- as.POSIXlt(Sys.time())) # the current datetime, as class "POSIXlt"str(unclass(now)) # the internal list ; use now$hour, etc :now$year + 1900 # see ?DateTimeClassesmonths(now); weekdays(now) # see ?months; using LC_TIME locale## suppose we have a time in seconds since 1960-01-01 00:00:00 GMT## (the origin used by SAS)z <- 1472562988# ways to convert thisas.POSIXct(z, origin = "1960-01-01") # localas.POSIXct(z, origin = "1960-01-01", tz = "GMT") # in UTC## SPSS dates (R-help 2006-02-16)z <- c(10485849600, 10477641600, 10561104000, 10562745600)as.Date(as.POSIXct(z, origin = "1582-10-14", tz = "GMT"))## Stata date-times: milliseconds since 1960-01-01 00:00:00 GMT## format %tc excludes leap-seconds, assumed here## For format %tC including leap seconds, see foreign::read.dta()z <- 1579598122120op <- options(digits.secs = 3)# avoid rounding down: milliseconds are not exactly representableas.POSIXct((z+0.1)/1000, origin = "1960-01-01")options(op)## Matlab 'serial day number' (days and fractional days)z <- 7.343736909722223e5 # 2010-08-23 16:35:00as.POSIXct((z - 719529)*86400, origin = "1970-01-01", tz = "UTC")as.POSIXlt(Sys.time(), "GMT") # the current time in UTC## These may not be correct names on your systemas.POSIXlt(Sys.time(), "America/New_York") # in New Yorkas.POSIXlt(Sys.time(), "EST5EDT") # alternative.as.POSIXlt(Sys.time(), "EST" ) # somewhere in Eastern Canadaas.POSIXlt(Sys.time(), "HST") # in Hawaiias.POSIXlt(Sys.time(), "Australia/Darwin")tab <- file.path(R.home("share"), "zoneinfo", "zone1970.tab")if(file.exists(tab)) { # typically on Windows; *not* on Linux cols <- c("code", "coordinates", "TZ", "comments") tmp <- read.delim(tab, header = FALSE, comment.char = "#", col.names = cols) if(interactive()) View(tmp) head(tmp, 10)}
(z<- Sys.time())# the current datetime, as class "POSIXct"unclass(z)# a large integerfloor(unclass(z)/86400)# the number of days since 1970-01-01 (UTC)(now<- as.POSIXlt(Sys.time()))# the current datetime, as class "POSIXlt"str(unclass(now))# the internal list ; use now$hour, etc :now$year+1900# see ?DateTimeClassesmonths(now); weekdays(now)# see ?months; using LC_TIME locale## suppose we have a time in seconds since 1960-01-01 00:00:00 GMT## (the origin used by SAS)z<-1472562988# ways to convert thisas.POSIXct(z, origin="1960-01-01")# localas.POSIXct(z, origin="1960-01-01", tz="GMT")# in UTC## SPSS dates (R-help 2006-02-16)z<- c(10485849600,10477641600,10561104000,10562745600)as.Date(as.POSIXct(z, origin="1582-10-14", tz="GMT"))## Stata date-times: milliseconds since 1960-01-01 00:00:00 GMT## format %tc excludes leap-seconds, assumed here## For format %tC including leap seconds, see foreign::read.dta()z<-1579598122120op<- options(digits.secs=3)# avoid rounding down: milliseconds are not exactly representableas.POSIXct((z+0.1)/1000, origin="1960-01-01")options(op)## Matlab 'serial day number' (days and fractional days)z<-7.343736909722223e5# 2010-08-23 16:35:00as.POSIXct((z-719529)*86400, origin="1970-01-01", tz="UTC")as.POSIXlt(Sys.time(),"GMT")# the current time in UTC## These may not be correct names on your systemas.POSIXlt(Sys.time(),"America/New_York")# in New Yorkas.POSIXlt(Sys.time(),"EST5EDT")# alternative.as.POSIXlt(Sys.time(),"EST")# somewhere in Eastern Canadaas.POSIXlt(Sys.time(),"HST")# in Hawaiias.POSIXlt(Sys.time(),"Australia/Darwin")tab<- file.path(R.home("share"),"zoneinfo","zone1970.tab")if(file.exists(tab)){# typically on Windows; *not* on Linux cols<- c("code","coordinates","TZ","comments") tmp<- read.delim(tab, header=FALSE, comment.char="#", col.names= cols)if(interactive()) View(tmp) head(tmp,10)}
Change the class of an object to indicate that it should be treated‘as is’.
I(x)
I(x)
x | an object |
FunctionI
has two main uses.
In functiondata.frame
. Protecting an object byenclosing it inI()
in a call todata.frame
inhibits theconversion of character vectors to factors and the dropping ofnames, and ensures that matrices are inserted as single columns.I
can also be used to protect objects which are to beadded to a data frame, or converted to a data frameviaas.data.frame
.
It achieves this by prepending the class"AsIs"
to the object'sclasses. Class"AsIs"
has a few of its own methods, includingfor[
,as.data.frame
,print
andformat
.
In functionformula
. There it is used toinhibit the interpretation of operators such as"+"
,"-"
,"*"
and"^"
as formula operators, so theyare used as arithmetical operators. This is interpreted as a symbolbyterms.formula
.
A copy of the object with class"AsIs"
prepended to the class(es).
Chambers, J. M. (1992)Linear models.Chapter 4 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Split an array or matrix by its margins.
asplit(x, MARGIN)
asplit(x, MARGIN)
x | an array, including a matrix. |
MARGIN | a vector giving the margins to split by.E.g., for a matrix |
SinceR 4.1.0, one can also obtain the splits (less efficiently)usingapply(x, MARGIN, identity, simplify = FALSE)
.The values of the splits can also be obtained (less efficiently) bysplit(x, slice.index(x, MARGIN))
.
A “list array” with dimension and each element anarray of dimension
and dimnames preserved as available, where
and
are, respectively, the dimensions of
x
included and not included inMARGIN
.
## A 3-dimensional array of dimension 2 x 3 x 4:d <- 2 : 4x <- array(seq_len(prod(d)), d)x## Splitting by margin 2 gives a 1-d list array of length 3## consisting of 2 x 4 arrays:asplit(x, 2)## Splitting by margins 1 and 2 gives a 2 x 3 list array## consisting of 1-d arrays of length 4:asplit(x, c(1, 2))## Compare tosplit(x, slice.index(x, c(1, 2)))## A 2 x 3 matrix:(x <- matrix(1 : 6, 2, 3))## To split x by its rows, one can useasplit(x, 1)## or less efficientlysplit(x, slice.index(x, 1))split(x, row(x))
## A 3-dimensional array of dimension 2 x 3 x 4:d<-2:4x<- array(seq_len(prod(d)), d)x## Splitting by margin 2 gives a 1-d list array of length 3## consisting of 2 x 4 arrays:asplit(x,2)## Splitting by margins 1 and 2 gives a 2 x 3 list array## consisting of 1-d arrays of length 4:asplit(x, c(1,2))## Compare tosplit(x, slice.index(x, c(1,2)))## A 2 x 3 matrix:(x<- matrix(1:6,2,3))## To split x by its rows, one can useasplit(x,1)## or less efficientlysplit(x, slice.index(x,1))split(x, row(x))
Assign a value to a name in an environment.
assign(x, value, pos = -1, envir = as.environment(pos), inherits = FALSE, immediate = TRUE)
assign(x, value, pos=-1, envir= as.environment(pos), inherits=FALSE, immediate=TRUE)
x | a variable name, given as a character string. No coercion isdone, and the first element of a character vector of length greaterthan one will be used, with a warning. |
value | a value to be assigned to |
pos | where to do the assignment. By default, assigns into thecurrent environment. See ‘Details’ for other possibilities. |
envir | the |
inherits | should the enclosing frames of the environment beinspected? |
immediate | an ignored compatibility feature. |
There are no restrictions on the name given asx
: it can be anon-syntactic name (seemake.names
).
Thepos
argument can specify the environment in which to assignthe object in any of several ways: as-1
(the default),as a positive integer (the position in thesearch
list); asthe character string name of an element in the search list; or as anenvironment
(including usingsys.frame
toaccess the currently active function calls).Theenvir
argument is an alternative way to specify anenvironment, but is primarily for back compatibility.
assign
does not dispatch assignment methods, so it cannot beused to set elements of vectors, names, attributes, etc.
Note that assignment to an attached list or data frame changes theattached copy and not the original object: seeattach
andwith
.
This function is invoked for its side effect, which is assigningvalue
to the variablex
. If noenvir
isspecified, then the assignment takes place in the currently activeenvironment.
Ifinherits
isTRUE
, enclosing environments of the suppliedenvironment are searched until the variablex
is encountered.The value is then assigned in the environment in which the variable isencountered (provided that the binding is not locked: seelockBinding
: if it is, an error is signaled). If thesymbol is not encountered then assignment takes place in the user'sworkspace (the global environment).
Ifinherits
isFALSE
, assignment takes place in theinitial frame ofenvir
, unless an existing binding is locked orthere is no existing binding and the environment is locked (when anerror is signaled).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
<-
,get
, the inverse ofassign()
,exists
,environment
.
for(i in 1:6) { #-- Create objects 'r.1', 'r.2', ... 'r.6' -- nam <- paste("r", i, sep = ".") assign(nam, 1:i)}ls(pattern = "^r..$")##-- Global assignment within a function:myf <- function(x) { innerf <- function(x) assign("Global.res", x^2, envir = .GlobalEnv) innerf(x+1)}myf(3)Global.res # 16a <- 1:4assign("a[1]", 2)a[1] == 2 # FALSEget("a[1]") == 2 # TRUE
for(iin1:6){#-- Create objects 'r.1', 'r.2', ... 'r.6' -- nam<- paste("r", i, sep=".") assign(nam,1:i)}ls(pattern="^r..$")##-- Global assignment within a function:myf<-function(x){ innerf<-function(x) assign("Global.res", x^2, envir= .GlobalEnv) innerf(x+1)}myf(3)Global.res# 16a<-1:4assign("a[1]",2)a[1]==2# FALSEget("a[1]")==2# TRUE
Assign a value to a name.
x <- valuex <<- valuevalue -> xvalue ->> xx = value
x<- valuex<<- valuevalue-> xvalue->> xx= value
x | a variable name (possibly quoted). |
value | a value to be assigned to |
There are three different assignment operators: two of themhave leftwards and rightwards forms.
The operators<-
and=
assign into the environment inwhich they are evaluated. The operator<-
can be usedanywhere, whereas the operator=
is only allowed at the toplevel (e.g., in the complete expression typed at the command prompt)or as one of the subexpressions in a braced list of expressions.
The operators<<-
and->>
are normally only used infunctions, and cause a search to be made through parent environmentsfor an existing definition of the variable being assigned. If sucha variable is found (and its binding is not locked) then its valueis redefined, otherwise assignment takes place in the globalenvironment. Note that their semantics differ from that in the Slanguage, but are useful in conjunction with the scoping rules ofR. See ‘The R Language Definition’ manual for furtherdetails and examples.
In all the assignment operator expressions,x
can be a nameor an expression defining a part of an object to be replaced (e.g.,z[[1]]
). A syntactic name does not need to be quoted,though it can be (preferably bybackticks).
The leftwards forms of assignment<- = <<-
group right to left,the other from left to right.
value
. Thus one can usea <- b <- c <- 6
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer (for=
).
assign
(and its inverseget
),for “subassignment” such asx[i] <- v
,see[<-
; further,environment
.
The database is attached to theR search path. This means that thedatabase is searched byR when evaluating a variable, so objects inthe database can be accessed by simply giving their names.
attach(what, pos = 2L, name = deparse1(substitute(what), backtick=FALSE), warn.conflicts = TRUE)
attach(what, pos=2L, name= deparse1(substitute(what), backtick=FALSE), warn.conflicts=TRUE)
what | ‘database’. This can be a |
pos | integer specifying position in |
name | name to use for the attached database. Names starting with |
warn.conflicts | logical. If NB: Even though the name is |
When evaluating a variable or function nameR searches forthat name in the databases listed bysearch
. The firstname of the appropriate type is used.
By attaching a data frame (or list) to the search path it is possibleto refer to the variables in the data frame by their names alone,rather than as components of the data frame (e.g., in the example below,height
rather thanwomen$height
).
By default the database is attached in position 2 in the search path,immediately after the user's workspace and before all previouslyattached packages and previously attached databases. This can bealtered to attach later in the search path with thepos
option,but you cannot attach atpos = 1
.
The database is not actually attached. Rather, a new environment iscreated on the search path and the elements of a list (includingcolumns of a data frame) or objects in a save file or an environmentarecopied into the new environment. If you use<<-
orassign
to assign to an attacheddatabase, you only alter the attached copy, not the original object.(Normal assignment will place a modified version in the user'sworkspace: see the examples.) For this reasonattach
can leadto confusion.
One useful ‘trick’ is to usewhat = NULL
(or equivalently alength-zero list) to create a new environment on the search path intowhich objects can be assigned byassign
orload
orsys.source
.
Names starting"package:"
are reserved forlibrary
and should not be used by end users. Attachedfiles are by default given the namefile:what
. Thename
argument given for the attached environment will be usedbysearch
and can be used as the argument toas.environment
.
Theenvironment
is returned invisibly with a"name"
attribute.
attach
has the side effect of altering the search path and thiscan easily lead to the wrong object of a particular name being found.People do often forget todetach
databases.
In interactive use,with
is usually preferable to theuse ofattach
/detach
, unlesswhat
is asave()
-produced file in which caseattach()
is a (safety) wrapper forload()
.
In programming, functions should not change the search path unlessthat is their purpose. Oftenwith
can be used within afunction. If not, good practice is to
Always use a distinctivename
argument, and
To immediately follow theattach
call by anon.exit
call todetach
using the distinctive name.
This ensures that the search path is left unchanged even if thefunction is interrupted or if code after theattach
callchanges the search path.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
library
,detach
,search
,objects
,environment
,with
.
require(utils)summary(women$height) # refers to variable 'height' in the data frameattach(women)summary(height) # The same variable now available by nameheight <- height*2.54 # Don't do this. It creates a new variable # in the user's workspacefind("height")summary(height) # The new variable in the workspacerm(height)summary(height) # The original variable.height <<- height*25.4 # Change the copy in the attached environmentfind("height")summary(height) # The changed copydetach("women")summary(women$height) # unchanged## Not run: ## create an environment on the search path and populate itsys.source("myfuns.R", envir = attach(NULL, name = "myfuns"))## End(Not run)
require(utils)summary(women$height)# refers to variable 'height' in the data frameattach(women)summary(height)# The same variable now available by nameheight<- height*2.54# Don't do this. It creates a new variable# in the user's workspacefind("height")summary(height)# The new variable in the workspacerm(height)summary(height)# The original variable.height<<- height*25.4# Change the copy in the attached environmentfind("height")summary(height)# The changed copydetach("women")summary(women$height)# unchanged## Not run: ## create an environment on the search path and populate itsys.source("myfuns.R", envir= attach(NULL, name="myfuns"))## End(Not run)
Get or set specific attributes of an object.
attr(x, which, exact = FALSE)attr(x, which) <- value
attr(x, which, exact=FALSE)attr(x, which)<- value
x | an object whose attributes are to be accessed. |
which | a non-empty character string specifying which attributeis to be accessed. |
exact | logical: should |
value | an object, the new value of the attribute, or |
These functions provide access to a single attribute of an object.The replacement form causes the named attribute to take the valuespecified (or create a new attribute with the value given).
The extraction function first looks for an exact match towhich
amongst the attributes ofx
, then (unlessexact = TRUE
)a unique partial match.(Settingoptions(warnPartialMatchAttr = TRUE)
causespartial matches to give warnings.)
The replacement function only uses exact matches.
Note that some attributes (namelyclass
,comment
,dim
,dimnames
,names
,row.names
andtsp
) are treated specially and have restrictions onthe values which can be set. (Note that this is not true oflevels
which should be set for factors via thelevels
replacement function.)
The extractor function allows (and does not match) empty and missingvalues ofwhich
: the replacement function does not.
NULL
objects cannot have attributes and attempting toassign one byattr
gives an error.
Both areprimitive functions.
For the extractor, the value of the attribute matched, orNULL
if no exact match is found and no or more than one partial match is found.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
# create a 2 by 5 matrixx <- 1:10attr(x,"dim") <- c(2, 5)
# create a 2 by 5 matrixx<-1:10attr(x,"dim")<- c(2,5)
These functions access an object's attributes.The first form below returns the object's attribute list.The replacement forms uses the list on the right-handside of the assignment as the object's attributes (if appropriate).
attributes(x)attributes(x) <- valuemostattributes(x) <- value
attributes(x)attributes(x)<- valuemostattributes(x)<- value
x | anyR object. |
value | an appropriate named |
Unlikeattr
it is not an error to set attributes on aNULL
object: it will first be coerced to an empty list.
Note that some attributes (namelyclass
,comment
,dim
,dimnames
,names
,row.names
andtsp
) are treated specially and have restrictions onthe values which can be set. (Note that this is not true oflevels
which should be set for factors via thelevels
replacement function.)
Attributes are not stored internally as a list and should be thoughtof as a set and not a vector, i.e, theorder of the elements ofattributes()
does not matter. This is also reflected byidentical()
's behaviour with the default argumentattrib.as.set = TRUE
. Attributes must have unique names (andNA
is taken as"NA"
, not a missing value).
Assigning attributes first removes all attributes, then sets anydim
attribute and then the remaining attributes in the ordergiven: this ensures that setting adim
attribute always precedesthedimnames
attribute.
Themostattributes
assignment takes special care for thedim
,names
anddimnames
attributes, and assigns them only when known to be valid whereas anattributes
assignment would give an error if any are not. Itis principally intended for arrays, and should be used with care onclassed objects. For example, it does not check thatrow.names
are assigned correctly for data frames.
The names of a pairlist are not stored as attributes, but are reportedas if they were (and can be set by the replacement form ofattributes
).
NULL
objects cannot have attributes and attempts toassign them will promote the object to an empty list.
Both assignment and replacement forms ofattributes
areprimitive functions.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
x <- cbind(a = 1:3, pi = pi) # simple matrix with dimnamesattributes(x)## strip an object's attributes:attributes(x) <- NULLx # now just a vector of length 6mostattributes(x) <- list(mycomment = "really special", dim = 3:2, dimnames = list(LETTERS[1:3], letters[1:5]), names = paste(1:6))x # dim(), but not {dim}names
x<- cbind(a=1:3, pi= pi)# simple matrix with dimnamesattributes(x)## strip an object's attributes:attributes(x)<-NULLx# now just a vector of length 6mostattributes(x)<- list(mycomment="really special", dim=3:2, dimnames= list(LETTERS[1:3], letters[1:5]), names= paste(1:6))x# dim(), but not {dim}names
autoload
creates a promise-to-evaluateautoloader
andstores it with namename
in.AutoloadEnv
environment.WhenR attempts to evaluatename
,autoloader
is run,the package is loaded andname
is re-evaluated in the newpackage's environment. The result is thatR behaves as ifpackage
was loaded but it does not occupy memory.
.Autoloaded
contains the names of the packages forwhich autoloading has been promised.
autoload(name, package, reset = FALSE, ...)autoloader(name, package, ...).AutoloadEnv.Autoloaded
autoload(name, package, reset=FALSE,...)autoloader(name, package,...).AutoloadEnv.Autoloaded
name | string giving the name of an object. |
package | string giving the name of a package containing the object. |
reset | logical: for internal use by |
... | other arguments to |
This function is invoked for its side-effect. It has no return value.
require(stats)autoload("interpSpline", "splines")search()ls("Autoloads").Autoloadedx <- sort(stats::rnorm(12))y <- x^2is <- interpSpline(x, y)search() ## now has splinesdetach("package:splines")search()is2 <- interpSpline(x, y+x)search() ## and againdetach("package:splines")
require(stats)autoload("interpSpline","splines")search()ls("Autoloads").Autoloadedx<- sort(stats::rnorm(12))y<- x^2is<- interpSpline(x, y)search()## now has splinesdetach("package:splines")search()is2<- interpSpline(x, y+x)search()## and againdetach("package:splines")
Solves a triangular system of linear equations.
backsolve(r, x, k = ncol(r), upper.tri = TRUE, transpose = FALSE)forwardsolve(l, x, k = ncol(l), upper.tri = FALSE, transpose = FALSE)
backsolve(r, x, k= ncol(r), upper.tri=TRUE, transpose=FALSE)forwardsolve(l, x, k= ncol(l), upper.tri=FALSE, transpose=FALSE)
r ,l | an upper (or lower) triangular matrix giving thecoefficients for the system to be solved. Values below (above)the diagonal are ignored. |
x | a matrix whose columns give the right-hand sides forthe equations. |
k | the number of columns of |
upper.tri | logical; if |
transpose | logical; if |
Solves a system of linear equations where the coefficient matrix isupper (or ‘right’, ‘R’) or lower (‘left’,‘L’) triangular.
x <- backsolve (R, b)
solves, and
x <- forwardsolve(L, b)
solves, respectively.
Ther
/l
must have at leastk
rows and columns,andx
must have at leastk
rows.
This is a wrapper for the level-3 BLAS routinedtrsm
.
The solution of the triangular system. The result will be a vector ifx
is a vector and a matrix ifx
is a matrix.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide. Philadelphia: SIAM Publications.
## upper triangular matrix 'r':r <- rbind(c(1,2,3), c(0,1,1), c(0,0,2))( y <- backsolve(r, x <- c(8,4,2)) ) # -1 3 1r %*% y # == x = (8,4,2)backsolve(r, x, transpose = TRUE) # 8 -12 -5
## upper triangular matrix 'r':r<- rbind(c(1,2,3), c(0,1,1), c(0,0,2))( y<- backsolve(r, x<- c(8,4,2)))# -1 3 1r%*% y# == x = (8,4,2)backsolve(r, x, transpose=TRUE)# 8 -12 -5
Utilities to ‘balance’ objects of class"POSIXlt"
.
unCfillPOSIXlt(x)
is a fastprimitive version ofbalancePOSIXlt(x, fill.only=TRUE, classed=FALSE)
or equivalently,unclass(balancePOSIXlt(x, fill.only=TRUE))
from where it is named.
balancePOSIXlt(x, fill.only = FALSE, classed = TRUE)unCfillPOSIXlt(x)
balancePOSIXlt(x, fill.only=FALSE, classed=TRUE)unCfillPOSIXlt(x)
x | anR object inheriting from |
fill.only | a |
classed | a |
Note that"POSIXlt"
objectsx
may have their (9 to 11)list components of differentlength
s, by simplyrecycling them to full length. Prior toR 4.3.0, this has worked inprinting, formatting, and conversion to"POSIXct"
, but oftennot forlength()
, conversion to"Date"
or indexing,i.e., subsetting,[
, or subassigning,[<-
.
Relatedly, componentssec
,min
,hour
,mday
andmon
could have been out of their designated range (say, 0–23for hours) and still work correctly, e.g. in conversions and printing.This is supported as well, sinceR 4.3.0, at least when the values arenot extreme.
FunctionbalancePOSIXlt(x)
will now return a version of the"POSIXlt"
objectx
which by default is balanced in both ways:All the internal list components are of full length, and their values areinside their ranges as specified inas.POSIXlt
's‘Details on POSIXlt’.Settingfill.only = TRUE
will only recycle the list componentsto full length, but not check them at all. This is particularly fasterwhen all components ofx
are already of full length.
Experimentally,balancePOSIXlt()
and other functions returningPOSIXlt
objects now set alogical
attribute"balanced"
withNA
meaning “filled-in”, i.e.,not “ragged” andTRUE
means (fully) balanced.
For more details about many aspects of validPOSIXlt
objects, notablytheir internal list components, see ‘DateTimeClasses’, e.g.,as.POSIXlt
, notably the section ‘Details on POSIXlt’.
## FIXME: this should also work for regular (non-UTC) time zones.TZ <-"UTC"# Could be# d1 <- as.POSIXlt("2000-01-02 3:45", tz = TZ)# on systems (almost all) which have tm_zone.oldTZ <- Sys.getenv('TZ', unset = "unset")Sys.setenv(TZ = "UTC")d1 <- as.POSIXlt("2000-01-02 3:45")d1$min <- d1$min + (0:16)*20L(f1 <- format(d1))str(unclass(d1)) # only $min is of length > 1df <- balancePOSIXlt(d1, fill.only = TRUE) # a "POSIXlt" objectstr(unclass(df)) # all of length 17; 'min' unchangeddb <- balancePOSIXlt(d1, classed = FALSE) # a liststopifnot(identical( unCfillPOSIXlt(d1), balancePOSIXlt(d1, fill.only = TRUE, classed = FALSE)))str(db) # of length 17 *and* in rangeif(oldTZ == "unset") Sys.unsetenv('TZ') else Sys.setenv(TZ = oldTZ)
## FIXME: this should also work for regular (non-UTC) time zones.TZ<-"UTC"# Could be# d1 <- as.POSIXlt("2000-01-02 3:45", tz = TZ)# on systems (almost all) which have tm_zone.oldTZ<- Sys.getenv('TZ', unset="unset")Sys.setenv(TZ="UTC")d1<- as.POSIXlt("2000-01-02 3:45")d1$min<- d1$min+(0:16)*20L(f1<- format(d1))str(unclass(d1))# only $min is of length > 1df<- balancePOSIXlt(d1, fill.only=TRUE)# a "POSIXlt" objectstr(unclass(df))# all of length 17; 'min' unchangeddb<- balancePOSIXlt(d1, classed=FALSE)# a liststopifnot(identical( unCfillPOSIXlt(d1), balancePOSIXlt(d1, fill.only=TRUE, classed=FALSE)))str(db)# of length 17 *and* in rangeif(oldTZ=="unset") Sys.unsetenv('TZ')else Sys.setenv(TZ= oldTZ)
basename
removes all of the path up to and including the lastpath separator (if any).
dirname
returns the part of thepath
up to butexcluding the last path separator, or"."
if there is no pathseparator.
basename(path)dirname(path)
basename(path)dirname(path)
path | character vector, containing path names. |
tilde expansion of the path will be performed.
Trailing path separators are removed before dissecting the path,and fordirname
any trailing file separators are removedfrom the result.
A character vector of the same length aspath
. A zero-lengthinput will give a zero-length output with no error.
Paths not containing any separators are taken to be in the currentdirectory, sodirname
returns"."
.
If an element ofpath
isNA
, so is the result.
""
is not a valid pathname, but is returned unchanged.
On Windows this will accept either\
or/
as the pathseparator, butdirname
will return a path using/
(except if on a network share, when the leading\\
will bepreserved). Expect these only to be able to handle completepaths, and not for example just a network share or a drive.
UTF-8-encoded path names not valid in the current locale can be used.
These are not wrappers for the POSIX system functions of the samenames: in particular they donot have the special handling ofthe path"/"
and of returning"."
for empty strings.
basename(file.path("","p1","p2","p3", c("file1", "file2")))dirname (file.path("","p1","p2","p3", "filename"))
basename(file.path("","p1","p2","p3", c("file1","file2")))dirname(file.path("","p1","p2","p3","filename"))
Bessel Functions of integer and fractional order, of firstand second kind, and
, andModified Bessel functions (of first and third kind),
and
.
besselI(x, nu, expon.scaled = FALSE)besselK(x, nu, expon.scaled = FALSE)besselJ(x, nu)besselY(x, nu)
besselI(x, nu, expon.scaled=FALSE)besselK(x, nu, expon.scaled=FALSE)besselJ(x, nu)besselY(x, nu)
x | numeric, |
nu | numeric; theorder (maybe fractional and negative) ofthe corresponding Bessel function. |
expon.scaled | logical; if |
Ifexpon.scaled = TRUE
,,or
are returned.
For, formulae 9.1.2 and 9.6.2 fromAbramowitz & Stegunare applied (which is probably suboptimal), except for
besselK
which is symmetric innu
.
The current algorithms will give warnings about accuracy loss forlarge arguments. In some cases, these warnings are exaggerated, andthe precision is perfect. For largenu
, say in the order ofmillions, the current algorithms are rarely useful.
Numeric vector with the (scaled, ifexpon.scaled = TRUE
)values of the corresponding Bessel function.
The length of the result is the maximum of the lengths of theparameters. All parameters are recycled to that length.
Original Fortran code:W. J. Cody, Argonne National Laboratory
Translation to C and adaptation toR:Martin Maechler[email protected].
The C code is a translation of Fortran routines fromhttps://netlib.org/specfun/ribesl, ‘../rjbesl’, etc.The four source code files for bessel[IJKY] each contain a paragraph“Acknowledgement” and “References”, a short summary ofwhich is
based on (code) by David J. Sookne, see Sookne (1973)...Modifications... An earlier version was published in Cody (1983).
asbesselI
based on (code) by J. B. Campbell (1980)... Modifications...
draws heavily on Temme's Algol program for... and on Campbell's programs for
.... ... heavily modified.
Abramowitz, M. and Stegun, I. A. (1972).Handbook of Mathematical Functions.Dover, New York;Chapter 9: Bessel Functions of Integer Order.
In order of “Source” citation above:
Sookne, David J. (1973).Bessel Functions of Real Argument and Integer Order.Journal of Research of the National Bureau of Standards,77B, 125–132.doi:10.6028/jres.077B.012.
Cody, William J. (1983).Algorithm 597: Sequence of modified Bessel functions of the first kind.ACM Transactions on Mathematical Software,9(2), 242–245.doi:10.1145/357456.357462.
Campbell, J.B. (1980).On Temme's algorithm for the modified Bessel function of the third kind.ACM Transactions on Mathematical Software,6(4), 581–586.doi:10.1145/355921.355928.
Campbell, J.B. (1979).Bessel functions J_nu(x) and Y_nu(x) of float order and float argument.Computer Physics Communications,18, 133–142.doi:10.1016/0010-4655(79)90030-4.
Temme, Nico M. (1976).On the numerical evaluation of the ordinary Bessel function of thesecond kind.Journal of Computational Physics,21, 343–350.doi:10.1016/0021-9991(76)90032-2.
Other special mathematical functions, such asgamma
,, and
beta
,.
require(graphics)nus <- c(0:5, 10, 20)x <- seq(0, 4, length.out = 501)plot(x, x, ylim = c(0, 6), ylab = "", type = "n", main = "Bessel Functions I_nu(x)")for(nu in nus) lines(x, besselI(x, nu = nu), col = nu + 2)legend(0, 6, legend = paste("nu=", nus), col = nus + 2, lwd = 1)x <- seq(0, 40, length.out = 801); yl <- c(-.5, 1)plot(x, x, ylim = yl, ylab = "", type = "n", main = "Bessel Functions J_nu(x)")abline(h=0, v=0, lty=3)for(nu in nus) lines(x, besselJ(x, nu = nu), col = nu + 2)legend("topright", legend = paste("nu=", nus), col = nus + 2, lwd = 1, bty="n")## Negative nu's --------------------------------------------------xx <- 2:7nu <- seq(-10, 9, length.out = 2001)## --- I() --- --- --- ---matplot(nu, t(outer(xx, nu, besselI)), type = "l", ylim = c(-50, 200), main = expression(paste("Bessel ", I[nu](x), " for fixed ", x, ", as ", f(nu))), xlab = expression(nu))abline(v = 0, col = "light gray", lty = 3)legend(5, 200, legend = paste("x=", xx), col=seq(xx), lty=1:5)## --- J() --- --- --- ---bJ <- t(outer(xx, nu, besselJ))matplot(nu, bJ, type = "l", ylim = c(-500, 200), xlab = quote(nu), ylab = quote(J[nu](x)), main = expression(paste("Bessel ", J[nu](x), " for fixed ", x)))abline(v = 0, col = "light gray", lty = 3)legend("topright", legend = paste("x=", xx), col=seq(xx), lty=1:5)## ZOOM into right part:matplot(nu[nu > -2], bJ[nu > -2,], type = "l", xlab = quote(nu), ylab = quote(J[nu](x)), main = expression(paste("Bessel ", J[nu](x), " for fixed ", x)))abline(h=0, v = 0, col = "gray60", lty = 3)legend("topright", legend = paste("x=", xx), col=seq(xx), lty=1:5)##--------------- x --> 0 -----------------------------x0 <- 2^seq(-16, 5, length.out=256)plot(range(x0), c(1e-40, 1), log = "xy", xlab = "x", ylab = "", type = "n", main = "Bessel Functions J_nu(x) near 0\n log - log scale") ; axis(2, at=1)for(nu in sort(c(nus, nus+0.5))) lines(x0, besselJ(x0, nu = nu), col = nu + 2, lty= 1+ (nu%%1 > 0))legend("right", legend = paste("nu=", paste(nus, nus+0.5, sep=", ")), col = nus + 2, lwd = 1, bty="n")x0 <- 2^seq(-10, 8, length.out=256)plot(range(x0), 10^c(-100, 80), log = "xy", xlab = "x", ylab = "", type = "n", main = "Bessel Functions K_nu(x) near 0\n log - log scale") ; axis(2, at=1)for(nu in sort(c(nus, nus+0.5))) lines(x0, besselK(x0, nu = nu), col = nu + 2, lty= 1+ (nu%%1 > 0))legend("topright", legend = paste("nu=", paste(nus, nus + 0.5, sep = ", ")), col = nus + 2, lwd = 1, bty="n")x <- x[x > 0]plot(x, x, ylim = c(1e-18, 1e11), log = "y", ylab = "", type = "n", main = "Bessel Functions K_nu(x)"); axis(2, at=1)for(nu in nus) lines(x, besselK(x, nu = nu), col = nu + 2)legend(0, 1e-5, legend=paste("nu=", nus), col = nus + 2, lwd = 1)yl <- c(-1.6, .6)plot(x, x, ylim = yl, ylab = "", type = "n", main = "Bessel Functions Y_nu(x)")for(nu in nus){ xx <- x[x > .6*nu] lines(xx, besselY(xx, nu=nu), col = nu+2)}legend(25, -.5, legend = paste("nu=", nus), col = nus+2, lwd = 1)## negative nu in bessel_Y -- was bogus for a long timecurve(besselY(x, -0.1), 0, 10, ylim = c(-3,1), ylab = "")for(nu in c(seq(-0.2, -2, by = -0.1))) curve(besselY(x, nu), add = TRUE)title(expression(besselY(x, nu) * " " * {nu == list(-0.1, -0.2, ..., -2)}))
require(graphics)nus<- c(0:5,10,20)x<- seq(0,4, length.out=501)plot(x, x, ylim= c(0,6), ylab="", type="n", main="Bessel Functions I_nu(x)")for(nuin nus) lines(x, besselI(x, nu= nu), col= nu+2)legend(0,6, legend= paste("nu=", nus), col= nus+2, lwd=1)x<- seq(0,40, length.out=801); yl<- c(-.5,1)plot(x, x, ylim= yl, ylab="", type="n", main="Bessel Functions J_nu(x)")abline(h=0, v=0, lty=3)for(nuin nus) lines(x, besselJ(x, nu= nu), col= nu+2)legend("topright", legend= paste("nu=", nus), col= nus+2, lwd=1, bty="n")## Negative nu's --------------------------------------------------xx<-2:7nu<- seq(-10,9, length.out=2001)## --- I() --- --- --- ---matplot(nu, t(outer(xx, nu, besselI)), type="l", ylim= c(-50,200), main= expression(paste("Bessel ", I[nu](x)," for fixed ", x,", as ", f(nu))), xlab= expression(nu))abline(v=0, col="light gray", lty=3)legend(5,200, legend= paste("x=", xx), col=seq(xx), lty=1:5)## --- J() --- --- --- ---bJ<- t(outer(xx, nu, besselJ))matplot(nu, bJ, type="l", ylim= c(-500,200), xlab= quote(nu), ylab= quote(J[nu](x)), main= expression(paste("Bessel ", J[nu](x)," for fixed ", x)))abline(v=0, col="light gray", lty=3)legend("topright", legend= paste("x=", xx), col=seq(xx), lty=1:5)## ZOOM into right part:matplot(nu[nu>-2], bJ[nu>-2,], type="l", xlab= quote(nu), ylab= quote(J[nu](x)), main= expression(paste("Bessel ", J[nu](x)," for fixed ", x)))abline(h=0, v=0, col="gray60", lty=3)legend("topright", legend= paste("x=", xx), col=seq(xx), lty=1:5)##--------------- x --> 0 -----------------------------x0<-2^seq(-16,5, length.out=256)plot(range(x0), c(1e-40,1), log="xy", xlab="x", ylab="", type="n", main="Bessel Functions J_nu(x) near 0\n log - log scale"); axis(2, at=1)for(nuin sort(c(nus, nus+0.5))) lines(x0, besselJ(x0, nu= nu), col= nu+2, lty=1+(nu%%1>0))legend("right", legend= paste("nu=", paste(nus, nus+0.5, sep=", ")), col= nus+2, lwd=1, bty="n")x0<-2^seq(-10,8, length.out=256)plot(range(x0),10^c(-100,80), log="xy", xlab="x", ylab="", type="n", main="Bessel Functions K_nu(x) near 0\n log - log scale"); axis(2, at=1)for(nuin sort(c(nus, nus+0.5))) lines(x0, besselK(x0, nu= nu), col= nu+2, lty=1+(nu%%1>0))legend("topright", legend= paste("nu=", paste(nus, nus+0.5, sep=", ")), col= nus+2, lwd=1, bty="n")x<- x[x>0]plot(x, x, ylim= c(1e-18,1e11), log="y", ylab="", type="n", main="Bessel Functions K_nu(x)"); axis(2, at=1)for(nuin nus) lines(x, besselK(x, nu= nu), col= nu+2)legend(0,1e-5, legend=paste("nu=", nus), col= nus+2, lwd=1)yl<- c(-1.6,.6)plot(x, x, ylim= yl, ylab="", type="n", main="Bessel Functions Y_nu(x)")for(nuin nus){ xx<- x[x>.6*nu] lines(xx, besselY(xx, nu=nu), col= nu+2)}legend(25,-.5, legend= paste("nu=", nus), col= nus+2, lwd=1)## negative nu in bessel_Y -- was bogus for a long timecurve(besselY(x,-0.1),0,10, ylim= c(-3,1), ylab="")for(nuin c(seq(-0.2,-2, by=-0.1))) curve(besselY(x, nu), add=TRUE)title(expression(besselY(x, nu)*" "*{nu== list(-0.1,-0.2,...,-2)}))
These functions represent an interface for adjustmentsto environments and bindings within environments. They allow forlocking environments as well as individual bindings, and for linkinga variable to a function.
lockEnvironment(env, bindings = FALSE)environmentIsLocked(env)lockBinding(sym, env)unlockBinding(sym, env)bindingIsLocked(sym, env)makeActiveBinding(sym, fun, env)bindingIsActive(sym, env)activeBindingFunction(sym, env)
lockEnvironment(env, bindings=FALSE)environmentIsLocked(env)lockBinding(sym, env)unlockBinding(sym, env)bindingIsLocked(sym, env)makeActiveBinding(sym, fun, env)bindingIsActive(sym, env)activeBindingFunction(sym, env)
env | an environment. |
bindings | logical specifying whether bindings should be locked. |
sym | a name object or character string. |
fun | a function taking zero or one arguments. |
The functionlockEnvironment
locks its environment argument.Locking theenvironment prevents adding or removing variable bindings from theenvironment. Changing the value of a variable is still possible unlessthe binding has been locked. The namespace environments of packageswith namespaces are locked when loaded.
lockBinding
locks individual bindings in the specifiedenvironment. The value of a locked binding cannot be changed. Lockedbindings may be removed from an environment unless the environment islocked.
makeActiveBinding
installsfun
in environmentenv
so that getting the value ofsym
callsfun
with noarguments, and assigning tosym
callsfun
with oneargument, the value to be assigned. This allows the implementation ofthings like C variables linked toR variables and variables linked todatabases, and is used to implementsetRefClass
. It mayalso be useful for making thread-safe versions of some system globals.Currently active bindings are not preserved during package installation,but they can be created in.onLoad
.
ThebindingIsLocked
andenvironmentIsLocked
return alength-one logical vector. The remaining functions returnNULL
, invisibly.
Luke Tierney
# locking environmentse <- new.env()assign("x", 1, envir = e)get("x", envir = e)lockEnvironment(e)get("x", envir = e)assign("x", 2, envir = e)try(assign("y", 2, envir = e)) # error# locking bindingse <- new.env()assign("x", 1, envir = e)get("x", envir = e)lockBinding("x", e)try(assign("x", 2, envir = e)) # errorunlockBinding("x", e)assign("x", 2, envir = e)get("x", envir = e)# active bindingsf <- local( { x <- 1 function(v) { if (missing(v)) cat("get\n") else { cat("set\n") x <<- v } x }})makeActiveBinding("fred", f, .GlobalEnv)bindingIsActive("fred", .GlobalEnv)fredfred <- 2fred
# locking environmentse<- new.env()assign("x",1, envir= e)get("x", envir= e)lockEnvironment(e)get("x", envir= e)assign("x",2, envir= e)try(assign("y",2, envir= e))# error# locking bindingse<- new.env()assign("x",1, envir= e)get("x", envir= e)lockBinding("x", e)try(assign("x",2, envir= e))# errorunlockBinding("x", e)assign("x",2, envir= e)get("x", envir= e)# active bindingsf<- local({ x<-1function(v){if(missing(v)) cat("get\n")else{ cat("set\n") x<<- v} x}})makeActiveBinding("fred", f, .GlobalEnv)bindingIsActive("fred", .GlobalEnv)fredfred<-2fred
Logical operations on integer vectors with elements viewed as sets of bits.
bitwNot(a)bitwAnd(a, b)bitwOr(a, b)bitwXor(a, b)bitwShiftL(a, n)bitwShiftR(a, n)
bitwNot(a)bitwAnd(a, b)bitwOr(a, b)bitwXor(a, b)bitwShiftL(a, n)bitwShiftR(a, n)
a ,b | integer vectors; numeric vectors are coerced to integer vectors. |
n | non-negative integer vector of values up to 31. |
Each element of an integer vector has 32 bits.
Pairwise operations can result in integerNA
.
Shifting is done assuming the values represent unsigned integers.
An integer vector of length the longer of the arguments, or zerolength if one is zero-length.
The output element isNA
if an input isNA
(aftercoercion) or an invalid shift.
The logical operators,!
,&
,|
,xor
.Notably thesedo work bitwise forraw
arguments.
The classes"octmode"
and"hexmode"
whoseimplementation of the standard logical operators is based on thesefunctions.
Packagebitops has similar functions for numeric vectors whichdiffer in the way they treat integers or larger.
bitwNot(0:12) # -1 -2 ... -13bitwAnd(15L, 7L) # 7bitwOr (15L, 7L) # 15bitwXor(15L, 7L) # 8bitwXor(-1L, 1L) # -2## The "same" for 'raw' instead of integer :rr12 <- as.raw(0:12) ; rbind(rr12, !rr12)c(r15 <- as.raw(15), r7 <- as.raw(7)) # 0f 07r15 & r7 # 07r15 | r7 # 0fxor(r15, r7)# 08bitwShiftR(-1, 1:31) # shifts of 2^32-1 = 4294967295
bitwNot(0:12)# -1 -2 ... -13bitwAnd(15L,7L)# 7bitwOr(15L,7L)# 15bitwXor(15L,7L)# 8bitwXor(-1L,1L)# -2## The "same" for 'raw' instead of integer :rr12<- as.raw(0:12); rbind(rr12,!rr12)c(r15<- as.raw(15), r7<- as.raw(7))# 0f 07r15& r7# 07r15| r7# 0fxor(r15, r7)# 08bitwShiftR(-1,1:31)# shifts of 2^32-1 = 4294967295
Get or set thebody of a function which is basically all ofthe function definition but its formal arguments (formals
),see the ‘Details’.
body(fun = sys.function(sys.parent()))body(fun, envir = environment(fun)) <- value
body(fun= sys.function(sys.parent()))body(fun, envir= environment(fun))<- value
fun | a function object, or see ‘Details’. |
envir | environment in which the function should be defined. |
value | an object, usually alanguage object: see section‘Value’. |
For the first form,fun
can be a character stringnaming the function to be manipulated, which is searched for from theparent frame. If it is not specified, the function callingbody
is used.
The bodies of all but the simplest are braced expressions, that iscalls to{
: see the ‘Examples’ section for how tocreate such a call.
body
returns the body of the function specified. This isnormally alanguage object, most often a call to{
, butit can also be asymbol
such aspi
or a constant(e.g.,3
or"R"
) to be the return value of the function.
The replacement form sets the body of a function to theobject on the right hand side, and (potentially) resets theenvironment
of the function, and dropsattributes
. Ifvalue
is of class"expression"
the first element is used as the body: anyadditional elements are ignored, with a warning.
The three parts of a (non-primitive) function are itsformals
,body
, andenvironment
.
Further, seealist
,args
,function
.
body(body)f <- function(x) x^5body(f) <- quote(5^x)## or equivalently body(f) <- expression(5^x)f(3) # = 125body(f)## creating a multi-expression bodye <- expression(y <- x^2, return(y)) # or a listbody(f) <- as.call(c(as.name("{"), e))ff(8)## Using substitute() may be simpler than 'as.call(c(as.name("{",..)))':stopifnot(identical(body(f), substitute({ y <- x^2; return(y) })))
body(body)f<-function(x) x^5body(f)<- quote(5^x)## or equivalently body(f) <- expression(5^x)f(3)# = 125body(f)## creating a multi-expression bodye<- expression(y<- x^2, return(y))# or a listbody(f)<- as.call(c(as.name("{"), e))ff(8)## Using substitute() may be simpler than 'as.call(c(as.name("{",..)))':stopifnot(identical(body(f), substitute({ y<- x^2; return(y)})))
An analogue of the LISP backquote macro.bquote
quotes itsargument except that terms wrapped in.()
are evaluated in thespecifiedwhere
environment. Ifsplice = TRUE
thenterms wrapped in..()
are evaluated and spliced into a call.
bquote(expr, where = parent.frame(), splice = FALSE)
bquote(expr, where= parent.frame(), splice=FALSE)
expr | |
where | An environment. |
splice | Logical; if |
require(graphics)a <- 2bquote(a == a)quote(a == a)bquote(a == .(a))substitute(a == A, list(A = a))plot(1:10, a*(1:10), main = bquote(a == .(a)))## to set a function default argdefault <- 1bquote( function(x, y = .(default)) x+y )exprs <- expression(x <- 1, y <- 2, x + y)bquote(function() {..(exprs)}, splice = TRUE)
require(graphics)a<-2bquote(a== a)quote(a== a)bquote(a== .(a))substitute(a== A, list(A= a))plot(1:10, a*(1:10), main= bquote(a== .(a)))## to set a function default argdefault<-1bquote(function(x, y= .(default)) x+y)exprs<- expression(x<-1, y<-2, x+ y)bquote(function(){..(exprs)}, splice=TRUE)
Interrupt the execution of an expression and allow the inspection ofthe environment wherebrowser
was called from.
browser(text = "", condition = NULL, expr = TRUE, skipCalls = 0L)
browser(text="", condition=NULL, expr=TRUE, skipCalls=0L)
text | a text string that can be retrieved once the browser is invoked. |
condition | a condition that can be retrieved once the browser isinvoked. |
expr | a “condition”. By default, and whenever not falseafter being coerced to |
skipCalls | how many previous calls to skip when reporting thecalling context. |
A call tobrowser
can be included in the body of a function.When reached, this causes a pause in the execution of thecurrent expression and allows access to theR interpreter.
The purpose of thetext
andcondition
arguments are toallow helper programs (e.g., external debuggers) to insert specificvalues here, so that the specific call to browser (perhaps its locationin a source file) can be identified and special processing can beachieved. The values can be retrieved by callingbrowserText
andbrowserCondition
.
The purpose of theexpr
argument is to allow for the illusionof conditional debugging. It is an illusion, because execution isalways paused at the call to browser, but control is only passedto the evaluator described below ifexpr
is notFALSE
aftercoercion to logical.In most cases it is going to be more efficient to use anif
statement in the calling program, but in some cases using this argumentwill be simpler.
TheskipCalls
argument should be used when thebrowser()
call is nested within another debugging function: it will look furtherup the call stack to report its location.
At the browser prompt the user can enter commands orR expressions,followed by a newline. The commands are
c
exit the browserand continue execution at the next statement.
cont
synonym forc
.
f
finish execution of the current loop or function.
help
print this list of commands.
n
evaluate the next statement, stepping over function calls. For byte compiled functions interrupted bybrowser
calls,n
is equivalent toc
.
s
evaluate the next statement, stepping intofunction calls. Again, byte compiled functions makes
equivalent toc
.
where
print a stack trace of all active function calls.
r
invoke a"resume"
restart if one isavailable; interpreted as anR expression otherwise. Typically"resume"
restarts are established for continuing from userinterrupts.
Q
exit the browser and the current evaluation andreturn to the top-level prompt.
Leading and trailing whitespace is ignored, except for an empty line.Handling of empty lines depends on the"browserNLdisabled"
option; if it isTRUE
, empty lines are ignored. If not, an empty line is the same asn
(ors
, if it was used most recently).
Anything else entered at the browser prompt is interpreted as anR expression to be evaluated in the calling environment: inparticular typing an object name will cause the object to be printed,andls()
lists the objects in the calling frame. (If you wantto look at an object with a name such asn
, print itexplicitly, or use autoprint via(n)
.
The number of lines printed for the deparsed call can be limited bysettingoptions(deparse.max.lines)
.
The browser prompt is of the formBrowse[n]>
: heren
indicates the ‘browser level’. The browser canbe called when browsing (and often is whendebug
is inuse), and each recursive call increases the number. (The actualnumber is the number of ‘contexts’ on the context stack: thisis usually2
for the outer level of browsing and1
whenexamining dumps indebugger
.)
This is a primitive function but does argument matching in thestandard way.
Because the browser prompt is implemented using therestart and condition handling mechanism,it prevents error handlers set up before the breakpoint from beingcalled or invoked. The implementation follows this model:
repeat withRestarts( withCallingHandlers( readEvalPrint(), error = function(cnd) { cat("Error:", conditionMessage(cnd), "\n") invokeRestart("browser") } ), browser = function(...) NULL)readEvalPrint <- function(env = parent.frame()) { print(eval(parse(prompt = "Browse[n]> "), env))}
The restart invocation interrupts the lookup for condition handlersand transfers control to the next iteration of the debuggerREPL.
Note that condition handlers for other classes (such as"warning"
)are still called and may cause a non-local transfer of control out of thedebugger.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
debug
, andtraceback
for the stack on error.browserText
for how to retrieve the text and condition.
A call to browser can provide context by supplying either a textargument or a condition argument. These functions can be used toretrieve either of these arguments.
browserText(n = 1)browserCondition(n = 1)browserSetDebug(n = 1)
browserText(n=1)browserCondition(n=1)browserSetDebug(n=1)
n | The number of contexts to skip over, it must be non-negative. |
Each call tobrowser
can supply either a text string or a condition.The functionsbrowserText
andbrowserCondition
provide waysto retrieve those values. Since there can be multiple browser contextsactive at any time we also support retrieving values from the differentcontexts. The innermost (most recently initiated) browser context isnumbered 1: other contexts are numbered sequentially.
browserSetDebug
provides a mechanism for initiating the browser inone of the calling functions. Seesys.frame
for a morecomplete discussion of the calling stack. To usebrowserSetDebug
you select some calling function, determine how far back it is in the callstack and callbrowserSetDebug
withn
set to that value.Then, by typingc
at the browser prompt you will cause evaluationto continue, and provided there are no intervening calls to browser orother interrupts, control will halt again once evaluation has returned tothe closure specified. This is similar to the up functionality in GDBor the "step out" functionality in other debuggers.
browserText
returns the text, whilebrowserCondition
returns the condition from the specified browser context.
browserSetDebug
returns NULL, invisibly.
It may be of interest to allow for querying further up the set of browsercontexts and this functionality may be added at a later date.
R. Gentleman
Return the names of all the built-in objects. These are fetcheddirectly from the symbol table of theR interpreter.
builtins(internal = FALSE)
builtins(internal=FALSE)
internal | a logical indicating whether only ‘internal’functions (which can be called via |
builtins()
returns an unsorted list of the objects in thesymbol table, that is all the objects in the base environment.These are the built-in objects plus any that have been addedsubsequently when the base package was loaded. It is less confusingto usels(baseenv(), all.names = TRUE)
.
builtins(TRUE)
returns an unsorted list of the names of internalfunctions, that is those which can be accessed as.Internal(foo(args ...))
forfoo in the list.
A character vector.
Functionby
is an object-oriented wrapper fortapply
applied to data frames.
by(data, INDICES, FUN, ..., simplify = TRUE)
by(data, INDICES, FUN,..., simplify=TRUE)
data | anR object, normally a data frame, possibly a matrix. |
INDICES | a factor or a list of factors, each of length |
FUN | a function to be applied to (usually data-frame) subsets of |
... | further arguments to |
simplify | logical: see |
A data frame is split by row into data framessubsetted by the values of one or more factors, and functionFUN
is applied to each subset in turn.
For the default method, an object with dimensions (e.g., a matrix) iscoerced to a data frame and the data frame method applied. Otherobjects are also coerced to a data frame, butFUN
is appliedseparately to (subsets of) each column of the data frame.
An object of class"by"
, giving the results for each subset.This is always a list ifsimplify
is false, otherwise a list orarray (seetapply
).
tapply
,simplify2array
.array2DF
to convert result to a dataframe.ave
also applies a function block-wise.
require(stats)by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)by(warpbreaks[, 1], warpbreaks[, -1], summary)by(warpbreaks, warpbreaks[,"tension"], function(x) lm(breaks ~ wool, data = x))## now suppose we want to extract the coefficients by grouptmp1 <- with(warpbreaks, by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x)))sapply(tmp1, coef)## another waytmp2 <- by(warpbreaks, ~ tension, with, coef(lm(breaks ~ wool)))array2DF(tmp2, simplify = TRUE)
require(stats)by(warpbreaks[,1:2], warpbreaks[,"tension"], summary)by(warpbreaks[,1], warpbreaks[,-1], summary)by(warpbreaks, warpbreaks[,"tension"],function(x) lm(breaks~ wool, data= x))## now suppose we want to extract the coefficients by grouptmp1<- with(warpbreaks, by(warpbreaks, tension,function(x) lm(breaks~ wool, data= x)))sapply(tmp1, coef)## another waytmp2<- by(warpbreaks,~ tension, with, coef(lm(breaks~ wool)))array2DF(tmp2, simplify=TRUE)
This is a generic function which combines its arguments.
The default method combines its arguments to form a vector.All arguments are coerced to a common type which is the typeof the returned value, and all attributes except names are removed.
## S3 Generic functionc(...)## Default S3 method:c(..., recursive = FALSE, use.names = TRUE)
## S3 Generic functionc(...)## Default S3 method:c(..., recursive=FALSE, use.names=TRUE)
... | objects to be concatenated. All |
recursive | logical. If |
use.names | logical indicating if |
The output type is determined from the highest type of the componentsin the hierarchy NULL < raw < logical < integer < double < complex < character< list < expression. Pairlists are treated as lists, whereas non-vectorcomponents (such asname
s /symbol
s andcall
s)are treated as one-elementlist
swhich cannot be unlisted even ifrecursive = TRUE
.
If the output type iscomplex
, logical, integer, and doubleNA
s keep their imaginary parts zero when coerced, and hence willnot becomeNA_complex_
(with imaginary partNA
).
There is ac.factor
method which combines factors intoa factor.
c
is sometimes used for its side effect of removing attributesexcept names, for example to turn anarray
into a vector.as.vector
is a more intuitive way to do this, but also dropsnames. Note thatc
methods other than the default are not requiredto remove attributes (and they will almost certainly preserve a class attribute).
This is aprimitive function.
NULL
or an expression or a vector of an appropriate mode.(With no arguments the value isNULL
.)
This function is S4 generic, but with argument list(x, ...)
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
unlist
andas.vector
to produceattribute-free vectors.
c(1, 7:9)c(1:5, 10.5, "next")## uses with a single argument to drop attributesx <- 1:4names(x) <- letters[1:4]xc(x) # has namesas.vector(x) # no namesdim(x) <- c(2,2)xc(x)as.vector(x)## append to a list:ll <- list(A = 1, c = "C")## do *not* usec(ll, d = 1:3) # which is == c(ll, as.list(c(d = 1:3)))## but ratherc(ll, d = list(1:3)) # c() combining two lists## descend through lists:c(list(A = c(B = 1)), recursive = TRUE)c(list(A = c(B = 1, C = 2), B = c(E = 7)), recursive = TRUE)
c(1,7:9)c(1:5,10.5,"next")## uses with a single argument to drop attributesx<-1:4names(x)<- letters[1:4]xc(x)# has namesas.vector(x)# no namesdim(x)<- c(2,2)xc(x)as.vector(x)## append to a list:ll<- list(A=1, c="C")## do *not* usec(ll, d=1:3)# which is == c(ll, as.list(c(d = 1:3)))## but ratherc(ll, d= list(1:3))# c() combining two lists## descend through lists:c(list(A= c(B=1)), recursive=TRUE)c(list(A= c(B=1, C=2), B= c(E=7)), recursive=TRUE)
Create or test for objects ofmode
"call"
(or"("
, see Details).
call(name, ...)is.call(x)as.call(x)
call(name,...)is.call(x)as.call(x)
name | a non-empty character string naming the function to be called. |
... | arguments to be part of the call. |
x | an arbitraryR object. |
call
returns an unevaluated function call, that is, anunevaluated expression which consists of the named function applied tothe given arguments (name
must be a string which givesthe name of a function to be called). Note that although the call isunevaluated, the arguments...
are evaluated.
call
is a primitive, so the first argument istaken asname
and the remaining arguments as arguments for theconstructed call: if the first argument is named the name mustpartially matchname
.
is.call
is used to determine whetherx
is a call (i.e.,ofmode
"call"
or"("
). Note that
is.call(x)
is strictly equivalent totypeof(x) == "language"
.
is.language()
is also true for calls (but alsoforsymbol
s andexpression
s whereis.call()
is false).
Whenis.call(cl)
is true,class(cl)
typically returns"call"
, except whencl
is one ofif
,for
,while
,(
,{
,<-
,=
,which each has its ownclass(cl)
(equal to the“function” name), see the ‘Special calls’ example.
as.call(x)
:Objects of mode"list"
can be coerced to mode"call"
.The first element of the list becomes the function part of the call,so should be a function or the name of one (as a symbol; a character string will not do).
If you think of usingas.call(string)
, consider usingstr2lang(string)
which is an efficient version ofparse(text=string)
.Note thatcall()
andas.call()
, whenapplicable, are much preferable to theseparse()
basedapproaches.
All three areprimitive functions.
as.call
is generic: you can write methods to handle specificclasses of objects, seeInternalMethods.
call
should not be used to attempt to evade restrictions on theuse of.Internal
and other non-API calls.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
do.call
for calling a function by name and argumentlist;Recall
for recursive calling of functions;furtheris.language
,expression
,function
.
Producingcall
s etc from character:str2lang
andparse
.
is.call(call) #-> FALSE: Functions are NOT calls## set up a function call to round with argument 10.5cl <- call("round", 10.5)is.call(cl) # TRUEclidentical(quote(round(10.5)), # <- less functional, but the same cl) # TRUE## such a call can also be evaluated.eval(cl) # [1] 10class(cl) # "call"typeof(cl)# "language"is.call(cl) && is.language(cl) # always TRUE for "call"sA <- 10.5call("round", A) # round(10.5)call("round", quote(A)) # round(A)f <- "round"call(f, quote(A)) # round(A)## if we want to supply a function we need to use as.call or similarf <- round## Not run: call(f, quote(A)) # error: first arg must be character(g <- as.call(list(f, quote(A))))eval(g)## alternatively but less transparentlyg <- list(f, quote(A))mode(g) <- "call"geval(g)## Special calls (and some regular ones):L <- as.list(E <- setNames( , c("if", "for", "while", "repeat", "function", "(", "{", "[", "<-", "<<-", "->", "=")))for(i in seq_along(L)) L[[i]] <- call(E[[i]]) # instead of lapply(E, call) ..list_ <- function (...) `names<-`(list(...), vapply(sys.call()[-1L], as.character, ""))(Tab <- noquote(sapply(list_(is.call, typeof, class, mode), \(F) sapply(L, F))))## The 7 exceptions:Tab[ Tab[,"class"] != "call" , c(3:4, 1:2)]## see also the examples in the help for do.call
is.call(call)#-> FALSE: Functions are NOT calls## set up a function call to round with argument 10.5cl<- call("round",10.5)is.call(cl)# TRUEclidentical(quote(round(10.5)),# <- less functional, but the same cl)# TRUE## such a call can also be evaluated.eval(cl)# [1] 10class(cl)# "call"typeof(cl)# "language"is.call(cl)&& is.language(cl)# always TRUE for "call"sA<-10.5call("round", A)# round(10.5)call("round", quote(A))# round(A)f<-"round"call(f, quote(A))# round(A)## if we want to supply a function we need to use as.call or similarf<- round## Not run: call(f, quote(A)) # error: first arg must be character(g<- as.call(list(f, quote(A))))eval(g)## alternatively but less transparentlyg<- list(f, quote(A))mode(g)<-"call"geval(g)## Special calls (and some regular ones):L<- as.list(E<- setNames(, c("if","for","while","repeat","function","(","{","[","<-","<<-","->","=")))for(iin seq_along(L)) L[[i]]<- call(E[[i]])# instead of lapply(E, call) ..list_<-function(...) `names<-`(list(...), vapply(sys.call()[-1L], as.character,""))(Tab<- noquote(sapply(list_(is.call, typeof, class, mode), \(F) sapply(L, F))))## The 7 exceptions:Tab[ Tab[,"class"]!="call", c(3:4,1:2)]## see also the examples in the help for do.call
A downward-only version of Scheme's call with current continuation.
callCC(fun)
callCC(fun)
fun | function of one argument, the exit procedure. |
callCC
provides a non-local exit mechanism that can be usefulfor early termination of a computation.callCC
callsfun
with one argument, anexit function. The exitfunction takes a single argument, the intended return value. If thebody offun
calls the exit function then the call tocallCC
immediately returns, with the value supplied to the exitfunction as the value returned bycallCC
.
Luke Tierney
# The following all return the value 1callCC(function(k) 1)callCC(function(k) k(1))callCC(function(k) {k(1); 2})callCC(function(k) repeat k(1))
# The following all return the value 1callCC(function(k)1)callCC(function(k) k(1))callCC(function(k){k(1);2})callCC(function(k)repeat k(1))
Functions to passR objects to compiled C/C++ code that has beenloaded intoR.
.Call(.NAME, ..., PACKAGE).External(.NAME, ..., PACKAGE)
.Call(.NAME,..., PACKAGE).External(.NAME,..., PACKAGE)
.NAME | a character string giving the name of a C function,or an object of class |
... | arguments to be passed to the compiled code. Up to 65 for |
PACKAGE | if supplied, confine the search for a character string This argument follows This is intended to add safety for packages, which can ensure byusing this argument that no other package can override theirexternal symbols, and also speeds up the search (see ‘Note’). |
The functions are used to call compiled code which makes use ofinternalR objects, passing the arguments to the code as a sequenceofR objects. They assume C calling conventions, so can usuallyalso be used for C++ code.
For details about how to write code to use with these functions seethe chapter on ‘System and foreign language interfaces’ inthe ‘Writing R Extensions’ manual. They differ in the way thearguments are passed to the C code:.External
allows for avariable or unlimited number of arguments.
These functions areprimitive, and.NAME
is alwaysmatched to the first argument supplied (which should not be named).For clarity, avoid using names in the arguments passed to...
that match or partially match.NAME
.
AnR object constructed in the compiled code.
Writing code for use with these functions will need to use internalRstructures defined in ‘Rinternals.h’ and/or the macros in‘Rdefines.h’.
If one of these functions is to be used frequently, do specifyPACKAGE
(to confine the search to a single DLL) or pass.NAME
as one of the native symbol objects. Searching forsymbols can take a long time, especially when many namespaces are loaded.
You may seePACKAGE = "base"
for symbols linked intoR. Donot use this in your own code: such symbols are not part of the APIand may be changed without warning.
PACKAGE = ""
used to be accepted (but was undocumented): it isnow an error.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer. (.Call
.)
The ‘Writing R Extensions’ manual.
Report on the optional features which have been compiled into thisbuild ofR.
capabilities(what = NULL, Xchk = any(nas %in% c("X11", "jpeg", "png", "tiff")))
capabilities(what=NULL, Xchk= any(nas%in% c("X11","jpeg","png","tiff")))
what | character vector or |
Xchk |
|
A named logical vector. Current components are
jpeg | is the |
png | is the |
tiff | is the |
tcltk | is thetcltk package operational?Note that to make use of Tk you will almost always need to checkthat |
X11 | are the |
aqua | is the Note that this is distinct from |
http/ftp | does the default method for |
sockets | are |
libxml | is there support for integrating |
fifo | are FIFOconnections supported? |
cledit | is command-line editing available in the currentRsession? This is false in non-interactive sessions.It will be true for the command-line interface if |
iconv | is internationalization conversion via |
NLS | is there Natural Language Support (for message translations)? |
Rprof | is there support for |
profmem | is there support for memory profiling? See |
cairo | is there support for the |
ICU | is ICU available for collation? See the help onComparison and |
long.double | does this build use a Although not guaranteed, it is a reasonable assumption that ifpresent long doubles will have at least as much range and accuracyas the ISO/IEC 60559 80-bit ‘extended precision’ format. SinceR 4.0.0 |
libcurl | is |
Capabilities"jpeg"
,"png"
and"tiff"
refer tothe X11-based versions of these devices. Ifcapabilities("aqua")
is true, then these devices withtype = "quartz"
will be available, and out-of-the-box will be thedefault type. Thus for example thetiff
device will beavailable ifcapabilities("aqua") || capabilities("tiff")
ifthe defaults are unchanged.
.Platform
,extSoftVersion
, andgrSoftVersion
(and links there)for availability of capabilitiesexternal toR butused fromR functions.
capabilities()if(!capabilities("ICU")) warning("ICU is not available")## Does not call the internal X11-checking function:capabilities(Xchk = FALSE)## See also the examples for 'connections'.
capabilities()if(!capabilities("ICU")) warning("ICU is not available")## Does not call the internal X11-checking function:capabilities(Xchk=FALSE)## See also the examples for 'connections'.
Outputs the objects, concatenating the representations.cat
performs much less conversion thanprint
.
cat(... , file = "", sep = " ", fill = FALSE, labels = NULL, append = FALSE)
cat(..., file="", sep=" ", fill=FALSE, labels=NULL, append=FALSE)
... | R objects (see ‘Details’ for the types of objectsallowed). |
file | aconnection, or a character string naming the fileto print to. If |
sep | a character vector of strings to append after each element. |
fill | a logical or (positive) numeric controlling how the output isbroken into successive lines. If |
labels | character vector of labels for the lines printed.Ignored if |
append | logical. Only used if the argument |
cat
is useful for producing output in user-defined functions.It converts its arguments to character vectors, concatenatesthem to a single character vector, appends the givensep =
string(s) to each element and then outputs them.
No line feeds (aka “newline”s) are output unless explicitlyrequested by ‘"\n"’or if generated by filling (if argumentfill
isTRUE
ornumeric).
Iffile
is a connection and open for writing it is written fromits current position. If it is not open, it is opened for theduration of the call in"wt"
mode and then closed again.
Currently onlyatomic vectors andnames are handled,together withNULL
and other zero-length objects (which produceno output). Character strings are output ‘as is’ (unlikeprint.default
which escapes non-printable characters andbackslash — useencodeString
if you want to outputencoded strings usingcat
). Other types ofR object should beconverted (e.g., byas.character
orformat
)before being passed tocat
. That includes factors, which areoutput as integer vectors.
cat
converts numeric/complex elements in the same way asprint
(and not in the same way asas.character
which is used by the S equivalent), sooptions
"digits"
and"scipen"
are relevant. However, it usesthe minimum field width necessary for each element, rather than thesame field width for all elements.
None (invisibleNULL
).
If any element ofsep
contains a newline character, it istreated as a vector of terminators rather than separators, an elementbeing output after every vector elementand a newline after thelast. Entries are recycled as needed.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
print
,format
, andpaste
which concatenates into a string.
iter <- stats::rpois(1, lambda = 10)## print an informative messagecat("iteration = ", iter <- iter + 1, "\n")## 'fill' and label lines:cat(paste(letters, 100* 1:26), fill = TRUE, labels = paste0("{", 1:10, "}:"))
iter<- stats::rpois(1, lambda=10)## print an informative messagecat("iteration = ", iter<- iter+1,"\n")## 'fill' and label lines:cat(paste(letters,100*1:26), fill=TRUE, labels= paste0("{",1:10,"}:"))
Take a sequence of vector, matrix or data-frame arguments and combinebycolumns orrows, respectively. These are genericfunctions with methods for otherR classes.
cbind(..., deparse.level = 1)rbind(..., deparse.level = 1)## S3 method for class 'data.frame'rbind(..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = FALSE, factor.exclude = TRUE)
cbind(..., deparse.level=1)rbind(..., deparse.level=1)## S3 method for class 'data.frame'rbind(..., deparse.level=1, make.row.names=TRUE, stringsAsFactors=FALSE, factor.exclude=TRUE)
... | (generalized) vectors or matrices. These can be given as namedarguments. OtherR objects may be coerced as appropriate, or S4methods may be used: see sections ‘Details’ and‘Value’. (For the |
deparse.level | integer controlling the construction of labels inthe case of non-matrix-like arguments (for the default method): |
make.row.names | (only for data frame method:) logicalindicating if unique and valid |
stringsAsFactors | logical, passed to |
factor.exclude | if the data frames contain factors, the default |
The functionscbind
andrbind
are S3 generic, withmethods for data frames. The data frame method will be used if atleast one argument is a data frame and the rest are vectors ormatrices. There can be other methods; in particular, there is one fortime series objects. See the section on ‘Dispatch’ for howthe method to be used is selected. If some of the arguments are of anS4 class, i.e.,isS4(.)
is true, S4 methods are soughtalso, and the hiddencbind
/rbind
functionsfrom packagemethods maybe called, which in turn build oncbind2
orrbind2
, respectively. In thatcase,deparse.level
is obeyed, similarly to the default method.
In the default method, all the vectors/matrices must be atomic (seevector
) or lists. Expressions are not allowed.Language objects (such as formulae and calls) and pairlists will becoerced to lists: other objects (such as names and external pointers)will be included as elements in a list result. Any classes the inputsmight have are discarded (in particular, factors are replaced by theirinternal codes).
If there are several matrix arguments, they must all have the samenumber of columns (or rows) and this will be the number of columns (orrows) of the result. If all the arguments are vectors, the number ofcolumns (rows) in the result is equal to the length of the longestvector. Values in shorter arguments are recycled to achieve thislength (with awarning
if they are recycled onlyfractionally).
When the arguments consist of a mix of matrices and vectors the numberof columns (rows) of the result is determined by the number of columns(rows) of the matrix arguments. Any vectors have their valuesrecycled or subsetted to achieve this length.
Forcbind
(rbind
), vectors of zero length (includingNULL
) are ignored unless the result would have zero rows(columns), for S compatibility.(Zero-extent matrices do not occur in S3 and are not ignored inR.)
Matrices are restricted to less than rows andcolumns even on 64-bit systems. So input vectors have the same lengthrestriction: as fromR 3.2.0 input matrices with more elements (butmeeting the row and column restrictions) are allowed.
For the default method, a matrix combining the...
argumentscolumn-wise or row-wise. (Exception: if there are no inputs or allthe inputs areNULL
, the value isNULL
.)
The type of a matrix result determined from the highest type of any ofthe inputs in the hierarchy raw < logical < integer < double < complex <character < list .
Forcbind
(rbind
) the column (row) names are taken fromthecolnames
(rownames
) of the arguments if these arematrix-like. Otherwise from the names of the arguments or where thoseare not supplied anddeparse.level > 0
, by deparsing theexpressions given, fordeparse.level = 1
only if that gives asensible name (a ‘symbol’, seeis.symbol
).
Forcbind
row names are taken from the first argument withappropriate names: rownames for a matrix, or names for a vector oflength the number of rows of the result.
Forrbind
column names are taken from the first argument withappropriate names: colnames for a matrix, or names for a vector oflength the number of columns of the result.
Thecbind
data frame method is just a wrapper fordata.frame(..., check.names = FALSE)
. This means thatit will split matrix columns in data frame arguments, and convertcharacter columns to factors unlessstringsAsFactors = FALSE
isspecified.
Therbind
data frame method first drops all zero-column andzero-row arguments. (If that leaves none, it returns the firstargument with columns otherwise a zero-column zero-row data frame.)It then takes the classes of the columns from the first data frame,and matches columns by name (rather than by position). Factors havetheir levels expanded as necessary (in the order of the levels of thelevel sets of the factors encountered) and the result is an orderedfactor if and only if all the components were ordered factors.Old-style categories (integer vectors with levels) are promoted tofactors.
Note that for result columnj
,factor(., exclude = X(j))
is applied, where
X(j) := if(isTRUE(factor.exclude)) { if(!NA.lev[j]) NA # else NULL } else factor.exclude
whereNA.lev[j]
is true iff any contributing data frame has had afactor
in columnj
with an explicitNA
level.
The method dispatching isnot done viaUseMethod()
, but by C-internal dispatching.Therefore there is no need for, e.g.,rbind.default
.
The dispatch algorithm is described in the source file(‘.../src/main/bind.c’) as
For each argument we get the list of possible classmemberships from the class attribute.
We inspect each class in turn to see if there is anapplicable method.
If we find a method, we use it. Otherwise, if there was an S4object among the arguments, we try S4 dispatch; otherwise, we usethe default code.
If you want to combine other objects with data frames, it may benecessary to coerce them to data frames first. (Note that thisalgorithm can result in calling the data frame method if all thearguments are either data frames or vectors, and this will result inthe coercion of character vectors to factors.)
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
c
to combine vectors (and lists) as vectors,data.frame
to combine vectors and matrices as a dataframe.
m <- cbind(1, 1:7) # the '1' (= shorter vector) is recycledmm <- cbind(m, 8:14)[, c(1, 3, 2)] # insert a columnmcbind(1:7, diag(3)) # vector is subset -> warningcbind(0, rbind(1, 1:3))cbind(I = 0, X = rbind(a = 1, b = 1:3)) # use some namesxx <- data.frame(I = rep(0,2))cbind(xx, X = rbind(a = 1, b = 1:3)) # named differentlycbind(0, matrix(1, nrow = 0, ncol = 4)) #> Warning (making sense)dim(cbind(0, matrix(1, nrow = 2, ncol = 0))) #-> 2 x 1## deparse.leveldd <- 10rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 0) # middle 2 rownamesrbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 1) # 3 rownames (default)rbind(1:4, c = 2, "a++" = 10, dd, deparse.level = 2) # 4 rownames## cheap row names:b0 <- gl(3,4, labels=letters[1:3])bf <- setNames(b0, paste0("o", seq_along(b0)))df <- data.frame(a = 1, B = b0, f = gl(4,3))df. <- data.frame(a = 1, B = bf, f = gl(4,3))new <- data.frame(a = 8, B ="B", f = "1")(df1 <- rbind(df , new))(df.1 <- rbind(df., new))stopifnot(identical(df1, rbind(df, new, make.row.names=FALSE)), identical(df1, rbind(df., new, make.row.names=FALSE)))
m<- cbind(1,1:7)# the '1' (= shorter vector) is recycledmm<- cbind(m,8:14)[, c(1,3,2)]# insert a columnmcbind(1:7, diag(3))# vector is subset -> warningcbind(0, rbind(1,1:3))cbind(I=0, X= rbind(a=1, b=1:3))# use some namesxx<- data.frame(I= rep(0,2))cbind(xx, X= rbind(a=1, b=1:3))# named differentlycbind(0, matrix(1, nrow=0, ncol=4))#> Warning (making sense)dim(cbind(0, matrix(1, nrow=2, ncol=0)))#-> 2 x 1## deparse.leveldd<-10rbind(1:4, c=2,"a++"=10, dd, deparse.level=0)# middle 2 rownamesrbind(1:4, c=2,"a++"=10, dd, deparse.level=1)# 3 rownames (default)rbind(1:4, c=2,"a++"=10, dd, deparse.level=2)# 4 rownames## cheap row names:b0<- gl(3,4, labels=letters[1:3])bf<- setNames(b0, paste0("o", seq_along(b0)))df<- data.frame(a=1, B= b0, f= gl(4,3))df.<- data.frame(a=1, B= bf, f= gl(4,3))new<- data.frame(a=8, B="B", f="1")(df1<- rbind(df, new))(df.1<- rbind(df., new))stopifnot(identical(df1, rbind(df, new, make.row.names=FALSE)), identical(df1, rbind(df., new, make.row.names=FALSE)))
Seeks a unique match of its first argument among theelements of its second. If successful, it returns this element;otherwise, it performs an action specified by the third argument.
char.expand(input, target, nomatch = stop("no match"))
char.expand(input, target, nomatch= stop("no match"))
input | a character string to be expanded. |
target | a character vector with the values to be matchedagainst. |
nomatch | anR expression to be evaluated in case expansion wasnot possible. |
This function is particularly useful when abbreviations are allowed infunction arguments, and need to be uniquely expanded with respect to atarget table of possible values.
A length-one character vector, one of the elements oftarget
(unlessnomatch
is changed to be a non-error, when it can be azero-length character string).
charmatch
andpmatch
for performingpartial string matching.
locPars <- c("mean", "median", "mode")char.expand("me", locPars, warning("Could not expand!"))char.expand("mo", locPars)
locPars<- c("mean","median","mode")char.expand("me", locPars, warning("Could not expand!"))char.expand("mo", locPars)
Create or test for objects of type"character"
.
character(length = 0)as.character(x, ...)is.character(x)
character(length=0)as.character(x,...)is.character(x)
length | a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error. |
x | object to be coerced or tested. |
... | further arguments passed to or from other methods. |
as.character
andis.character
are generic: you canwrite methods to handle specific classes of objects,seeInternalMethods. Further, foras.character
thedefault method callsas.vector
, so, onlyif(is.object(x))
is true, dispatch is first onmethods foras.character
and then for methods foras.vector
.
as.character
represents real and complex numbers to 15 significantdigits (technically the compiler's setting of the ISO C constantDBL_DIG
, which will be 15 on machines supportingIEC 60559arithmetic according to the C99 standard). This ensures that all thedigits in the result will be reliable (and not the result ofrepresentation error), but does mean that conversion to character andback to numeric may change the number. If you want to convert numbersto character with the maximum possible precision, useformat
.
character
creates a character vector of the specified length.The elements of the vector are all equal to""
.
as.character
attempts to coerce its argument to character type;likeas.vector
it strips attributes including names.For lists and pairlists (includinglanguage objects such ascalls) it deparses the elements individually, except that it extractsthe first element of length-one character vectors, see theAbc
example.
is.character
returnsTRUE
orFALSE
depending onwhether its argument is of character type or not.
as.character
breaks lines in language objects at 500characters, and inserts newlines. Prior to 2.15.0 lines weretruncated.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
options
: optionsscipen
andOutDec
affect theconversion of numbers.
paste
,substr
andstrsplit
for character concatenation and splitting,chartr
for character translation and case folding (e.g.,upper to lower case) andsub
,grep
etc forstring matching and substitutions. Note thathelp.search(keyword = "character")
gives even more links.
deparse
, which is normally preferable toas.character
forlanguage objects.
Quotes
on how to specifycharacter
/ stringconstants, includingraw ones.
form <- y ~ a + b + cas.character(form) ## length 3deparse(form) ## like the inputa0 <- 11/999 # has a repeating decimal representation(a1 <- as.character(a0))format(a0, digits = 16) # shows 1 to 2 more digit(s)a2 <- as.numeric(a1)a2 - a0 # normally around -1e-17as.character(a2) # possibly different from a1print(c(a0, a2), digits = 16)as.character(list(A = "Abc", xy = c("x", "y"))) # "Abc" "c(\"x\", \"y\")"## i.e., "Abc" directly instead of deparsing to "\"Abc\""
form<- y~ a+ b+ cas.character(form)## length 3deparse(form)## like the inputa0<-11/999# has a repeating decimal representation(a1<- as.character(a0))format(a0, digits=16)# shows 1 to 2 more digit(s)a2<- as.numeric(a1)a2- a0# normally around -1e-17as.character(a2)# possibly different from a1print(c(a0, a2), digits=16)as.character(list(A="Abc", xy= c("x","y")))# "Abc" "c(\"x\", \"y\")"## i.e., "Abc" directly instead of deparsing to "\"Abc\""
charmatch
seeks matches for the elements of its first argumentamong those of its second.
charmatch(x, table, nomatch = NA_integer_)
charmatch(x, table, nomatch=NA_integer_)
x | the values to be matched: converted to a character vector by |
table | the values to be matched against: converted to a charactervector.Long vectors are not supported. |
nomatch | the (integer) value to be returned at non-matchingpositions. |
Exact matches are preferred to partial matches (those where the valueto be matched has an exact match to the initial part of the target,but the target is longer).
If there is a single exact match or no exact match and a uniquepartial match then the index of the matching value is returned; ifmultiple exact or multiple partial matches are found then0
isreturned and if no match is found thennomatch
is returned.
NA
values are treated as the string constant"NA"
.
An integer vector of the same length asx
, giving theindices of the elements intable
which matched, ornomatch
.
This function is based on a C function written by Terry Therneau.
startsWith
for another matching of initial parts of strings;grep
orregexpr
for more general (regexp)matching of strings.
charmatch("", "") # returns 1charmatch("m", c("mean", "median", "mode")) # returns 0charmatch("med", c("mean", "median", "mode")) # returns 2
charmatch("","")# returns 1charmatch("m", c("mean","median","mode"))# returns 0charmatch("med", c("mean","median","mode"))# returns 2
Translate characters in character vectors, in particular from upper tolower case or vice versa.
chartr(old, new, x)tolower(x)toupper(x)casefold(x, upper = FALSE)
chartr(old, new, x)tolower(x)toupper(x)casefold(x, upper=FALSE)
x | a character vector, or an object that can be coerced tocharacter by |
old | a character string specifying the characters to betranslated. If a character vector of length 2 or more is supplied,the first element is used with a warning. |
new | a character string specifying the translations. If acharacter vector of length 2 or more is supplied, the first elementis used with a warning. |
upper | logical: translate to upper or lower case? |
chartr
translates each character inx
that is specifiedinold
to the corresponding character specified innew
.Ranges are supported in the specifications, but character classes andrepeated characters are not. Ifold
contains more charactersthan new, an error is signaled; if it contains fewer characters, theextra characters at the end ofnew
are ignored.
tolower
andtoupper
convert upper-case characters in acharacter vector to lower-case, or vice versa. Non-alphabeticcharacters are left unchanged. More than one character can be mappedto a single upper-case character.
casefold
is a wrapper fortolower
andtoupper
originally written for compatibility with S-PLUS.
A character vector of the same length and with the same attributes asx
(after possible coercion).
Elements of the result will be have the encoding declared as that ofthe current locale (seeEncoding
) if the correspondinginput had a declared encoding and the current locale is either Latin-1or UTF-8. The result will be in the current locale's encoding unlessthe corresponding input was in UTF-8 or Latin-1, when it will be in UTF-8.
These functions are platform-dependent, usually using OS services.The latter can be quite deficient, for example only covering ASCIIcharacters in 8-bit locales. The definition of ‘alphabetic’ isplatform-dependent and liable to change over time as most platformsare based on the frequently-updated Unicode tables.
sub
andgsub
for othersubstitutions in strings.
x <- "MiXeD cAsE 123"chartr("iXs", "why", x)chartr("a-cX", "D-Fw", x)tolower(x)toupper(x)## "Mixed Case" Capitalizing - toupper( every first letter of a word ) :.simpleCap <- function(x) { s <- strsplit(x, " ")[[1]] paste(toupper(substring(s, 1, 1)), substring(s, 2), sep = "", collapse = " ")}.simpleCap("the quick red fox jumps over the lazy brown dog")## -> [1] "The Quick Red Fox Jumps Over The Lazy Brown Dog"## and the better, more sophisticated version:capwords <- function(s, strict = FALSE) { cap <- function(s) paste(toupper(substring(s, 1, 1)), {s <- substring(s, 2); if(strict) tolower(s) else s}, sep = "", collapse = " " ) sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))}capwords(c("using AIC for model selection"))## -> [1] "Using AIC For Model Selection"capwords(c("using AIC", "for MODEL selection"), strict = TRUE)## -> [1] "Using Aic" "For Model Selection"## ^^^ ^^^^^## 'bad' 'good'## -- Very simple insecure crypto --rot <- function(ch, k = 13) { p0 <- function(...) paste(c(...), collapse = "") A <- c(letters, LETTERS, " '") I <- seq_len(k); chartr(p0(A), p0(c(A[-I], A[I])), ch)}pw <- "my secret pass phrase"(crypw <- rot(pw, 13)) #-> you can send this off## now ``decrypt'' :rot(crypw, 54 - 13) # -> the original:stopifnot(identical(pw, rot(crypw, 54 - 13)))
x<-"MiXeD cAsE 123"chartr("iXs","why", x)chartr("a-cX","D-Fw", x)tolower(x)toupper(x)## "Mixed Case" Capitalizing - toupper( every first letter of a word ) :.simpleCap<-function(x){ s<- strsplit(x," ")[[1]] paste(toupper(substring(s,1,1)), substring(s,2), sep="", collapse=" ")}.simpleCap("the quick red fox jumps over the lazy brown dog")## -> [1] "The Quick Red Fox Jumps Over The Lazy Brown Dog"## and the better, more sophisticated version:capwords<-function(s, strict=FALSE){ cap<-function(s) paste(toupper(substring(s,1,1)),{s<- substring(s,2);if(strict) tolower(s)else s}, sep="", collapse=" ") sapply(strsplit(s, split=" "), cap, USE.NAMES=!is.null(names(s)))}capwords(c("using AIC for model selection"))## -> [1] "Using AIC For Model Selection"capwords(c("using AIC","for MODEL selection"), strict=TRUE)## -> [1] "Using Aic" "For Model Selection"## ^^^ ^^^^^## 'bad' 'good'## -- Very simple insecure crypto --rot<-function(ch, k=13){ p0<-function(...) paste(c(...), collapse="") A<- c(letters, LETTERS," '") I<- seq_len(k); chartr(p0(A), p0(c(A[-I], A[I])), ch)}pw<-"my secret pass phrase"(crypw<- rot(pw,13))#-> you can send this off## now ``decrypt'' :rot(crypw,54-13)# -> the original:stopifnot(identical(pw, rot(crypw,54-13)))
Warn about extraneous arguments in the...
of its caller. Autility to be used e.g., in S3 methods which need a formal...
argument but do not make any use of it. This helps catching usererrors in calling the function in question (which is the caller ofchkDots()
).
chkDots(..., which.call = -1, allowed = character(0))
chkDots(..., which.call=-1, allowed= character(0))
... | “the dots”, as passed from the caller. |
which.call | passed to |
allowed | not yet implemented: character vector ofnamedelements in |
Martin Maechler, first version outside base, June 2012.
seq.default ## <- you will see ' chkDots(...) 'seq(1,5, foo = "bar") # gives warning via chkDots()## warning with more than one ...-entry:density.f <- function(x, ...) NextMethod("density")x <- density(structure(rnorm(10),), bar=TRUE, baz=TRUE)
seq.default## <- you will see ' chkDots(...) 'seq(1,5, foo="bar")# gives warning via chkDots()## warning with more than one ...-entry:density.f<-function(x,...) NextMethod("density")x<- density(structure(rnorm(10), class="f"), bar=TRUE, baz=TRUE)
Compute the Cholesky factorization of a real symmetricpositive-definite square matrix.
chol(x, ...)## Default S3 method:chol(x, pivot = FALSE, LINPACK = FALSE, tol = -1, ...)
chol(x,...)## Default S3 method:chol(x, pivot=FALSE, LINPACK=FALSE, tol=-1,...)
x | an object for which a method exists. The default methodapplies to numeric (or logical) symmetric, positive-definite matrices. |
... | arguments to be passed to or from methods. |
pivot | logical: should pivoting be used? |
LINPACK | logical. Defunct and gives an error. |
tol | a numeric tolerance for use with |
chol
is generic: the description here applies to the defaultmethod.
Note that only the upper triangular part ofx
is used, sothat when
x
is symmetric.
Ifpivot = FALSE
andx
is not non-negative definite anerror occurs. Ifx
is positive semi-definite (i.e., some zeroeigenvalues) an error will also occur as a numerical tolerance is used.
Ifpivot = TRUE
, then the Cholesky decomposition of a positivesemi-definitex
can be computed. The rank ofx
isreturned asattr(Q, "rank")
, subject to numerical errors.The pivot is returned asattr(Q, "pivot")
. It is no longerthe case thatt(Q) %*% Q
equalsx
. However, settingpivot <- attr(Q, "pivot")
andoo <- order(pivot)
, itis true thatt(Q[, oo]) %*% Q[, oo]
equalsx
,or, alternatively,t(Q) %*% Q
equalsx[pivot, pivot]
. See the examples.
The value oftol
is passed to LAPACK, with negative valuesselecting the default tolerance of (usually)nrow(x) * .Machine$double.neg.eps * max(diag(x))
. The algorithm terminates oncethe pivot is less thantol
.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.
The upper triangular factor of the Cholesky decomposition, i.e., thematrix such that
(see example).
If pivoting is used, then two additional attributes"pivot"
and"rank"
are also returned.
The code does not check for symmetry.
Ifpivot = TRUE
andx
is not non-negative definite thenthere will be a warning message but a meaningless result will occur.So only usepivot = TRUE
whenx
is non-negative definiteby construction.
This is an interface to the LAPACK routinesDPOTRF
andDPSTRF
,
LAPACK is fromhttps://netlib.org/lapack/ and its guide is listedin the references.
Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
chol2inv
for itsinverse (without pivoting),backsolve
for solving linear systems with uppertriangular left sides.
qr
,svd
for related matrix factorizations.
( m <- matrix(c(5,1,1,3),2,2) )( cm <- chol(m) )t(cm) %*% cm #-- = 'm'crossprod(cm) #-- = 'm'# now for something positive semi-definitex <- matrix(c(1:5, (1:5)^2), 5, 2)x <- cbind(x, x[, 1] + 3*x[, 2])colnames(x) <- letters[20:22]m <- crossprod(x)qr(m)$rank # is 2, as it should be# chol() may fail, depending on numerical rounding:# chol() unlike qr() does not use a tolerance.try(chol(m))(Q <- chol(m, pivot = TRUE))## we can use this bypivot <- attr(Q, "pivot")crossprod(Q[, order(pivot)]) # recover m## now for a non-positive-definite matrix( m <- matrix(c(5,-5,-5,3), 2, 2) )try(chol(m)) # fails(Q <- chol(m, pivot = TRUE)) # warningcrossprod(Q) # not equal to m
( m<- matrix(c(5,1,1,3),2,2))( cm<- chol(m))t(cm)%*% cm#-- = 'm'crossprod(cm)#-- = 'm'# now for something positive semi-definitex<- matrix(c(1:5,(1:5)^2),5,2)x<- cbind(x, x[,1]+3*x[,2])colnames(x)<- letters[20:22]m<- crossprod(x)qr(m)$rank# is 2, as it should be# chol() may fail, depending on numerical rounding:# chol() unlike qr() does not use a tolerance.try(chol(m))(Q<- chol(m, pivot=TRUE))## we can use this bypivot<- attr(Q,"pivot")crossprod(Q[, order(pivot)])# recover m## now for a non-positive-definite matrix( m<- matrix(c(5,-5,-5,3),2,2))try(chol(m))# fails(Q<- chol(m, pivot=TRUE))# warningcrossprod(Q)# not equal to m
Invert a symmetric, positive definite square matrix from its Choleskydecomposition. Equivalently, computefrom the (
part) of the QR decomposition of
.
chol2inv(x, size = NCOL(x), LINPACK = FALSE)
chol2inv(x, size= NCOL(x), LINPACK=FALSE)
x | a matrix. The first |
size | the number of columns of |
LINPACK | logical. Defunct and gives an error. |
The inverse of the matrix whose Cholesky decomposition was given.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.
This is an interface to the LAPACK routineDPOTRI
.LAPACK is fromhttps://netlib.org/lapack/ and its guide is listedin the references.
Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition.SIAM.Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.
Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide.Philadelphia: SIAM Publications.
cma <- chol(ma <- cbind(1, 1:3, c(1,3,7)))ma %*% chol2inv(cma)
cma<- chol(ma<- cbind(1,1:3, c(1,3,7)))ma%*% chol2inv(cma)
chooseOpsMethod
is a function called by theOps
Group Generic when twosuitable methods are found for a given call. It determines which method touse for the operation based on the objects being dispatched.
The function is first called withreverse = FALSE
, wherex
corresponds to the first argument andy
to the secondargument of the group generic call. IfchooseOpsMethod()
returnsFALSE
forx
, thenchooseOpsMethod
is called again,withx
andy
swapped,mx
andmy
swapped,andreverse = TRUE
.
chooseOpsMethod(x, y, mx, my, cl, reverse)
chooseOpsMethod(x, y, mx, my, cl, reverse)
x ,y | the objects being dispatched on by the group generic. |
mx ,my | the methods found for objects |
cl | the call to the group generic. |
reverse | logical value indicating whether |
This function must return eitherTRUE
orFALSE
. A value ofTRUE
indicates that methodmx
should be used.
# Create two objects with custom Ops methodsfoo_obj <- structure(1, class = "foo")bar_obj <- structure(1, class = "bar")`+.foo` <- function(e1, e2) "foo"Ops.bar <- function(e1, e2) "bar"invisible(foo_obj + bar_obj) # Warning: Incompatible methodschooseOpsMethod.bar <- function(x, y, mx, my, cl, reverse) TRUEstopifnot(exprs = { identical(foo_obj + bar_obj, "bar") identical(bar_obj + foo_obj, "bar")})# cleanuprm(foo_obj, bar_obj, `+.foo`, Ops.bar, chooseOpsMethod.bar)
# Create two objects with custom Ops methodsfoo_obj<- structure(1, class="foo")bar_obj<- structure(1, class="bar")`+.foo`<-function(e1, e2)"foo"Ops.bar<-function(e1, e2)"bar"invisible(foo_obj+ bar_obj)# Warning: Incompatible methodschooseOpsMethod.bar<-function(x, y, mx, my, cl, reverse)TRUEstopifnot(exprs={ identical(foo_obj+ bar_obj,"bar") identical(bar_obj+ foo_obj,"bar")})# cleanuprm(foo_obj, bar_obj, `+.foo`, Ops.bar, chooseOpsMethod.bar)
R possesses a simple generic function mechanism which can be used foran object-oriented style of programming. Method dispatch takes placebased on the class of the first argument to the generic function.
class(x)class(x) <- valueunclass(x)inherits(x, what, which = FALSE)nameOfClass(x)isa(x, what)oldClass(x)oldClass(x) <- value.class2(x)
class(x)class(x)<- valueunclass(x)inherits(x, what, which=FALSE)nameOfClass(x)isa(x, what)oldClass(x)oldClass(x)<- value.class2(x)
x | anR object. |
what ,value | a character vector naming classes. |
which | logical affecting return value: see ‘Details’. |
Here, we describe the so called “S3” classes (and methods). For“S4” classes (and methods), see ‘Formal classes’ below.
ManyR objects have aclass
attribute, a character vectorgiving the names of the classes from which the objectinherits.(FunctionsoldClass
andoldClass<-
get and set theattribute, which can also be done directly.)
If the object does not have a class attribute, it has an implicitclass, notably"matrix"
,"array"
,"function"
or"numeric"
or the result oftypeof(x)
(which is similar tomode(x)
),but for type"language"
andmode
"call"
,where the following extra classes exist for the corresponding functioncall
s:if
,for
,while
,(
,{
,<-
,=
.
Note that for objectsx
of an implicit (or an S4) class, when a(S3) generic functionfoo(x)
is called, method dispatch may usemore classes than are returned byclass(x)
, e.g., for a numericmatrix, thefoo.numeric()
method may apply. The exact fullcharacter
vector of the classes whichUseMethod()
uses, is available as.class2(x)
sinceR version 4.0.0. (This also applies to S4 objects when S3 dispatch isconsidered, see below.)
Beware that using.class2()
for other reasons than didactical,diagnostical or for debugging may rather be a misuse than smart.
NULL
objects (of implicit class"NULL"
) cannot haveattributes (hence noclass
attribute) and attempting to assign aclass is an error.
When a generic functionfun
is applied to an object with classattributec("first", "second")
, the system searches for afunction calledfun.first
and, if it finds it, applies it tothe object. If no such function is found, a function calledfun.second
is tried. If no class name produces a suitablefunction, the functionfun.default
is used (if it exists). Ifthere is no class attribute, the implicit class is tried, then thedefault method.
The functionclass
prints the vector of names of classes anobject inherits from. Correspondingly,class<-
sets theclasses an object inherits from. Assigning an empty character vector orNULL
removes the class attribute, as foroldClass<-
ordirect attribute setting. Whereas it is clearer to explicitly assignNULL
to remove the class, using an empty vector is more natural ine.g.,class(x) <-setdiff(class(x), "ts")
.
unclass
returns (a copy of) its argument with its classattribute removed. (It is not allowed for objects which cannot becopied, namely environments and external pointers.)
inherits
indicates whether its first argument inherits from anyof the classes specified in thewhat
argument. Ifwhich
isTRUE
then an integer vector of the same length aswhat
is returned. Each element indicates the position in theclass(x)
matched by the element ofwhat
; zero indicatesno match. Ifwhich
isFALSE
thenTRUE
isreturned byinherits
if any of the names inwhat
matchwith anyclass
.
nameOfClass
is an S3 generic. It is called byinherits
to get the class name forwhat
, allowing forwhat
to be values other than a character vector.nameOfClass
methods are expected to return a character vector of length 1.
isa
tests whetherx
is an object of class(es) as giveninwhat
by usingis
ifx
is an S4object, and otherwise givingTRUE
iffall elements ofclass(x)
are contained inwhat
.
All butinherits
andisa
areprimitive functions.
An additional mechanism offormal classes, nicknamed“S4”, is available in packagemethods which is attachedby default. For objects which have a formal class, its name isreturned byclass
as a character vector of length one andmethod dispatch can happen onseveral arguments, instead ofonly the first. However, S3 method selection attempts to treat objectsfrom an S4 class as if they had the appropriate S3 class attribute, asdoesinherits
. Therefore, S3 methods can be defined for S4classes. See the ‘Introduction’ and ‘Methods_for_S3’help pages for basic information on S4 methods and for the relationbetween these and S3 methods.
The replacement version of the function sets the class to the valueprovided. For classes that have a formal definition, directlyreplacing the class this way is strongly deprecated. The expressionas(object, value)
is the way to coerce an object to aparticular class.
The analogue ofinherits
for formal classes isis
. The two functions behave consistentlywith one exception: S4 classes can have conditionalinheritance, with an explicit test. In this case,is
willtest the condition, butinherits
ignores all conditionalsuperclasses.
UseMethod
dispatches on the class as returned byclass
(with some interpolated classes: see the link) ratherthanoldClass
.However,group generics dispatchon theoldClass
for efficiency, andinternal genericsonly dispatch on objects for whichis.object
is true.
UseMethod
,NextMethod
,‘group generic’, ‘internal generic’
x <- 10class(x) # "numeric"oldClass(x) # NULLinherits(x, "a") #FALSEclass(x) <- c("a", "b")inherits(x,"a") #TRUEinherits(x, "a", TRUE) # 1inherits(x, c("a", "b", "c"), TRUE) # 1 2 0class( quote(pi) ) # "name"## regular callsclass( quote(sin(pi*x)) ) # "call"## special callsclass( quote(x <- 1) ) # "<-"class( quote((1 < 2)) ) # "("class( quote( if(8<3) pi ) ) # "if".class2(pi) # "double" "numeric".class2(matrix(1:6, 2,3)) # "matrix" "array" "integer" "numeric"
x<-10class(x)# "numeric"oldClass(x)# NULLinherits(x,"a")#FALSEclass(x)<- c("a","b")inherits(x,"a")#TRUEinherits(x,"a",TRUE)# 1inherits(x, c("a","b","c"),TRUE)# 1 2 0class( quote(pi))# "name"## regular callsclass( quote(sin(pi*x)))# "call"## special callsclass( quote(x<-1))# "<-"class( quote((1<2)))# "("class( quote(if(8<3) pi))# "if".class2(pi)# "double" "numeric".class2(matrix(1:6,2,3))# "matrix" "array" "integer" "numeric"
Returns a matrix of integers indicating their column number in amatrix-like object, or a factor of column labels.
col(x, as.factor = FALSE).col(dim)
col(x, as.factor=FALSE).col(dim)
x | a matrix-like object, that is one with a two-dimensional |
dim | a matrix dimension, i.e., an integer valued numeric vector oflength two (with non-negative entries). |
as.factor | a logical value indicating whether the value shouldbe returned as a factor of column labels (created if necessary)rather than as numbers. |
An integer (or factor) matrix with the same dimensions asx
and whoseij
-th element is equal toj
(or thej
-th column label).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
row
to get rows;slice.index
for a general way to get slice indicesin an array.
# extract an off-diagonal of a matrixma <- matrix(1:12, 3, 4)ma[row(ma) == col(ma) + 1]# create an identity 5-by-5 matrix more slowly than diag(n = 5):x <- matrix(0, nrow = 5, ncol = 5)x[row(x) == col(x)] <- 1(i34 <- .col(3:4))stopifnot(identical(i34, .col(c(3,4)))) # 'dim' maybe "double"
# extract an off-diagonal of a matrixma<- matrix(1:12,3,4)ma[row(ma)== col(ma)+1]# create an identity 5-by-5 matrix more slowly than diag(n = 5):x<- matrix(0, nrow=5, ncol=5)x[row(x)== col(x)]<-1(i34<- .col(3:4))stopifnot(identical(i34, .col(c(3,4))))# 'dim' maybe "double"
Generate regular sequences.
from:to a:b
from:to a:b
from | starting value of sequence. |
to | (maximal) end value of the sequence. |
a ,b |
|
The binary operator:
has two meanings: for factorsa:b
isequivalent tointeraction(a, b)
(but the levels areordered and labelled differently).
For other argumentsfrom:to
is equivalent toseq(from, to)
,and generates a sequence fromfrom
toto
in steps of1
or-1
. Valueto
will be included if it differs fromfrom
by an integer up to a numeric fuzz of about1e-7
.Non-numeric arguments are coerced internally (hence withoutdispatching methods) to numeric—complex values will have theirimaginary parts discarded with a warning.
For numeric arguments, a numeric vector. This will be of typeinteger
iffrom
is integer-valued and the resultis representable in theR integer type, otherwise of type"double"
(akamode
"numeric"
).
For factors, an unordered factor with levels labelled asla:lb
and ordered lexicographically (that is,lb
varies fastest).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
(for numeric arguments: S does not have:
for factors.)
seq
(ageneralization offrom:to
).
As an alternative to using:
for factors,interaction
.
For:
used in the formal representation of an interaction, seeformula
.
1:4pi:6 # real6:pi # integerf1 <- gl(2, 3); f1f2 <- gl(3, 2); f2f1:f2 # a factor, the "cross" f1 x f2
1:4pi:6# real6:pi# integerf1<- gl(2,3); f1f2<- gl(3,2); f2f1:f2# a factor, the "cross" f1 x f2
Form row and column sums and means for numeric arrays (or data frames).
colSums (x, na.rm = FALSE, dims = 1)rowSums (x, na.rm = FALSE, dims = 1)colMeans(x, na.rm = FALSE, dims = 1)rowMeans(x, na.rm = FALSE, dims = 1).colSums(x, m, n, na.rm = FALSE).rowSums(x, m, n, na.rm = FALSE).colMeans(x, m, n, na.rm = FALSE).rowMeans(x, m, n, na.rm = FALSE)
colSums(x, na.rm=FALSE, dims=1)rowSums(x, na.rm=FALSE, dims=1)colMeans(x, na.rm=FALSE, dims=1)rowMeans(x, na.rm=FALSE, dims=1).colSums(x, m, n, na.rm=FALSE).rowSums(x, m, n, na.rm=FALSE).colMeans(x, m, n, na.rm=FALSE).rowMeans(x, m, n, na.rm=FALSE)
x | an array of two or more dimensions, containing numeric,complex, integer or logical values, or a numeric data frame. For |
na.rm | logical. Should missing values (including |
dims | integer: Which dimensions are regarded as ‘rows’ or‘columns’ to sum over. For |
m ,n | the dimensions of the matrix |
These functions are equivalent to use ofapply
withFUN = mean
orFUN = sum
with appropriate margins, butare a lot faster. As they are written for speed, they blur over someof the subtleties ofNaN
andNA
. Ifna.rm = FALSE
and eitherNaN
orNA
appears in a sum, theresult will be one ofNaN
orNA
, but which might beplatform-dependent.
Notice that omission of missing values is done on a per-column orper-row basis, so column means may not be over the same set of rows,and vice versa. To use only complete rows or columns, first selectthem withna.omit
orcomplete.cases
(possibly on the transpose ofx
).
The versions with an initial dot in the name (.colSums()
etc)are ‘bare-bones’ versions for use in programming: they applyonly to numeric (like) matrices and do not name the result.
A numeric or complex array of suitable size, or a vector if the resultis one-dimensional. For the first four functions thedimnames
(ornames
for a vector result) are taken from the originalarray.
If there are no values in a range to be summed over (after removingmissing values withna.rm = TRUE
), thatcomponent of the output is set to0
(*Sums
) orNaN
(*Means
), consistent withsum
andmean
.
## Compute row and column sums for a matrix:x <- cbind(x1 = 3, x2 = c(4:1, 2:5))rowSums(x); colSums(x)dimnames(x)[[1]] <- letters[1:8]rowSums(x); colSums(x); rowMeans(x); colMeans(x)x[] <- as.integer(x)rowSums(x); colSums(x)x[] <- x < 3rowSums(x); colSums(x)x <- cbind(x1 = 3, x2 = c(4:1, 2:5))x[3, ] <- NA; x[4, 2] <- NArowSums(x); colSums(x); rowMeans(x); colMeans(x)rowSums(x, na.rm = TRUE); colSums(x, na.rm = TRUE)rowMeans(x, na.rm = TRUE); colMeans(x, na.rm = TRUE)## an arraydim(UCBAdmissions)rowSums(UCBAdmissions); rowSums(UCBAdmissions, dims = 2)colSums(UCBAdmissions); colSums(UCBAdmissions, dims = 2)## complex casex <- cbind(x1 = 3 + 2i, x2 = c(4:1, 2:5) - 5i)x[3, ] <- NA; x[4, 2] <- NArowSums(x); colSums(x); rowMeans(x); colMeans(x)rowSums(x, na.rm = TRUE); colSums(x, na.rm = TRUE)rowMeans(x, na.rm = TRUE); colMeans(x, na.rm = TRUE)
## Compute row and column sums for a matrix:x<- cbind(x1=3, x2= c(4:1,2:5))rowSums(x); colSums(x)dimnames(x)[[1]]<- letters[1:8]rowSums(x); colSums(x); rowMeans(x); colMeans(x)x[]<- as.integer(x)rowSums(x); colSums(x)x[]<- x<3rowSums(x); colSums(x)x<- cbind(x1=3, x2= c(4:1,2:5))x[3,]<-NA; x[4,2]<-NArowSums(x); colSums(x); rowMeans(x); colMeans(x)rowSums(x, na.rm=TRUE); colSums(x, na.rm=TRUE)rowMeans(x, na.rm=TRUE); colMeans(x, na.rm=TRUE)## an arraydim(UCBAdmissions)rowSums(UCBAdmissions); rowSums(UCBAdmissions, dims=2)colSums(UCBAdmissions); colSums(UCBAdmissions, dims=2)## complex casex<- cbind(x1=3+2i, x2= c(4:1,2:5)-5i)x[3,]<-NA; x[4,2]<-NArowSums(x); colSums(x); rowMeans(x); colMeans(x)rowSums(x, na.rm=TRUE); colSums(x, na.rm=TRUE)rowMeans(x, na.rm=TRUE); colMeans(x, na.rm=TRUE)
Provides access to a copy of the command line arguments supplied whenthisR session was invoked.
commandArgs(trailingOnly = FALSE)
commandArgs(trailingOnly=FALSE)
trailingOnly | logical. Should only arguments after--args be returned? |
These arguments are captured before the standardR command lineprocessing takes place. This means that they are the unmodifiedvalues. This is especially useful with the--argscommand-line flag toR, as all of the command line after that flagis skipped.
A character vector containing the name of the executable and theuser-supplied command line arguments. The first element is the nameof the executable by whichR was invoked. The exact form of thiselement is platform dependent: it may be the fully qualified name, orsimply the last component (or basename) of the application, or for anembeddedR it can be anything the programmer supplied.
IftrailingOnly = TRUE
, a character vector of those arguments(if any) supplied after--args.
commandArgs()## Spawn a copy of this application as it was invoked,## subject to shell quoting issues## system(paste(commandArgs(), collapse = " "))
commandArgs()## Spawn a copy of this application as it was invoked,## subject to shell quoting issues## system(paste(commandArgs(), collapse = " "))
"comment"
AttributeThese functions set and query acommentattribute for anyR objects. This is typically useful fordata.frame
s or model fits.
Contrary to otherattributes
, thecomment
is notprinted (byprint
orprint.default
).
AssigningNULL
or a zero-length character vector removes thecomment.
comment(x)comment(x) <- value
comment(x)comment(x)<- value
x | anyR object. |
value | a |
attributes
andattr
for other attributes.
x <- matrix(1:12, 3, 4)comment(x) <- c("This is my very important data from experiment #0234", "Jun 5, 1998")xcomment(x)
x<- matrix(1:12,3,4)comment(x)<- c("This is my very important data from experiment #0234","Jun 5, 1998")xcomment(x)
Binary operators which allow the comparison of values in atomic vectors.
x < yx > yx <= yx >= yx == yx != y
x< yx> yx<= yx>= yx== yx!= y
x ,y | atomic vectors, symbols, calls, or other objects for whichmethods have been written. |
The binary comparison operators are generic functions: methods can bewritten for them individually or via theOps
group generic function. (SeeOps
for how dispatch is computed.)
Comparison of strings in character vectors is lexicographic within thestrings using the collating sequence of the locale in use: seelocales
. The collating sequence of locales such as‘en_US’ is normally different from ‘C’ (which should useASCII) and can be surprising. Beware of makingany assumptionsabout the collation order: e.g. in EstonianZ
comes betweenS
andT
, and collation is not necessarilycharacter-by-character – in Danishaa
sorts as a singleletter, afterz
. In Welshng
may or may not be a singlesorting unit: if it is it followsg
. Some platforms maynot respect the locale and always sort in numerical order of the bytesin an 8-bit locale, or in Unicode code-point order for a UTF-8 locale (andmay not sort in the same order for the same language in differentcharacter sets). Collation of non-letters (spaces, punctuation signs,hyphens, fractions and so on) is even more problematic.
Character strings can be compared with different marked encodings(seeEncoding
): they are translated to UTF-8 beforecomparison.
Raw vectors should not really be considered to have an order, but thenumeric order of the byte representation is used.
At least one ofx
andy
must be an atomic vector, but ifthe other is a listR attempts to coerce it to the type of the atomicvector: this will succeed if the list is made up of elements of lengthone that can be coerced to the correct type.
If the two arguments are atomic vectors of different types, one iscoerced to the type of the other, the (decreasing) order of precedencebeing character, complex, numeric, integer, logical and raw.
Missing values (NA
) andNaN
values areregarded as non-comparable even to themselves, so comparisonsinvolving them will always result inNA
. Missing values canalso result when character strings are compared and one is not validin the current collation locale.
Language objects such as symbols and calls can only be used asoperands for==
and!=
; the other comparisons signal anerror when one of the operands is a language object. Currentlylanguage objects are deparsed to character strings beforecomparison. This can be inefficient and may not be what is reallywanted. For equality comparisonsidentical
is usually abetter choice.
A logical vector indicating the result of the element by elementcomparison. The elements of shorter vectors are recycled asnecessary.
Objects such as arrays or time-series can be compared this wayprovided they are conformable.
These operators are members of the S4Compare
group generic,and so methods can be written for them individually as well as for thegroup generic (or theOps
group generic), with argumentsc(e1, e2)
.
Do not use==
and!=
for tests, such as inif
expressions, where you must get a singleTRUE
orFALSE
. Unless you are absolutely sure that nothing unusualcan happen, you should use theidentical
functioninstead.
For numerical and complex values, remember==
and!=
donot allow for the finite representation of fractions, nor for roundingerror. Usingall.equal
withidentical
orisTRUE
is almost always preferable; see the examples.(This also applies to the other comparison operators.)
These operators are sometimes called as functions ase.g.`<`(x, y)
: see the description of howargument-matching is done inOps
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Collation of character strings is a complex topic. For anintroduction seehttps://en.wikipedia.org/wiki/Collating_sequence. TheUnicode Collation Algorithm(https://unicode.org/reports/tr10/) is likely to be increasinglyinfluential. Where availableR by default makes use of ICU(https://icu.unicode.org/) for collation (except in a Clocale).
Logic
on how tocombine results of comparisons,i.e., logical vectors.
factor
for the behaviour with factor arguments.
Syntax
for operator precedence.
capabilities
for whether ICU is available, andicuSetCollate
to tune the string collation algorithmwhen it is.
x <- stats::rnorm(20)x < 1x[x > 0]x1 <- 0.5 - 0.3x2 <- 0.3 - 0.1x1 == x2 # FALSE on most machinesisTRUE(all.equal(x1, x2)) # TRUE everywhere# range of most 8-bit charsets, as well as of Latin-1 in Unicodez <- c(32:126, 160:255)x <- if(l10n_info()$MBCS) { intToUtf8(z, multiple = TRUE)} else rawToChar(as.raw(z), multiple = TRUE)## by numberwriteLines(strwrap(paste(x, collapse=" "), width = 60))## by locale collationwriteLines(strwrap(paste(sort(x), collapse=" "), width = 60))
x<- stats::rnorm(20)x<1x[x>0]x1<-0.5-0.3x2<-0.3-0.1x1== x2# FALSE on most machinesisTRUE(all.equal(x1, x2))# TRUE everywhere# range of most 8-bit charsets, as well as of Latin-1 in Unicodez<- c(32:126,160:255)x<-if(l10n_info()$MBCS){ intToUtf8(z, multiple=TRUE)}else rawToChar(as.raw(z), multiple=TRUE)## by numberwriteLines(strwrap(paste(x, collapse=" "), width=60))## by locale collationwriteLines(strwrap(paste(sort(x), collapse=" "), width=60))
Basic functions which support complex arithmetic inR, in addition tothe arithmetic operators+
,-
,*
,/
, and^
.
complex(length.out = 0, real = numeric(), imaginary = numeric(), modulus = 1, argument = 0)as.complex(x, ...)is.complex(x)Re(z)Im(z)Mod(z)Arg(z)Conj(z)
complex(length.out=0, real= numeric(), imaginary= numeric(), modulus=1, argument=0)as.complex(x,...)is.complex(x)Re(z)Im(z)Mod(z)Arg(z)Conj(z)
length.out | numeric. Desired length of the output vector,inputs being recycled as needed. |
real | numeric vector. |
imaginary | numeric vector. |
modulus | numeric vector. |
argument | numeric vector. |
x | an object, probably of mode |
z | an object of mode |
... | further arguments passed to or from other methods. |
Complex vectors can be created withcomplex
. The vector can bespecified either by giving its length, its real and imaginary parts, ormodulus and argument. (Giving just the length generates a vector ofcomplex zeroes.)
as.complex
attempts to coerce its argument to be of complextype: likeas.vector
it strips attributes includingnames.SinceR version 4.4.0,as.complex(x)
for “number-like”x
, i.e., types"logical"
,"integer"
, and"double"
, will always keep imaginary part zero, now also forNA
's.Up toR versions 3.2.x, all forms ofNA
andNaN
were coerced to a complexNA
, i.e., theNA_complex_
constant, for which both the real and imaginary parts areNA
.SinceR 3.3.0, typically only objects which areNA
in partsare coerced to complexNA
, but others withNaN
parts,arenot. As a consequence, complex arithmetic where onlyNaN
's (but noNA
's) are involved typically willnot give complexNA
but complex numbers with real orimaginary parts ofNaN
.All of these many different complex numbers fulfillis.na(.)
butonly one of them is identical toNA_complex_
.
Note thatis.complex
andis.numeric
are never bothTRUE
.
The functionsRe
,Im
,Mod
,Arg
andConj
have their usual interpretation as returning the realpart, imaginary part, modulus, argument and complex conjugate forcomplex values. The modulus and argument are also called thepolarcoordinates. If with real
and
, for
,and
,
and
. They are allinternal genericprimitive functions: methods can bedefined for themindividually orvia the
Complex
group generic.
In addition to the arithmetic operators (seeArithmetic)+
,-
,*
,/
, and^
, the elementarytrigonometric, logarithmic, exponential, square root and hyperbolicfunctions are implemented for complex values.
Matrix multiplications (%*%
,crossprod
,tcrossprod
) are also defined for complex matrices(matrix
), and so aresolve
,eigen
orsvd
.
Internally, complex numbers are stored as a pair ofdoubleprecision numbers, either or both of which can beNaN
(includingNA
, seeNA_complex_
and above) orplus or minus infinity.
as.complex
is primitive and can have S4 methods set.
Re
,Im
,Mod
,Arg
andConj
constitute the S4 group genericComplex
and so S4 methods can beset for them individually or via the group generic.
Operations and functions involving complexNaN
mostlyrely on the C library's handling of ‘double complex’ arithmetic,which typically returnscomplex(re=NaN, im=NaN)
(but we havenot seen a guarantee for that).For+
and-
,R's own handling works strictly“coordinate wise”.
Operations involving complexNA
, i.e.,NA_complex_
, returnNA_complex_
.
Only sinceR version 4.4.0,as.complex("1i")
gives1i
,it returnedNA_complex_
with a warning, previously.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Arithmetic
;polyroot
finds allcomplex roots of a polynomial of degree
.
require(graphics)0i ^ (-3:3)matrix(1i^ (-6:5), nrow = 4) #- all columns are the same0 ^ 1i # a complex NaN## create a complex normal vectorz <- complex(real = stats::rnorm(100), imaginary = stats::rnorm(100))## or also (less efficiently):z2 <- 1:2 + 1i*(8:9)## The Arg(.) is an angle:zz <- (rep(1:4, length.out = 9) + 1i*(9:1))/10zz.shift <- complex(modulus = Mod(zz), argument = Arg(zz) + pi)plot(zz, xlim = c(-1,1), ylim = c(-1,1), col = "red", asp = 1, main = expression(paste("Rotation by "," ", pi == 180^o)))abline(h = 0, v = 0, col = "blue", lty = 3)points(zz.shift, col = "orange")## as.complex(<some NA>): numbers keep Im = 0:stopifnot(identical(as.complex(NA_real_), NA_real_ + 0i)) # has always been trueNAs <- vapply(list(NA, NA_integer_, NA_real_, NA_character_, NA_complex_), as.complex, 0+0i)stopifnot(is.na(NAs), is.na(Re(NAs))) # has always been trueshowC <- function(z) noquote(paste0("(", Re(z), ",", Im(z), ")"))showC(NAs)Im(NAs) # [0 0 0 NA NA] \ in R <= 4.3.x was [NA NA 0 NA NA]stopifnot(Im(NAs)[1:3] == 0)## The exact result of this *depends* on the platform, compiler, math-library:(NpNA <- NaN + NA_complex_) ; str(NpNA) # *behaves* as 'cplx NA' ..stopifnot(is.na(NpNA), is.na(NA_complex_), is.na(Re(NA_complex_)), is.na(Im(NA_complex_)))showC(NpNA)# but does not always show '(NaN,NA)'## and this is not TRUE everywhere:identical(NpNA, NA_complex_)showC(NA_complex_) # always == (NA,NA)
require(graphics)0i^(-3:3)matrix(1i^(-6:5), nrow=4)#- all columns are the same0^1i# a complex NaN## create a complex normal vectorz<- complex(real= stats::rnorm(100), imaginary= stats::rnorm(100))## or also (less efficiently):z2<-1:2+1i*(8:9)## The Arg(.) is an angle:zz<-(rep(1:4, length.out=9)+1i*(9:1))/10zz.shift<- complex(modulus= Mod(zz), argument= Arg(zz)+ pi)plot(zz, xlim= c(-1,1), ylim= c(-1,1), col="red", asp=1, main= expression(paste("Rotation by "," ", pi==180^o)))abline(h=0, v=0, col="blue", lty=3)points(zz.shift, col="orange")## as.complex(<some NA>): numbers keep Im = 0:stopifnot(identical(as.complex(NA_real_),NA_real_+0i))# has always been trueNAs<- vapply(list(NA,NA_integer_,NA_real_,NA_character_,NA_complex_), as.complex,0+0i)stopifnot(is.na(NAs), is.na(Re(NAs)))# has always been trueshowC<-function(z) noquote(paste0("(", Re(z),",", Im(z),")"))showC(NAs)Im(NAs)# [0 0 0 NA NA] \ in R <= 4.3.x was [NA NA 0 NA NA]stopifnot(Im(NAs)[1:3]==0)## The exact result of this *depends* on the platform, compiler, math-library:(NpNA<-NaN+NA_complex_); str(NpNA)# *behaves* as 'cplx NA' ..stopifnot(is.na(NpNA), is.na(NA_complex_), is.na(Re(NA_complex_)), is.na(Im(NA_complex_)))showC(NpNA)# but does not always show '(NaN,NA)'## and this is not TRUE everywhere:identical(NpNA,NA_complex_)showC(NA_complex_)# always == (NA,NA)
These functions provide a mechanism for handling unusual conditions,including errors and warnings.
tryCatch(expr, ..., finally)withCallingHandlers(expr, ...)globalCallingHandlers(...)signalCondition(cond)simpleCondition(message, call = NULL)simpleError (message, call = NULL)simpleWarning (message, call = NULL)simpleMessage (message, call = NULL)errorCondition(message, ..., class = NULL, call = NULL)warningCondition(message, ..., class = NULL, call = NULL)## S3 method for class 'condition'as.character(x, ...)## S3 method for class 'error'as.character(x, ...)## S3 method for class 'condition'print(x, ...)## S3 method for class 'restart'print(x, ...)conditionCall(c)## S3 method for class 'condition'conditionCall(c)conditionMessage(c)## S3 method for class 'condition'conditionMessage(c)withRestarts(expr, ...)computeRestarts(cond = NULL)findRestart(name, cond = NULL)invokeRestart(r, ...)tryInvokeRestart(r, ...)invokeRestartInteractively(r)isRestart(x)restartDescription(r)restartFormals(r)suspendInterrupts(expr)allowInterrupts(expr).signalSimpleWarning(msg, call).handleSimpleError(h, msg, call).tryResumeInterrupt()
tryCatch(expr,..., finally)withCallingHandlers(expr,...)globalCallingHandlers(...)signalCondition(cond)simpleCondition(message, call=NULL)simpleError(message, call=NULL)simpleWarning(message, call=NULL)simpleMessage(message, call=NULL)errorCondition(message,..., class=NULL, call=NULL)warningCondition(message,..., class=NULL, call=NULL)## S3 method for class 'condition'as.character(x,...)## S3 method for class 'error'as.character(x,...)## S3 method for class 'condition'print(x,...)## S3 method for class 'restart'print(x,...)conditionCall(c)## S3 method for class 'condition'conditionCall(c)conditionMessage(c)## S3 method for class 'condition'conditionMessage(c)withRestarts(expr,...)computeRestarts(cond=NULL)findRestart(name, cond=NULL)invokeRestart(r,...)tryInvokeRestart(r,...)invokeRestartInteractively(r)isRestart(x)restartDescription(r)restartFormals(r)suspendInterrupts(expr)allowInterrupts(expr).signalSimpleWarning(msg, call).handleSimpleError(h, msg, call).tryResumeInterrupt()
c | a condition object. |
call | call expression. |
cond | a condition object. |
expr | expression to be evaluated. |
finally | expression to be evaluated before returning or exiting. |
h | function. |
message | character string. |
msg | character string. |
name | character string naming a restart. |
r | restart object. |
x | object. |
class | character string naming a condition class. |
... | additional arguments; see details below. |
The condition system provides a mechanism for signaling andhandling unusual conditions, including errors and warnings.Conditions are represented as objects that contain informationabout the condition that occurred, such as a message and the call inwhich the condition occurred. Currently conditions are S3-styleobjects, though this may eventually change.
Conditions are objects inheriting from the abstract classcondition
. Errors and warnings are objects inheritingfrom the abstract subclasseserror
andwarning
.The classsimpleError
is the class used bystop
and all internal error signals. Similarly,simpleWarning
is used bywarning
, andsimpleMessage
is used bymessage
. The constructors by the same names take a stringdescribing the condition as argument and an optional call. ThefunctionsconditionMessage
andconditionCall
aregeneric functions that return the message and call of a condition.
The functionerrorCondition
can beused to construct error conditions of a particular class withadditional fields specified as the...
argument.warningCondition
is analogous for warnings.
Conditions are signaled bysignalCondition
. In addition,thestop
andwarning
functions have been modified toalso accept condition arguments.
The functiontryCatch
evaluates its expression argumentin a context where the handlers provided in the...
argument are available. Thefinally
expression is thenevaluated in the context in whichtryCatch
was called; thatis, the handlers supplied to the currenttryCatch
call arenot active when thefinally
expression is evaluated.
Handlers provided in the...
argument totryCatch
are established for the duration of the evaluation ofexpr
.If no condition is signaled when evaluatingexpr
thentryCatch
returns the value of the expression.
If a condition is signaled while evaluatingexpr
thenestablished handlers are checked, starting with the most recentlyestablished ones, for one matching the class of the condition.When several handlers are supplied in a singletryCatch
thenthe first one is considered more recent than the second. If ahandler is found then control is transferred to thetryCatch
call that established the handler, the handlerfound and all more recent handlers are disestablished, the handleris called with the condition as its argument, and the resultreturned by the handler is returned as the value of thetryCatch
call.
Calling handlers are established bywithCallingHandlers
. Ifa condition is signaled and the applicable handler is a callinghandler, then the handler is called bysignalCondition
inthe context where the condition was signaled but with the availablehandlers restricted to those below the handler called in thehandler stack. If the handler returns, then the next handler istried; once the last handler has been tried,signalCondition
returnsNULL
.
globalCallingHandlers
establishes calling handlers globally.These handlers are only called as a last resort, after the otherhandlers dynamically registered withwithCallingHandlers
havebeen invoked. They are called before theerror
global option(which is the legacy interface for global handling of errors).Registering the same handler multiple times moves that handler ontop of the stack, which ensures that it is called first. Globalhandlers are a good place to define a general purpose logger (forinstance saving the last error object in the global workspace) or ageneral recovery strategy (e.g. installing missing packages via theretry_loadNamespace
restart).
LikewithCallingHandlers
andtryCatch
,globalCallingHandlers
takes named handlers. Unlike thesefunctions, it also has anoptions
-like interface: youcan establish handlers by passing a single list of named handlers.To unregister all global handlers, supply a single 'NULL'. The listof deleted handlers is returned invisibly. Finally, callingglobalCallingHandlers
without arguments returns the list ofcurrently established handlers, visibly.
User interrupts signal a condition of classinterrupt
thatinherits directly from classcondition
before executing thedefault interrupt action.
Restarts are used for establishing recovery protocols. They can beestablished usingwithRestarts
. One pre-established restart isanabort
restart that represents a jump to top level.
findRestart
andcomputeRestarts
find the availablerestarts.findRestart
returns the most recently establishedrestart of the specified name.computeRestarts
returns alist of all restarts. Both can be given a condition argument andwill then ignore restarts that do not apply to the condition.
invokeRestart
transfers control to the point where thespecified restart was established and calls the restart's handler with thearguments, if any, given as additional arguments toinvokeRestart
. The restart argument toinvokeRestart
can be a character string, in which casefindRestart
is usedto find the restart. If no restart is found, an error is thrown.
tryInvokeRestart
is a variant ofinvokeRestart
thatreturns silently when the restart cannot be found withfindRestart
. Because a condition of a given class might besignalled with arbitrary protocols (error, warning, etc), it isrecommended to use this permissive variant whenever you are handlingconditions signalled from a foreign context. For instance, invocationof a"muffleWarning"
restart should be optional because thewarning might have been signalled by the user or from a differentpackage with thestop
ormessage
protocols. Only useinvokeRestart
when you have control of the signalling context,or when it is a logical error if the restart is not available.
New restarts forwithRestarts
can be specified in several ways.The simplest is inname = function
form where the function isthe handler to call when the restart is invoked. Another simplevariant is asname = string
where the string is stored in thedescription
field of the restart object returned byfindRestart
; in this case the handler ignores its argumentsand returnsNULL
. The most flexible form of a restartspecification is as a list that can include several fields, includinghandler
,description
, andtest
. Thetest
field should contain a function of one argument, acondition, that returnsTRUE
if the restart applies to thecondition andFALSE
if it does not; the default functionreturnsTRUE
for all conditions.
One additional field that can be specified for a restart isinteractive
. This should be a function of no arguments thatreturns a list of arguments to pass to the restart handler. The listcould be obtained by interacting with the user if necessary. ThefunctioninvokeRestartInteractively
calls this function toobtain the arguments to use when invoking the restart. The defaultinteractive
method queries the user for values for theformal arguments of the handler function.
Interrupts can be suspended while evaluating an expression usingsuspendInterrupts
. Subexpression can be evaluated withinterrupts enabled usingallowInterrupts
. These functionscan be used to make sure cleanup handlers cannot be interrupted.
.signalSimpleWarning
,.handleSimpleError
, and.tryResumeInterrupt
are used internally and should not becalled directly.
ThetryCatch
mechanism is similar to Javaerror handling. Calling handlers are based on Common Lisp andDylan. Restarts are based on the Common Lisp restart mechanism.
stop
andwarning
signal conditions,andtry
is essentially a simplified version oftryCatch
.assertCondition
in packagetoolsteststhat conditions are signalled and works with several of the abovehandlers.
tryCatch(1, finally = print("Hello"))e <- simpleError("test error")## Not run: stop(e) tryCatch(stop(e), finally = print("Hello")) tryCatch(stop("fred"), finally = print("Hello"))## End(Not run)tryCatch(stop(e), error = function(e) e, finally = print("Hello"))tryCatch(stop("fred"), error = function(e) e, finally = print("Hello"))withCallingHandlers({ warning("A"); 1+2 }, warning = function(w) {})## Not run: { withRestarts(stop("A"), abort = function() {}); 1 }## End(Not run)withRestarts(invokeRestart("foo", 1, 2), foo = function(x, y) {x + y})##--> More examples are part of##--> demo(error.catching)
tryCatch(1, finally= print("Hello"))e<- simpleError("test error")## Not run: stop(e) tryCatch(stop(e), finally= print("Hello")) tryCatch(stop("fred"), finally= print("Hello"))## End(Not run)tryCatch(stop(e), error=function(e) e, finally= print("Hello"))tryCatch(stop("fred"), error=function(e) e, finally= print("Hello"))withCallingHandlers({ warning("A");1+2}, warning=function(w){})## Not run:{ withRestarts(stop("A"), abort=function(){});1}## End(Not run)withRestarts(invokeRestart("foo",1,2), foo=function(x, y){x+ y})##--> More examples are part of##--> demo(error.catching)
conflicts
reports on objects that exist with the same name intwo or more places on thesearch
path, usually becausean object in the user's workspace or a package is masking a systemobject of the same name. This helps discover unintentional masking.
conflicts(where = search(), detail = FALSE)
conflicts(where= search(), detail=FALSE)
where | A subset of the search path, by default the whole search path. |
detail | If |
Ifdetail = FALSE
, a character vector of masked objects.Ifdetail = TRUE
, a list of character vectors giving the masked ormasking objects in that member of the search path. Empty vectors areomitted.
lm <- 1:3conflicts(, TRUE)## gives something like# $.GlobalEnv# [1] "lm"## $package:base# [1] "lm"## Remove things from your "workspace" that mask others:remove(list = conflicts(detail = TRUE)$.GlobalEnv)
lm<-1:3conflicts(,TRUE)## gives something like# $.GlobalEnv# [1] "lm"## $package:base# [1] "lm"## Remove things from your "workspace" that mask others:remove(list= conflicts(detail=TRUE)$.GlobalEnv)
Functions to create, open and close connections, i.e.,“generalized files”, such as possibly compressed files, URLs,pipes, etc.
file(description = "", open = "", blocking = TRUE, encoding = getOption("encoding"), raw = FALSE, method = getOption("url.method", "default"))url(description, open = "", blocking = TRUE, encoding = getOption("encoding"), method = getOption("url.method", "default"), headers = NULL)gzfile(description, open = "", encoding = getOption("encoding"), compression = 6)bzfile(description, open = "", encoding = getOption("encoding"), compression = 9)xzfile(description, open = "", encoding = getOption("encoding"), compression = 6)unz(description, filename, open = "", encoding = getOption("encoding"))pipe(description, open = "", encoding = getOption("encoding"))fifo(description, open = "", blocking = FALSE, encoding = getOption("encoding"))socketConnection(host = "localhost", port, server = FALSE, blocking = FALSE, open = "a+", encoding = getOption("encoding"), timeout = getOption("timeout"), options = getOption("socketOptions"))serverSocket(port)socketAccept(socket, blocking = FALSE, open = "a+", encoding = getOption("encoding"), timeout = getOption("timeout"), options = getOption("socketOptions"))open(con, ...)## S3 method for class 'connection'open(con, open = "r", blocking = TRUE, ...)close(con, ...)## S3 method for class 'connection'close(con, type = "rw", ...)flush(con)isOpen(con, rw = "")isIncomplete(con)socketTimeout(socket, timeout = -1)
file(description="", open="", blocking=TRUE, encoding= getOption("encoding"), raw=FALSE, method= getOption("url.method","default"))url(description, open="", blocking=TRUE, encoding= getOption("encoding"), method= getOption("url.method","default"), headers=NULL)gzfile(description, open="", encoding= getOption("encoding"), compression=6)bzfile(description, open="", encoding= getOption("encoding"), compression=9)xzfile(description, open="", encoding= getOption("encoding"), compression=6)unz(description, filename, open="", encoding= getOption("encoding"))pipe(description, open="", encoding= getOption("encoding"))fifo(description, open="", blocking=FALSE, encoding= getOption("encoding"))socketConnection(host="localhost", port, server=FALSE, blocking=FALSE, open="a+", encoding= getOption("encoding"), timeout= getOption("timeout"), options= getOption("socketOptions"))serverSocket(port)socketAccept(socket, blocking=FALSE, open="a+", encoding= getOption("encoding"), timeout= getOption("timeout"), options= getOption("socketOptions"))open(con,...)## S3 method for class 'connection'open(con, open="r", blocking=TRUE,...)close(con,...)## S3 method for class 'connection'close(con, type="rw",...)flush(con)isOpen(con, rw="")isIncomplete(con)socketTimeout(socket, timeout=-1)
description | character string. A description of the connection:see ‘Details’. |
open | character string. A description of how to open the connection(if it should be opened initially). See section ‘Modes’ forpossible values. |
blocking | logical. See the ‘Blocking’ section. |
encoding | the name of the encoding to be assumed. See the‘Encoding’ section. |
raw | logical. If true, a ‘raw’ interface is used whichwill be more suitable for arguments which are not regular files,e.g. character devices. This suppresses the check for a compressedfile when opening for text-mode reading, and asserts that the‘file’ may not be seekable. |
method | character string, partially matched to |
headers | named character vector of HTTP headers to use in HTTPrequests. It is ignored for non-HTTP URLs. The |
compression | integer in 0–9. The amount of compression to beapplied when writing, from none to maximal available. For |
timeout | numeric: the timeout (in seconds) to be used for thisconnection. Beware that some OSes may treat very large values aszero: however the POSIX standard requires values up to 31 days to besupported. |
options | optional character vector with options. Currently only |
filename | a filename within a zip file. |
host | character string. Host name for the port. |
port | integer. The TCP port number. |
server | logical. Should the socket be a client or a server? |
socket | a server socket listening for connections. |
con | a connection. |
type | character string. Currently ignored. |
rw | character string. Empty or |
... | arguments passed to or from other methods. |
The first eleven functions create connections. By default theconnection is not opened (except for a socket connection created bysocketConnection
orsocketAccept
and for server socketconnection created byserverSocket
), but maybe opened by setting a non-empty value of argumentopen
.
Forfile
the description is a path to the file to be opened(whentilde expansion is done) or a complete URL (when it isthe same as callingurl
), or""
(the default) or"clipboard"
(see the ‘Clipboard’ section). Use"stdin"
to refer to the C-level ‘standard input’ of theprocess (which need not be connected to anything in a console orembedded version ofR, and is not inRGui
on Windows). Seealsostdin()
for the subtly different R-level concept ofstdin
. Seenullfile()
for a platform-independentway to get filename of the null device.
Forurl
the description is a complete URL including scheme(such as ‘http://’, ‘https://’, ‘ftp://’ or‘file://’). Method"internal"
is that available sinceconnections were introduced but now mainly defunct. Method"wininet"
is only available on Windows (it uses the WinINetfunctions of that OS) and method"libcurl"
(using the libraryof that name:https://curl.se/libcurl/) is nowadays required butwas optional on Windows beforeR 4.2.0. Method"default"
currently uses method"internal"
for ‘file://’ URLs and"libcurl"
for all others. Which methods support which schemeshas varied byR version – currently"internal"
supports only‘file://’;"wininet"
supports ‘file://’,‘http://’ and ‘https://’. Proxies can be specified: seedownload.file
.
Forgzfile
the description is the path to a file compressed bygzip
: it can also open for reading uncompressed files andthose compressed bybzip2
,xz
orlzma
.
Forbzfile
the description is the path to a file compressed bybzip2
.
Forxzfile
the description is the path to a file compressed byxz
(https://en.wikipedia.org/wiki/Xz) or (for readingonly)lzma
(https://en.wikipedia.org/wiki/LZMA).
unz
reads (only) single files within zip files, in binary mode.The description is the full path to the zip file, with ‘.zip’extension if required.
Forpipe
the description is the command line to be piped to orfrom. This is run in a shell, on Windows that specified by theCOMSPEC environment variable.
Forfifo
the description is the path of the fifo. (Support forfifo
connections is optional but they are available on mostUnix platforms and on Windows.)
The intention is thatfile
andgzfile
can be usedgenerally for text input (from files, ‘http://’ and‘https://’ URLs) and binary input respectively.
open
,close
andseek
are generic functions: thefollowing applies to the methods relevant to connections.
open
opens a connection. In general functions usingconnections will open them if they are not open, but then close themagain, so to leave a connection open callopen
explicitly.
close
closes and destroys a connection. This will happenautomatically in due course (with a warning) if there is no longer anR object referring to the connection.
flush
flushes the output stream of a connection open forwrite/append (where implemented, currently for file and clipboardconnections,stdout
andstderr
).
If for afile
or (on most platforms) afifo
connectionthe description is""
, the file/fifo is immediately opened (in"w+"
mode unlessopen = "w+b"
is specified) and unlinkedfrom the file system. This provides a temporary file/fifo to write toand then read from.
socketConnection(server=TRUE)
creates a new temporary server socketlistening on the given port. As soon as a new socket connection isaccepted on that port, the server socket is automatically closed.serverSocket
creates a listening server socket which can be usedfor accepting multiple socket connections bysocketAccept
. To stoplistening for new connections, a server socket needs to be closedexplicitly byclose
.
socketConnection
andsocketAccept
support setting ofsocket-specific options. Currently only"no-delay"
isimplemented which enables theTCP_NODELAY
socket option, causingthe socket to flush send buffers immediately (instead of waiting tocollect all output before sending). This option is useful forprotocols that need fast request/response turn-around times.
socketTimeout
sets connection timeout of a socket connection. Anegativetimeout
can be given to query the old value.
file
,pipe
,fifo
,url
,gzfile
,bzfile
,xzfile
,unz
,socketConnection
,socketAccept
andserverSocket
return a connection object which inherits from class"connection"
and has a first more specific class.
open
andflush
returnNULL
, invisibly.
close
returns eitherNULL
or an integer status,invisibly. The status is from when the connection was last closed andis available only for some types of connections (e.g., pipes, files andfifos): typically zero values indicate success. Negative values willresult in a warning; if writing, these may indicate write failures and shouldnot be ignored. Connections should be closed explicitly when finishedwith to avoid wasting resources and to reduce the risk that some buffereddata in output connections would be lost (seeon.exit()
forhow to run code also in case of error).
isOpen
returns a logical value, whether the connection iscurrently open.
isIncomplete
returns a logical value, whether the last read attemptfrom a non-blocking connection provided no data (currently no data from asocket or an unterminated line inreadLines
), or for anoutput text connection whether there is unflushed output. See examplebelow.
socketTimeout
returns the old timeout value of a socket connection.
url
andfile
support URL schemes ‘file://’,‘http://’, ‘https://’ and ‘ftp://’.
method = "libcurl"
allows more schemes: exactly which schemesis platform-dependent (seelibcurlVersion
), but allplatforms will support ‘https://’ and most platforms will support‘ftps://’.
Support for the ‘ftp://’ scheme by the"internal"
method wasdeprecated inR 4.1.1 and removed inR 4.2.0.
Most methods do not percent-encode special characters such as spacesin ‘http://’ URLs (seeURLencode
), but it seems the"wininet"
method does.
A note on ‘file://’ URLs (which are handled by the same internalcode irrespective of argumentmethod
). The most general form(from RFC1738) is ‘file://host/path/to/file’, butR only acceptsthe form with an emptyhost
field referring to the localmachine.
On a Unix-alike, this is then ‘file:///path/to/file’, where‘path/to/file’ is relative to ‘/’. So although the thirdslash is strictly part of the specification not part of the path, thiscan be regarded as a way to specify the file ‘/path/to/file’. Itis not possible to specify a relative path using a file URL.
In this form the path is relative to the root of the filesystem, not aWindows concept. The standard form on Windows is‘file:///d:/R/repos’: for compatibility with earlier versions ofR and Unix versions, any other form is parsed asR as ‘file://’pluspath_to_file
. Also, backslashes are accepted within thepath even though RFC1738 does not allow them.
No attempt is made to decode a percent-encoded ‘file:’ URL: callURLdecode
if necessary.
All the methods attempt to follow redirected HTTP andHTTPS URLs.
Server-side cached data is always accepted.
Functiondownload.file
and several contributed packagesprovide more comprehensive facilities to download from URLs.
Possible values for the argumentopen
are
"r"
or"rt"
Open for reading in text mode.
"w"
or"wt"
Open for writing in text mode.
"a"
or"at"
Open for appending in text mode.
"rb"
Open for reading in binary mode.
"wb"
Open for writing in binary mode.
"ab"
Open for appending in binary mode.
"r+"
,"r+b"
Open for reading and writing.
"w+"
,"w+b"
Open for reading and writing,truncating file initially.
"a+"
,"a+b"
Open for reading and appending.
Not all modes are applicable to all connections: for example URLs canonly be opened for reading. Only file and socket connections can beopened for both reading and writing. An unsupported mode is usuallysilently substituted.
If a file or fifo is created on a Unix-alike, its permissions will bethe maximal allowed by the current setting ofumask
(seeSys.umask
).
For many connections there is little or no difference between text andbinary modes. For file-like connections on Windows, translation ofline endings (betweenLF andCRLF) is done in text mode only (but textread operations on connections such asreadLines
,scan
andsource
work for any form of lineending). VariousR operations are possible in only one of the modes:for examplepushBack
is text-oriented and is onlyallowed on connections open for reading in text mode, and binaryoperations such asreadBin
,load
andsave
can only be done on binary-mode connections.
The mode of a connection is determined when actually opened, which isdeferred ifopen = ""
is given (the default for all but socketconnections). An explicit call toopen
can specify the mode,but otherwise the mode will be"r"
. (gzfile
,bzfile
andxzfile
connections are exceptions, as thecompressed file always has to be opened in binary mode and noconversion of line-endings is done even on Windows, so the defaultmode is interpreted as"rb"
.) Most operations that need writeaccess or text-only or binary-only mode will override the default modeof a non-yet-open connection.
Append modes need to be considered carefully for compressed-fileconnections. They donot produce a single compressed streamon the file, but rather append a new compressed stream to the file.Readers may or may not read beyond end of the first stream: currentlyR does so forgzfile
,bzfile
andxzfile
connections.
R supportsgzip
,bzip2
andxz
compression (also read-only support for its precursor,lzma
compression).
For reading, the type of compression (if any) can be determined fromthe first few bytes of the file. Thus forfile(raw = FALSE)
connections, ifopen
is""
,"r"
or"rt"
the connection can read any of the compressed file types as well asuncompressed files. (Using"rb"
will allow compressed files tobe read byte-by-byte.) Similarly,gzfile
connections can readany of the forms of compression and uncompressed files in any readmode.
(The type of compression is determined when the connection is createdifopen
is unspecified and a file of that name exists. If theintention is to open the connection to write a file with adifferent form of compression under that name, specifyopen = "w"
when the connection is created orunlink
the file before creating the connection.)
For write-mode connections,compress
specifies how hard thecompressor works to minimize the file size, and higher values needmore CPU time and more working memory (up to ca 800Mb forxzfile(compress = 9)
). Forxzfile
negative values ofcompress
correspond to adding thexz
argument-e: this takes more time (double?) to compress but mayachieve (slightly) better compression. The default (6
) hasgood compression and modest (100Mb memory) usage: but if you are usingxz
compression you are probably looking for high compression.
Choosing the type of compression involves tradeoffs:gzip
,bzip2
andxz
are successively less widely supported,need more resources for both compression and decompression, andachieve more compression (although individual files may buck thegeneral trend). Typical experience is thatbzip2
compressionis 15% better on text files thangzip
compression, andxz
with maximal compression 30% better. The experience withRsave
files is similar, but on some large ‘.rda’filesxz
compression is much better than the other two. Withcurrent computers decompression times even withcompress = 9
are typically modest and reading compressed files is usually fasterthan uncompressed ones because of the reduction in disc activity.
The encoding of the input/output stream of a connection can bespecified by name in the same way as it would be given toiconv
: see that help page for how to find out whatencoding names are recognized on your platform. Additionally,""
and"native.enc"
both mean the ‘native’encoding, that is the internal encoding of the current locale andhence no translation is done.
When writing to a text connection, the connections code always assumes itsinput is in native encoding, so e.g.writeLines
has toconvert text to native encoding. The native encoding is UTF-8 on mostsystems (since R 4.2 also on recent Windows) and can represent allcharacters.writeLines
does not do the conversion whenuseBytes=TRUE
(for expert use only, only useful on systems with nativeencoding other than UTF-8), but the connections code still behaves as ifthe text was in native encoding, so any attempt to convert encoding(encoding
argument other than""
and"native.enc"
) inconnections will produce incorrect results.
When reading from a text connection, the connections code re-encodes theinput to native encoding (from the encoding given by theencoding
argument). On systems where UTF-8 is not the native encoding, one canread text not representable in the native encoding usingreadLines
andscan
by providing them with anunopened connection that has been created with theencoding
argument specifying the input encoding.readLines
andscan
would then instruct the connections code to convert thetext to UTF-8 (instead of native encoding) and they will return it marked(aka declared, seeEncoding
)as"UTF-8"
. Finally and for expert use only, one may disablere-encoding of input by specifying""
or"native.enc"
asencoding
for the connection, but then mark the text as being"UTF-8"
or"latin1"
via theencoding
argumentofreadLines
andscan
.
Re-encoding only works for connections in text mode: reading from aconnection with re-encoding specified in binary mode will read thestream of bytes, but mixing text and binary mode reads (e.g., mixingcalls toreadLines
andreadChar
) is likelyto lead to incorrect results.
The encodings"UCS-2LE"
and"UTF-16LE"
are treatedspecially, as they are appropriate values for Windows ‘Unicode’text files. If the first two bytes are the Byte Order Mark0xFEFF
then these are removed as some implementations oficonv
do not acceptBOMs. Note that whereas mostimplementations will handleBOMs using encoding"UCS-2"
andchoose the appropriate byte order, some (including earlier versions ofglibc
) will not. There is a subtle distinction between"UTF-16"
and"UCS-2"
(seehttps://en.wikipedia.org/wiki/UTF-16): the use of characters inthe ‘Supplementary Planes’ which need surrogate pairs is veryrare so"UCS-2LE"
is an appropriate first choice (as it is morewidely implemented).
The encoding"UTF-8-BOM"
is accepted for reading and willremove a Byte Order Mark if present (which it often is for files andwebpages generated by Microsoft applications). If aBOM is required(it is not recommended) when writing it should be written explicitly,e.g. bywriteChar("\ufeff", con, eos = NULL)
orwriteBin(as.raw(c(0xef, 0xbb, 0xbf)), binary_con)
Encoding names"utf8"
,"mac"
and"macroman"
arenot portable, and not supported on all currentR platforms."UTF-8"
is portable and"macintosh"
is the official (andmost widely supported) name for ‘Mac Roman’. (R maps"utf8"
to"UTF-8"
internally.)
Requesting a conversion that is not supported is an error, reportedwhen the connection is opened. Exactly what happens when therequested translation cannot be done for invalid input is in generalundocumented. On output the result is likely to be that up to theerror, with a warning. On input, it will most likely be all or someof the input up to the error.
It may be possible to deduce the current native encoding fromSys.getlocale("LC_CTYPE")
, but not all OSes record it.
Whether or not the connection blocks can be specified for file, url(default yes), fifo and socket connections (default not).
In blocking mode, functions using the connection do not return to theR evaluator until the read/write is complete. In non-blocking mode,operations return as soon as possible, so on input they will returnwith whatever input is available (possibly none) and for output theywill return whether or not the write succeeded.
The functionreadLines
behaves differently in respect ofincomplete last lines in the two modes: see its help page.
Even when a connection is in blocking mode, attempts are made toensure that it does not block the event loop and hence the operationof GUI parts ofR. These do not always succeed, and the wholeRprocess will be blocked during aDNS lookup on Unix, for example.
Most blocking operations on HTTP/FTP URLs and on sockets are subject to thetimeout set byoptions("timeout")
. Note that this is a timeoutfor no response, not for the whole operation. The timeout is set atthe time the connection is opened (more precisely, when the lastconnection of that type – ‘http:’, ‘ftp:’ or socket – wasopened).
Fifos default to non-blocking. That follows S version 4 and isprobably most natural, but it does have some implications. Inparticular, opening a non-blocking fifo connection for writing (only)will fail unless some other process is reading on the fifo.
Opening a fifo for both reading and writing (in any mode: one can onlyappend to fifos) connects both sides of the fifo to theR process,and provides an similar facility tofile()
.
file
can be used withdescription = "clipboard"
in mode"r"
only. This reads the X11 primary selection (seehttps://specifications.freedesktop.org/clipboards-spec/clipboards-latest.txt),which can also be specified as"X11_primary"
and the secondaryselection as"X11_secondary"
. On most systems the clipboardselection (that used by ‘Copy’ from an ‘Edit’ menu) canbe specified as"X11_clipboard"
.
When a clipboard is opened for reading, the contents are immediatelycopied to internal storage in the connection.
Unix users wishing towrite to one of the X11 selections may beable to do so viaxclip
(https://github.com/astrand/xclip) orxsel
(https://www.vergenet.net/~conrad/software/xsel/), for example bypipe("xclip -i", "w")
for the primary selection.
macOS users can usepipe("pbpaste")
andpipe("pbcopy", "w")
to read from and write to that system'sclipboard.
In most cases these are translated to the native encoding.
The exceptions arefile
andpipe
on Windows, where adescription
which is marked as being in UTF-8 is passed toWindows as a ‘wide’ character string. This allows files withnames not in the native encoding to be opened on file systems whichuse Unicode file names (such asNTFS but not FAT32).
Most modern browsers do not support such URLs, and ‘https://’ones are much preferred for use inR.
It is intended thatR will continue to allow such URLs for as long aslibcurl
does, but as they become rarer this is increasinglyuntested. What ‘protocols’ the version oflibcurl
being used supports can be seen by callinglibcurlVersion()
.
There is a limit on the number of connections which can be allocated(not necessarily open) at any one time. It is good practice to closeconnections when finished with, but if necessary garbage-collectionwill be invoked to close those connections without anyR objectreferring to them.
The default limit is 128 (including the three terminal connections,stdin
,stdout
andstderr
). This can be increasedwhenR is started using the option--max-connections=N, wherethe maximum allowed value is 4096.
However, many types of connections use other resources which arethemselves limited. Notably on Unix, ‘file descriptors’ whichby default are per-process limited: this limits the number ofconnections using files, pipes and fifos. (The default limit is 256on macOS (and Solaris) but 1024 on Linux. The limit can be raised in theshell used to launchR, for example byulimit -n
.) Filedescriptors are used for many other purposes including dynamicallyloadingDSO/DLLs (seedyn.load
) which may use up to 60%of the limit.
Windows has a default limit of 512 open C file streams: these are usedby at leastfile
,gzfile
,bzfile
,xzfile
,pipe
,url
andunz
connections applied to files(rather than URLs).
Packageparallel'smakeCluster
uses socketconnections to communicate with the worker processes, one per worker.
R's connections are modelled on those in S version 4 (see Chambers,1998). HoweverR goes well beyond the S model, for example in outputtext connections and URL, compressed and socket connections.The default open mode inR is"r"
except for socket connections.This differs from S, where it is the equivalent of"r+"
,known as"*"
.
On (historic) platforms wherevsnprintf
does not return the neededlength of output there is a 100,000 byte output limit on the length ofa line for text output onfifo
,gzfile
,bzfile
andxzfile
connections: longer lines will be truncated with awarning.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language. Springer.
Ripley, B. D. (2001).“Connections.”R News,1(1), 16–7.https://www.r-project.org/doc/Rnews/Rnews_2001-1.pdf.
textConnection
,seek
,showConnections
,pushBack
.
Functions making direct use of connections are (text-mode)readLines
,writeLines
,cat
,sink
,scan
,parse
,read.dcf
,dput
,dump
and(binary-mode)readBin
,readChar
,writeBin
,writeChar
,load
andsave
.
capabilities
to see iffifo
connections aresupported by this build ofR.
gzcon
to wrapgzip
(de)compression around aconnection.
options
HTTPUserAgent
,internet.info
andtimeout
are used by some of the methods for URL connections.
memCompress
for more ways to (de)compress and referenceson data compression.
extSoftVersion
for the versions of thezlib
(forgzfile
),bzip2
andxz
libraries in use.
To flush output to the Windows and macOS consoles, seeflush.console
.
zzfil <- tempfile(fileext=".data")zz <- file(zzfil, "w") # open an output file connectioncat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")cat("One more line\n", file = zz)close(zz)readLines(zzfil)unlink(zzfil)zzfil <- tempfile(fileext=".gz")zz <- gzfile(zzfil, "w") # compressed filecat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")close(zz)readLines(zz <- gzfile(zzfil))close(zz)unlink(zzfil)zz # an invalid connectionzzfil <- tempfile(fileext=".bz2")zz <- bzfile(zzfil, "w") # bzip2-ed filecat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")close(zz)zz # print() method: invalid connectionprint(readLines(zz <- bzfile(zzfil)))close(zz)unlink(zzfil)## An example of a file open for reading and writingTpath <- tempfile("test")Tfile <- file(Tpath, "w+")c(isOpen(Tfile, "r"), isOpen(Tfile, "w")) # both TRUEcat("abc\ndef\n", file = Tfile)readLines(Tfile)seek(Tfile, 0, rw = "r") # reset to beginningreadLines(Tfile)cat("ghi\n", file = Tfile)readLines(Tfile)Tfile # -> print() : "valid" connectionclose(Tfile)Tfile # -> print() : "invalid" connectionunlink(Tpath)## We can do the same thing with an anonymous file.Tfile <- file()cat("abc\ndef\n", file = Tfile)readLines(Tfile)close(Tfile)## Not run: ## fifo example -- may hang even with OS support for fifosif(capabilities("fifo")) { zzfil <- tempfile(fileext="-fifo") zz <- fifo(zzfil, "w+") writeLines("abc", zz) print(readLines(zz)) close(zz) unlink(zzfil)}## End(Not run)## Unix examples of use of pipes# read listing of current directoryreadLines(pipe("ls -1"))# remove trailing commas. Suppose## Not run: % cat data2_450, 390, 467, 654, 30, 542, 334, 432, 421,357, 497, 493, 550, 549, 467, 575, 578, 342,446, 547, 534, 495, 979, 479## End(Not run)# Then read this byscan(pipe("sed -e s/,$// data2_"), sep = ",")# convert decimal point to comma in output: see also write.table# both R strings and (probably) the shell need \ doubledzzfil <- tempfile("outfile")zz <- pipe(paste("sed s/\\\\./,/ >", zzfil), "w")cat(format(round(stats::rnorm(48), 4)), fill = 70, file = zz)close(zz)file.show(zzfil, delete.file = TRUE)## Not run: ## example for a machine running a finger daemoncon <- socketConnection(port = 79, blocking = TRUE)writeLines(paste0(system("whoami", intern = TRUE), "\r"), con)gsub(" *$", "", readLines(con))close(con)## End(Not run)## Not run: ## Two R processes communicating via non-blocking sockets# R process 1con1 <- socketConnection(port = 6011, server = TRUE)writeLines(LETTERS, con1)close(con1)# R process 2con2 <- socketConnection(Sys.info()["nodename"], port = 6011)# as non-blocking, may need to loop for inputreadLines(con2)while(isIncomplete(con2)) { Sys.sleep(1) z <- readLines(con2) if(length(z)) print(z)}close(con2)## examples of use of encodings# write a file in UTF-8cat(x, file = (con <- file("foo", "w", encoding = "UTF-8"))); close(con)# read a 'Windows Unicode' fileA <- read.table(con <- file("students", encoding = "UCS-2LE")); close(con)## End(Not run)
zzfil<- tempfile(fileext=".data")zz<- file(zzfil,"w")# open an output file connectioncat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")cat("One more line\n", file= zz)close(zz)readLines(zzfil)unlink(zzfil)zzfil<- tempfile(fileext=".gz")zz<- gzfile(zzfil,"w")# compressed filecat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)readLines(zz<- gzfile(zzfil))close(zz)unlink(zzfil)zz# an invalid connectionzzfil<- tempfile(fileext=".bz2")zz<- bzfile(zzfil,"w")# bzip2-ed filecat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)zz# print() method: invalid connectionprint(readLines(zz<- bzfile(zzfil)))close(zz)unlink(zzfil)## An example of a file open for reading and writingTpath<- tempfile("test")Tfile<- file(Tpath,"w+")c(isOpen(Tfile,"r"), isOpen(Tfile,"w"))# both TRUEcat("abc\ndef\n", file= Tfile)readLines(Tfile)seek(Tfile,0, rw="r")# reset to beginningreadLines(Tfile)cat("ghi\n", file= Tfile)readLines(Tfile)Tfile# -> print() : "valid" connectionclose(Tfile)Tfile# -> print() : "invalid" connectionunlink(Tpath)## We can do the same thing with an anonymous file.Tfile<- file()cat("abc\ndef\n", file= Tfile)readLines(Tfile)close(Tfile)## Not run: ## fifo example -- may hang even with OS support for fifosif(capabilities("fifo")){ zzfil<- tempfile(fileext="-fifo") zz<- fifo(zzfil,"w+") writeLines("abc", zz) print(readLines(zz)) close(zz) unlink(zzfil)}## End(Not run)## Unix examples of use of pipes# read listing of current directoryreadLines(pipe("ls -1"))# remove trailing commas. Suppose## Not run: % cat data2_450,390,467,654,30,542,334,432,421,357,497,493,550,549,467,575,578,342,446,547,534,495,979,479## End(Not run)# Then read this byscan(pipe("sed -e s/,$// data2_"), sep=",")# convert decimal point to comma in output: see also write.table# both R strings and (probably) the shell need \ doubledzzfil<- tempfile("outfile")zz<- pipe(paste("sed s/\\\\./,/ >", zzfil),"w")cat(format(round(stats::rnorm(48),4)), fill=70, file= zz)close(zz)file.show(zzfil, delete.file=TRUE)## Not run:## example for a machine running a finger daemoncon<- socketConnection(port=79, blocking=TRUE)writeLines(paste0(system("whoami", intern=TRUE),"\r"), con)gsub(" *$","", readLines(con))close(con)## End(Not run)## Not run:## Two R processes communicating via non-blocking sockets# R process 1con1<- socketConnection(port=6011, server=TRUE)writeLines(LETTERS, con1)close(con1)# R process 2con2<- socketConnection(Sys.info()["nodename"], port=6011)# as non-blocking, may need to loop for inputreadLines(con2)while(isIncomplete(con2)){ Sys.sleep(1) z<- readLines(con2)if(length(z)) print(z)}close(con2)## examples of use of encodings# write a file in UTF-8cat(x, file=(con<- file("foo","w", encoding="UTF-8"))); close(con)# read a 'Windows Unicode' fileA<- read.table(con<- file("students", encoding="UCS-2LE")); close(con)## End(Not run)
Constants built intoR.
LETTERSlettersmonth.abbmonth.namepi
LETTERSlettersmonth.abbmonth.namepi
R has a small number of built-in constants.
The following constants are available:
LETTERS
: the 26 upper-case letters of the Romanalphabet;
letters
: the 26 lower-case letters of the Romanalphabet;
month.abb
: the three-letter abbreviations for theEnglish month names;
month.name
: the English names for the months of theyear;
pi
: the ratio of the circumference of a circle to itsdiameter.
These are implemented as variables in the base namespace takingappropriate values.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Quotes
for the parsing of character constants,NumericConstants
for numeric constants.
## John Machin (ca 1706) computed pi to over 100 decimal places## using the Taylor series expansion of the second term ofpi - 4*(4*atan(1/5) - atan(1/239))## months in Englishmonth.name## months in your current localeformat(ISOdate(2000, 1:12, 1), "%B")format(ISOdate(2000, 1:12, 1), "%b")
## John Machin (ca 1706) computed pi to over 100 decimal places## using the Taylor series expansion of the second term ofpi-4*(4*atan(1/5)- atan(1/239))## months in Englishmonth.name## months in your current localeformat(ISOdate(2000,1:12,1),"%B")format(ISOdate(2000,1:12,1),"%b")
TheR Who-is-who, describing who made significant contributions tothe development ofR.
contributors()
contributors()
These are the basic control-flow constructs of theR language. Theyfunction in much the same way as control statements in any Algol-likelanguage. They are allreserved words.
if(cond) exprif(cond) cons.expr else alt.exprfor(var in seq) exprwhile(cond) exprrepeat exprbreaknextx %||% y
if(cond) exprif(cond) cons.exprelse alt.exprfor(varin seq) exprwhile(cond) exprrepeat exprbreaknextx%||% y
cond | A length-one logical vector that is not |
var | A syntactical name for a variable. |
seq | An expression evaluating to a vector (including a list andanexpression) or to apairlist or |
expr ,cons.expr ,alt.expr ,x ,y | Anexpression in a formal sense. This is either asimple expression or a so-calledcompound expression, usuallyof the form |
break
breaks out of afor
,while
orrepeat
loop; control is transferred to the first statement outside theinner-most loop.next
halts the processing of the currentiteration and advances the looping index. Bothbreak
andnext
apply only to the innermost of nested loops.
Note that it is a common mistake to forget to put braces ({ .. }
)around your statements, e.g., afterif(..)
orfor(....)
.In particular, you should not have a newline between}
andelse
to avoid a syntax error in entering aif ... else
construct at the keyboard or viasource
.For that reason, one (somewhat extreme) attitude of defensive programmingis to always use braces, e.g., forif
clauses.
Theseq
in afor
loop is evaluated at the start ofthe loop; changing it subsequently does not affect the loop. Ifseq
has length zero the body of the loop is skipped. Otherwise thevariablevar
is assigned in turn the value of each element ofseq
. You can assign tovar
within the body of the loop,but this will not affect the next iteration. When the loop terminates,var
remains as a variable containing its latest value.
The null coalescing operator%||%
is a simple 1-line function:x %||% y
is an idiomatic way to call
if (is.null(x)) y else x # or equivalently, of course, if(!is.null(x)) x else y
Inspired by Ruby, it was first proposed by Hadley Wickham.
if
returns the value of the expression evaluated, orNULL
invisibly if none was (which may happen if there is noelse
).
for
,while
andrepeat
returnNULL
invisibly.for
setsvar
to the last used element ofseq
,or toNULL
if it was of length zero.
break
andnext
do not return a value as they transfercontrol within the loop.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Syntax
for the basicR syntax and operators,Paren
for parentheses and braces.
ifelse
,switch
for other ways to control flow.
for(i in 1:5) print(1:i)for(n in c(2,5,10,20,50)) { x <- stats::rnorm(n) cat(n, ": ", sum(x^2), "\n", sep = "")}f <- factor(sample(letters[1:5], 10, replace = TRUE))for(i in unique(f)) print(i)res <- {}res %||% "alternative result"x <- head(x) %||% stop("parsed, but *not* evaluated..")res <- if(sum(x) > 7.5) mean(x) # may be NULLres %||% "sum(x) <= 7.5"
for(iin1:5) print(1:i)for(nin c(2,5,10,20,50)){ x<- stats::rnorm(n) cat(n,": ", sum(x^2),"\n", sep="")}f<- factor(sample(letters[1:5],10, replace=TRUE))for(iin unique(f)) print(i)res<-{}res%||%"alternative result"x<- head(x)%||% stop("parsed, but *not* evaluated..")res<-if(sum(x)>7.5) mean(x)# may be NULLres%||%"sum(x) <= 7.5"
R is released under the ‘GNU Public License’: seelicense
for details. The license describes your rightto useR. Copyright is concerned with ownership of intellectualrights, and some of the software used has conditions that thecopyright must be explicitly stated: see the ‘Details’ section. Weare grateful to these people and other contributors (seecontributors
) for the ability to use their work.
The file ‘R_HOME/COPYRIGHTS’ lists the copyrights in fulldetail.
Given matricesx
andy
as arguments, return a matrixcross-product. This is formally equivalent to (but faster than) the callt(x) %*% y
(crossprod
) orx %*% t(y)
(tcrossprod
).
These are generic functions sinceR 4.4.0: methods can be writtenindividually or via thematOps
groupgeneric function; it dispatches to S3 and S4 methods.
crossprod(x, y = NULL, ...)tcrossprod(x, y = NULL, ...)
crossprod(x, y=NULL,...)tcrossprod(x, y=NULL,...)
x ,y | numeric or complex matrices (or vectors): |
... | potential further arguments for methods. |
A double or complex matrix, with appropriatedimnames
takenfromx
andy
.
Whenx
ory
are not matrices, they are treated as column orrow matrices, but theirnames
are usuallynotpromoted todimnames
. Hence, currently, the lastexample has empty dimnames.
In the same situation, these matrix products (also%*%
)are more flexible in promotion of vectors to row or column matrices, suchthat more cases are allowed, sinceR 3.2.0.
The propagation ofNaN
/Inf
values, precision, and performance of matrixproducts can be controlled byoptions("matprod")
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
(z <- crossprod(1:4)) # = sum(1 + 2^2 + 3^2 + 4^2)drop(z) # scalarx <- 1:4; names(x) <- letters[1:4]; xtcrossprod(as.matrix(x)) # isidentical(tcrossprod(as.matrix(x)), crossprod(t(x)))tcrossprod(x) # no dimnamesm <- matrix(1:6, 2,3) ; v <- 1:3; v2 <- 2:1stopifnot(identical(tcrossprod(v, m), v %*% t(m)), identical(tcrossprod(v, m), crossprod(v, t(m))), identical(crossprod(m, v2), t(m) %*% v2))
(z<- crossprod(1:4))# = sum(1 + 2^2 + 3^2 + 4^2)drop(z)# scalarx<-1:4; names(x)<- letters[1:4]; xtcrossprod(as.matrix(x))# isidentical(tcrossprod(as.matrix(x)), crossprod(t(x)))tcrossprod(x)# no dimnamesm<- matrix(1:6,2,3); v<-1:3; v2<-2:1stopifnot(identical(tcrossprod(v, m), v%*% t(m)), identical(tcrossprod(v, m), crossprod(v, t(m))), identical(crossprod(m, v2), t(m)%*% v2))
Report information on the C stack size and usage (if available).
Cstack_info()
Cstack_info()
On most platforms, C stack information is recorded whenR isinitialized and used for stack-checking. If this information isunavailable, thesize
will be returned asNA
, andstack-checking is not performed.
The information on the stack base address is thought to be accurate onWindows, Linux (usingglibc
), macOS and FreeBSD but a heuristicis used on other platforms. Because this might be slightlyinaccurate, the current usage could be estimated as negative. (Theheuristic is not used on embedded uses ofR on platforms where thestack base information is not thought to be accurate.)
The ‘evaluation depth’ is the number of nestedR expressionscurrently under evaluation: this has a limit controlled byoptions("expressions")
.
An integer vector. This has named elements
size | The size of the stack (in bytes), or |
current | The estimated current usage (in bytes), possibly |
direction |
|
eval_depth | The current evaluation depth (including two callsfor the call to |
Cstack_info()
Cstack_info()
Returns a vector whose elements are the cumulative sums, products,minima or maxima of the elements of the argument.
cumsum(x)cumprod(x)cummax(x)cummin(x)
cumsum(x)cumprod(x)cummax(x)cummin(x)
x | a numeric or complex (not |
These are generic functions: methods can be defined for themindividually or via theMath
group generic.
A vector of the same length and type asx
(after coercion),except thatcumprod
returns a numeric vector for integer input(for consistency with*
). Names are preserved.
AnNA
value inx
causes the corresponding and followingelements of the return value to beNA
, as does integer overflowincumsum
(with a warning).In the complex case withNA
s, theseNA
elements mayhave finite real or imaginary parts, notably forcumsum()
,fulfilling the identityIm(cumsum(x))
cumsum(Im(x))
.
cumsum
andcumprod
are S4 generic functions:methods can be defined for them individually or via theMath
group generic.cummax
andcummin
are individually S4 generic functions.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (cumsum
only.)
cumsum(1:10)cumprod(1:10)cummin(c(3:1, 2:0, 4:2))cummax(c(3:1, 2:0, 4:2))
cumsum(1:10)cumprod(1:10)cummin(c(3:1,2:0,4:2))cummax(c(3:1,2:0,4:2))
Retrieve the headers for a URL for a supported protocol such as‘http://’, ‘ftp://’, ‘https://’ and ‘ftps://’.
curlGetHeaders(url, redirect = TRUE, verify = TRUE, timeout = 0L, TLS = "")
curlGetHeaders(url, redirect=TRUE, verify=TRUE, timeout=0L, TLS="")
url | character string specifying the URL. |
redirect | logical: should redirections be followed? |
verify | logical: should certificates be verified as validand applying to that host? |
timeout | integer: the maximum time in seconds the request isallowed to take. Non-positive and invalid values are ignored(including the default). (Added inR 4.1.0.) |
TLS | character: the minimum version of theTLS protocol to be usedfor ‘https://’ URLs: the default ( |
This reports whatcurl -I -L
orcurl -I
wouldreport. For a ‘ftp://’ URL the ‘headers’ are a record ofthe conversation between client and server before data transfer.
Only 500 header lines will be reported: there is a limit of 20redirections so this should suffice (and even 20 would indicateproblems).
If argumenttimeout
is not set to a positive integer this usesgetOption("timeout")
which defaults to 60 seconds. Asthe request cannot be interrupted you may want to consider a shortervalue.
To see all the details of the interaction with the server(s) setoptions(internet.info = 1)
.
HTTP[S] servers are allowed to refuse requests to read the headers andsome do: this will result in astatus
of405
.
For possible issues with secure URLs (especially on Windows) seedownload.file
.
There is a security risk in not verifying certificates, but as onlythe headers are captured it is slight. Usually looking at the URL ina browser will reveal what the problem is (and it may well bemachine-specific).
A character vector with integer attribute"status"
(thelast-received ‘status’ code). If redirection occurs this will includethe headers for all the URLs visited.
For the interpretation of ‘status’ codes seehttps://en.wikipedia.org/wiki/List_of_HTTP_status_codes andhttps://en.wikipedia.org/wiki/List_of_FTP_server_return_codes.A successful FTP connection will usually have status 250, 257 or 350.
capabilities("libcurl")
to see if this is supported.libcurlVersion
for the version oflibcurl
in use.
options
HTTPUserAgent
andtimeout
are used.
## needs Internet access, results varycurlGetHeaders("http://bugs.r-project.org") ## this redirects to https://## 2023-04: replaces slow and unreliable https://httpbin.org/status/404curlGetHeaders("https://developer.R-project.org/inet-tests/not-found")## returns status
## needs Internet access, results varycurlGetHeaders("http://bugs.r-project.org")## this redirects to https://## 2023-04: replaces slow and unreliable https://httpbin.org/status/404curlGetHeaders("https://developer.R-project.org/inet-tests/not-found")## returns status
cut
divides the range ofx
into intervalsand codes the values inx
according to whichinterval they fall. The leftmost interval corresponds to level one,the next leftmost to level two and so on.
cut(x, ...)## Default S3 method:cut(x, breaks, labels = NULL, include.lowest = FALSE, right = TRUE, dig.lab = 3, ordered_result = FALSE, ...)
cut(x,...)## Default S3 method:cut(x, breaks, labels=NULL, include.lowest=FALSE, right=TRUE, dig.lab=3, ordered_result=FALSE,...)
x | a numeric vector which is to be converted to a factor by cutting. |
breaks | either a numeric vector of two or more unique cut points or asingle number (greater than or equal to 2) giving the number ofintervals into which |
labels | labels for the levels of the resulting category. By default,labels are constructed using |
include.lowest | logical, indicating if an ‘x[i]’ equal tothe lowest (or highest, for |
right | logical, indicating if the intervals should be closed onthe right (and open on the left) or vice versa. |
dig.lab | integer which is used when labels are not given. Itdetermines the number of digits used in formatting the break numbers. |
ordered_result | logical: should the result be an ordered factor? |
... | further arguments passed to or from other methods. |
Whenbreaks
is specified as a single number, the range of thedata is divided intobreaks
pieces of equal length, and thenthe outer limits are moved away by 0.1% of the range to ensure thatthe extreme values both fall within the break intervals. (Ifx
is a constant vector, equal-length intervals are created, one ofwhich includes the single value.)
If alabels
parameter is specified, its values are used to namethe factor levels. If none is specified, the factor level labels areconstructed as"(b1, b2]"
,"(b2, b3]"
etc. forright = TRUE
and as"[b1, b2)"
, ... ifright = FALSE
.In this case,dig.lab
indicates the minimum number of digitsshould be used in formatting the numbersb1
,b2
, ....A larger value (up to 12) will be used if needed to distinguishbetween any pair of endpoints: if this fails labels such as"Range3"
will be used. Formatting is done byformatC
.
The default method will sort a numeric vector ofbreaks
, butother methods are not required to andlabels
will correspond tothe intervals after sorting.
As fromR 3.2.0,getOption("OutDec")
is consulted when labelsare constructed forlabels = NULL
.
Afactor
is returned, unlesslabels = FALSE
whichresults in an integer vector of level codes.
Values which fall outside the range ofbreaks
are coded asNA
, as areNaN
andNA
values.
Instead oftable(cut(x, br))
,hist(x, br, plot = FALSE)
ismore efficient and less memory hungry. Instead ofcut(*, labels = FALSE)
,findInterval()
is more efficient.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
split
for splitting a variable according to a group factor;factor
,tabulate
,table
,findInterval
.
quantile
for ways of choosing breaks of roughly equalcontent (rather than length).
.bincode
for a bare-bones version.
Z <- stats::rnorm(10000)table(cut(Z, breaks = -6:6))sum(table(cut(Z, breaks = -6:6, labels = FALSE)))sum(graphics::hist(Z, breaks = -6:6, plot = FALSE)$counts)cut(rep(1,5), 4) #-- dummytx0 <- c(9, 4, 6, 5, 3, 10, 5, 3, 5)x <- rep(0:8, tx0)stopifnot(table(x) == tx0)table( cut(x, breaks = 8))table( cut(x, breaks = 3*(-2:5)))table( cut(x, breaks = 3*(-2:5), right = FALSE))##--- some values OUTSIDE the breaks :table(cx <- cut(x, breaks = 2*(0:4)))table(cxl <- cut(x, breaks = 2*(0:4), right = FALSE))which(is.na(cx)); x[is.na(cx)] #-- the first 9 values 0which(is.na(cxl)); x[is.na(cxl)] #-- the last 5 values 8## Label construction:y <- stats::rnorm(100)table(cut(y, breaks = pi/3*(-3:3)))table(cut(y, breaks = pi/3*(-3:3), dig.lab = 4))table(cut(y, breaks = 1*(-3:3), dig.lab = 4))# extra digits don't "harm" heretable(cut(y, breaks = 1*(-3:3), right = FALSE))#- the same, since no exact INT!## sometimes the default dig.lab is not enough to be avoid confusion:aaa <- c(1,2,3,4,5,2,3,4,5,6,7)cut(aaa, 3)cut(aaa, 3, dig.lab = 4, ordered_result = TRUE)## one way to extract the breakpointslabs <- levels(cut(aaa, 3))cbind(lower = as.numeric( sub("\\((.+),.*", "\\1", labs) ), upper = as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", labs) ))
Z<- stats::rnorm(10000)table(cut(Z, breaks=-6:6))sum(table(cut(Z, breaks=-6:6, labels=FALSE)))sum(graphics::hist(Z, breaks=-6:6, plot=FALSE)$counts)cut(rep(1,5),4)#-- dummytx0<- c(9,4,6,5,3,10,5,3,5)x<- rep(0:8, tx0)stopifnot(table(x)== tx0)table( cut(x, breaks=8))table( cut(x, breaks=3*(-2:5)))table( cut(x, breaks=3*(-2:5), right=FALSE))##--- some values OUTSIDE the breaks :table(cx<- cut(x, breaks=2*(0:4)))table(cxl<- cut(x, breaks=2*(0:4), right=FALSE))which(is.na(cx)); x[is.na(cx)]#-- the first 9 values 0which(is.na(cxl)); x[is.na(cxl)]#-- the last 5 values 8## Label construction:y<- stats::rnorm(100)table(cut(y, breaks= pi/3*(-3:3)))table(cut(y, breaks= pi/3*(-3:3), dig.lab=4))table(cut(y, breaks=1*(-3:3), dig.lab=4))# extra digits don't "harm" heretable(cut(y, breaks=1*(-3:3), right=FALSE))#- the same, since no exact INT!## sometimes the default dig.lab is not enough to be avoid confusion:aaa<- c(1,2,3,4,5,2,3,4,5,6,7)cut(aaa,3)cut(aaa,3, dig.lab=4, ordered_result=TRUE)## one way to extract the breakpointslabs<- levels(cut(aaa,3))cbind(lower= as.numeric( sub("\\((.+),.*","\\1", labs)), upper= as.numeric( sub("[^,]*,([^]]*)\\]","\\1", labs)))
Method forcut
applied to date-time objects.
## S3 method for class 'POSIXt'cut(x, breaks, labels = NULL, start.on.monday = TRUE, right = FALSE, ...)## S3 method for class 'Date'cut(x, breaks, labels = NULL, start.on.monday = TRUE, right = FALSE, ...)
## S3 method for class 'POSIXt'cut(x, breaks, labels=NULL, start.on.monday=TRUE, right=FALSE,...)## S3 method for class 'Date'cut(x, breaks, labels=NULL, start.on.monday=TRUE, right=FALSE,...)
x | an object inheriting from class |
breaks | a vector of cut pointsor number giving the number ofintervals which |
labels | labels for the levels of the resulting category. By default,labels are constructed from the left-hand end of the intervals(which are included for the default value of |
start.on.monday | logical. If |
right ,... | arguments to be passed to or from other methods. |
Note that the default forright
differs from thedefault method. Usinginclude.lowest = TRUE
will include both ends of the range of dates.
Usingbreaks = "quarter"
will create intervals of 3 calendarmonths, with the intervals beginning on January 1, April 1,July 1 or October 1 (based uponmin(x)
) as appropriate.
A vector ofbreaks
will be sorted before use:labels
shouldcorrespond to the sorted vector.
A factor is returned, unlesslabels = FALSE
which returnsthe integer level codes.
Values which fall outside the range ofbreaks
are coded asNA
, as are andNA
values.
## random dates in a 10-week periodcut(ISOdate(2001, 1, 1) + 70*86400*stats::runif(100), "weeks")cut(as.Date("2001/1/1") + 70*stats::runif(100), "weeks")# The standards all have midnight as the start of the day, but some# people incorrectly interpret it at the end of the previous day ...tm <- seq(as.POSIXct("2012-06-01 06:00"), by = "6 hours", length.out = 24)aggregate(1:24, list(day = cut(tm, "days")), mean)# and a version with midnight included in the previous day:aggregate(1:24, list(day = cut(tm, "days", right = TRUE)), mean)
## random dates in a 10-week periodcut(ISOdate(2001,1,1)+70*86400*stats::runif(100),"weeks")cut(as.Date("2001/1/1")+70*stats::runif(100),"weeks")# The standards all have midnight as the start of the day, but some# people incorrectly interpret it at the end of the previous day ...tm<- seq(as.POSIXct("2012-06-01 06:00"), by="6 hours", length.out=24)aggregate(1:24, list(day= cut(tm,"days")), mean)# and a version with midnight included in the previous day:aggregate(1:24, list(day= cut(tm,"days", right=TRUE)), mean)
Determine the class of an arbitraryR object.
data.class(x)
data.class(x)
x | anR object. |
character string giving theclass ofx
.
The class is the (first element) of theclass
attribute if this is non-NULL
, or inferred from the object'sdim
attribute if this is non-NULL
, ormode(x)
.
Simply speaking,data.class(x)
returns what is typically usefulfor method dispatching. (Or, what the basic creator functions alreadyand maybe eventually all will attach as a class attribute.)
For compatibility reasons, there is one exception to the rule above:Whenx
isinteger
, the result ofdata.class(x)
is"numeric"
even whenx
is classed.
x <- LETTERSdata.class(factor(x)) # has a class attributedata.class(matrix(x, ncol = 13)) # has a dim attributedata.class(list(x)) # the same as mode(x)data.class(x) # the same as mode(x)stopifnot(data.class(1:2) == "numeric") # compatibility "rule"
x<- LETTERSdata.class(factor(x))# has a class attributedata.class(matrix(x, ncol=13))# has a dim attributedata.class(list(x))# the same as mode(x)data.class(x)# the same as mode(x)stopifnot(data.class(1:2)=="numeric")# compatibility "rule"
The functiondata.frame()
creates data frames, tightly coupledcollections of variables which share many of the properties ofmatrices and of lists, used as the fundamental data structure by mostofR's modeling software.
data.frame(..., row.names = NULL, check.rows = FALSE, check.names = TRUE, fix.empty.names = TRUE, stringsAsFactors = FALSE)
data.frame(..., row.names=NULL, check.rows=FALSE, check.names=TRUE, fix.empty.names=TRUE, stringsAsFactors=FALSE)
... | these arguments are of either the form |
row.names |
|
check.rows | if |
check.names | logical. If |
fix.empty.names | logical indicating if arguments which are“unnamed” (in the sense of not being formally called as |
stringsAsFactors | logical: should character vectors be convertedto factors? The ‘factory-fresh’ default has been |
A data frame is a list of variables of the same number of rows withunique row names, given class"data.frame"
. If no variablesare included, the row names determine the number of rows.
The column names should be non-empty, and attempts to use empty nameswill have unsupported results. Duplicate column names are allowed,but you need to usecheck.names = FALSE
fordata.frame
to generate such a data frame. However, not all operations on dataframes will preserve duplicated column names: for example matrix-likesubsetting will force column names in the result to be unique.
data.frame
converts each of its arguments to a data frame bycallingas.data.frame(optional = TRUE)
. As that is ageneric function, methods can be written to change the behaviour ofarguments according to their classes:R comes with many such methods.Character variables passed todata.frame
are converted tofactor columns if not protected byI
and argumentstringsAsFactors
is true. If a list or dataframe or matrix is passed todata.frame
it is as if eachcomponent or column had been passed as a separate argument (except formatrices protected byI
).
Objects passed todata.frame
should have the same number ofrows, but atomic vectors (seeis.vector
), factors andcharacter vectors protected byI
will be recycled awhole number of times if necessary (including as elements of listarguments).
If row names are not supplied in the call todata.frame
, therow names are taken from the first component that has suitable names,for example a named vector or a matrix with rownames or a data frame.(If that component is subsequently recycled, the names are discardedwith a warning.) Ifrow.names
was supplied asNULL
or nosuitable component was found the row names are the integer sequencestarting at one (and such row names are considered to be‘automatic’, and not preserved byas.matrix
).
If row names are supplied of length one and the data frame has asingle row, therow.names
is taken to specify the row names andnot a column (by name or number).
Names are removed from vector inputs not protected byI
.
A data frame, a matrix-like structure whose columns may be ofdiffering types (numeric, logical, factor and character and so on).
How the names of the data frame are created is complex, and the restof this paragraph is only the basic story. If the arguments are allnamed and simple objects (not lists, matrices of data frames) then theargument names give the column names. For an unnamed simple argument,a deparsed version of the argument is used as the name (with anenclosingI(...)
removed). For a named matrix/list/data frameargument with more than one named column, the names of the columns arethe name of the argument followed by a dot and the column name insidethe argument: if the argument is unnamed, the argument's column namesare used. For a named or unnamed matrix/list/data frame argument thatcontains a single column, the column name in the result is the columnname in the argument. Finally, the names are adjusted to be uniqueand syntactically valid unlesscheck.names = FALSE
.
In versions ofR prior to 2.4.0row.names
had to becharacter: to ensure compatibility with such versions ofR, supplya character vector as therow.names
argument.
Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
I
,plot.data.frame
,print.data.frame
,row.names
,names
(for the column names),[.data.frame
for subsetting methods andI(matrix(..))
examples;Math.data.frame
etc, aboutGroup methods fordata.frame
s;read.table
,make.names
,list2DF
for creating data frames from lists of variables.
L3 <- LETTERS[1:3]char <- sample(L3, 10, replace = TRUE)(d <- data.frame(x = 1, y = 1:10, char = char))## The "same" with automatic column names:data.frame(1, 1:10, sample(L3, 10, replace = TRUE))is.data.frame(d)## enable automatic conversion of character arguments to factor columns:(dd <- data.frame(d, fac = letters[1:10], stringsAsFactors = TRUE))rbind(class = sapply(dd, class), mode = sapply(dd, mode))stopifnot(1:10 == row.names(d)) # {coercion}(d0 <- d[, FALSE]) # data frame with 0 columns and 10 rows(d.0 <- d[FALSE, ]) # <0 rows> data frame (3 named cols)(d00 <- d0[FALSE, ]) # data frame with 0 columns and 0 rows
L3<- LETTERS[1:3]char<- sample(L3,10, replace=TRUE)(d<- data.frame(x=1, y=1:10, char= char))## The "same" with automatic column names:data.frame(1,1:10, sample(L3,10, replace=TRUE))is.data.frame(d)## enable automatic conversion of character arguments to factor columns:(dd<- data.frame(d, fac= letters[1:10], stringsAsFactors=TRUE))rbind(class= sapply(dd, class), mode= sapply(dd, mode))stopifnot(1:10== row.names(d))# {coercion}(d0<- d[,FALSE])# data frame with 0 columns and 10 rows(d.0<- d[FALSE,])# <0 rows> data frame (3 named cols)(d00<- d0[FALSE,])# data frame with 0 columns and 0 rows
Return the matrix obtained by converting all the variables in a dataframe to numeric mode and then binding them together as the columns ofa matrix. Factors and ordered factors are replaced by their internalcodes.
data.matrix(frame, rownames.force = NA)
data.matrix(frame, rownames.force=NA)
frame | a data frame whose components are logical vectors,factors or numeric or character vectors. |
rownames.force | logical indicating if the resulting matrixshould have character (rather than |
Logical and factor columns are converted to integers. Charactercolumns are first converted to factors and then to integers. Any othercolumn which is not numeric (according tois.numeric
) isconverted byas.numeric
or, for S4 objects,as(, "numeric")
. If all columns are integer (afterconversion) the result is an integer matrix, otherwise a numeric(double) matrix.
Ifframe
inherits from class"data.frame"
, an integer ornumeric matrix of the same dimensions asframe
, with dimnamestaken from therow.names
(orNULL
, depending onrownames.force
) andnames
.
Otherwise, the result ofas.matrix
.
The default behaviour for data frames differs fromR < 2.5.0 whichalways gave the result character rownames.
Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
DF <- data.frame(a = 1:3, b = letters[10:12], c = seq(as.Date("2004-01-01"), by = "week", length.out = 3), stringsAsFactors = TRUE)data.matrix(DF[1:2])data.matrix(DF)
DF<- data.frame(a=1:3, b= letters[10:12], c= seq(as.Date("2004-01-01"), by="week", length.out=3), stringsAsFactors=TRUE)data.matrix(DF[1:2])data.matrix(DF)
Returns a character string of the current system date and time.
date()
date()
The string has the form"Fri Aug 20 11:11:00 1999"
, i.e.,length 24, since it relies on POSIX'sctime
ensuring the abovefixed format. Timezone and Daylight Saving Time are taken account of,butnot indicated in the result.
The day and month abbreviations are always in English, irrespectiveof locale.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Sys.Date
andSys.time
;Date
andDateTimeClasses
for objects representing date and time.
(d <- date())nchar(d) == 24## something similar in the current locale## depending on ctime; e.g. %e could be %d:format(Sys.time(), "%a %b %e %H:%M:%S %Y")
(d<- date())nchar(d)==24## something similar in the current locale## depending on ctime; e.g. %e could be %d:format(Sys.time(),"%a %b %e %H:%M:%S %Y")
Description of the class"Date"
representing calendar dates.
## S3 method for class 'Date'summary(object, digits = 12, ...)## S3 method for class 'Date'print(x, max = NULL, ...)
## S3 method for class 'Date'summary(object, digits=12,...)## S3 method for class 'Date'print(x, max=NULL,...)
object ,x | a |
digits | number of significant digits for the computations. |
max | numeric or |
... | further arguments to be passed from or to other methods. |
Dates are represented as the number of days since 1970-01-01, withnegative values for earlier dates. They are always printedfollowing the rules of the current Gregorian calendar, even thoughthat calendar was not in use long ago (it was adopted in 1752 inGreat Britain and its colonies). When printing there is assumed tobe a year zero.
It is intended that the date should be an integer value, but this isnot enforced in the internal representation. Fractional days will beignored when printing. It is possible to produce fractional days viathemean
method or by adding or subtracting (seeOps.Date
).
When a date is converted to a date-time (for example byas.POSIXct
oras.POSIXlt
its time is takenas midnight in UTC.
Printing dates involves conversion to class"POSIXlt"
which treats dates of more than about 780 million years from presentasNA
.
For the many methods seemethods(class = "Date")
. Several aredocumented separately, see below.
Sys.Date
for the current date.
weekdays
for convenience extraction functions.
Methods with extra arguments and documentation:
Ops.Date
for operators on"Date"
objects.
format.Date
for conversion to and from character strings.
axis.Date
andhist.Date
for plotting.
seq.Date
,cut.Date
, andround.Date
for utility operations.
DateTimeClasses
for date-time classes.
(today <- Sys.Date())format(today, "%d %b %Y") # with month as a word(tenweeks <- seq(today, length.out=10, by="1 week")) # next ten weeksweekdays(today)months(tenweeks)(Dls <- as.Date(.leap.seconds))## Show use of year zero:(z <- as.Date("01-01-01")) # how it is printed depends on the OSz - 365 # so year zero was a leap year.as.Date("00-02-29")# if you want a different format, consider something like (if supported)## Not run: format(z, "%04Y-%m-%d") # "0001-01-01"format(z, "%_4Y-%m-%d") # " 1-01-01"format(z, "%_Y-%m-%d") # "1-01-01"## End(Not run) ## length(<Date>) <- n now worksls <- Dls; length(ls) <- 12l2 <- Dls; length(l2) <- 5 + length(Dls)stopifnot(exprs = { ## length(.) <- * is compatible to subsetting/indexing: identical(ls, Dls[seq_along(ls)]) identical(l2, Dls[seq_along(l2)]) ## has filled with NA's is.na(l2[(length(Dls)+1):length(l2)])})
(today<- Sys.Date())format(today,"%d %b %Y")# with month as a word(tenweeks<- seq(today, length.out=10, by="1 week"))# next ten weeksweekdays(today)months(tenweeks)(Dls<- as.Date(.leap.seconds))## Show use of year zero:(z<- as.Date("01-01-01"))# how it is printed depends on the OSz-365# so year zero was a leap year.as.Date("00-02-29")# if you want a different format, consider something like (if supported)## Not run: format(z, "%04Y-%m-%d") # "0001-01-01"format(z,"%_4Y-%m-%d")# " 1-01-01"format(z,"%_Y-%m-%d")# "1-01-01"## End(Not run)## length(<Date>) <- n now worksls<- Dls; length(ls)<-12l2<- Dls; length(l2)<-5+ length(Dls)stopifnot(exprs={## length(.) <- * is compatible to subsetting/indexing: identical(ls, Dls[seq_along(ls)]) identical(l2, Dls[seq_along(l2)])## has filled with NA's is.na(l2[(length(Dls)+1):length(l2)])})
Description of the classes"POSIXlt"
and"POSIXct"
representing calendar dates and times.
## S3 method for class 'POSIXct'print(x, tz = "", usetz = TRUE, max = NULL, ...)## S3 method for class 'POSIXct'summary(object, digits = 15, ...)time + zz + timetime - ztime1 lop time2
## S3 method for class 'POSIXct'print(x, tz="", usetz=TRUE, max=NULL,...)## S3 method for class 'POSIXct'summary(object, digits=15,...)time+ zz+ timetime- ztime1 lop time2
x ,object | an object to be printed or summarized from one of thedate-time classes. |
tz ,usetz | for timezone formatting, passed to |
max | numeric or |
digits | number of significant digits for the computations:should be high enough to represent the least important time unitexactly. |
... | further arguments to be passed from or to other methods. |
time | date-time objects. |
time1 ,time2 | date-time objects or character vectors. (Charactervectors are converted by |
z | a numeric vector (in seconds). |
lop | one of |
There are two basic classes of date/times. Class"POSIXct"
represents the (signed) number of seconds since the beginning of 1970(in the UTC time zone) as a numeric vector. Class"POSIXlt"
isinternally alist
of vectors with components namedsec
,min
,hour
for the time,mday
,mon
, andyear
, for the date,wday
,yday
for the day of the week and day of the year,isdst
, a Daylight Saving Time flag,and sometimes (bothoptional)zone
, a string for the time zone, andgmtoff
, offset in seconds from GMT,see the section ‘Details on POSIXlt’ below for more details.
The classes correspond to the POSIX/C99 constructs of ‘calendartime’ (thetime_t
data type, “ct”), and ‘local time’(or broken-down time, the ‘struct tm’ data type, “lt”),from which they also inherit their names.
"POSIXct"
is more convenient for including in data frames, and"POSIXlt"
is closer to human-readable forms. A virtual class"POSIXt"
exists from which both of the classes inherit: it isused to allow operations such as subtraction to mix the two classes.
Logical comparisons and some arithmetic operations are available forboth classes. One can add or subtract a number of seconds from adate-time object, but not add two date-time objects. Subtraction oftwo date-time objects is equivalent to usingdifftime
.Be aware that"POSIXlt"
objects will be interpreted as being inthe current time zone for these operations unless a time zone has beenspecified.
Both classes may have an attribute"tzone"
, specifying the timezone. Note however that their meaning differ, see the section‘Time Zones’ below for more details.
Unfortunately, the conversion is complicated by the operation of timezones and leap seconds (according to this version ofR's data,27 days have been 86401 seconds long sofar, the last being on (actually, immediately before)2017-01-01: the times of theextra seconds are in the object.leap.seconds
). The details ofthis are entrusted to the OS services where possible. It seems thatsome rare systems used to use leap seconds, but all known currentplatforms ignore them (as required by POSIX). This is detected andcorrected for at build time, so"POSIXct"
times used byR donot include leap seconds on any platform.
Usingc
on"POSIXlt"
objects converts them to thecurrent time zone, and on"POSIXct"
objects drops"tzone"
attributes if they are not all the same.
A few times have specific issues. First, the leap seconds are ignored,and real times such as"2005-12-31 23:59:60"
are (probably)treated as the next second. However, they will never be generated byR, and are unlikely to arise as input. Second, on some OSes there isa problem in the POSIX/C99 standard with"1969-12-31 23:59:59 UTC"
,which is-1
in calendar time and that value is on those OSesalso used as an error code. Thusas.POSIXct("1969-12-31 23:59:59", format = "%Y-%m-%d %H:%M:%S", tz = "UTC")
may giveNA
, and henceas.POSIXct("1969-12-31 23:59:59", tz = "UTC")
will give"1969-12-31 23:59:00"
. Other OSes(including the code used byR on Windows) report errors separatelyand so are able to handle that time as valid.
The print methods respectoptions("max.print")
.
"POSIXlt"
objects will often have an attribute"tzone"
,a character vector of length 3 giving the time zone name (from theTZenvironment variable or argumenttz
of functions creating"POSIXlt"
objects;""
marks the current time zone)and the names of the base time zoneand the alternate (daylight-saving) time zone. Sometimes this mayjust be of length one, giving thetime zone name.
"POSIXct"
objects may also have an attribute"tzone"
, acharacter vector of length one. If set to a non-empty value, it willdetermine how the object is converted to class"POSIXlt"
and inparticular how it is printed. This is usually desirable, but if youwant to specify an object in a particular time zone but to be printedin the current time zone you may want to remove the"tzone"
attribute.
Class"POSIXlt"
is internally a namedlist
ofvectors representing date-times, with the following list components
sec
0–61: seconds, allowing for leap seconds.
min
0–59: minutes.
hour
0–23: hours.
mday
1–31: day of the month.
mon
0–11: months after the first of the year.
year
years since 1900.
wday
0–6 day of the week, starting on Sunday.
yday
0–365: day of the year (365 only in leap years).
isdst
Daylight Saving Time flag. Positive if inforce, zero if not, negative if unknown.
zone
(Optional.) The abbreviation for the time zone inforce at that time:""
if unknown (but""
might alsobe used for UTC).
gmtoff
(Optional.) The offset in seconds from GMT:positive values are East of the meridian. UsuallyNA
ifunknown, but0
could mean unknown.
The components must be in this order: that was only minimally checkedprior toR 4.3.0. All objects created inR 4.3.0 have the optionalcomponents. From earlier versions ofR, he last two components willnot be present for times in UTC and are platform-dependent. Currentlygmtoff
is set on almost all current platforms: those based onBSD orglibc
(including Linux and macOS) and those using thetzcode
implementation shipped withR (including Windows and bydefault macOS).
Note that the internal list structure is somewhat hidden, as manymethods (includinglength(x)
,print()
andstr()
) apply to the abstract date-time vector, as for"POSIXct"
. One can extract and replacesinglecomponents via[
indexing withtwo indices (see theexamples).
The components of"POSIXlt"
areinteger
vectors,exceptsec
(double
) andzone
(character
). However most users will coerce numericvalues for the first to real and the rest barzone
to integer.
Componentswday
andyday
are for information, and are notused in the conversion to calendar time nor for printing,format()
, or inas.character()
.
However, componentisdst
is needed to distinguish times at theend of DST: typically 1am to 2am occurs twice, first in DST and thenin standard time. At all other timesisdst
can be deduced fromthe first six values, but the behaviour if it is set incorrectly isplatform-dependent. For example Linux/glibc when checked fixed upincorrect values in time zones which support DST but gave an error onvalue1
in those without DST.
For “ragged” and out-of-range vs “balanced”"POSIXlt"
objects, seebalancePOSIXlt()
.
Classes"POSIXct"
and"POSIXlt"
are able to expressfractions of a second where the latter allows for higher accuracy.Consequently, conversion of fractions between the two formsmay not be exact, but will have better than microsecond accuracy.
Fractional seconds are printed only ifoptions("digits.secs")
is set: seestrftime
.
The"POSIXlt"
class can represent a very wide range of times (upto billions of years), but such times can only be interpreted withreference to a time zone.
The concept of time zones was first adopted in the nineteenth century,and the Gregorian calendar was introduced in 1582 but not universallyadopted until 1927. OS services almost invariably assume theGregorian calendar and may assume that the time zone that was firstenacted for the location was in force before that date. (The earliestlegislated time zone seems to have been London on 1847-12-01.) SomeOSes assume the previous use of ‘local time’ based on thelongitude of a location within the time zone.
Most operating systems representPOSIXct
times as C typelong
. This means that on 32-bit OSes this covers the period1902 to 2037. On all known 64-bit platforms and for the code we useon 32-bit Windows, the range of representable times is billions ofyears: however, not all can convert correctly times before 1902 orafter 2037. A few benighted OSes used a unsigned type and so cannotrepresent times before 1970.
Where possible the platform limits are detected, and outsidethe limits we use our own C code. This uses the offset fromGMT in use either for 1902 (when there was no DST) or that predictedfor one of 2030 to 2037 (chosen so that the likely DST transition daysare Sundays), and uses the alternate (daylight-saving) time zone onlyifisdst
is positive or (if-1
) if DST was predicted tobe in operation in the 2030s on that day.
Note that there are places (e.g., Rome) whose offset from UTC variedin the years prior to 1902, and these will be handled correctly onlywhere there is OS support.
There is no reason to assume that the DST rules will remain the samein the future: the US legislated in 2005 to change itsrules as from 2007, with a possible future reversion. So conversionsfor times more than a year or two ahead are speculative. Othercountries have changed their rules (and indeed, if DST is used at all)at a few days' notice. So representations and conversion of futuredates are tentative. This also applies to dates after the in-useversion of the time-zone database – not all platforms keep it up todate, which includes that shipped with older versions ofR where used(which it is by default on Windows and macOS).
Some Unix-like systems (especially Linux ones) do not have environmentvariableTZ set, yet have internal code that expects it (as doesPOSIX). We have tried to work around this, but if you get unexpectedresults try settingTZ. SeeSys.timezone
forvalid settings.
Great care is needed when comparing objects of class"POSIXlt"
.Not only are components and attributes optional; several componentsmay have values meaning ‘not yet determined’ and the same timerepresented in different time zones will look quite different.
Theorder of the list components of"POSIXlt"
objectsmust not be changed, as several C-based conversion methods rely on theorder for efficiency.
Ripley, B. D. and Hornik, K. (2001).“Date-time classes.”R News,1(2), 8–11.https://www.r-project.org/doc/Rnews/Rnews_2001-2.pdf.
Dates for dates without times.
as.POSIXct
andas.POSIXlt
for conversionbetween the classes.
strptime
for conversion to and from characterrepresentations.
Sys.time
for clock time as a"POSIXct"
object.
difftime
for time intervals.
balancePOSIXlt()
for balancing or filling “ragged”POSIXlt objects.
cut.POSIXt
,seq.POSIXt
,round.POSIXt
andtrunc.POSIXt
for methodsfor these classes.
weekdays
for convenience extraction functions.
(z <- Sys.time()) # the current date, as class "POSIXct"Sys.time() - 3600 # an hour agoas.POSIXlt(Sys.time(), "GMT") # the current time in GMTformat(.leap.seconds) # the leap seconds in your time zoneprint(.leap.seconds, tz = "America/Los_Angeles") # and in Seattle's## look at *internal* representation of "POSIXlt" :leapS <- as.POSIXlt(.leap.seconds)names(unclass(leapS)) ; is.list(leapS)## str() on inner structure needs unclass(.):utils::str(unclass(leapS), vec.len = 7)## show all (apart from "tzone" attr):data.frame(unclass(leapS))## Extracting *single* components of POSIXlt objects:leapS[1 : 5, "year"]leapS[17:22, "mon" ]## length(.) <- n now works for "POSIXct" and "POSIXlt" :for(lpS in list(.leap.seconds, leapS)) { ls <- lpS; length(ls) <- 12 l2 <- lpS; length(l2) <- 5 + length(lpS) stopifnot(exprs = { ## length(.) <- * is compatible to subsetting/indexing: identical(ls, lpS[seq_along(ls)]) identical(l2, lpS[seq_along(l2)]) ## has filled with NA's is.na(l2[(length(lpS)+1):length(l2)]) })}
(z<- Sys.time())# the current date, as class "POSIXct"Sys.time()-3600# an hour agoas.POSIXlt(Sys.time(),"GMT")# the current time in GMTformat(.leap.seconds)# the leap seconds in your time zoneprint(.leap.seconds, tz="America/Los_Angeles")# and in Seattle's## look at *internal* representation of "POSIXlt" :leapS<- as.POSIXlt(.leap.seconds)names(unclass(leapS)); is.list(leapS)## str() on inner structure needs unclass(.):utils::str(unclass(leapS), vec.len=7)## show all (apart from "tzone" attr):data.frame(unclass(leapS))## Extracting *single* components of POSIXlt objects:leapS[1:5,"year"]leapS[17:22,"mon"]## length(.) <- n now works for "POSIXct" and "POSIXlt" :for(lpSin list(.leap.seconds, leapS)){ ls<- lpS; length(ls)<-12 l2<- lpS; length(l2)<-5+ length(lpS) stopifnot(exprs={## length(.) <- * is compatible to subsetting/indexing: identical(ls, lpS[seq_along(ls)]) identical(l2, lpS[seq_along(l2)])## has filled with NA's is.na(l2[(length(lpS)+1):length(l2)])})}
Reads or writes anR object from/to a file in Debian Control Fileformat.
read.dcf(file, fields = NULL, all = FALSE, keep.white = NULL)write.dcf(x, file = "", append = FALSE, useBytes = FALSE, indent = 0.1 * getOption("width"), width = 0.9 * getOption("width"), keep.white = NULL)
read.dcf(file, fields=NULL, all=FALSE, keep.white=NULL)write.dcf(x, file="", append=FALSE, useBytes=FALSE, indent=0.1* getOption("width"), width=0.9* getOption("width"), keep.white=NULL)
file | either a character string naming a file or aconnection. |
fields | a character vector with the names of the fieldsto read from the DCF file. Default is to read all fields. |
all | a logical indicating whether in case of multipleoccurrences of a field in a record, all these should be gathered.If |
keep.white | a character vector with the names of the fields forwhich whitespace should be kept as is, or |
x | the object to be written, typically a data frame. If not, itis attempted to coerce |
append | logical. If |
useBytes | logical to be passed to |
indent | a positive integer specifying the indentation forcontinuation lines in output entries. |
width | a positive integer giving the target column for wrappinglines in the output. |
DCF is a simple format for storing databases in plain text files thatcan easily be directly read and written by humans. DCF is used invarious places to storeR system information, like descriptions andcontents of packages.
The DCF rules as implemented inR are:
A database consists of one or more records, each with one ormore named fields. Not every record must contain each field.Fields may appear more than once in a record.
Regular lines start with a non-whitespace character.
Regular lines are of formtag:value
, i.e., have a nametag and a value for the field, separated by:
(only the first:
counts). The value can be empty (i.e., whitespace only).
Lines starting with whitespace are continuation lines (to thepreceding field) if at least one character in the line isnon-whitespace. Continuation lines where the only non-whitespacecharacter is a ‘.’ are taken as blank lines (allowing formulti-paragraph field values).
Records are separated by one or more empty (i.e., whitespaceonly) lines.
Individual lines may not be arbitrarily long; prior toR 3.0.2 thelength limit was approximately 8191 bytes per line.
Note thatread.dcf(all = FALSE)
reads the file byte-by-byte.This allows a ‘DESCRIPTION’ file to be read and only its ASCIIfields used, or its ‘Encoding’ field used to re-encode theremaining fields.
write.dcf
does not writeNA
fields.
The defaultread.dcf(all = FALSE)
returns a character matrixwith one row per record and one column per field. Leading andtrailing whitespace of field values is ignored unless a field islisted inkeep.white
. If a tag name is specified in the file,but the corresponding value is empty, then an empty string isreturned. If the tag name of a field is specified infields
but never used in a record, then the corresponding value isNA
.If fields are repeated within a record, the last one encountered isreturned. Malformed lines lead to an error.
Forread.dcf(all = TRUE)
a data frame is returned, again withone row per record and one column per field. The columns are lists ofcharacter vectors for fields with multiple occurrences, and charactervectors otherwise.
Note that an emptyfile
is a valid DCF file, andread.dcf
will return a zero-row matrix or data frame.
Forwrite.dcf
, invisibleNULL
.
As fromR 3.4.0, ‘whitespace’ in all cases includes newlines.
https://www.debian.org/doc/debian-policy/ch-controlfields.html.
Note thatR does not require encoding in UTF-8, which is a recentDebian requirement. Nor does it use the Debian-specific sub-formatwhich allows comment lines starting with ‘#’.
available.packages
, which usesread.dcf
to readthe indices of package repositories.
## Create a reduced version of the DESCRIPTION file in package 'splines'x <- read.dcf(file = system.file("DESCRIPTION", package = "splines"), fields = c("Package", "Version", "Title"))write.dcf(x)## An online DCF file with multiple recordscon <- url("https://cran.r-project.org/src/contrib/PACKAGES")y <- read.dcf(con, all = TRUE)close(con)utils::str(y)
## Create a reduced version of the DESCRIPTION file in package 'splines'x<- read.dcf(file= system.file("DESCRIPTION", package="splines"), fields= c("Package","Version","Title"))write.dcf(x)## An online DCF file with multiple recordscon<- url("https://cran.r-project.org/src/contrib/PACKAGES")y<- read.dcf(con, all=TRUE)close(con)utils::str(y)
Set, unset or query the debugging flag on a function.Thetext
andcondition
arguments are the same as thosethat can be supplied via a call tobrowser
. They can be retrievedby the user once the browser has been entered, and provide a mechanism toallow users to identify which breakpoint has been activated.
debug(fun, text = "", condition = NULL, signature = NULL)debugonce(fun, text = "", condition = NULL, signature = NULL)undebug(fun, signature = NULL)isdebugged(fun, signature = NULL)debuggingState(on = NULL)
debug(fun, text="", condition=NULL, signature=NULL)debugonce(fun, text="", condition=NULL, signature=NULL)undebug(fun, signature=NULL)isdebugged(fun, signature=NULL)debuggingState(on=NULL)
fun | any interpretedR function. |
text | a text string that can be retrieved when the browser is entered. |
condition | a condition that can be retrieved when the browser isentered. |
signature | an optional method signature. If specified, themethod is debugged, rather than its generic. |
on | logical; a call to the support function |
When a function flagged for debugging is entered, normal executionis suspended and the body of function is executed one statement at atime. A newbrowser
context is initiated for each step(and the previous one destroyed).
At the debug prompt the user can enter commands orR expressions,followed by a newline. The commands are described in thebrowser
help topic.
To debug a function which is defined inside another function,single-step through to the end of its definition, and then calldebug
on its name.
If you want to debug a function not starting at the very beginning,usetrace(..., at = *)
orsetBreakpoint
.
Usingdebug
is persistent, and unless debugging is turned offthe debugger will be entered on every invocation (note that if thefunction is removed and replaced the debug state is not preserved).Usedebugonce()
to enter the debugger only the next time thefunction is invoked.
To debug an S4 method by explicit signature, usesignature
. When specified, signature indicates the method offun
to be debugged. Note that debugging is implemented slightlydifferently for this case, as it uses the trace machinery, rather thanthe debugging bit. As such,text
andcondition
cannot bespecified in combination with a non-nullsignature
. For methodswhich implement the.local
rematching mechanism, the.local
closure itself is the one that will be ultimatelydebugged (seeisRematched
).
isdebugged
returnsTRUE
if a)signature
isNULL
and the closurefun
has been debugged, or b)signature
is notNULL
,fun
is an S4 generic, and the method offun
for that signature has been debugged. In all other cases, it returnsFALSE
.
The number of lines printed for the deparsed call when a function isentered for debugging can be limited by settingoptions(deparse.max.lines)
.
When debugging is enabled on a byte compiled function then theinterpreted version of the function will be used until debugging isdisabled.
debug
andundebug
invisibly returnNULL
.
isdebugged
returnsTRUE
if the function or method is
marked for debugging, andFALSE
otherwise.
debugcall
for conveniently debugging methods,browser
notably for its ‘commands’,trace
;traceback
to see the stack after anError: ...
message;recover
for another debugging approach.
## Not run: debug(library)library(methods)## End(Not run)## Not run: debugonce(sample)## only the first call will be debuggedsampe(10, 1)sample(10, 1)## End(Not run)
## Not run:debug(library)library(methods)## End(Not run)## Not run:debugonce(sample)## only the first call will be debuggedsampe(10,1)sample(10,1)## End(Not run)
A framework for specifying information about R code for use by theinterpreter, compiler, and code analysis tools.
declare(...)
declare(...)
... | declaration expressions. |
A syntax for declaration expressions is still being developed.
Evaluating adeclare()
call ignores the arguments and returnsNULL
invisibly.
When a function is removed fromR it should be replaced by a functionwhich calls.Defunct
.
.Defunct(new, package = NULL, msg)
.Defunct(new, package=NULL, msg)
new | character string: A suggestion for a replacement function. |
package | character string: The package to be used when suggesting where thedefunct function might be listed. |
msg | character string: A message to be printed, if missing a defaultmessage is used. |
.Defunct
is called from defunct functions. Functions should belisted inhelp("pkg-defunct")
for an appropriatepkg
,includingbase
(with the alias added to the respective Rdfile).
.Defunct
signals an error of classdefunctError
with fieldsold
,new
, andpackage
.
base-defunct
and so on which list the defunct functionsin the packages.
delayedAssign
creates apromise to evaluate the givenexpression if its value is requested. This provides direct accessto thelazy evaluation mechanism used byR for the evaluationof (interpreted) functions.
delayedAssign(x, value, eval.env = parent.frame(1), assign.env = parent.frame(1))
delayedAssign(x, value, eval.env= parent.frame(1), assign.env= parent.frame(1))
x | a variable name (given as a quoted string in the function call) |
value | an expression to be assigned to |
eval.env | an environment in which to evaluate |
assign.env | an environment in which to assign |
Botheval.env
andassign.env
default to the currently activeenvironment.
The expression assigned to a promise bydelayedAssign
willnot be evaluated until it is eventually ‘forced’. This happens whenthe variable is first accessed.
When the promise is eventually forced, it is evaluated within theenvironment specified byeval.env
(whose contents may have changed inthe meantime). After that, the value is fixed and the expression willnot be evaluated again, where the promise still keeps its expression.
This function is invoked for its side effect, which is assigninga promise to evaluatevalue
to the variablex
.
substitute
, to see the expression associated with apromise, ifassign.env
is not the.GlobalEnv
.
msg <- "old"delayedAssign("x", msg)substitute(x) # shows only 'x', as it is in the global env.msg <- "new!"x # new!delayedAssign("x", { for(i in 1:3) cat("yippee!\n") 10})x^2 #- yippeex^2 #- simple numberne <- new.env()delayedAssign("x", pi + 2, assign.env = ne)## See the promise {without "forcing" (i.e. evaluating) it}:substitute(x, ne) # 'pi + 2'### Promises in an environment [for advanced users]: ---------------------e <- (function(x, y = 1, z) environment())(cos, "y", {cat(" HO!\n"); pi+2})## How can we look at all promises in an env (w/o forcing them)?gete <- function(e_) { ne <- names(e_) names(ne) <- ne lapply(lapply(ne, as.name), function(n) eval(substitute(substitute(X, e_), list(X=n))))}(exps <- gete(e))sapply(exps, typeof)(le <- as.list(e)) # evaluates ("force"s) the promisesstopifnot(identical(le, lapply(exps, eval))) # and another "Ho!"
msg<-"old"delayedAssign("x", msg)substitute(x)# shows only 'x', as it is in the global env.msg<-"new!"x# new!delayedAssign("x",{for(iin1:3) cat("yippee!\n")10})x^2#- yippeex^2#- simple numberne<- new.env()delayedAssign("x", pi+2, assign.env= ne)## See the promise {without "forcing" (i.e. evaluating) it}:substitute(x, ne)# 'pi + 2'### Promises in an environment [for advanced users]: ---------------------e<-(function(x, y=1, z) environment())(cos,"y",{cat(" HO!\n"); pi+2})## How can we look at all promises in an env (w/o forcing them)?gete<-function(e_){ ne<- names(e_) names(ne)<- ne lapply(lapply(ne, as.name),function(n) eval(substitute(substitute(X, e_), list(X=n))))}(exps<- gete(e))sapply(exps, typeof)(le<- as.list(e))# evaluates ("force"s) the promisesstopifnot(identical(le, lapply(exps, eval)))# and another "Ho!"
Turn unevaluated expressions into character strings.
deparse(expr, width.cutoff = 60L, backtick = mode(expr) %in% c("call", "expression", "(", "function"), control = c("keepNA", "keepInteger", "niceNames", "showAttributes"), nlines = -1L)deparse1(expr, collapse = " ", width.cutoff = 500L, ...)
deparse(expr, width.cutoff=60L, backtick= mode(expr)%in% c("call","expression","(","function"), control= c("keepNA","keepInteger","niceNames","showAttributes"), nlines=-1L)deparse1(expr, collapse=" ", width.cutoff=500L,...)
expr | anyR expression. |
width.cutoff | integer in |
backtick | logical indicating whether symbolic names should beenclosed in backticks if they do not follow the standard syntax. |
control | character vector (or |
nlines | integer: the maximum number of lines to produce. Negativevalues indicate no limit. |
collapse | a string, passed to |
... | further arguments passed to |
These functions turn unevaluated expressions (where ‘expression’is taken in a wider sense than the strict concept of a vector ofmode
and type (typeof
)"expression"
used inexpression
) into characterstrings (a kind of inverse toparse
).
A typical use of this is to create informative labels for data setsand plots. The example shows a simple use of this facility. It usesthe functionsdeparse
andsubstitute
to create labelsfor a plot which are character string versions of the actual argumentsto the functionmyplot
.
The default for thebacktick
option is not to quote singlesymbols but only composite expressions. This is a compromise toavoid breaking existing code.
width.cutoff
is a lower bound for the line lengths: deparsing aline proceeds until at leastwidth.cutoff
bytes havebeen output and e.g.arg = value
expressions will not be splitacross lines.
deparse1()
is a simple utility added inR 4.0.0 to ensure astring result (character
vector of length one),typically used in name construction, asdeparse1(substitute(.))
.
To avoid the risk of a source attribute out of sync with the actualfunction definition, the source attribute of a function will neverbe deparsed as an attribute.
Deparsing internal structures may not be accurate: for example thegraphics display list recorded byrecordPlot
is notintended to be deparsed and.Internal
calls will be shown asprimitive calls.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
.deparseOpts
for availablecontrol
settings;dput()
anddump()
for related functions usingidentical internal deparsing functionality.
Quotes
for quoting conventions, including backticks.
require(stats); require(graphics)deparse(args(lm))deparse(args(lm), width.cutoff = 500)myplot <- function(x, y) { plot(x, y, xlab = deparse1(substitute(x)), ylab = deparse1(substitute(y)))}e <- quote(`foo bar`)deparse(e)deparse(e, backtick = TRUE)e <- quote(`foo bar`+1)deparse(e)deparse(e, control = "all") # wraps it w/ quote( . )
require(stats); require(graphics)deparse(args(lm))deparse(args(lm), width.cutoff=500)myplot<-function(x, y){ plot(x, y, xlab= deparse1(substitute(x)), ylab= deparse1(substitute(y)))}e<- quote(`foo bar`)deparse(e)deparse(e, backtick=TRUE)e<- quote(`foo bar`+1)deparse(e)deparse(e, control="all")# wraps it w/ quote( . )
Process the deparsing options fordeparse
,dput
anddump
.
.deparseOpts(control)..deparseOpts
.deparseOpts(control)..deparseOpts
control | character vector of deparsing options. |
..deparseOpts
is thecharacter
vector of possibledeparsing options used by.deparseOpts()
.
.deparseOpts()
is called bydeparse
,dput
anddump
to process theircontrol
argument.
Thecontrol
argument is a vector containing zero or more of thefollowing strings (exactly those in..deparseOpts
). Partialstring matching is used.
"keepInteger"
:Either surround integer vectors byas.integer()
or usesuffixL
, so they are not converted to type double whenparsed. This includes making sure that integerNA
s arepreserved (viaNA_integer_
if there are no non-NA
values in the vector, unless"S_compatible"
is set).
"quoteExpressions"
:Surround unevaluated expressions, but notformula
s,withquote()
, so they are not evaluated when re-parsed.
"showAttributes"
:If the object hasattributes
(other than asource
attribute, seesrcref
), usestructure()
to display them as well as the object value unless the only suchattribute isnames
and the"niceNames"
option is set.This ("showAttributes"
) is the default fordeparse
anddput
.
"useSource"
:If the object has asource
attribute (srcref
),display that instead of deparsing the object. Currently onlyapplies to function definitions.
"warnIncomplete"
:Some exotic objects such asenvironments, externalpointers, etc. can not be deparsed properly. This option causes awarning to be issued if the deparser recognizes one of thesesituations.
Also, the parser inR < 2.7.0 would only accept strings of up to8192 bytes, and this option gives a warning for longer strings.
"keepNA"
:Integer, real and characterNA
s are surrounded by coercionfunctions where necessary to ensure that they are parsed to thesame type. Since e.g.NA_real_
can be output inR, this ismainly used in connection withS_compatible
.
"niceNames"
:If true,list
s and atomic vectors with non-NA
names (seenames
) are deparsed as e.g.,c(A = 1)
instead ofstructure(1, names = "A")
, independently of the"showAttributes"
setting.
"all"
:An abbreviated way to specify all of the optionslisted aboveplus"digits17"
.This is the default fordump
, and, without"digits17"
, the optionsused byedit
(which are fixed).
"delayPromises"
:Deparse promises in the form <promise: expression> rather thanevaluating them. The value and the environment of the promisewill not be shown and the deparsed code cannot be sourced.
"S_compatible"
:Make deparsing as far as possible compatible with S andR < 2.5.0.For compatibility with S, integer values of double vectors aredeparsed with a trailing decimal point. Backticks are not used.
"hexNumeric"
:Real and finite complex numbers are output in ‘"%a"’ format asbinary fractions (coded as hexadecimal: seesprintf
)with maximal opportunity to be recorded exactly to full precision.Complex numbers with one or both non-finite components areoutput as if this option were not set.
(This relies on that format being correctly supported: knownproblems on Windows are worked around as fromR 3.1.2.)
"digits17"
:Real and finite complex numbers are output using format‘"%.17g"’ which may give more precision than the default(but the output will depend on the platform and there may be lossof precision when read back). Complex numbers with one or bothnon-finite components are output as if this option were not set.
"exact"
:An abbreviated way to specifycontrol = c("all", "hexNumeric")
which is guaranteed to be exact for numbers, see also below.
For the most readable (but perhaps incomplete) display, usecontrol = NULL
. This displays the object's value, but not itsattributes. The default indeparse
is to display theattributes as well, but not to use any of the other options to makethe result parseable. (dump
uses more default options viacontrol = "all"
, and printing of functions without sourcesusesc("keepInteger", "keepNA")
to which one may add"warnIncomplete"
.)
Usingcontrol = "exact"
(short forcontrol = c("all", "hexNumeric")
)comes closest to makingdeparse()
an inverse ofparse()
(but we have not yet seen an example where"all"
, now including"digits17"
, would not have been as good). However, not allobjects are deparse-able even with these options, and a warning will beissued if the function recognizes that it is being asked to do theimpossible.
Only one of"hexNumeric"
and"digits17"
can be specified.
An integer value corresponding to thecontrol
optionsselected.
stopifnot(.deparseOpts("exact") == .deparseOpts(c("all", "hexNumeric")))(iOpt.all <- .deparseOpts("all")) # a four digit integer## one integer --> vector binary bitsint2bits <- function(x, base = 2L, ndigits = 1 + floor(1e-9 + log(max(x,1), base))) { r <- numeric(ndigits) for (i in ndigits:1) { r[i] <- x%%base if (i > 1L) x <- x%/%base } rev(r) # smallest bit at left}int2bits(iOpt.all)## What options does "all" contain ? =========(depO.indiv <- setdiff(..deparseOpts, c("all", "exact")))(oa <- depO.indiv[int2bits(iOpt.all) == 1])# 8 stringsstopifnot(identical(iOpt.all, .deparseOpts(oa)))## ditto for "exact" instead of "all":(iOpt.X <- .deparseOpts("exact"))data.frame(opts = depO.indiv, all = int2bits(iOpt.all), exact= int2bits(iOpt.X))(oX <- depO.indiv[int2bits(iOpt.X) == 1]) # 8 strings, toodiffXall <- oa != oXstopifnot(identical(iOpt.X, .deparseOpts(oX)), identical(oX[diffXall], "hexNumeric"), identical(oa[diffXall], "digits17"))
stopifnot(.deparseOpts("exact")== .deparseOpts(c("all","hexNumeric")))(iOpt.all<- .deparseOpts("all"))# a four digit integer## one integer --> vector binary bitsint2bits<-function(x, base=2L, ndigits=1+ floor(1e-9+ log(max(x,1), base))){ r<- numeric(ndigits)for(iin ndigits:1){ r[i]<- x%%baseif(i>1L) x<- x%/%base} rev(r)# smallest bit at left}int2bits(iOpt.all)## What options does "all" contain ? =========(depO.indiv<- setdiff(..deparseOpts, c("all","exact")))(oa<- depO.indiv[int2bits(iOpt.all)==1])# 8 stringsstopifnot(identical(iOpt.all, .deparseOpts(oa)))## ditto for "exact" instead of "all":(iOpt.X<- .deparseOpts("exact"))data.frame(opts= depO.indiv, all= int2bits(iOpt.all), exact= int2bits(iOpt.X))(oX<- depO.indiv[int2bits(iOpt.X)==1])# 8 strings, toodiffXall<- oa!= oXstopifnot(identical(iOpt.X, .deparseOpts(oX)), identical(oX[diffXall],"hexNumeric"), identical(oa[diffXall],"digits17"))
When an object is about to be removed fromR it is first deprecated andshould include a call to.Deprecated
.
.Deprecated(new, package = NULL, msg, old = as.character(sys.call(sys.parent()))[1L])
.Deprecated(new, package=NULL, msg, old= as.character(sys.call(sys.parent()))[1L])
new | character string: A suggestion for a replacement function. |
package | character string: The package to be used when suggesting where thedeprecated function might be listed. |
msg | character string: A message to be printed, if missing a defaultmessage is used. |
old | character string specifying the function (default) or usagewhich is being deprecated. |
.Deprecated("new name")
is called from deprecatedfunctions. The original help page for these functions is oftenavailable athelp("old-deprecated")
(note the quotes).Deprecated functions should be listed inhelp("pkg-deprecated")
for an appropriatepkg, includingbase.
.Deprecated
signals a warning of class"deprecatedWarning"
with fieldsold
,new
, andpackage
.
help("base-deprecated")
and so on which list thedeprecated functions in the packages.
det
calculates the determinant of a matrix.determinant
is a generic function that returns separately the modulus of the determinant,optionally on the logarithm scale, and the sign of the determinant.
det(x, ...)determinant(x, logarithm = TRUE, ...)
det(x,...)determinant(x, logarithm=TRUE,...)
x | numeric matrix: logical matrices are coerced to numeric. |
logarithm | logical; if |
... | optional arguments, currently unused. |
Thedeterminant
function uses an LU decomposition and thedet
function is simply a wrapper around a call todeterminant
.
Often, computing the determinant isnot what you should be doingto solve a given problem.
Fordet
, the determinant ofx
. Fordeterminant
, alist with components
modulus | a numeric value. The modulus (absolute value) of thedeterminant if |
sign | integer; either |
(x <- matrix(1:4, ncol = 2))unlist(determinant(x))det(x)det(print(cbind(1, 1:3, c(2,0,1))))
(x<- matrix(1:4, ncol=2))unlist(determinant(x))det(x)det(print(cbind(1,1:3, c(2,0,1))))
Detach a database, i.e., remove it from thesearch()
path of availableR objects. Usually this is either adata.frame
which has beenattach
ed or apackage which was attached bylibrary
.
detach(name, pos = 2L, unload = FALSE, character.only = FALSE, force = FALSE)
detach(name, pos=2L, unload=FALSE, character.only=FALSE, force=FALSE)
name | the object to detach. Defaults to |
pos | index position in |
unload | a logical value indicating whether or not to attempt tounload the namespace when a package is being detached. If thepackage has a namespace and |
character.only | a logical indicating whether |
force | logical: should a package be detached even though otherattached packages depend on it? |
This is most commonly used with a single number argument referring to aposition on the search list, and can also be used with a unquoted orquoted name of an item on the search list such aspackage:tools
.
If a package has a namespace, detaching it does not by default unloadthe namespace (and may not even withunload = TRUE
), anddetaching will not in general unload any dynamically loaded compiledcode (DLLs); seegetLoadedDLLs
andlibrary.dynam.unload
. Further, registered S3 methodsfrom the namespace will not be removed, and because S3 methods arenot tagged to their source on registration, it is in general notpossible to safely un-register the methods associated with a givenpackage. If you uselibrary
on a package whosenamespace is loaded, it attaches the exports of the already loadednamespace. So detaching and re-attaching a package may not refreshsome or all components of the package, and is inadvisable. The mostreliable way to completely detach a package is to restartR.
The return value isinvisible. It isNULL
when apackage is detached, otherwise the environment which was returned byattach
when the object was attached (incorporating anychanges since it was attached).
detach()
without an argument removes the first item on thesearch path after the workspace. It is all too easy to call it toomany or too few times, or to not notice that the search path haschanged since anattach
call.
Use ofattach
/detach
is best avoided in functions (seethe help forattach
) and in interactive use and scriptsit is prudent to detach by name.
You cannot detach either the workspace (position 1) nor thebasepackage (the last item in the search list), and attempting to do sowill throw an error.
Unloading some namespaces has undesirable side effects:e.g. unloadinggrid closes all graphics devices, and on somesystemstcltk cannot be reloaded once it has been unloaded andmay crashR if this is attempted.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
attach
,library
,search
,objects
,unloadNamespace
,library.dynam.unload
.
require(splines) # packagedetach(package:splines)## or alsolibrary(splines)pkg <- "package:splines"detach(pkg, character.only = TRUE)## careful: do not do this unless 'splines' is not already attached.library(splines)detach(2) # 'pos' used for 'name'## an example of the name argument to attach## and of detaching a database named by a character vectorattach_and_detach <- function(db, pos = 2){ name <- deparse1(substitute(db)) attach(db, pos = pos, name = name) print(search()[pos]) detach(name, character.only = TRUE)}attach_and_detach(women, pos = 3)
require(splines)# packagedetach(package:splines)## or alsolibrary(splines)pkg<-"package:splines"detach(pkg, character.only=TRUE)## careful: do not do this unless 'splines' is not already attached.library(splines)detach(2)# 'pos' used for 'name'## an example of the name argument to attach## and of detaching a database named by a character vectorattach_and_detach<-function(db, pos=2){ name<- deparse1(substitute(db)) attach(db, pos= pos, name= name) print(search()[pos]) detach(name, character.only=TRUE)}attach_and_detach(women, pos=3)
Extract or replace the diagonal of a matrix,or construct a diagonal matrix.
diag(x = 1, nrow, ncol, names = TRUE)diag(x) <- value
diag(x=1, nrow, ncol, names=TRUE)diag(x)<- value
x | a matrix, vector or 1D |
nrow ,ncol | optional dimensions for the result when |
names | (when |
value | either a single value or a vector of length equal to thatof the current diagonal. Should be of a mode which can be coercedto that of |
diag
has four distinct usages:
x
is a matrix, when it extracts the diagonal.
x
is missing andnrow
is specified, it returnsan identity matrix.
x
is a scalar (length-one vector) and the onlyargument, it returns a square identity matrix of size given by the scalar.
x
is a ‘numeric’ (complex
,numeric
,integer
,logical
, orraw
) vector, either of length at least 2 or therewere further arguments. This returns a matrix with the givendiagonal and zero off-diagonal entries.
It is an error to specifynrow
orncol
in the first case.
Ifx
is a matrix thendiag(x)
returns the diagonal ofx
. The resulting vector will havenames
ifnames
is true and if thematrixx
has matching column and rownames.
The replacement form sets the diagonal of the matrixx
to thegiven value(s).
In all other cases the value is a diagonal matrix withnrow
rows andncol
columns (ifncol
is not given the matrixis square). Herenrow
is taken from the argument if specified,otherwise inferred fromx
: if that is a vector (or 1D array) oflength two or more, then its length is the number of rows, but if itis of length one and neithernrow
norncol
is specified,nrow = as.integer(x)
.
When a diagonal matrix is returned, the diagonal elements are oneexcept in the fourth case, whenx
gives the diagonal elements:it will be recycled or truncated as needed, but fractional recyclingand truncation will give a warning.
Usingdiag(x)
can have unexpected effects ifx
is avector that could be of length one. Usediag(x, nrow = length(x))
for consistent behaviour.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
dim(diag(3))diag(10, 3, 4) # guess what?all(diag(1:3) == {m <- matrix(0,3,3); diag(m) <- 1:3; m})## other "numeric"-like diagonal matrices :diag(c(1i,2i)) # complexdiag(TRUE, 3) # logicaldiag(as.raw(1:3)) # raw(D2 <- diag(2:1, 4)); typeof(D2) # "integer"require(stats)## diag(<var-cov-matrix>) = variancesdiag(var(M <- cbind(X = 1:5, Y = rnorm(5))))#-> vector with names "X" and "Y"rownames(M) <- c(colnames(M), rep("", 3))M; diag(M) # named as welldiag(M, names = FALSE) # w/o names
dim(diag(3))diag(10,3,4)# guess what?all(diag(1:3)=={m<- matrix(0,3,3); diag(m)<-1:3; m})## other "numeric"-like diagonal matrices :diag(c(1i,2i))# complexdiag(TRUE,3)# logicaldiag(as.raw(1:3))# raw(D2<- diag(2:1,4)); typeof(D2)# "integer"require(stats)## diag(<var-cov-matrix>) = variancesdiag(var(M<- cbind(X=1:5, Y= rnorm(5))))#-> vector with names "X" and "Y"rownames(M)<- c(colnames(M), rep("",3))M; diag(M)# named as welldiag(M, names=FALSE)# w/o names
Returns suitably lagged and iterated differences.
diff(x, ...)## Default S3 method:diff(x, lag = 1, differences = 1, ...)## S3 method for class 'POSIXt'diff(x, lag = 1, differences = 1, ...)## S3 method for class 'Date'diff(x, lag = 1, differences = 1, ...)
diff(x,...)## Default S3 method:diff(x, lag=1, differences=1,...)## S3 method for class 'POSIXt'diff(x, lag=1, differences=1,...)## S3 method for class 'Date'diff(x, lag=1, differences=1,...)
x | a numeric vector or matrix containing the values to bedifferenced. |
lag | an integer indicating which lag to use. |
differences | an integer indicating the order of the difference. |
... | further arguments to be passed to or from methods. |
diff
is a generic function with a default method and ones forclasses"ts"
,"POSIXt"
and"Date"
.
NA
's propagate.
Ifx
is a vector of lengthn
anddifferences = 1
,then the computed result is equal to the successive differencesx[(1+lag):n] - x[1:(n-lag)]
.
Ifdifference
is larger than one this algorithm is appliedrecursively tox
.Note that the returned value is a vector which is shorter thanx
.
Ifx
is a matrix then the difference operations are carried outon each column separately.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
diff(1:10, 2)diff(1:10, 2, 2)x <- cumsum(cumsum(1:10))diff(x, lag = 2)diff(x, differences = 2)diff(.leap.seconds)## allows to pass units via ... to difftime()diff(.leap.seconds, units = "weeks") diff(as.Date(.leap.seconds), units = "weeks")
diff(1:10,2)diff(1:10,2,2)x<- cumsum(cumsum(1:10))diff(x, lag=2)diff(x, differences=2)diff(.leap.seconds)## allows to pass units via ... to difftime()diff(.leap.seconds, units="weeks") diff(as.Date(.leap.seconds), units="weeks")
Time intervals creation, printing, and some arithmetic. Theprint()
method calls these “time differences”.
time1 - time2difftime(time1, time2, tz, units = c("auto", "secs", "mins", "hours", "days", "weeks"))as.difftime(tim, format = "%X", units = "auto", tz = "UTC")## S3 method for class 'difftime'format(x, ...)## S3 method for class 'difftime'units(x)## S3 replacement method for class 'difftime'units(x) <- value## S3 method for class 'difftime'as.double(x, units = "auto", ...)## Group methods, notably for round(), signif(), floor(),## ceiling(), trunc(), abs(); called directly, *not* as Math():## S3 method for class 'difftime'Math(x, ...)
time1- time2difftime(time1, time2, tz, units= c("auto","secs","mins","hours","days","weeks"))as.difftime(tim, format="%X", units="auto", tz="UTC")## S3 method for class 'difftime'format(x,...)## S3 method for class 'difftime'units(x)## S3 replacement method for class 'difftime'units(x)<- value## S3 method for class 'difftime'as.double(x, units="auto",...)## Group methods, notably for round(), signif(), floor(),## ceiling(), trunc(), abs(); called directly, *not* as Math():## S3 method for class 'difftime'Math(x,...)
time1 ,time2 | |
tz | an optionaltime zone specification to be used for theconversion, mainly for |
units | character string. Units in which the results aredesired. Can be abbreviated. |
value | character string. Like |
tim | character string or numeric value specifying a time interval. |
format | character specifying the format of |
x | an object inheriting from class |
... | arguments to be passed to or from other methods. |
Functiondifftime
calculates a difference of two date/timeobjects and returns an object of class"difftime"
with anattribute indicating the units. TheMath
group method providesround
,signif
,floor
,ceiling
,trunc
,abs
, andsign
methods for objects of this class, and there aremethods for the group-generic (seeOps
) logical and arithmeticoperations.
Ifunits = "auto"
, a suitable set of units is chosen, the largestpossible (excluding"weeks"
) in which all the absolutedifferences are greater than one.
Subtraction of date-time objects gives an object of this class,by callingdifftime
withunits = "auto"
. Alternatively,as.difftime()
works on character-coded or numeric timeintervals; in the latter case, units must be specified, andformat
has no effect.
Limited arithmetic is available on"difftime"
objects: they canbe added or subtracted, and multiplied or divided by a numeric vector.In addition, adding or subtracting a numeric vector by a"difftime"
object implicitly converts the numeric vector to a"difftime"
object with the same units as the"difftime"
object. There are methods formean
andsum
(via theSummary
group generic), anddiff
viadiff.default
building on the"difftime"
method for arithmetic, notably-
.
The units of a"difftime"
object can be extracted by theunits
function, which also has a replacement form. If theunits are changed, the numerical value is scaled accordingly. Thereplacement version keeps attributes such as names and dimensions.
Note thatunits = "days"
means a period of 24 hours, hencetakes no account of Daylight Savings Time. Differences in objectsof class"Date"
are computed as if in the UTC time zone.
Theas.double
method returns the numeric value expressed inthe specified units. Usingunits = "auto"
means the units of theobject.
Theformat
method simply formats the numeric value and appendsthe units as a text string.
BecauseR follows POSIX (and almost all computer clocks) in ignoringleap seconds, so do time differences. So in a UTC time zone
z <- as.POSIXct(c("2016-12-31 23:59:59", "2017-01-01 00:00:01")) z[2] - z[1]
reports ‘Time difference of 2 secs’ but 3 seconds elapsed whilethe computer clock advanced by 2 seconds.
If you want the elapsed time interval, you need to add in anyleap seconds for yourself.
Units such as"months"
are not possible as they are not ofconstant length. To create intervals of months, quarters or yearsuseseq.Date
orseq.POSIXt
.
(z <- Sys.time() - 3600)Sys.time() - z # just over 3600 seconds.## time interval between release days of R 1.2.2 and 1.2.3.ISOdate(2001, 4, 26) - ISOdate(2001, 2, 26)as.difftime(c("0:3:20", "11:23:15"))as.difftime(c("3:20", "23:15", "2:"), format = "%H:%M") # 3rd gives NA(z <- as.difftime(c(0,30,60), units = "mins"))as.numeric(z, units = "secs")as.numeric(z, units = "hours")format(z)
(z<- Sys.time()-3600)Sys.time()- z# just over 3600 seconds.## time interval between release days of R 1.2.2 and 1.2.3.ISOdate(2001,4,26)- ISOdate(2001,2,26)as.difftime(c("0:3:20","11:23:15"))as.difftime(c("3:20","23:15","2:"), format="%H:%M")# 3rd gives NA(z<- as.difftime(c(0,30,60), units="mins"))as.numeric(z, units="secs")as.numeric(z, units="hours")format(z)
Retrieve or set the dimension of an object.
dim(x)dim(x) <- value
dim(x)dim(x)<- value
x | anR object, for example a matrix, array or data frame. |
value | for the default method, either |
The functionsdim
anddim<-
areinternal genericprimitive functions.
dim
has a method fordata.frame
s, which returnsthe lengths of therow.names
attribute ofx
andofx
(as the numbers of rows and columns respectively).
For an array (and hence in particular, for a matrix)dim
retrievesthedim
attribute of the object. It isNULL
or a vectorof modeinteger
.
The replacement method changes the"dim"
attribute (provided thenew value is compatible) and removes any"dimnames"
and"names"
attributes.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
x <- 1:12 ; dim(x) <- c(3,4)x# simple versions of nrow and ncol could be defined as followsnrow0 <- function(x) dim(x)[1]ncol0 <- function(x) dim(x)[2]
x<-1:12; dim(x)<- c(3,4)x# simple versions of nrow and ncol could be defined as followsnrow0<-function(x) dim(x)[1]ncol0<-function(x) dim(x)[2]
Retrieve or set the dimnames of an object.
dimnames(x)dimnames(x) <- valueprovideDimnames(x, sep = "", base = list(LETTERS), unique = TRUE)
dimnames(x)dimnames(x)<- valueprovideDimnames(x, sep="", base= list(LETTERS), unique=TRUE)
x | anR object, for example a matrix, array or data frame. |
value | a possible value for |
sep | a character string, used to separate |
base | a non-empty |
unique | logical indicating that the dimnames constructed areunique within each dimension in the sense of |
The functionsdimnames
anddimnames<-
are generic.
For anarray
(and hence in particular, for amatrix
), they retrieve or set thedimnames
attribute (seeattributes) of the object. A listvalue
can have names, and these will be used to label thedimensions of the array where appropriate.
The replacement method for arrays/matrices coerces vector and factorelements ofvalue
to character, but does not dispatch methodsforas.character
. It coerces zero-length elements toNULL
, and a zero-length list toNULL
. Ifvalue
is a list shorter than the number of dimensions, it is extended withNULL
s to the needed length.
Both have methods for data frames. The dimnames of a data frame areitsrow.names
and itsnames
. For thereplacement method each component ofvalue
will be coerced byas.character
.
For a 1D matrix thenames
are the same thing as the(only) component of thedimnames
.
Both areprimitive functions.
provideDimnames(x)
providesdimnames
where“missing”, such that its result hascharacter
dimnames for each component. Ifunique
is true as by default,they are unique within each component viamake.unique(*, sep=sep)
.
The dimnames of a matrix or array can beNULL
(which is notstored) or a list of the same length asdim(x)
. If a list, itscomponents are eitherNULL
or a character vector with positivelength of the appropriate dimension ofx
. The list can havenames. It is possible that all components areNULL
: suchdimnames may get converted toNULL
.
For the"data.frame"
method both dimnames are charactervectors, and the rownames must contain no duplicates nor missingvalues.
provideDimnames(x)
returnsx
, with “NULL
-free”dimnames
, i.e. each component a character vector ofcorrect length.
Setting components of the dimnames, e.g.,dimnames(A)[[1]] <- value
is a common paradigm, but note thatit will not work if the value assigned isNULL
. Userownames
instead, or (as it does) manipulate the wholedimnames list.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
rownames
,colnames
;array
,matrix
,data.frame
.
## simple versions of rownames and colnames## could be defined as followsrownames0 <- function(x) dimnames(x)[[1]]colnames0 <- function(x) dimnames(x)[[2]](dn <- dimnames(A <- provideDimnames(N <- array(1:24, dim = 2:4))))A0 <- A; dimnames(A)[2:3] <- list(NULL)stopifnot(identical(A0, provideDimnames(A)))strd <- function(x) utils::str(dimnames(x))strd(provideDimnames(A, base= list(letters[-(1:9)], tail(LETTERS))))strd(provideDimnames(N, base= list(letters[-(1:9)], tail(LETTERS)))) # recyclingstrd(provideDimnames(A, base= list(c("AA","BB")))) # recycling on both levels## set "empty dimnames":provideDimnames(rbind(1, 2:3), base = list(""), unique=FALSE)
## simple versions of rownames and colnames## could be defined as followsrownames0<-function(x) dimnames(x)[[1]]colnames0<-function(x) dimnames(x)[[2]](dn<- dimnames(A<- provideDimnames(N<- array(1:24, dim=2:4))))A0<- A; dimnames(A)[2:3]<- list(NULL)stopifnot(identical(A0, provideDimnames(A)))strd<-function(x) utils::str(dimnames(x))strd(provideDimnames(A, base= list(letters[-(1:9)], tail(LETTERS))))strd(provideDimnames(N, base= list(letters[-(1:9)], tail(LETTERS))))# recyclingstrd(provideDimnames(A, base= list(c("AA","BB"))))# recycling on both levels## set "empty dimnames":provideDimnames(rbind(1,2:3), base= list(""), unique=FALSE)
do.call
constructs and executes a function call from a name ora function and a list of arguments to be passed to it.
do.call(what, args, quote = FALSE, envir = parent.frame())
do.call(what, args, quote=FALSE, envir= parent.frame())
what | either a function or a non-empty character string naming thefunction to be called. |
args | alist of arguments to the function call. The |
quote | a logical value indicating whether to quote thearguments. |
envir | an environment within which to evaluate the call. Thiswill be most useful if |
Ifquote
isFALSE
, the default, then the arguments areevaluated (in the calling environment, not inenvir
). Ifquote
isTRUE
then each argument is quoted (seequote
) so that the effect of argument evaluation is toremove the quotes – leaving the original arguments unevaluated when thecall is constructed.
The behavior of some functions, such assubstitute
,will not be the same for functions evaluated usingdo.call
asif they were evaluated from the interpreter. The precise semanticsare currently undefined and subject to change.
The result of the (evaluated) function call.
This should not be used to attempt to evade restrictions on the use of.Internal
and other non-API calls.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
call
which creates an unevaluated call.
do.call("complex", list(imaginary = 1:3))## if we already have a list (e.g., a data frame)## we need c() to add further argumentstmp <- expand.grid(letters[1:2], 1:3, c("+", "-"))do.call("paste", c(tmp, sep = ""))do.call(paste, list(as.name("A"), as.name("B")), quote = TRUE)## examples of where objects will be found.A <- 2f <- function(x) print(x^2)env <- new.env()assign("A", 10, envir = env)assign("f", f, envir = env)f <- function(x) print(x)f(A) # 2do.call("f", list(A)) # 2do.call("f", list(A), envir = env) # 4do.call( f, list(A), envir = env) # 2do.call("f", list(quote(A)), envir = env) # 100do.call( f, list(quote(A)), envir = env) # 10do.call("f", list(as.name("A")), envir = env) # 100eval(call("f", A)) # 2eval(call("f", quote(A))) # 2eval(call("f", A), envir = env) # 4eval(call("f", quote(A)), envir = env) # 100
do.call("complex", list(imaginary=1:3))## if we already have a list (e.g., a data frame)## we need c() to add further argumentstmp<- expand.grid(letters[1:2],1:3, c("+","-"))do.call("paste", c(tmp, sep=""))do.call(paste, list(as.name("A"), as.name("B")), quote=TRUE)## examples of where objects will be found.A<-2f<-function(x) print(x^2)env<- new.env()assign("A",10, envir= env)assign("f", f, envir= env)f<-function(x) print(x)f(A)# 2do.call("f", list(A))# 2do.call("f", list(A), envir= env)# 4do.call( f, list(A), envir= env)# 2do.call("f", list(quote(A)), envir= env)# 100do.call( f, list(quote(A)), envir= env)# 10do.call("f", list(as.name("A")), envir= env)# 100eval(call("f", A))# 2eval(call("f", quote(A)))# 2eval(call("f", A), envir= env)# 4eval(call("f", quote(A)), envir= env)# 100
ThedontCheck
function is the same asidentity
, but is interpreted byR CMD check
code analysis as a directiveto suppress checking ofx
. Currently this is only used bycheckFF(registration = TRUE)
when checking the.NAME
argument of foreign function calls.
dontCheck(x)
dontCheck(x)
x | anR object. |
suppressForeignCheck
which explains why that anddontCheck
are undesirable and should be avoided if at allpossible.
..1
, etc used in Functions...
and..1
,..2
etc are used to refer toarguments passed down from a calling function. These (and thefollowing) can only be usedinside a function which has...
among its formal arguments.
...elt(n)
is a functional way to get..n
andbasically the same aseval(paste0("..", n))
, just more elegantand efficient.Note thatswitch(n, ...)
is very close, differing by returningNULL
invisibly instead of an error whenn
is zero ortoo large.
...length()
returns the number of expressions in...
, and...names()
thenames
.These are the same aslength(list(...))
ornames(list(...))
but without evaluating the expressions in...
(which happens withlist(...)
).
Evaluating elements of...
with..1
,..2
,...elt(n)
, etc. propagatesvisibility. Thisis consistent with the evaluation of named arguments which alsopropagates visibility.
...length()...names()...elt(n)
...length()...names()...elt(n)
n | a positive integer, not larger than the number of expressionsin ..., which is the same as |
...
and..1
,..2
arereserved words inR, seeReserved
.
For more, see theIntroduction to Rmanual for usage of these syntactic elements,anddotsMethods for their use in formal (S4) methods.
tst <- function(n, ...) ...elt(n)tst(1, pi=pi*0:1, 2:4) ## [1] 0.000000 3.141593tst(2, pi=pi*0:1, 2:4) ## [1] 2 3 4try(tst(1)) # -> Error about '...' not containing an element.tst.dl <- function(x, ...) ...length()tst.dns <- function(x, ...) ...names()tst.dl(1:10) # 0 (because the first argument is 'x')tst.dl(4, 5) # 1tst.dl(4, 5, 6) # 2 namely '5, 6'tst.dl(4, 5, 6, 7, sin(1:10), "foo"/"bar") # 5. Note: no evaluation!tst.dns(4, foo=5, 6, bar=7, sini = sin(1:10), "foo"/"bar")## "foo" "" "bar" "sini" ""## From R 4.1.0 to 4.1.2, ...names() sometimes did not match names(list(...));## check and show (these examples all would've failed):chk.n2 <- function(...) stopifnot(identical(print(...names()), names(list(...))))chk.n2(4, foo=5, 6, bar=7, sini = sin(1:10), "bar")chk.n2()chk.n2(1,2)
tst<-function(n,...)...elt(n)tst(1, pi=pi*0:1,2:4)## [1] 0.000000 3.141593tst(2, pi=pi*0:1,2:4)## [1] 2 3 4try(tst(1))# -> Error about '...' not containing an element.tst.dl<-function(x,...)...length()tst.dns<-function(x,...)...names()tst.dl(1:10)# 0 (because the first argument is 'x')tst.dl(4,5)# 1tst.dl(4,5,6)# 2 namely '5, 6'tst.dl(4,5,6,7, sin(1:10),"foo"/"bar")# 5. Note: no evaluation!tst.dns(4, foo=5,6, bar=7, sini= sin(1:10),"foo"/"bar")## "foo" "" "bar" "sini" ""## From R 4.1.0 to 4.1.2, ...names() sometimes did not match names(list(...));## check and show (these examples all would've failed):chk.n2<-function(...) stopifnot(identical(print(...names()), names(list(...))))chk.n2(4, foo=5,6, bar=7, sini= sin(1:10),"bar")chk.n2()chk.n2(1,2)
Create, coerce to or test for a double-precision vector.
double(length = 0)as.double(x, ...)is.double(x)single(length = 0)as.single(x, ...)
double(length=0)as.double(x,...)is.double(x)single(length=0)as.single(x,...)
length | a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error. |
x | object to be coerced or tested. |
... | further arguments passed to or from other methods. |
double
creates a double-precision vector of the specifiedlength. The elements of the vector are all equal to0
.It is identical tonumeric
.
as.double
is a generic function. It is identical toas.numeric
. Methods should return an object of base type"double"
.
is.double
is a test of doubletype.
R has no single precision data type. All real numbers arestored in double precision format. The functionsas.single
andsingle
are identical toas.double
anddouble
except they set the attributeCsingle
that is used in the.C
and.Fortran
interface, and they areintended only to be used in that context.
double
creates a double-precision vector of the specifiedlength. The elements of the vector are all equal to0
.
as.double
attempts to coerce its argument to be of double type:likeas.vector
it strips attributes including names.(To ensure that an object is of double type without strippingattributes, usestorage.mode
.) Character stringscontaining optional whitespace followed by either a decimalrepresentation or a hexadecimal representation (starting with0x
or0X
) can be converted, as can special values suchas"NA"
,"NaN"
,"Inf"
and"infinity"
,irrespective of case.
as.double
for factors yields the codes underlying the factorlevels, not the numeric representation of the labels, see alsofactor
.
is.double
returnsTRUE
orFALSE
depending onwhether its argument is of doubletype or not.
AllR platforms are required to work with values conforming to theIEC 60559 (also known as IEEE 754) standard. This basically workswith a precision of 53 bits, and represents to that precision a rangeof absolute values from about to
. It also has special values
NaN
(many of them), plus and minus infinity and plus andminus zero (althoughR acts as if these are the same). There arealsodenormal(ized) (orsubnormal) numbers with valuesbelow the range given above but represented to less precision.
See.Machine
for precise information on these limits.Note that ultimately how double precision numbers are handled is downto the CPU/FPU and compiler.
In IEEE 754-2008/IEC60559:2011 this is called ‘binary64’ format.
It is a historical anomaly thatR has two names for itsfloating-point vectors,double
andnumeric
(and formerly hadreal
).
double
is the name of thetype.numeric
is the name of themode and also of the implicitclass. As an S4 formal class, use"numeric"
.
The potential confusion is thatR has usedmode"numeric"
to mean ‘double or integer’, which conflictswith the S4 usage. Thusis.numeric
tests the mode, not theclass, butas.numeric
(which is identical toas.double
)coerces to the class.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
https://en.wikipedia.org/wiki/IEEE_754-1985,https://en.wikipedia.org/wiki/IEEE_754-2008,https://en.wikipedia.org/wiki/IEEE_754-2019,https://en.wikipedia.org/wiki/Double_precision,https://en.wikipedia.org/wiki/Denormal_number.
is.double(1)all(double(3) == 0)
is.double(1)all(double(3)==0)
Writes an ASCII text representation of anR object to a file, theRconsole, or a connection, or uses one to recreate the object.
dput(x, file = "", control = c("keepNA", "keepInteger", "niceNames", "showAttributes"))dget(file, keep.source = FALSE)
dput(x, file="", control= c("keepNA","keepInteger","niceNames","showAttributes"))dget(file, keep.source=FALSE)
x | an object. |
file | either a character string naming a file or aconnection. |
control | character vector (or |
keep.source | logical: should the source formatting be retained whenparsing functions, if possible? |
dput
opensfile
and deparses the objectx
intothat file. The object name is not written (unlikedump
).Ifx
is a function the associated environment is stripped.Hence scoping information can be lost.
Deparsing an object is difficult, and not always possible. With thedefaultcontrol
,dput()
attempts to deparse in a waythat is readable, but for more complex or unusual objects (seedump
), not likelyto be parsed as identical to the original. Usecontrol = "all"
for the most complete deparsing; usecontrol = NULL
for thesimplest deparsing, not even including attributes.
dput
will warn if fewer characters were written to a file thanexpected, which may indicate a full or corrupt file system.
To display saved source rather than deparsing the internalrepresentation include"useSource"
incontrol
.Rcurrently saves source only for function definitions. If you do notcare about source representation (e.g., for a data object), for speedsetoptions(keep.source = FALSE
) when callingsource
.
Fordput
, the first argument invisibly.
Fordget
, the object created.
This isnot a good way to transfer objects betweenR sessions.dump
is better, but the functionssave
andsaveRDS
are designed to be used for transportingR data,and will work withR objects thatdput
does not handle correctlyas well as being much faster.
To avoid the risk of a source attribute out of sync with the actualfunction definition, the source attribute of a function will neverbe written as an attribute.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
deparse
,.deparseOpts
,dump
,write
.
fil <- tempfile()## Write an ASCII version of the 'base' function mean() to our temp file, ..dput(base::mean, fil)## ... read it back into 'bar' and confirm it is the samebar <- dget(fil)stopifnot(all.equal(bar, base::mean, check.environment = FALSE))## Create a function with commentsbaz <- function(x) { # Subtract from one 1-x}## and display itdput(baz)## and now display the saved sourcedput(baz, control = "useSource")## Numeric values:xx <- pi^(1:3)dput(xx)dput(xx, control = "digits17")dput(xx, control = "hexNumeric")dput(xx, fil); dget(fil) - xx # slight rounding on all platformsdput(xx, fil, control = "digits17")dget(fil) - xx # slight rounding on some platformsdput(xx, fil, control = "hexNumeric"); dget(fil) - xxunlink(fil)xn <- setNames(xx, paste0("pi^",1:3))dput(xn) # nicer, now "niceNames" being part of default 'control'dput(xn, control = "S_compat") # no names## explicitly asking for output as in R < 3.5.0:dput(xn, control = c("keepNA", "keepInteger", "showAttributes"))
fil<- tempfile()## Write an ASCII version of the 'base' function mean() to our temp file, ..dput(base::mean, fil)## ... read it back into 'bar' and confirm it is the samebar<- dget(fil)stopifnot(all.equal(bar, base::mean, check.environment=FALSE))## Create a function with commentsbaz<-function(x){# Subtract from one1-x}## and display itdput(baz)## and now display the saved sourcedput(baz, control="useSource")## Numeric values:xx<- pi^(1:3)dput(xx)dput(xx, control="digits17")dput(xx, control="hexNumeric")dput(xx, fil); dget(fil)- xx# slight rounding on all platformsdput(xx, fil, control="digits17")dget(fil)- xx# slight rounding on some platformsdput(xx, fil, control="hexNumeric"); dget(fil)- xxunlink(fil)xn<- setNames(xx, paste0("pi^",1:3))dput(xn)# nicer, now "niceNames" being part of default 'control'dput(xn, control="S_compat")# no names## explicitly asking for output as in R < 3.5.0:dput(xn, control= c("keepNA","keepInteger","showAttributes"))
Delete the dimensions of an array which have only one level.
drop(x)
drop(x)
x | an array (including a matrix). |
Ifx
is an object with adim
attribute (e.g., a matrixorarray
), thendrop
returns an object likex
, but with any extents of length one removed. Anyaccompanyingdimnames
attribute is adjusted and returned withx
: if the result is a vector thenames
are taken fromthedimnames
(if any). If the result is a length-one vector,the names are taken from the first dimension with a dimname.
Array subsetting ([
) performs this reduction unless usedwithdrop = FALSE
, but sometimes it is useful to invokedrop
directly.
drop1
which is used for dropping terms in models, anddroplevels
used for dropping unused levels from afactor
.
dim(drop(array(1:12, dim = c(1,3,1,1,2,1,2)))) # = 3 2 2drop(1:3 %*% 2:4) # scalar product
dim(drop(array(1:12, dim= c(1,3,1,1,2,1,2))))# = 3 2 2drop(1:3%*%2:4)# scalar product
The functiondroplevels
is used to drop unused levels from afactor
or, more commonly, from factors in a data frame.
droplevels(x, ...)## S3 method for class 'factor'droplevels(x, exclude = if(anyNA(levels(x))) NULL else NA, ...)## S3 method for class 'data.frame'droplevels(x, except, exclude, ...)
droplevels(x,...)## S3 method for class 'factor'droplevels(x, exclude=if(anyNA(levels(x)))NULLelseNA,...)## S3 method for class 'data.frame'droplevels(x, except, exclude,...)
x | an object from which to drop unused factor levels. |
exclude | passed to |
... | further arguments passed to methods. |
except | indices of columns from whichnot to drop levels. |
The method for class"factor"
is currently equivalent tofactor(x, exclude=exclude)
. For the data frame method, youshould rarely specifyexclude
“globally” for all factorcolumns; rather the default uses the same factor-specificexclude
as the factor method itself.
Theexcept
argument follows the usual indexing rules.
droplevels
returns an object of the same class asx
This function was introduced in R 2.12.0. It is primarilyintended for cases where one or more factors in a data framecontains only elements from a reduced level set aftersubsetting. (Notice that subsetting doesnot in general dropunused levels). By default, levels are dropped from all factors in adata frame, but theexcept
argument allows you to specifycolumns for which this is not wanted.
subset
for subsetting data frames.factor
for definition of factors.drop
for dropping array dimensions.drop1
for dropping terms from a model.[.factor
for subsetting of factors.
aq <- transform(airquality, Month = factor(Month, labels = month.abb[5:9]))aq <- subset(aq, Month != "Jul")table( aq $Month)table(droplevels(aq)$Month)
aq<- transform(airquality, Month= factor(Month, labels= month.abb[5:9]))aq<- subset(aq, Month!="Jul")table( aq$Month)table(droplevels(aq)$Month)
This function takes a vector of names ofR objects and producestext representations of the objects on a file or connection.Adump
file can usually besource
d into anotherR session.
dump(list, file = "dumpdata.R", append = FALSE, control = "all", envir = parent.frame(), evaluate = TRUE)
dump(list, file="dumpdata.R", append=FALSE, control="all", envir= parent.frame(), evaluate=TRUE)
list | character vector (or |
file | either a character string naming a file or aconnection. |
append | if |
control | character vector (or |
envir | the environment to search for objects. |
evaluate | logical. Should promises be evaluated? |
If some of the objects named do not exist (in scope), they areomitted, with a warning. Iffile
is a file and no objectsexist then no file is created.
source
ing may not produce an identical copy ofdump
ed objects. A warning is issued if it is likely thatproblems will arise, for example when dumping exotic or complexobjects (see the Note).
dump
will also warn if fewer characters were written to a filethan expected, which may indicate a full or corrupt file system.
Adump
file can besource
d into anotherR (orperhaps S) session, but the functionssave
andsaveRDS
are designed tobe used for transportingR data, and will work withR objects thatdump
does not handle. For maximal reproducibility usecontrol = "exact"
.
To produce a more readable representation of an object, usecontrol = NULL
. This will skip attributes, and will make othersimplifications that makesource
less likely to produce anidentical copy. See.deparseOpts
for details.
To deparse the internal representation of a function rather thandisplaying the saved source, usecontrol = c("keepInteger", "warnIncomplete", "keepNA")
. This will lose all formatting andcomments, but may be useful in those cases where the saved source isno longer correct.
Promises will normally only be encountered by users as a result oflazy-loading (when the defaultevaluate = TRUE
is essential)and after the use ofdelayedAssign
,whenevaluate = FALSE
might be intended.
An invisible character vector containing the names of the objectswhich were dumped.
Asdump
is defined in the base namespace, thebasepackage will be searchedbefore the global environment unlessdump
is called from the top level prompt or theenvir
argument is given explicitly.
To avoid the risk of a source attribute becoming out of sync with theactual function definition, the source attribute of a function willnever be dumped as an attribute.
Currently environments, external pointers, weak references and objectsof typeS4
are not deparsed in a way that can besource
d. In addition,language objects are deparsed in asimple way whatever the value ofcontrol
, and this includes notdumping their attributes (which will result in a warning).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
.deparseOpts
for availablecontrol
settings;dput()
,dget()
anddeparse()
for related functions using identical internal deparsing functionality.
write
,write.table
, etc for “dumping”data to (text) files.
save
andsaveRDS
for a more reliable way tosaveR objects.
x <- 1; y <- 1:10fil <- tempfile(fileext=".Rdmped")dump(ls(pattern = '^[xyz]'), fil)print(.Last.value)unlink(fil)
x<-1; y<-1:10fil<- tempfile(fileext=".Rdmped")dump(ls(pattern='^[xyz]'), fil)print(.Last.value)unlink(fil)
duplicated()
determines which elements of a vector or dataframe are duplicatesof elements with smaller subscripts, and returns a logical vectorindicating which elements (rows) are duplicates.
anyDuplicated(.)
is a “generalized” more efficientversionany(duplicated(.))
, returning positive integer indicesinstead of justTRUE
.
duplicated(x, incomparables = FALSE, ...)## Default S3 method:duplicated(x, incomparables = FALSE, fromLast = FALSE, nmax = NA, ...)## S3 method for class 'array'duplicated(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, ...)anyDuplicated(x, incomparables = FALSE, ...)## Default S3 method:anyDuplicated(x, incomparables = FALSE, fromLast = FALSE, ...)## S3 method for class 'array'anyDuplicated(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, ...)
duplicated(x, incomparables=FALSE,...)## Default S3 method:duplicated(x, incomparables=FALSE, fromLast=FALSE, nmax=NA,...)## S3 method for class 'array'duplicated(x, incomparables=FALSE, MARGIN=1, fromLast=FALSE,...)anyDuplicated(x, incomparables=FALSE,...)## Default S3 method:anyDuplicated(x, incomparables=FALSE, fromLast=FALSE,...)## S3 method for class 'array'anyDuplicated(x, incomparables=FALSE, MARGIN=1, fromLast=FALSE,...)
x | a vector or a data frame or an array or |
incomparables | a vector of values that cannot be compared. |
fromLast | logical indicating if duplication should be consideredfrom the reverse side, i.e., the last (or rightmost) of identicalelements would correspond to |
nmax | the maximum number of unique items expected (greater than one). |
... | arguments for particular methods. |
MARGIN | the array margin to be held fixed: see |
These are generic functions with methods for vectors (includinglists), data frames and arrays (including matrices).
For the default methods, and whenever there are equivalent methoddefinitions forduplicated
andanyDuplicated
,anyDuplicated(x, ...)
is a “generalized” shortcut forany(duplicated(x, ...))
, in the sense that it returns theindexi
of the first duplicated entryx[i]
ifthere is one, and0
otherwise. Their behaviours may bedifferent when at least one ofduplicated
andanyDuplicated
has a relevant method.
duplicated(x, fromLast = TRUE)
is equivalent to but faster thanrev(duplicated(rev(x)))
.
The array method calculates for each element of the sub-arrayspecified byMARGIN
if the remaining dimensions are identicalto those for an earlier (or later, whenfromLast = TRUE
) element(in row-major order). This would most commonly be used to findduplicated rows (the default) or columns (withMARGIN = 2
).Note thatMARGIN = 0
returns an array of the samedimensionality attributes asx
.
Missing values ("NA"
) are regarded as equal, numeric andcomplex ones differing fromNaN
; character strings will be compared in a“common encoding”; for details, seematch
(andunique
) which use the same concept.
Values inincomparables
will never be marked as duplicated.This is intended to be used for a fairly small set of values and willnot be efficient for a very large set.
Except for factors, logical and raw vectors the defaultnmax = NA
isequivalent tonmax = length(x)
. Since a hash table of size8*nmax
bytes is allocated, settingnmax
suitably cansave large amounts of memory. For factors it is automatically set tothe smaller oflength(x)
and the number of levels plus one (forNA
). Ifnmax
is set too small there is liable to be anerror:nmax = 1
is silently ignored.
Long vectors are supported for the default method ofduplicated
, but may only be usable ifnmax
is supplied.
duplicated()
:For a vector input, a logical vector of the same length asx
. For a data frame, a logical vector with one element foreach row. For a matrix or array, and whenMARGIN = 0
, alogical array with the same dimensions and dimnames.
anyDuplicated()
: an integer or real vector of length one withvalue the 1-based index of the first duplicate if any, otherwise0
.
Using this for lists is potentially slow, especially if the elementsare not atomic vectors (seevector
) or differ onlyin their attributes. In the worst case it is.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
x <- c(9:20, 1:5, 3:7, 0:8)## extract unique elements(xu <- x[!duplicated(x)])## similar, same elements but different order:(xu2 <- x[!duplicated(x, fromLast = TRUE)])## xu == unique(x) but unique(x) is more efficientstopifnot(identical(xu, unique(x)), identical(xu2, unique(x, fromLast = TRUE)))duplicated(iris)[140:143]duplicated(iris3, MARGIN = c(1, 3))anyDuplicated(iris) ## 143anyDuplicated(x)anyDuplicated(x, fromLast = TRUE)
x<- c(9:20,1:5,3:7,0:8)## extract unique elements(xu<- x[!duplicated(x)])## similar, same elements but different order:(xu2<- x[!duplicated(x, fromLast=TRUE)])## xu == unique(x) but unique(x) is more efficientstopifnot(identical(xu, unique(x)), identical(xu2, unique(x, fromLast=TRUE)))duplicated(iris)[140:143]duplicated(iris3, MARGIN= c(1,3))anyDuplicated(iris)## 143anyDuplicated(x)anyDuplicated(x, fromLast=TRUE)
Load or unload DLLs (also known as shared objects), and test whether aC function or Fortran subroutine is available.
dyn.load(x, local = TRUE, now = TRUE, ...)dyn.unload(x)is.loaded(symbol, PACKAGE = "", type = "")
dyn.load(x, local=TRUE, now=TRUE,...)dyn.unload(x)is.loaded(symbol, PACKAGE="", type="")
x | a character string giving the pathname to a DLL, also knownas a dynamic shared object. (See ‘Details’ for what theseterms mean.) |
local | a logical value controlling whether the symbols in theDLL are stored in their own local table and not sharedacross DLLs, or added to the global symbol table. Whether this hasany effect is system-dependent. |
now | a logical controlling whether all symbols are resolved (andrelocated) immediately when the library is loaded or deferred until theyare used. This control is useful for developers testing whether alibrary is complete and has all the necessary symbols, and for usersto ignore missing symbols. Whether this has any effect is system-dependent. |
... | other arguments for future expansion. |
symbol | a character string giving a symbol name. |
PACKAGE | if supplied, confine the search for the |
type | the type of symbol to look for: can be any ( |
The objectsdyn.load
loads are called ‘dynamicallyloadable libraries’ (abbreviated to ‘DLL’) on all platformsexcept macOS, which uses the term for a different sortof object. On Unix-alikes they are also called ‘dynamicshared objects’ (‘DSO’), or ‘shared objects’ forshort. (The POSIX standards use ‘executable object file’,but no one else does.)
See ‘See Also’ and the ‘Writing R Extensions’ and‘R Installation and Administration’ manuals for how to createand install a suitable DLL.
Unfortunately some rare platforms (e.g., Compaq Tru64) do not handlethePACKAGE
argument correctly, and may incorrectly findsymbols linked intoR.
The additional arguments todyn.load
mirror the differentaspects of the mode argument to thedlopen()
routine on POSIXsystems. They are available so that users can exercise greater controlover the loading process for an individual library. In general, thedefault values are appropriate and you should override them only ifthere is good reason and you understand the implications.
Thelocal
argument allows one to control whether the symbols inthe DLL being attached are visible to other DLLs. While maintainingthe symbols in their own namespace is good practice, the ability toshare symbols across related ‘chapters’ is useful in manycases. Additionally, on certain platforms and versions of anoperating system, certain libraries must have their symbols loadedglobally to successfully resolve all symbols.
One should be careful of one potential side-effect of using lazyloading vianow = FALSE
: if a routine iscalled that has a missing symbol, the process will terminateimmediately. The intended use is for library developers to call this withvalueTRUE
to check that all symbols are actually resolved andfor regular users to call it withFALSE
so that missing symbolscan be ignored and the available ones can be called.
The initial motivation for adding these was to avoid such terminationin the_init()
routines of the Java virtual machine library.However, symbols loaded locally may not be (read: probably) availableto other DLLs. Those added to the global table are available to allother elements of the application and so can be shared across twodifferent DLLs.
Some (very old) systems do not provide (explicit) support forlocal/global and lazy/eager symbol resolution. This can be the sourceof subtle bugs. One can arrange to have warning messages emitted whenunsupported options are used. This is done by setting either of theoptionsverbose
orwarn
to be non-zero via theoptions
function.
There is a short discussion of these additional arguments with someexample code available athttps://www.stat.ucdavis.edu/~duncan/R/dynload/.
The functiondyn.load
is used for its side effect which linksthe specified DLL to the executingR image. Calls to.C
,.Call
,.Fortran
and.External
can then be used toexecute compiled C functions or Fortran subroutines contained in thelibrary. The return value ofdyn.load
is an object of classDLLInfo
. SeegetLoadedDLLs
for information aboutthis class.
The functiondyn.unload
unlinks the DLL. Note that unloading aDLL and then re-loading a DLL of the same name may or may not work: onSolaris it used the first version loaded. Note also that some DLLs cannotbe safely unloaded at all: unloading a DLL which implements C finalizersbut does not unregister them on unload causes R to crash.
is.loaded
checks if the symbol name is loadedandsearchable and hence available for use as a character string valuefor argument.NAME
in.C
,.Fortran
,.Call
, or.External
. It will succeed if any one of thefour calling functions would succeed in using the entry point unlesstype
is specified. (See.Fortran
for how Fortransymbols are mapped.) Note that symbols in base packages are notsearchable, and other packages can be so marked.
Do not usedyn.unload
on a DLL loaded bylibrary.dynam
: uselibrary.dynam.unload
.This is needed for system housekeeping.
is.loaded
requires the name you would give to.C
etc.It must be a character string and so cannot be anR object as usedfor registered native symbols (see “Writing R Extensions”section 5.4.). Some registered symbols are available by name but most arenot, including those in the examples below.
By default, the maximum number of DLLs that can be loaded is now 614when the OS limit on the number of open files allows or can beincreased, but less otherwise (but it will be at least 100). Aspecific maximum can be requestedvia the environment variableR_MAX_NUM_DLLS, which has to be set (to a value between 100 and1000 inclusive) before starting anR session. If the OS limit onthe number of open files does not allow using this maximum and cannotbe increased,R will fail to start with an error. The maximum is notallowed to be greater than 60% of the OS limit on the number of openfiles (essentially unlimited on Windows, on Unix typically 1024, but256 on macOS). The limit can sometimes (including on macOS) bemodified using commandulimit -n
(sh
,bash
) orlimit descriptors
(csh
) in theshell used to launchR. IncreasingR_MAX_NUM_DLLS comes withsome memory overhead, and be aware that many types ofconnections also use file descriptors.
If the OS limit on the number of open files cannot be determined, theDLL limit is 100 and cannot be changedviaR_MAX_NUM_DLLS.
The creation of DLLs and the runtime linking of them into executingprograms is very platform dependent. In recent years there has beensome simplification in the process because the C subroutine calldlopen
has become the POSIX standard for doing this. UnderUnix-alikesdyn.load
uses thedlopen
mechanism andshould work on all platforms which support it. On Windows it uses thestandard mechanism (LoadLibrary
) for loading DLLs.
The original code for loading DLLs in Unix-alikes was provided byHeiner Schwarte.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
library.dynam
to be used inside a package's.onLoad
initialization.
SHLIB
for how to create suitable DLLs.
## expect all of these to be false in R >= 3.0.0 as these can only be## used via registered symbols.is.loaded("supsmu") # Fortran entry point in statsis.loaded("supsmu", "stats", "Fortran")is.loaded("PDF", type = "External") # pdf() device in grDevices
## expect all of these to be false in R >= 3.0.0 as these can only be## used via registered symbols.is.loaded("supsmu")# Fortran entry point in statsis.loaded("supsmu","stats","Fortran")is.loaded("PDF", type="External")# pdf() device in grDevices
eapply
appliesFUN
to the named values from anenvironment
and returns the results as a list. The usercan request that all named objects are used (normally names that beginwith a dot are not). The output is not sorted and no enclosingenvironments are searched.
eapply(env, FUN, ..., all.names = FALSE, USE.NAMES = TRUE)
eapply(env, FUN,..., all.names=FALSE, USE.NAMES=TRUE)
env | environment to be used. |
FUN | the function to be applied, foundvia |
... | optional arguments to |
all.names | a logical indicating whether to apply the function toall values. |
USE.NAMES | logical indicating whether the resulting list shouldhave |
A named (unlessUSE.NAMES = FALSE
) list. Note that the order ofthe components is arbitrary for hashed environments.
require(stats)env <- new.env(hash = FALSE) # so the order is fixedenv$a <- 1:10env$beta <- exp(-3:3)env$logic <- c(TRUE, FALSE, FALSE, TRUE)# what have we there?utils::ls.str(env)# compute the mean for each list element eapply(env, mean)unlist(eapply(env, mean, USE.NAMES = FALSE))# median and quartiles for each element (making use of "..." passing):eapply(env, quantile, probs = 1:3/4)eapply(env, quantile)
require(stats)env<- new.env(hash=FALSE)# so the order is fixedenv$a<-1:10env$beta<- exp(-3:3)env$logic<- c(TRUE,FALSE,FALSE,TRUE)# what have we there?utils::ls.str(env)# compute the mean for each list element eapply(env, mean)unlist(eapply(env, mean, USE.NAMES=FALSE))# median and quartiles for each element (making use of "..." passing):eapply(env, quantile, probs=1:3/4)eapply(env, quantile)
Computes eigenvalues and eigenvectors of numeric (double, integer,logical) or complex matrices.
eigen(x, symmetric, only.values = FALSE, EISPACK = FALSE)
eigen(x, symmetric, only.values=FALSE, EISPACK=FALSE)
x | a numeric or complex matrix whose spectral decomposition is tobe computed. Logical matrices are coerced to numeric. |
symmetric | if |
only.values | if |
EISPACK | logical. Defunct and ignored. |
Ifsymmetric
is unspecified,isSymmetric(x)
determines if the matrix is symmetric up to plausible numericalinaccuracies. It is surer and typically much faster to set the valueyourself.
Computing the eigenvectors is the slow part for large matrices.
Computing the eigendecomposition of a matrix is subject to errors on areal-world computer: the definitive analysis is Wilkinson (1965). Allyou can hope for is a solution to a problem suitably close tox
. So even though a real asymmetricx
may have analgebraic solution with repeated real eigenvalues, the computedsolution may be of a similar matrix with complex conjugate pairs ofeigenvalues.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code (most often1
): these canonly be interpreted by detailed study of the FORTRAN code.
Missing,NaN
or infinite values inx
will givenan error.
The spectral decomposition ofx
is returned as a list with components
values | a vector containing the |
vectors | either a Recall that the eigenvectors are only defined up to a constant: evenwhen the length is specified they are still only defined up to ascalar of modulus one (the sign for real matrices). |
Whenonly.values
is not true, as by default, the result is ofS3 class"eigen"
.
Ifr <- eigen(A)
, andV <- r$vectors; lam <- r$values
,then
(up to numericalfuzz), wherediag(lam)
.
eigen
uses the LAPACK routinesDSYEVR
,DGEEV
,ZHEEV
andZGEEV
.
LAPACK is fromhttps://netlib.org/lapack/ and its guide is listedin the references.
Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Wilkinson, J. H. (1965)The Algebraic Eigenvalue Problem.Clarendon Press, Oxford.
svd
, a generalization ofeigen
;qr
, andchol
for related decompositions.
To compute the determinant of a matrix, theqr
decomposition is much more efficient:det
.
eigen(cbind(c(1,-1), c(-1,1)))eigen(cbind(c(1,-1), c(-1,1)), symmetric = FALSE)# same (different algorithm).eigen(cbind(1, c(1,-1)), only.values = TRUE)eigen(cbind(-1, 2:1)) # complex valueseigen(print(cbind(c(0, 1i), c(-1i, 0)))) # Hermite ==> real Eigenvalues## 3 x 3:eigen(cbind( 1, 3:1, 1:3))eigen(cbind(-1, c(1:2,0), 0:2)) # complex values
eigen(cbind(c(1,-1), c(-1,1)))eigen(cbind(c(1,-1), c(-1,1)), symmetric=FALSE)# same (different algorithm).eigen(cbind(1, c(1,-1)), only.values=TRUE)eigen(cbind(-1,2:1))# complex valueseigen(print(cbind(c(0,1i), c(-1i,0))))# Hermite ==> real Eigenvalues## 3 x 3:eigen(cbind(1,3:1,1:3))eigen(cbind(-1, c(1:2,0),0:2))# complex values
encodeString
escapes the strings in a character vector in thesame wayprint.default
does, and optionally fits the encodedstrings within a field width.
encodeString(x, width = 0, quote = "", na.encode = TRUE, justify = c("left", "right", "centre", "none"))
encodeString(x, width=0, quote="", na.encode=TRUE, justify= c("left","right","centre","none"))
x | a character vector, or an object that can be coerced to oneby |
width | integer: the minimum field width. If |
quote | character: quoting character, if any. |
na.encode | logical: should |
justify | character: partial matches are allowed. If padding tothe minimum field width is needed, how should spaces be inserted? |
This escapes backslash and the control characters ‘\a’ (bell),‘\b’ (backspace), ‘\f’ (form feed),‘\n’ (line feed, aka “newline”),‘\r’ (carriage return), ‘\t’ (tab) and ‘\v’(vertical tab) as well as any non-printable characters in asingle-byte locale, which are printed in octal notation (‘\xyz’with leading zeroes).
Which characters are non-printable depends on the current locale.Windows' reporting of printable characters is unreliable, so there allother control characters are regarded as non-printable, and allcharacters with codes 32–255 as printable in a single-byte locale.Seeprint.default
for how non-printable characters arehandled in multi-byte locales.
Ifquote
is a single or double quote any embedded quote of thesame type is escaped. Note that justification is of the quotedstring, hence spaces are added outside the quotes.
A character vector of the same length asx
, with the sameattributes (including names and dimensions) but with no class set.
Marked UTF-8 encodings are preserved.
The default forwidth
is different fromformat.default
,which does similar things for character vectors but without encodingusing escapes.
x <- "ab\bc\ndef"print(x)cat(x) # interprets escapescat(encodeString(x), "\n", sep = "") # similar to print()factor(x) # makes use of this to print the levelsx <- c("a", "ab", "abcde")encodeString(x) # width = 0: use as little as possibleencodeString(x, 2) # use two or more (left justified)encodeString(x, width = NA) # left justificationencodeString(x, width = NA, justify = "c")encodeString(x, width = NA, justify = "r")encodeString(x, width = NA, quote = "'", justify = "r")
x<-"ab\bc\ndef"print(x)cat(x)# interprets escapescat(encodeString(x),"\n", sep="")# similar to print()factor(x)# makes use of this to print the levelsx<- c("a","ab","abcde")encodeString(x)# width = 0: use as little as possibleencodeString(x,2)# use two or more (left justified)encodeString(x, width=NA)# left justificationencodeString(x, width=NA, justify="c")encodeString(x, width=NA, justify="r")encodeString(x, width=NA, quote="'", justify="r")
Read or set the declared encodings for a character vector.
Encoding(x)Encoding(x) <- valueenc2native(x)enc2utf8(x)
Encoding(x)Encoding(x)<- valueenc2native(x)enc2utf8(x)
x | A character vector. |
value | A character vector of positive length. |
Character strings inR can be declared to be encoded in"latin1"
or"UTF-8"
or as"bytes"
. Thesedeclarations can be read byEncoding
, which will return acharacter vector of values"latin1"
,"UTF-8"
"bytes"
or"unknown"
, or set, whenvalue
isrecycled as needed and other values are silently treated as"unknown"
. ASCII strings will never be marked with a declaredencoding, since their representation is the same in all supportedencodings. Strings marked as"bytes"
are intended to benon-ASCII strings which should be manipulated as bytes, and neverconverted to a character encoding (so writing them to a text file issupported only bywriteLines(useBytes = TRUE)
).
enc2native
andenc2utf8
convert elements of charactervectors to the native encoding or UTF-8 respectively, taking anymarked encoding into account. They areprimitive functions,designed to do minimal copying.
There are other ways for character strings to acquire a declaredencoding apart from explicitly setting it (and these have changed asR has evolved). The parser marks strings containing ‘\u’ or‘\U’ escapes. Functionsscan
,read.table
,readLines
, andparse
have anencoding
argument that is used todeclare encodings,iconv
declares encodings from itsto
argument, and console input in suitable locales is alsodeclared.intToUtf8
declares its output as"UTF-8"
, and output text connections (seetextConnection
) are marked if running in asuitable locale. Under some circumstances (see its help page)source(encoding=)
will mark encodings of characterstrings it outputs.
Most character manipulation functions will set the encoding on outputstrings if it was declared on the corresponding input. These includechartr
,strsplit(useBytes = FALSE)
,tolower
andtoupper
as well assub(useBytes = FALSE)
andgsub(useBytes = FALSE)
. Note that such functions do notpreserve theencoding, but if they know the input encoding and that the string hasbeen successfully re-encoded (to the current encoding or UTF-8), theymark the output.
substr
does preserve the encoding, andchartr
,tolower
andtoupper
preserve UTF-8 encoding on systems with Unicode wide characters. Withtheirfixed
andperl
options,strsplit
,sub
andgsub
will give a marked UTF-8 result ifany of the inputs are UTF-8.
paste
andsprintf
return elements markedas bytes if any of the corresponding inputs is marked as bytes, andotherwise marked as UTF-8 if any of the inputs is marked as UTF-8.
match
,pmatch
,charmatch
,duplicated
andunique
all match in UTF-8if any of the elements are marked as UTF-8.
Changing the current encoding from a running R session may lead toconfusion (seeSys.setlocale
).
There is some ambiguity as to what is meant by a ‘Latin-1’locale, since some OSes (notably Windows) make use of characterpositions undefined (or used for control characters) in the ISO 8859-1character set. How such characters are interpreted issystem-dependent but as fromR 3.5.0 they are if possible interpretedas per Windows codepage 1252 (which Microsoft calls ‘WindowsLatin 1 (ANSI)’) when converting to e.g. UTF-8.
A character vector.
Forenc2utf8
encodings are always marked: they are forenc2native
in UTF-8 and Latin-1 locales.
## x is intended to be in latin1x. <- x <- "fran\xE7ais"Encoding(x.) # "unknown" (UTF-8 loc.) | "latin1" (8859-1/CP-1252 loc.) | ....Encoding(x) <- "latin1"xxx <- iconv(x, "latin1", "UTF-8")Encoding(c(x., x, xx))c(x, xx)xb <- xx; Encoding(xb) <- "bytes"xb # will be encoded in hexcat("x = ", x, ", xx = ", xx, ", xb = ", xb, "\n", sep = "")(Ex <- Encoding(c(x.,x,xx,xb)))stopifnot(identical(Ex, c(Encoding(x.), Encoding(x), Encoding(xx), Encoding(xb))))
## x is intended to be in latin1x.<- x<-"fran\xE7ais"Encoding(x.)# "unknown" (UTF-8 loc.) | "latin1" (8859-1/CP-1252 loc.) | ....Encoding(x)<-"latin1"xxx<- iconv(x,"latin1","UTF-8")Encoding(c(x., x, xx))c(x, xx)xb<- xx; Encoding(xb)<-"bytes"xb# will be encoded in hexcat("x = ", x,", xx = ", xx,", xb = ", xb,"\n", sep="")(Ex<- Encoding(c(x.,x,xx,xb)))stopifnot(identical(Ex, c(Encoding(x.), Encoding(x), Encoding(xx), Encoding(xb))))
Get, set, test for and create environments.
environment(fun = NULL)environment(fun) <- valueis.environment(x).GlobalEnvglobalenv().BaseNamespaceEnvemptyenv()baseenv()new.env(hash = TRUE, parent = parent.frame(), size = 29L)parent.env(env)parent.env(env) <- valueenvironmentName(env)env.profile(env)
environment(fun=NULL)environment(fun)<- valueis.environment(x).GlobalEnvglobalenv().BaseNamespaceEnvemptyenv()baseenv()new.env(hash=TRUE, parent= parent.frame(), size=29L)parent.env(env)parent.env(env)<- valueenvironmentName(env)env.profile(env)
fun | |
value | an environment to associate with the function. |
x | an arbitraryR object. |
hash | a logical, if |
parent | an environment to be used as the enclosure of theenvironment created. |
env | an environment. |
size | an integer specifying the initial size for a hashedenvironment. An internal default value will be used if |
Environments consist of aframe, or collection of namedobjects, and a pointer to anenclosing environment. The mostcommon example is the frame of variables local to a function call; itsenclosure is the environment where the function was defined(unless changed subsequently). The enclosing environment isdistinguished from theparent frame: the latter (returned byparent.frame
) refers to the environment of the caller ofa function. Since confusion is so easy, it is best never to use‘parent’ in connection with an environment (despite thepresence of the functionparent.env
).
Whenget
orexists
search an environmentwith the defaultinherits = TRUE
, they look for the variablein the frame, then in the enclosing frame, and so on.
The global environment.GlobalEnv
, more often known as theuser's workspace, is the first item on the search path. It can alsobe accessed byglobalenv()
. On the search path, each item'senclosure is the next item.
The object.BaseNamespaceEnv
is the namespace environment forthe base package. The environment of the base package itself isavailable asbaseenv()
.
If one follows the chain of enclosures found by repeatedly callingparent.env
from any environment, eventually one reaches theempty environmentemptyenv()
, into which nothing maybe assigned.
The replacement functionparent.env<-
is extremely dangerous asit can be used to destructively change environments in ways thatviolate assumptions made by the internal C code. It may be removedin the near future.
The replacement form ofenvironment
,is.environment
,baseenv
,emptyenv
andglobalenv
areprimitive functions.
System environments, such as the base, global and empty environments,have names as do the package and namespace environments and thosegenerated byattach()
. Other environments can be named bygiving a"name"
attribute, but this needs to be done with careas environments have unusual copying semantics.
Iffun
is a function or a formula thenenvironment(fun)
returns the environment associated with that function or formula.Iffun
isNULL
then the current evaluation environment isreturned.
The replacement form sets the environment of the function or formulafun
to thevalue
given.
is.environment(obj)
returnsTRUE
if and only ifobj
is anenvironment
.
new.env
returns a new (empty) environment with (by default)enclosure the parent frame.
parent.env
returns the enclosing environment of its argument.
parent.env<-
sets the enclosing environment of its firstargument.
environmentName
returns a character string, that given whenthe environment is printed or""
if it is not a named environment.
env.profile
returns a list with the following components:size
the number of chains that can be stored in the hash table,nchains
the number of non-empty chains in the table (asreported byHASHPRI
), andcounts
an integer vectorgiving the length of each chain (zero for empty chains). Thisfunction is intended to assess the performance of hashed environments.Whenenv
is a non-hashed environment,NULL
is returned.
For the performance implications of hashing or not, seehttps://en.wikipedia.org/wiki/Hash_table.
Theenvir
argument ofeval
,get
,andexists
.
ls
may be used to view the objects in an environment,and hencels.str
may be useful for an overview.
sys.source
can be used to populate an environment.
f <- function() "top level function"##-- all three give the same:environment()environment(f).GlobalEnvls(envir = environment(stats::approxfun(1:2, 1:2, method = "const")))is.environment(.GlobalEnv) # TRUEe1 <- new.env(parent = baseenv()) # this one has enclosure package:base.e2 <- new.env(parent = e1)assign("a", 3, envir = e1)ls(e1)ls(e2)exists("a", envir = e2) # this succeeds by inheritanceexists("a", envir = e2, inherits = FALSE)exists("+", envir = e2) # this succeeds by inheritanceeh <- new.env(hash = TRUE, size = NA)with(env.profile(eh), stopifnot(size == length(counts)))
f<-function()"top level function"##-- all three give the same:environment()environment(f).GlobalEnvls(envir= environment(stats::approxfun(1:2,1:2, method="const")))is.environment(.GlobalEnv)# TRUEe1<- new.env(parent= baseenv())# this one has enclosure package:base.e2<- new.env(parent= e1)assign("a",3, envir= e1)ls(e1)ls(e2)exists("a", envir= e2)# this succeeds by inheritanceexists("a", envir= e2, inherits=FALSE)exists("+", envir= e2)# this succeeds by inheritanceeh<- new.env(hash=TRUE, size=NA)with(env.profile(eh), stopifnot(size== length(counts)))
Details of some of the environment variables which affect anR session.
It is impossible to list all the environment variables which canaffect anR session: some affect the OS system functions whichRuses, and others will affect add-on packages. But here are notes onsome of the more important ones. Those that set the defaults foroptions are consulted only at startup (as are some of the others).
The user's ‘home’ directory.
Optional. The language(s) to be used formessage translations. This is consulted when needed.
(etc) Optional. Use to set various aspects ofthe locale – seeSys.getlocale
. Consulted at startup.
The path tomakeindex
.If unset to a value determined whenR was built.Used by the emulation mode oftexi2dvi
andtexi2pdf
.
Optional – set in a batch session, that isone started byR CMDBATCH
. Most often set to""
, so test by something like!is.na(Sys.getenv("R_BATCH", NA))
.
The path to the default browser. Used toset the default value ofoptions("browser")
.
Optional. If set toFALSE
,command-line completion is not used. (Not used by the macOS GUI.)
A comma-separated list of packageswhich are to be attached in every session. Seeoptions
.
The location of theR ‘doc’directory. Set byR.
Optional. The path to the site environmentfile: seeStartup. Consulted at startup.
Optional. The path to Ghostscript, used bydev2bitmap
,bitmap
andembedFonts
. Consulted when those functions areinvoked. Since it will be treated as if passed tosystem
, spaces and shell metacharacters should be escaped.
Optional. The path of the history file:seeStartup. Consulted at startup and when the history issaved.
Optional. The maximum size of the historyfile, in lines. Exactly how this is used depends on theinterface.
for thereadline
command-line interface it takes effectwhen the history is saved (bysavehistory
or at theend of a session).
forRgui
it controls the number of lines saved to thehistory file: the size of the history used in the session iscontrolled by the console customization: seeRconsole
.
The top-level directory of theRinstallation: seeR.home
. Set byR.
The location of theR ‘include’directory. Set byR.
Optional. Used for initial setting of.libPaths
.
Optional. Used for initial setting of.libPaths
.
Optional. Used for initial setting of.libPaths
.
Optional. Used to set the default foroptions("papersize")
, e.g. used bypdf
andpostscript
.
Optional. Consulted whenPCRE'sJIT pattern compiler is first used. Seegrep
.
The path to the default PDF viewer. UsedbyR CMD Rd2pdf
.
The platform – a string of the form"cpu-vendor-os"
, seeR.Version
.
Optional. The path to the site profilefile: seeStartup. Consulted at startup.
Options forpdflatex
processing ofRd
files. Used byR CMD Rd2pdf
.
The location of theR ‘share’directory. Set byR.
The path totexi2dvi
.Defaults to the value ofTEXI2DVI, and if that is unset to avalue determined whenR was built.
Only on Unix-alikes:
Consulted at startup to set the default foroptions("texi2dvi")
, used bytexi2dvi
andtexi2pdf
in packagetools.
The path to HTMLtidy
. Used byR CMD check
if_R_CHECK_RD_VALIDATE_RD2HTML_ isset to a true value (as it is by--as-cran.
The path tounzip
. Sets theinitial value foroptions("unzip")
on a Unix-alikewhen namespaceutils is loaded.
The path tozip
. Used byzip
and byR CMD INSTALL --build
on Windows.
Consulted (in thatorder) when setting the temporary directory for the session: seetempdir
.TMPDIR is also used by some of theutilities: see the help forbuild
.
Optional. The current time zone. SeeSys.timezone
for the system-specificformats. Consulted as needed.
Optional. The top-level directory of thetime-zone database. SeeSys.timezone
.
(and more). Optional. Settings fordownload.file
:see its help for further details.
Some variables set on Unix-alikes, and not (in general) on Windows.
Optional: used byX11
, Tk (inpackagetcltk), the data editor and various packages.
The path to the default editor: sets thedefault foroptions("editor")
when namespaceutils is loaded.
The path to the pager with the default setting ofoptions("pager")
. The default value is chosen atconfiguration, usually as the path toless
.
Sets the default foroptions("printcmd")
, which sets the default printcommand to be used bypostscript
.
logical. Sets the default for thesupport_old_tars
argument ofuntar
. Shouldbe set toTRUE
if an old systemtar
command isused which does not support eitherxz
compression orautomagically detecting compression type.
Some Windows-specific variables are
Optional: the path to Ghostscript, used ifR_GSCMD is not set.
The user's ‘home’ directory. Set byR. (HOME will be set to the same value if not already set.)
Sys.getenv
andSys.setenv
to read and setenvironmental variables in anR session.
gctorture
for environment variables controlling garbagecollection.
Evaluate anR expression in a specified environment.
eval(expr, envir = parent.frame(), enclos = if(is.list(envir) || is.pairlist(envir)) parent.frame() else baseenv())evalq(expr, envir, enclos)eval.parent(expr, n = 1)local(expr, envir = new.env())
eval(expr, envir= parent.frame(), enclos=if(is.list(envir)|| is.pairlist(envir)) parent.frame()else baseenv())evalq(expr, envir, enclos)eval.parent(expr, n=1)local(expr, envir= new.env())
expr | an object to be evaluated. See ‘Details’. |
envir | the |
enclos | relevant when |
n | number of parent generations to go back. |
eval
evaluates theexpr
argument in theenvironment specified byenvir
and returns the computed value.Ifenvir
is not specified, then the default isparent.frame()
(the environment where the call toeval
was made).
Objects to be evaluated can be of typescall
orexpression
orname (when the name is lookedup in the current scope and its binding is evaluated), apromiseor any of the basic types such as vectors, functions and environments(which are returned unchanged).
Theevalq
form is equivalent toeval(quote(expr), ...)
.eval
evaluates its first argument in the current scopebefore passing it to the evaluator:evalq
avoids this.
eval.parent(expr, n)
is a shorthand foreval(expr, parent.frame(n))
.
Ifenvir
is a list (such as a data frame) or pairlist, it iscopied into a temporary environment (with enclosureenclos
),and the temporary environment is used for evaluation. So ifexpr
changes any of the components named in the (pair)list, thechanges are lost.
Ifenvir
isNULL
it is interpreted as an empty list sono values could be found inenvir
and look-up goes directly toenclos
.
local
evaluates an expression in a local environment. It isequivalent toevalq
except that its default argument creates anew, empty environment. This is useful to create anonymous recursivefunctions and as a kind of limited namespace feature since variablesdefined in the environment are not visible from the outside.
The result of evaluating the object: for an expression vector this isthe result of evaluating the last element.
Due to the difference in scoping rules, there are some differencesbetweenR and S in this area. In particular, the default enclosurein S is the global environment.
When evaluating expressions in a data frame that has been passed as anargument to a function, the relevant enclosure is often the caller'senvironment, i.e., one needseval(x, data, parent.frame())
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (eval
only.)
expression
,quote
,sys.frame
,parent.frame
,environment
.
Further,force
toforce evaluation, typically offunction arguments.
eval(2 ^ 2 ^ 3)mEx <- expression(2^2^3); mEx; 1 + eval(mEx)eval({ xx <- pi; xx^2}) ; xxa <- 3 ; aa <- 4 ; evalq(evalq(a+b+aa, list(a = 1)), list(b = 5)) # == 10a <- 3 ; aa <- 4 ; evalq(evalq(a+b+aa, -1), list(b = 5)) # == 12ev <- function() { e1 <- parent.frame() ## Evaluate a in e1 aa <- eval(expression(a), e1) ## evaluate the expression bound to a in e1 a <- expression(x+y) list(aa = aa, eval = eval(a, e1))}tst.ev <- function(a = 7) { x <- pi; y <- 1; ev() }tst.ev() #-> aa : 7, eval : 4.14a <- list(a = 3, b = 4)with(a, a <- 5) # alters the copy of a from the list, discarded.#### Example of evalq()##N <- 3env <- new.env()assign("N", 27, envir = env)## this version changes the visible copy of N only, since the argument## passed to eval is '4'.eval(N <- 4, env)Nget("N", envir = env)## this version does the assignment in env, and changes N only there.evalq(N <- 5, env)Nget("N", envir = env)#### Uses of local()### Mutually recursive.# gg gets value of last assignment, an anonymous version of f.gg <- local({ k <- function(y)f(y) f <- function(x) if(x) x*k(x-1) else 1})gg(10)sapply(1:5, gg)# Nesting locals: a is private storage accessible to kgg <- local({ k <- local({ a <- 1 function(y){print(a <<- a+1);f(y)} }) f <- function(x) if(x) x*k(x-1) else 1})sapply(1:5, gg)ls(envir = environment(gg))ls(envir = environment(get("k", envir = environment(gg))))
eval(2^2^3)mEx<- expression(2^2^3); mEx;1+ eval(mEx)eval({ xx<- pi; xx^2}); xxa<-3; aa<-4; evalq(evalq(a+b+aa, list(a=1)), list(b=5))# == 10a<-3; aa<-4; evalq(evalq(a+b+aa,-1), list(b=5))# == 12ev<-function(){ e1<- parent.frame()## Evaluate a in e1 aa<- eval(expression(a), e1)## evaluate the expression bound to a in e1 a<- expression(x+y) list(aa= aa, eval= eval(a, e1))}tst.ev<-function(a=7){ x<- pi; y<-1; ev()}tst.ev()#-> aa : 7, eval : 4.14a<- list(a=3, b=4)with(a, a<-5)# alters the copy of a from the list, discarded.#### Example of evalq()##N<-3env<- new.env()assign("N",27, envir= env)## this version changes the visible copy of N only, since the argument## passed to eval is '4'.eval(N<-4, env)Nget("N", envir= env)## this version does the assignment in env, and changes N only there.evalq(N<-5, env)Nget("N", envir= env)#### Uses of local()### Mutually recursive.# gg gets value of last assignment, an anonymous version of f.gg<- local({ k<-function(y)f(y) f<-function(x)if(x) x*k(x-1)else1})gg(10)sapply(1:5, gg)# Nesting locals: a is private storage accessible to kgg<- local({ k<- local({ a<-1function(y){print(a<<- a+1);f(y)}}) f<-function(x)if(x) x*k(x-1)else1})sapply(1:5, gg)ls(envir= environment(gg))ls(envir= environment(get("k", envir= environment(gg))))
Look for anR object of the given name and possibly return it
exists(x, where = -1, envir = , frame, mode = "any", inherits = TRUE)get0(x, envir = pos.to.env(-1L), mode = "any", inherits = TRUE, ifnotfound = NULL)
exists(x, where=-1, envir=, frame, mode="any", inherits=TRUE)get0(x, envir= pos.to.env(-1L), mode="any", inherits=TRUE, ifnotfound=NULL)
x | a variable name (given as a character string or a symbol). |
where | where to look for the object (see the details section); ifomitted, the function will search as if the name of the objectappeared unquoted in an expression. |
envir | an alternative way to specify an environment to look in,but it is usually simpler to just use the |
frame | a frame in the calling list. Equivalent to giving |
mode | the mode or type of object sought: see the‘Details’ section. |
inherits | should the enclosing frames of the environment besearched? |
ifnotfound | the return value of |
Thewhere
argument can specify the environment in which to lookfor the object in any of several ways: as an integer (the position inthesearch
list); as the character string name of anelement in the search list; or as anenvironment
(including usingsys.frame
to access the currently activefunction calls). Theenvir
argument is an alternative way tospecify an environment, but is primarily there for back compatibility.
This function looks to see if the namex
has a value bound toit in the specified environment. Ifinherits
isTRUE
anda value is not found forx
in the specified environment, theenclosing frames of the environment are searched until the namex
is encountered. Seeenvironment
and the ‘RLanguage Definition’ manual for details about the structure ofenvironments and their enclosures.
Warning:inherits = TRUE
is the default behaviour forR but not for S.
Ifmode
is specified then only objects of that type are sought.Themode
may specify one of the collections"numeric"
and"function"
(seemode
): any member of thecollection will suffice. (This is true even if a member of acollection is specified, so for examplemode = "special"
willseek any type of function.)
exists():
Logical, true if and only if an object of the correctname and mode is found.
get0():
The object—as fromget(x, *)
—ifexists(x, *)
is true, otherwiseifnotfound
.
Withget0()
, instead of the easy to read but somewhatinefficient
if (exists(myVarName, envir = myEnvir)) { r <- get(myVarName, envir = myEnvir) ## ... deal with r ... }
you now can use the more efficient (and slightly harder to read)
if (!is.null(r <- get0(myVarName, envir = myEnvir))) { ## ... deal with r ... }
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
get
andhasName
. For quite a different kind of “existence”checking, namely if function arguments were specified,missing
;and for yet a different kind, namely if a file exists,file.exists
.
## Define a substitute function if necessary:if(!exists("some.fun", mode = "function")) some.fun <- function(x) { cat("some.fun(x)\n"); x }search()exists("ls", 2) # true even though ls is in pos = 3exists("ls", 2, inherits = FALSE) # false## These are true (in most circumstances):identical(ls, get0("ls"))identical(NULL, get0(".foo.bar.")) # default ifnotfound = NULL (!)
## Define a substitute function if necessary:if(!exists("some.fun", mode="function")) some.fun<-function(x){ cat("some.fun(x)\n"); x}search()exists("ls",2)# true even though ls is in pos = 3exists("ls",2, inherits=FALSE)# false## These are true (in most circumstances):identical(ls, get0("ls"))identical(NULL, get0(".foo.bar."))# default ifnotfound = NULL (!)
Create a data frame from all combinations of the supplied vectors orfactors. See the description of the return value for precise details ofthe way this is done.
expand.grid(..., KEEP.OUT.ATTRS = TRUE, stringsAsFactors = TRUE)
expand.grid(..., KEEP.OUT.ATTRS=TRUE, stringsAsFactors=TRUE)
... | vectors, factors or a list containing these. |
KEEP.OUT.ATTRS | a logical indicating the |
stringsAsFactors | logical specifying if character vectors areconverted to factors. |
A data frame containing one row for each combination of the suppliedfactors. The first factors vary fastest. The columns are labelled bythe factors if these are supplied as named arguments or namedcomponents of a list. The row names are ‘automatic’.
Attribute"out.attrs"
is a list which gives the dimension anddimnames for use bypredict
methods.
Conversion to a factor is done with levels in the orderthey occur in the character vectors (and not alphabetically, as ismost common when converting to factors).
Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
combn
(packageutils
) for the generationof all combinations of n elements, taken m at a time.
require(utils)expand.grid(height = seq(60, 80, 5), weight = seq(100, 300, 50), sex = c("Male","Female"))x <- seq(0, 10, length.out = 100)y <- seq(-1, 1, length.out = 20)d1 <- expand.grid(x = x, y = y)d2 <- expand.grid(x = x, y = y, KEEP.OUT.ATTRS = FALSE)object.size(d1) - object.size(d2)##-> 5992 or 8832 (on 32- / 64-bit platform)
require(utils)expand.grid(height= seq(60,80,5), weight= seq(100,300,50), sex= c("Male","Female"))x<- seq(0,10, length.out=100)y<- seq(-1,1, length.out=20)d1<- expand.grid(x= x, y= y)d2<- expand.grid(x= x, y= y, KEEP.OUT.ATTRS=FALSE)object.size(d1)- object.size(d2)##-> 5992 or 8832 (on 32- / 64-bit platform)
Creates or tests for objects of mode and class"expression"
.
expression(...)is.expression(x)as.expression(x, ...)
expression(...)is.expression(x)as.expression(x,...)
... |
|
x | an arbitraryR object. |
‘Expression’ here is not being used in its colloquial sense,that of mathematical expressions. Those are calls (seecall
) inR, and anR expression vector is a list ofcalls, symbols etc, for example as returned byparse
.
As an object of mode"expression"
is a list, it can besubsetted by[
,[[
or$
, the latter two extractingindividual calls etc. The replacement forms of these operators can beused to replace or delete elements.
expression
andis.expression
areprimitive functions.expression
is ‘special’: it does not evaluate its arguments.
expression
returns a vector of type"expression"
containing its arguments (unevaluated).
is.expression
returnsTRUE
ifexpr
is anexpression object andFALSE
otherwise.
as.expression
attempts to coerce its argument into anexpression object. It is generic, and only the default method isdescribed here. (The default method callsas.vector(type = "expression")
and so may dispatch methods foras.vector
.)NULL
, calls, symbols (seeas.symbol
) and pairlists are returned as the element ofa length-one expression vector. Atomic vectors are placedelement-by-element into an expression vector (without using anynames):list
s have their type (typeof
)changed to an expression vector(keeping all attributes).Other types are not currently supported.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
call
,eval
,function
.Further,text
,legend
, andplotmath
for plotting mathematical expressions.
length(ex1 <- expression(1 + 0:9)) # 1ex1eval(ex1) # 1:10length(ex3 <- expression(u, 2, u + 0:9)) # 3mode(ex3 [3]) # expressionmode(ex3[[3]]) # call## but not all components are 'call's :sapply(ex3, mode ) # name numeric callsapply(ex3, typeof) # symbol double languagerm(ex3)
length(ex1<- expression(1+0:9))# 1ex1eval(ex1)# 1:10length(ex3<- expression(u,2, u+0:9))# 3mode(ex3[3])# expressionmode(ex3[[3]])# call## but not all components are 'call's :sapply(ex3, mode)# name numeric callsapply(ex3, typeof)# symbol double languagerm(ex3)
Operators acting on vectors, matrices, arrays and lists to extract orreplace parts.
x[i]x[i, j, ... , drop = TRUE]x[[i, exact = TRUE]]x[[i, j, ..., exact = TRUE]]x$namegetElement(object, name)x[i] <- valuex[i, j, ...] <- valuex[[i]] <- valuex$name <- value
x[i]x[i, j,..., drop=TRUE]x[[i, exact=TRUE]]x[[i, j,..., exact=TRUE]]x$namegetElement(object, name)x[i]<- valuex[i, j,...]<- valuex[[i]]<- valuex$name<- value
x ,object | object from which to extract element(s) or in which to replace element(s). |
i ,j ,... | indices specifying elements to extract or replace. Indices are For When indexing arrays by An index value of |
name | a literal character string or aname (possiblybacktickquoted). For extraction, this is normally (see under‘Environments’) partially matched to the |
drop | relevant for matrices and arrays. If |
exact | controls possible partial matching of |
value | typically an array-likeR object of a similar class as |
These operators are generic. You can write methods to handle indexingof specific classes of objects, seeInternalMethods as well as[.data.frame
and[.factor
. Thedescriptions here apply only to the default methods. Note thatseparate methods are required for the replacement functions[<-
,[[<-
and$<-
for use when indexing occurs onthe assignment side of an expression.
The most important distinction between[
,[[
and$
is that the[
can select more than one element whereasthe other two select a single element.
Note thatx[[]]
is always erroneous.
The default methods work somewhat differently for atomic vectors,matrices/arrays and for recursive (list-like, seeis.recursive
) objects.$
is only valid forrecursive objects (andNULL
), and is only discussed in the section below onrecursive objects.
Subsetting (except by an empty index) will drop all attributes exceptnames
,dim
anddimnames
.
Indexing can occur on the right-hand-side of an expression forextraction, or on the left-hand-side for replacement. When an indexexpression appears on the left side of an assignment (known assubassignment) then that part ofx
is set to the valueof the right hand side of the assignment. In this case no partialmatching of character indices is done, and the left-hand-side iscoerced as needed to accept the values. For vectors, the answer willbe of the higher of the types ofx
andvalue
in thehierarchy raw < logical < integer < double < complex < character <list < expression. Attributes are preserved (althoughnames
,dim
anddimnames
will be adjusted suitably).Subassignment is done sequentially, so if an index is specified morethan once the latest assigned value for an index will result.
It is an error to apply any of these operators to an object which isnot subsettable (e.g., a function).
The usual form of indexing is[
.[[
can be used toselect a single elementdroppingnames
, whereas[
keeps them, e.g., inc(abc = 123)[1]
.
The index objecti
can be numeric, logical, character or empty.Indexing by factors is allowed and is equivalent to indexing by thenumeric codes (seefactor
) and not by the charactervalues which are printed (for which use[as.character(i)]
).
An empty index selects all values: this is most often used to replaceall the entries but keep theattributes
.
Matrices and arrays are vectors with a dimension attribute and so allthe vector forms of indexing can be used with a single index. Theresult will be an unnamed vector unlessx
is one-dimensionalwhen it will be a one-dimensional array.
The most common form of indexing a-dimensional array is tospecify
indices to
[
. As for vector indexing, theindices can be numeric, logical, character, empty or even factor.And again, indexing by factors is equivalent to indexing by thenumeric codes, see ‘Atomic vectors’ above.
An empty index (a comma separated blank) indicates that all entries inthat dimension are selected.The argumentdrop
applies to this form of indexing.
A third form of indexing is via a numeric matrix with the one columnfor each dimension: each row of the index matrix then selects a singleelement of the array, and the result is a vector. Negative indices arenot allowed in the index matrix.NA
and zero values are allowed:rows of an index matrix containing a zero are ignored, whereas rowscontaining anNA
produce anNA
in the result.
Indexing via a character matrix with one column per dimensions is alsosupported if the array has dimension names. As with numeric matrixindexing, each row of the index matrix selects a single element of thearray. Indices are matched against the appropriate dimension names.NA
is allowed and will produce anNA
in the result.Unmatched indices as well as the empty string (""
) are notallowed and will result in an error.
A vector obtained by matrix indexing will be unnamed unlessx
is one-dimensional when the row names (if any) will be indexed toprovide names for the result.
Indexing by[
is similar to atomic vectors and selects a listof the specified element(s).
Both[[
and$
select a single element of the list. Themain difference is that$
does not allow computed indices,whereas[[
does.x$name
is equivalent tox[["name", exact = FALSE]]
. Also, the partial matchingbehavior of[[
can be controlled using theexact
argument.
getElement(x, name)
is a version ofx[[name, exact = TRUE]]
which for formally classed (S4) objects returnsslot(x, name)
,hence providing access to even more general list-like objects.
[
and[[
are sometimes applied to other recursiveobjects such ascalls andexpressions. Pairlists (suchas calls) are coerced to lists for extraction by[
, but allthree operators can be used for replacement.
[[
can be applied recursively to lists, so that if the singleindexi
is a vector of lengthp
,alist[[i]]
isequivalent toalist[[i1]]...[[ip]]
providing all but thefinal indexing results in a list.
Note that in all three kinds of replacement, a value ofNULL
deletes the corresponding item of the list. To set entries toNULL
, you needx[i] <- list(NULL)
.
When$<-
is applied to aNULL
x
, it first coercesx
tolist()
. This is what also happens with[[<-
where inR versions less than 4.y.z, a length one value resulted in alength one (atomic)vector.
Both$
and[[
can be applied to environments. Onlycharacter indices are allowed and no partial matching is done. Thesemantics of these operations are those ofget(i, env = x, inherits = FALSE)
. If no match is found thenNULL
isreturned. The replacement versions,$<-
and[[<-
, canalso be used. Again, only character arguments are allowed. Thesemantics in this case are those ofassign(i, value, env = x, inherits = FALSE)
. Such an assignment will either create a newbinding or change the existing binding inx
.
When extracting, a numerical, logical or characterNA
index picksan unknown element and so returnsNA
in the correspondingelement of a logical, integer, numeric, complex or character result,andNULL
for a list. (It returns00
for a raw result.)
When replacing (that is using indexing on the lhs of anassignment)NA
does not select any element to be replaced. Asthere is ambiguity as to whether an element of the rhs shouldbe used or not, this is only allowed if the rhs value is of length one(so the two interpretations would have the same outcome).(The documented behaviour of S was that anNA
replacement index‘goes nowhere’ but uses up an element ofvalue
:Beckeret al. p. 359. However, that has not been true ofother implementations.)
Note that these operations do not match their index arguments in thestandard way: argument names are ignored and positional matching only isused. Som[j = 2, i = 1]
is equivalent tom[2, 1]
andnot tom[1, 2]
.
This may not be true for methods defined for them; for example it isnot true for thedata.frame
methods described in[.data.frame
which warn ifi
orj
is named and have undocumented behaviour in that case.
To avoid confusion, do not name index arguments (butdrop
andexact
must be named).
These operators are also implicit S4 generics, but as primitives, S4methods will be dispatched only on S4 objectsx
.
The implicit generics for the$
and$<-
operators do nothavename
in their signature because the grammar only allowssymbols or string constants for thename
argument.
Character indices can in some circumstances be partially matched (seepmatch
) to the names or dimnames of the object beingsubsetted (but never for subassignment).Unlike S (Beckeret al. p. 358),R never uses partialmatching when extracting by[
, and partial matching is not by default used by[[
(see argumentexact
).
Thus the default behaviour is to use partial matching only whenextracting from recursive objects (except environments) by$
.Even in that case, warnings can be switched on byoptions(warnPartialMatchDollar = TRUE)
.
Neither empty (""
) norNA
indices match any names, noteven empty nor missing names. If any object has no names orappropriate dimnames, they are taken as all""
and so matchnothing.
Attempting to apply a subsetting operation to objects for which this isnot possible signals an error of classnotSubsettableError
. Theobject
component of the errorcondition contains the non-subsettable object.
Subscript out of bounds errors are signaled as errors of classsubscriptOutOfBoundsError
. Theobject
component of theerror condition contains the object being subsetted. The integersubscript
component is zero for vector subscripting, and formultiple subscripts indicates which subscript was out of bounds. Theindex
component contains the erroneous index.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
names
for details of matching to names, andpmatch
for partial matching.
[.data.frame
and[.factor
for thebehaviour when applied to data.frame and factors.
Syntax
for operator precedence, and the‘R Language Definition’ manual about indexing details.
NULL
for details of indexing null objects.
x <- 1:12m <- matrix(1:6, nrow = 2, dimnames = list(c("a", "b"), LETTERS[1:3]))li <- list(pi = pi, e = exp(1))x[10] # the tenth element of xx <- x[-1] # delete the 1st element of xm[1,] # the first row of matrix mm[1, , drop = FALSE] # is a 1-row matrixm[,c(TRUE,FALSE,TRUE)]# logical indexingm[cbind(c(1,2,1),3:1)]# matrix numeric indexci <- cbind(c("a", "b", "a"), c("A", "C", "B"))m[ci] # matrix character indexm <- m[,-1] # delete the first column of mli[[1]] # the first element of list liy <- list(1, 2, a = 4, 5)y[c(3, 4)] # a list containing elements 3 and 4 of yy$a # the element of y named a## non-integer indices are truncated:(i <- 3.999999999) # "4" is printed(1:5)[i] # 3## named atomic vectors, compare "[" and "[[" :nx <- c(Abc = 123, pi = pi)nx[1] ; nx["pi"] # keeps names, whereas "[[" does not:nx[[1]] ; nx[["pi"]]## recursive indexing into listsz <- list(a = list(b = 9, c = "hello"), d = 1:5)unlist(z)z[[c(1, 2)]]z[[c(1, 2, 1)]] # both "hello"z[[c("a", "b")]] <- "new"unlist(z)## check $ and [[ for environmentse1 <- new.env()e1$a <- 10e1[["a"]]e1[["b"]] <- 20e1$bls(e1)## partial matching - possibly with warning :stopifnot(identical(li$p, pi))op <- options(warnPartialMatchDollar = TRUE)stopifnot( identical(li$p, pi), #-- a warning inherits(tryCatch (li$p, warning = identity), "warning"))## revert the warning option:options(op)
x<-1:12m<- matrix(1:6, nrow=2, dimnames= list(c("a","b"), LETTERS[1:3]))li<- list(pi= pi, e= exp(1))x[10]# the tenth element of xx<- x[-1]# delete the 1st element of xm[1,]# the first row of matrix mm[1,, drop=FALSE]# is a 1-row matrixm[,c(TRUE,FALSE,TRUE)]# logical indexingm[cbind(c(1,2,1),3:1)]# matrix numeric indexci<- cbind(c("a","b","a"), c("A","C","B"))m[ci]# matrix character indexm<- m[,-1]# delete the first column of mli[[1]]# the first element of list liy<- list(1,2, a=4,5)y[c(3,4)]# a list containing elements 3 and 4 of yy$a# the element of y named a## non-integer indices are truncated:(i<-3.999999999)# "4" is printed(1:5)[i]# 3## named atomic vectors, compare "[" and "[[" :nx<- c(Abc=123, pi= pi)nx[1]; nx["pi"]# keeps names, whereas "[[" does not:nx[[1]]; nx[["pi"]]## recursive indexing into listsz<- list(a= list(b=9, c="hello"), d=1:5)unlist(z)z[[c(1,2)]]z[[c(1,2,1)]]# both "hello"z[[c("a","b")]]<-"new"unlist(z)## check $ and [[ for environmentse1<- new.env()e1$a<-10e1[["a"]]e1[["b"]]<-20e1$bls(e1)## partial matching - possibly with warning :stopifnot(identical(li$p, pi))op<- options(warnPartialMatchDollar=TRUE)stopifnot( identical(li$p, pi),#-- a warning inherits(tryCatch(li$p, warning= identity),"warning"))## revert the warning option:options(op)
Extract or replace subsets of data frames.
## S3 method for class 'data.frame'x[i, j, drop = ]## S3 replacement method for class 'data.frame'x[i, j] <- value## S3 method for class 'data.frame'x[[..., exact = TRUE]]## S3 replacement method for class 'data.frame'x[[i, j]] <- value## S3 replacement method for class 'data.frame'x$name <- value
## S3 method for class 'data.frame'x[i, j, drop=]## S3 replacement method for class 'data.frame'x[i, j]<- value## S3 method for class 'data.frame'x[[..., exact=TRUE]]## S3 replacement method for class 'data.frame'x[[i, j]]<- value## S3 replacement method for class 'data.frame'x$name<- value
x | data frame. |
i ,j ,... | elements to extract or replace. For |
name | a literal character string or aname (possiblybacktickquoted). |
drop | logical. If |
value | a suitable replacement value: it will be repeated a wholenumber of times if necessary and it may be coerced: see theCoercion section. If |
exact | logical: see |
Data frames can be indexed in several modes. When[
and[[
are used with a single vector index (x[i]
orx[[i]]
), they index the data frame as if it were a list. Inthis usage adrop
argument is ignored, with a warning.
There is nodata.frame
method for$
, sox$name
uses the default method which treatsx
as a list (with partialmatching of column names if the match is unique, seeExtract
). The replacement method (for$
) checksvalue
for the correct number of rows, and replicates it if necessary.
When[
and[[
are used with two indices (x[i, j]
andx[[i, j]]
) they act like indexing a matrix:[[
canonly be used to select one element. Note that for each selectedcolumn,xj
say, typically (if it is not matrix-like), theresulting column will bexj[i]
, and hence rely on thecorresponding[
method, see the examples section.
If[
returns a data frame it will have unique (and non-missing)row names, if necessary transforming the row names usingmake.unique
. Similarly, if columns are selected columnnames will be transformed to be unique if necessary (e.g., if columnsare selected more than once, or if more than one column of a givenname is selected if the data frame has duplicate column names).
Whendrop = TRUE
, this is applied to the subsetting of anymatrices contained in the data frame as well as to the data frame itself.
The replacement methods can be used to add whole column(s) by specifyingnon-existent column(s), in which case the column(s) are added at theright-hand edge of the data frame and numerical indices must becontiguous to existing indices. On the other hand, rows can be addedat any row after the current last row, and the columns will bein-filled with missing values. Missing values in the indices are notallowed for replacement.
For[
the replacement value can be a list: each element of thelist is used to replace (part of) one column, recycling the list asnecessary. If columns specified by number are created, the names(if any) of the corresponding list elements are used to name thecolumns. If the replacement is not selecting rows, list values cancontainNULL
elements which will cause the correspondingcolumns to be deleted. (See the Examples.)
Matrix indexing (x[i]
with a logical or a 2-column integermatrixi
) using[
is not recommended. For extraction,x
is first coerced to a matrix. For replacement, logicalmatrix indices must be of the same dimension asx
.Replacements are done one column at a time, with multiple typecoercions possibly taking place.
Both[
and[[
extraction methods partially match rownames. By default neither partially match column names, but[[
will ifexact = FALSE
(and with a warning ifexact = NA
). If you want to exact matching on row names usematch
, as in the examples.
For[
a data frame, list or a single column (the latter twoonly when dimensions have been dropped). If matrix indexing is used forextraction a vector results. If the result would be a data frame anerror results if undefined columns are selected (as there is no generalconcept of a 'missing' column in a data frame). Otherwise if a singlecolumn is selected and this is undefined the result isNULL
.
For[[
a column of the data frame orNULL
(extraction with one index)or a length-one vector (extraction with two indices).
For$
, a column of the data frame (orNULL
).
For[<-
,[[<-
and$<-
, a data frame.
The story over when replacement values are coerced is a complicatedone, and one that has changed duringR's development. This sectionis a guide only.
When[
and[[
are used to add or replace a whole column,no coercion takes place butvalue
will bereplicated (by calling the generic functionrep
) to theright length if an exact number of repeats can be used.
When[
is used with a logical matrix, each value is coerced tothe type of the column into which it is to be placed.
When[
and[[
are used with two indices, thecolumn will be coerced as necessary to accommodate the value.
Note that when the replacement value is an array (including a matrix)it isnot treated as a series of columns (asdata.frame
andas.data.frame
do) butinserted as a single column.
The default behaviour when only onerow is left is equivalent tospecifyingdrop = FALSE
. To drop from a data frame to a list,drop = TRUE
has to be specified explicitly.
Arguments other thandrop
andexact
should not be named:there is a warning if they are and the behaviour differs from thedescription here.
subset
which is often easier for extraction,data.frame
,Extract
.
sw <- swiss[1:5, 1:4] # select a manageable subsetsw[1:3] # select columnssw[, 1:3] # samesw[4:5, 1:3] # select rows and columnssw[1] # a one-column data framesw[, 1, drop = FALSE] # the samesw[, 1] # a (unnamed) vectorsw[[1]] # the samesw$Fert # the same (possibly w/ warning, see ?Extract)sw[1,] # a one-row data framesw[1,, drop = TRUE] # a listsw["C", ] # partially matchessw[match("C", row.names(sw)), ] # no exact matchtry(sw[, "Ferti"]) # column names must match exactlysw[sw$Fertility > 90,] # logical indexing, see also ?subsetsw[c(1, 1:2), ] # duplicate row, unique row names are createdsw[sw <= 6] <- 6 # logical matrix indexingsw## adding a columnsw["new1"] <- LETTERS[1:5] # adds a character columnsw[["new2"]] <- letters[1:5] # dittosw[, "new3"] <- LETTERS[1:5] # dittosw$new4 <- 1:5sapply(sw, class)sw$new # -> NULL: no unique partial matchsw$new4 <- NULL # delete the columnswsw[6:8] <- list(letters[10:14], NULL, aa = 1:5)# update col. 6, delete 7, appendsw## matrices in a data frameA <- data.frame(x = 1:3, y = I(matrix(4:9, 3, 2)), z = I(matrix(letters[1:9], 3, 3)))A[1:3, "y"] # a matrixA[1:3, "z"] # a matrixA[, "y"] # a matrixstopifnot(identical(colnames(A), c("x", "y", "z")), ncol(A) == 3L, identical(A[,"y"], A[1:3, "y"]), inherits (A[,"y"], "AsIs"))## keeping special attributes: use a class with a## "as.data.frame" and "[" method;## "avector" := vector that keeps attributes. Could provide a constructor## avector <- function(x) { class(x) <- c("avector", class(x)); x }as.data.frame.avector <- as.data.frame.vector`[.avector` <- function(x,i,...) { r <- NextMethod("[") mostattributes(r) <- attributes(x) r}d <- data.frame(i = 0:7, f = gl(2,4), u = structure(11:18, unit = "kg", class = "avector"))str(d[2:4, -1]) # 'u' keeps its "unit"
sw<- swiss[1:5,1:4]# select a manageable subsetsw[1:3]# select columnssw[,1:3]# samesw[4:5,1:3]# select rows and columnssw[1]# a one-column data framesw[,1, drop=FALSE]# the samesw[,1]# a (unnamed) vectorsw[[1]]# the samesw$Fert# the same (possibly w/ warning, see ?Extract)sw[1,]# a one-row data framesw[1,, drop=TRUE]# a listsw["C",]# partially matchessw[match("C", row.names(sw)),]# no exact matchtry(sw[,"Ferti"])# column names must match exactlysw[sw$Fertility>90,]# logical indexing, see also ?subsetsw[c(1,1:2),]# duplicate row, unique row names are createdsw[sw<=6]<-6# logical matrix indexingsw## adding a columnsw["new1"]<- LETTERS[1:5]# adds a character columnsw[["new2"]]<- letters[1:5]# dittosw[,"new3"]<- LETTERS[1:5]# dittosw$new4<-1:5sapply(sw, class)sw$new# -> NULL: no unique partial matchsw$new4<-NULL# delete the columnswsw[6:8]<- list(letters[10:14],NULL, aa=1:5)# update col. 6, delete 7, appendsw## matrices in a data frameA<- data.frame(x=1:3, y= I(matrix(4:9,3,2)), z= I(matrix(letters[1:9],3,3)))A[1:3,"y"]# a matrixA[1:3,"z"]# a matrixA[,"y"]# a matrixstopifnot(identical(colnames(A), c("x","y","z")), ncol(A)==3L, identical(A[,"y"], A[1:3,"y"]), inherits(A[,"y"],"AsIs"))## keeping special attributes: use a class with a## "as.data.frame" and "[" method;## "avector" := vector that keeps attributes. Could provide a constructor## avector <- function(x) { class(x) <- c("avector", class(x)); x }as.data.frame.avector<- as.data.frame.vector`[.avector`<-function(x,i,...){ r<- NextMethod("[") mostattributes(r)<- attributes(x) r}d<- data.frame(i=0:7, f= gl(2,4), u= structure(11:18, unit="kg", class="avector"))str(d[2:4,-1])# 'u' keeps its "unit"
Extract or replace subsets of factors.
## S3 method for class 'factor'x[..., drop = FALSE]## S3 method for class 'factor'x[[...]]## S3 replacement method for class 'factor'x[...] <- value## S3 replacement method for class 'factor'x[[...]] <- value
## S3 method for class 'factor'x[..., drop=FALSE]## S3 method for class 'factor'x[[...]]## S3 replacement method for class 'factor'x[...]<- value## S3 replacement method for class 'factor'x[[...]]<- value
x | a factor. |
... | a specification of indices – see |
drop | logical. If true, unused levels are dropped. |
value | character: a set of levels. Factor values are coerced tocharacter. |
When unused levels are dropped the ordering of the remaining levels ispreserved.
Ifvalue
is not inlevels(x)
, a missing value isassigned with a warning.
Anycontrasts
assigned to the factor are preservedunlessdrop = TRUE
.
The[[
method supports argumentexact
.
A factor with the same set of levels asx
unlessdrop = TRUE
.
## following example(factor)(ff <- factor(substring("statistics", 1:10, 1:10), levels = letters))ff[, drop = TRUE]factor(letters[7:10])[2:3, drop = TRUE]
## following example(factor)(ff<- factor(substring("statistics",1:10,1:10), levels= letters))ff[, drop=TRUE]factor(letters[7:10])[2:3, drop=TRUE]
Returns the (regular orparallel) maxima and minima of theinput values.
pmax*()
andpmin*()
take one or more vectors asarguments, recycle them to common length and return a single vectorgiving the‘parallel’ maxima (or minima) of the argumentvectors.
max(..., na.rm = FALSE)min(..., na.rm = FALSE)pmax(..., na.rm = FALSE)pmin(..., na.rm = FALSE)pmax.int(..., na.rm = FALSE)pmin.int(..., na.rm = FALSE)
max(..., na.rm=FALSE)min(..., na.rm=FALSE)pmax(..., na.rm=FALSE)pmin(..., na.rm=FALSE)pmax.int(..., na.rm=FALSE)pmin.int(..., na.rm=FALSE)
... | numeric or character arguments (see Note). |
na.rm | a logical indicating whether missing values should beremoved. |
max
andmin
return the maximum or minimum ofallthe values present in their arguments, asinteger
ifall arelogical
orinteger
, asdouble
ifall are numeric, and character otherwise.
Ifna.rm
isFALSE
anNA
value in any of thearguments will cause a value ofNA
to be returned, otherwiseNA
values are ignored.
The minimum and maximum of a numeric empty set are+Inf
and-Inf
(in this order!) which ensurestransitivity, e.g.,min(x1, min(x2)) == min(x1, x2)
. For numericx
max(x) == -Inf
andmin(x) == +Inf
wheneverlength(x) == 0
(after removing missing values ifrequested). However,pmax
andpmin
returnNA
if all the parallel elements areNA
even forna.rm = TRUE
.
pmax
andpmin
take one or more vectors (or matrices) asarguments and return a single vector giving the ‘parallel’maxima (or minima) of the vectors. The first element of the result isthe maximum (minimum) of the first elements of all the arguments, thesecond element of the result is the maximum (minimum) of the secondelements of all the arguments and so on. Shorter inputs (of non-zerolength) are recycled if necessary. Attributes (seeattributes
: such asnames
ordim
) are copied from the first argument (if applicable,e.g.,not for anS4
object).
pmax.int
andpmin.int
are faster internal versions onlyused when all arguments are atomic vectors and there are no classes:they drop all attributes. (Note that all versions fail for raw andcomplex vectors since these have no ordering.)
max
andmin
are generic functions: methods can bedefined for them individually or via theSummary
group generic. For this towork properly, the arguments...
should be unnamed, anddispatch is on the first argument.
By definition the min/max of a numeric vector containing anNaN
isNaN
, except that the min/max of any vector containing anNA
isNA
even if it also contains anNaN
.Note thatmax(NA, Inf) == NA
even though the maximum would beInf
whatever the missing value actually is.
Character versions are sorted lexicographically, and this depends onthe collating sequence of the locale in use: the help for‘Comparison’ gives details. The max/min of an emptycharacter vector is defined to be characterNA
. (One couldargue that as""
is the smallest character element, the maximumshould be""
, but there is no obvious candidate for theminimum.)
Formin
ormax
, a length-one vector. Forpmin
orpmax
, a vector of length the longest of the input vectors, orlength zero if one of the inputs had zero length.
The type of the result will be that of the highest of the inputs inthe hierarchy integer < double < character.
Formin
andmax
if there are only numeric inputs and allare empty (after possible removal ofNA
s), the result is double(Inf
or-Inf
).
max
andmin
are part of the S4Summary
group generic. Methodsfor them must use the signaturex, ..., na.rm
.
‘Numeric’ arguments are vectors of type integer and numeric,and logical (coerced to integer). For historical reasons,NULL
is accepted as equivalent tointeger(0)
.
pmax
andpmin
will also work on classed S3 or S4 objectswith appropriate methods for comparison,is.na
andrep
(if recycling of arguments is needed).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
range
(both min and max) andwhich.min
(which.max
) for thearg min,i.e., the location where an extreme value occurs.
‘plotmath’ for the use ofmin
in plot annotation.
require(stats); require(graphics) min(5:1, pi) #-> one numberpmin(5:1, pi) #-> 5 numbersx <- sort(rnorm(100)); cH <- 1.35pmin(cH, quantile(x)) # no namespmin(quantile(x), cH) # has namesplot(x, pmin(cH, pmax(-cH, x)), type = "b", main = "Huber's function")cut01 <- function(x) pmax(pmin(x, 1), 0)curve( x^2 - 1/4, -1.4, 1.5, col = 2)curve(cut01(x^2 - 1/4), col = "blue", add = TRUE, n = 500)## pmax(), pmin() preserve attributes of *first* argumentD <- diag(x = (3:1)/4) ; n0 <- numeric()stopifnot(identical(D, cut01(D) ), identical(n0, cut01(n0)), identical(n0, cut01(NULL)), identical(n0, pmax(3:1, n0, 2)), identical(n0, pmax(n0, 4)))
require(stats); require(graphics) min(5:1, pi)#-> one numberpmin(5:1, pi)#-> 5 numbersx<- sort(rnorm(100)); cH<-1.35pmin(cH, quantile(x))# no namespmin(quantile(x), cH)# has namesplot(x, pmin(cH, pmax(-cH, x)), type="b", main="Huber's function")cut01<-function(x) pmax(pmin(x,1),0)curve( x^2-1/4,-1.4,1.5, col=2)curve(cut01(x^2-1/4), col="blue", add=TRUE, n=500)## pmax(), pmin() preserve attributes of *first* argumentD<- diag(x=(3:1)/4); n0<- numeric()stopifnot(identical(D, cut01(D)), identical(n0, cut01(n0)), identical(n0, cut01(NULL)), identical(n0, pmax(3:1, n0,2)), identical(n0, pmax(n0,4)))
Report versions of (external) third-party software used.
extSoftVersion()
extSoftVersion()
The reports the versions of third-party software libraries in use.These are often external but might have been compiled intoR when itwas installed.
With dynamic linking, these are the versions of the libraries linkedto in this session: with static linking, of those compiled in.
A named character vector, currently with components
zlib | The version of |
bzlib | The version of |
xz | The version of |
libdeflate | The version of |
PCRE | The version of |
ICU | The version of |
TRE | The version of |
iconv | The implementation and version of the |
readline | The version of |
BLAS | Name of the binary/executable file with the implementation of |
Note that the values forbzlib
andpcre
normally containa date as well as the version number, and that fortre
includesseveral items separated by spaces, the version number being thesecond.
Foriconv
this will give the implementation as well as theversion, for example"GNU libiconv 1.14"
,"glibc 2.18"
or"win_iconv"
(which has no version number).
The name of the binary/executable file forBLAS
can be used as anindication of which implementation is in use. Typically, the R version ofBLAS will appear aslibR.so
(libR.dylib
),R
orlibRblas.so
(libRblas.dylib
), depending on how R was built.Note thatlibRblas.so
(libRblas.dylib
) may also be shown foran external BLAS implementation that had been copied, hard-linked orrenamed by the system administrator. For an external BLAS, a sharedobject file will be given and its path/name may indicate thevendor/version. The detection does not work on Windows nor for someuses of the Accelerate framework on macOS.
libcurlVersion
for the version oflibCurl
.
La_version
for the version of LAPACK in use.
La_library
for binary/executable file with LAPACK in use.
grSoftVersion
for third-party graphics software.
tclVersion
in packagetcltk for the version of Tcl/Tk.
pcre_config
for PCRE configuration options.
extSoftVersion()## the PCRE versionsub(" .*", "", extSoftVersion()["PCRE"])
extSoftVersion()## the PCRE versionsub(" .*","", extSoftVersion()["PCRE"])
The functionfactor
is used to encode a vector as a factor (theterms ‘category’ and ‘enumerated type’ are also used forfactors). If argumentordered
isTRUE
, the factorlevels are assumed to be ordered. For compatibility with S there isalso a functionordered
.
is.factor
,is.ordered
,as.factor
andas.ordered
are the membership and coercion functions for these classes.
factor(x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA)ordered(x = character(), ...)is.factor(x)is.ordered(x)as.factor(x)as.ordered(x)addNA(x, ifany = FALSE).valid.factor(object)
factor(x= character(), levels, labels= levels, exclude=NA, ordered= is.ordered(x), nmax=NA)ordered(x= character(),...)is.factor(x)is.ordered(x)as.factor(x)as.ordered(x)addNA(x, ifany=FALSE).valid.factor(object)
x | a vector of data, usually taking a small number of distinctvalues. |
levels | an optional vector of the unique values (as character strings)that |
labels | either an optional character vector oflabels for the levels (in the same order as |
exclude | a vector of values to be excluded when forming theset of levels. This may be factor with the same level set as |
ordered | logical flag to determine if the levels should be regardedas ordered (in the order given). |
nmax | an upper bound on the number of levels; see ‘Details’. |
... | (in |
ifany | only add an |
object | anR object. |
The type of the vectorx
is not restricted; it only must haveanas.character
method and be sortable (byorder
).
Ordered factors differ from factors only in their class, but methodsand model-fitting functions may treat the two classes quite differently,seeoptions("contrasts")
.
The encoding of the vector happens as follows. First all the valuesinexclude
are removed fromlevels
. Ifx[i]
equalslevels[j]
, then thei
-th element of the result isj
. If no match is found forx[i]
inlevels
(which will happen for excluded values) then thei
-th elementof the result is set toNA
.
Normally the ‘levels’ used as an attribute of the result arethe reduced set of levels after removing those inexclude
, butthis can be altered by supplyinglabels
. This should eitherbe a set of new labels for the levels, or a character string, inwhich case the levels are that character string with a sequencenumber appended.
factor(x, exclude = NULL)
applied to a factor withoutNA
s is a no-operation unless there are unused levels: inthat case, a factor with the reduced level set is returned. Ifexclude
is used, sinceR version 3.4.0, excluding non-existingcharacter levels is equivalent to excluding nothing, and whenexclude
is acharacter
vector, thatisapplied to the levels ofx
.Alternatively,exclude
can be factor with the same level set asx
and will exclude the levels present inexclude
.
The codes of a factor may containNA
. For a numericx
, setexclude = NULL
to makeNA
an extralevel (prints as ‘<NA>’); by default, this is the last level.
IfNA
is a level, the way to set a code to be missing (asopposed to the code of the missing level) is touseis.na
on the left-hand-side of an assignment (as inis.na(f)[i] <- TRUE
; indexing insideis.na
does not work).Under those circumstances missing values are currently printed as‘<NA>’, i.e., identical to entries of levelNA
.
is.factor
is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.
Wherelevels
is not supplied,unique
is called.Since factors typically have quite a small number of levels, for largevectorsx
it is helpful to supplynmax
as an upper boundon the number of unique values.
When usingc
to combine a (possiblyordered) factor with other objects, if all objects are (possiblyordered) factors, the result will be a factor with levels the union ofthe level sets of the elements, in the order the levels occur in thelevel sets of the elements (which means that if all the elements havethe same level set, that is the level set of the result), equivalentto howunlist
operates on a list of factor objects.
factor
returns an object of class"factor"
which has aset of integer codes the length ofx
with a"levels"
attribute of modecharacter
and unique(!anyDuplicated(.)
) entries. If argumentordered
is true (orordered()
is used) the result has classc("ordered", "factor")
.Undocumentedly for a long time,factor(x)
loses allattributes(x)
but"names"
, and resets"levels"
and"class"
.
Applyingfactor
to an ordered or unordered factor returns afactor (of the same type) with just the levels which occur: see also[.factor
for a more transparent way to achieve this.
is.factor
returnsTRUE
orFALSE
depending onwhether its argument is of type factor or not. Correspondingly,is.ordered
returnsTRUE
when its argument is an orderedfactor andFALSE
otherwise.
as.factor
coerces its argument to a factor.It is an abbreviated (sometimes faster) form offactor
.
as.ordered(x)
returnsx
if this is ordered, andordered(x)
otherwise.
addNA
modifies a factor by turningNA
into an extralevel (so thatNA
values are counted in tables, for instance).
.valid.factor(object)
checks the validity of a factor,currently onlylevels(object)
, and returnsTRUE
if it isvalid, otherwise a string describing the validity problem. Thisfunction is used forvalidObject(<factor>)
.
The interpretation of a factor depends on both the codes and the"levels"
attribute. Be careful only to compare factors withthe same set of levels (in the same order). In particular,as.numeric
applied to a factor is meaningless, and mayhappen by implicit coercion. To transform a factorf
toapproximately its original numeric values,as.numeric(levels(f))[f]
is recommended and slightly moreefficient thanas.numeric(as.character(f))
.
The levels of a factor are by default sorted, but the sort ordermay well depend on the locale at the time of creation, and shouldnot be assumed to be ASCII.
There are some anomalies associated with factors that haveNA
as a level. It is suggested to use them sparingly, e.g.,only for tabulation purposes.
There are"factor"
and"ordered"
methods for thegroup genericOps
whichprovide methods for theComparison operators,and for themin
,max
, andrange
generics inSummary
of"ordered"
. (The rest of the groups and theMath
group generate an error as theyare not meaningful for factors.)
Only==
and!=
can be used for factors: a factor canonly be compared to another factor with an identical set of levels(not necessarily in the same ordering) or to a character vector.Ordered factors are compared in the same way, but the general dispatchmechanism precludes comparing ordered and unordered factors.
All the comparison operators are available for ordered factors.Collation is done by the levels of the operands: if both operands areordered factors they must have the same level set.
In earlier versions ofR, storing character data as a factor was morespace efficient if there is even a small proportion ofrepeats. However, identical character strings now share storage, sothe difference is small in most cases. (Integer values are storedin 4 bytes whereas each reference to a character string needs apointer of 4 or 8 bytes.)
Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
[.factor
for subsetting of factors.
gl
for construction of balanced factors andC
for factors with specified contrasts.levels
andnlevels
for accessing thelevels, andunclass
to get integer codes.
(ff <- factor(substring("statistics", 1:10, 1:10), levels = letters))as.integer(ff) # the internal codes(f. <- factor(ff)) # drops the levels that do not occurff[, drop = TRUE] # the same, more transparentlyfactor(letters[1:20], labels = "letter")class(ordered(4:1)) # "ordered", inheriting from "factor"z <- factor(LETTERS[3:1], ordered = TRUE)## and "relational" methods work:stopifnot(sort(z)[c(1,3)] == range(z), min(z) < max(z))## suppose you want "NA" as a level, and to allow missing values.(x <- factor(c(1, 2, NA), exclude = NULL))is.na(x)[2] <- TRUEx # [1] 1 <NA> <NA>is.na(x)# [1] FALSE TRUE FALSE## More rational, since R 3.4.0 :factor(c(1:2, NA), exclude = "" ) # keeps <NA> , asfactor(c(1:2, NA), exclude = NULL) # always did## exclude = <character>z # ordered levels 'A < B < C'factor(z, exclude = "C") # does excludefactor(z, exclude = "B") # ditto## Now, labels maybe duplicated:## factor() with duplicated labels allowing to "merge levels"x <- c("Man", "Male", "Man", "Lady", "Female")## Map from 4 different values to only two levels:(xf <- factor(x, levels = c("Male", "Man" , "Lady", "Female"), labels = c("Male", "Male", "Female", "Female")))#> [1] Male Male Male Female Female#> Levels: Male Female## Using addNA()Month <- airquality$Monthtable(addNA(Month))table(addNA(Month, ifany = TRUE))
(ff<- factor(substring("statistics",1:10,1:10), levels= letters))as.integer(ff)# the internal codes(f.<- factor(ff))# drops the levels that do not occurff[, drop=TRUE]# the same, more transparentlyfactor(letters[1:20], labels="letter")class(ordered(4:1))# "ordered", inheriting from "factor"z<- factor(LETTERS[3:1], ordered=TRUE)## and "relational" methods work:stopifnot(sort(z)[c(1,3)]== range(z), min(z)< max(z))## suppose you want "NA" as a level, and to allow missing values.(x<- factor(c(1,2,NA), exclude=NULL))is.na(x)[2]<-TRUEx# [1] 1 <NA> <NA>is.na(x)# [1] FALSE TRUE FALSE## More rational, since R 3.4.0 :factor(c(1:2,NA), exclude="")# keeps <NA> , asfactor(c(1:2,NA), exclude=NULL)# always did## exclude = <character>z# ordered levels 'A < B < C'factor(z, exclude="C")# does excludefactor(z, exclude="B")# ditto## Now, labels maybe duplicated:## factor() with duplicated labels allowing to "merge levels"x<- c("Man","Male","Man","Lady","Female")## Map from 4 different values to only two levels:(xf<- factor(x, levels= c("Male","Man","Lady","Female"), labels= c("Male","Male","Female","Female")))#> [1] Male Male Male Female Female#> Levels: Male Female## Using addNA()Month<- airquality$Monthtable(addNA(Month))table(addNA(Month, ifany=TRUE))
Utility function to access information about files on the user'sfile systems.
file.access(names, mode = 0)
file.access(names, mode=0)
names | character vector containing file names.Tilde-expansion will be done: see |
mode | integer specifying access mode required: see ‘Details’. |
Themode
value can be the exclusive or (xor
), i.e., apartial sum of the following values, and hence must be in0:7
,
test for existence.
test for execute permission.
test for write permission.
test for read permission.
Permission will be computed for real user ID and real group ID (ratherthan the effective IDs).
Please note that it is not a good idea to use this function to testbefore trying to open a file. On a multi-tasking system, it ispossible that the accessibility of a file will change between the timeyou callfile.access()
and the time you try to open the file.It is better to wrap file open attempts intry
.
An integer vector with values0
for success and-1
for failure.
This was written as a replacement for the S-PLUS functionaccess
, a wrapper for the C function of the same name, whichexplains the return value encoding. Note that the return value isfalse forsuccess.
file.info
for more details on permissions,Sys.chmod
to change permissions, andtry
for a ‘test it and see’ approach.
file_test
for shell-style file tests.
fa <- file.access(dir("."))table(fa) # count successes & failures
fa<- file.access(dir("."))table(fa)# count successes & failures
Choose a file interactively.
file.choose(new = FALSE)
file.choose(new=FALSE)
new | Logical: choose the style of dialog boxpresented to the user: at present only new = FALSE is used. |
A character vector of length one giving the file path.
list.files
for non-interactive selection.
Utility function to extract information about files on the user'sfile systems.
file.info(..., extra_cols = TRUE)file.mode(...)file.mtime(...)file.size(...)
file.info(..., extra_cols=TRUE)file.mode(...)file.mtime(...)file.size(...)
... | character vectors containing file paths. Tilde-expansionis done: see |
extra_cols | logical: return all cols rather than just thefirst six. |
What constitutes a ‘file’ is OS-dependent but includesdirectories. (However, directory names must not include a trailingbackslash or slash on Windows.) See also the section in the help forfile.exists
on case-insensitive file systems.
The file ‘mode’ follows POSIX conventions, giving three octaldigits summarizing the permissions for the file owner, the owner'sgroup and for anyone respectively. Each digit is the logicalor of read (4), write (2) and execute/search (1) permissions.
Seefiles for how file paths with marked encodings are interpreted.
On most systems symbolic links are followed, so information is givenabout the file to which the link points rather than about the link.
File modes are probably only useful onNTFS file systems, and it seemsall three digits refer to the file's owner.The execute/search bits are set for directories, and for files basedon their extensions (e.g., ‘.exe’, ‘.com’, ‘.cmd’and ‘.bat’ files).file.access
will give a morereliable view of read/write access availability to theR process.
UTF-8-encoded file names not valid in the current locale can be used.
Junction points and symbolic links are followed, so information isgiven about the file/directory to which the link points rather thanabout the link.
Forfile.info()
, data frame with row names the file names and columns
size | double: File size in bytes. |
isdir | logical: Is the file a directory? |
mode | integer of class |
mtime ,ctime ,atime | object of class |
integer, the user ID of the file's owner.
integer, the group ID of the file's group.
character,uid
interpreted as a user name.
character,gid
interpreted as a group name.
Unknown user and group names will beNA
.
character indicating the sort of executable. Possiblevalues are"no"
,"msdos"
,"win16"
,"win32"
,"win64"
and"unknown"
. Note that afile (e.g., a script file) can be executable according to the modebits but not executable in this sense.
Ifextra_cols
is false, only the first six columns arereturned: as these can all be found from a single C system call thiscan be faster. (However, properly configured systems will use a‘name service cache daemon’ to speed up the name lookups.)
Entries for non-existent or non-readable files will beNA
.
Theuid
,gid
,uname
andgrname
columnsmay not be supplied on a non-POSIX Unix-alike system, and will not beon Windows.
What is meant by the three file times depends on the OS and filesystem. On Windows native file systemsctime
is the filecreation time (something which is not recorded on most Unix-alike filesystems). What is meant by ‘file access’ and hence the‘last access time’ is system-dependent.
The resolution of the file times depends on both the OS and the typeof the file system. Modern file systems typically record times to anaccuracy of a microsecond or better: notable exceptions areHFS+ onmacOS (recorded in seconds) and modification time on older FAT systems(recorded in increments of 2 seconds). Note that"POSIXct"
times are by default printed in whole seconds: to change that seestrftime
.
file.mode()
,file.mtime()
andfile.size()
are fastconvenience wrappers returning just one of the columns.
Some (now old) unix alike systems allow files of more than 2Gb to be created butnot accessed by thestat
system call. Such files may show upas non-readable (and very likely not be readable by any ofR's inputfunctions).
Sys.readlink
to find out about symbolic links,files
,file.access
,list.files
,andDateTimeClasses
for the date formats.
Sys.chmod
to change permissions.
ncol(finf <- file.info(dir())) # at least sixfinf # the whole list## Those that are more than 100 days old :finf <- file.info(dir(), extra_cols = FALSE)finf[difftime(Sys.time(), finf[,"mtime"], units = "days") > 100 , 1:4]file.info("no-such-file-exists")## E.g., for R-core, in a R-devel version:if(Sys.info()[["sysname"]] == "Linux") sort(file.mtime(file.path(R.home("bin"), c("", file.path(c("", "exec"), "R"))) ))
ncol(finf<- file.info(dir()))# at least sixfinf# the whole list## Those that are more than 100 days old :finf<- file.info(dir(), extra_cols=FALSE)finf[difftime(Sys.time(), finf[,"mtime"], units="days")>100,1:4]file.info("no-such-file-exists")## E.g., for R-core, in a R-devel version:if(Sys.info()[["sysname"]]=="Linux") sort(file.mtime(file.path(R.home("bin"), c("", file.path(c("","exec"),"R")))))
Construct the path to a file from components in a platform-independentway.
file.path(..., fsep = .Platform$file.sep)
file.path(..., fsep= .Platform$file.sep)
... | character vectors.Long vectors are not supported. |
fsep | the path separator to use (assumed to be ASCII). |
The implementation is designed to be fast (faster thanpaste
) as this function is used extensively inR itself.
It can also be used for environment paths such asPATH andR_LIBS withfsep = .Platform$path.sep
.
Trailing path separators are invalid for Windows file paths apart from‘/’ and ‘d:/’ (although some functions/utilities do acceptthem), so a trailing/
or\
is removed there.
A character vector of the arguments concatenated term-by-term andseparated byfsep
if all arguments have positive length;otherwise, an empty character vector (unlikepaste
).
An element of the result will be marked (seeEncoding
) asUTF-8 if run in a UTF-8 locale (when marked inputs are converted toUTF-8) or if a component of the result is marked as UTF-8, or asLatin-1 in a non-Latin-1 locale.
The components are by default separated by/
(not\
) on Windows.
basename
,normalizePath
,path.expand
.
Display one or more (plain) text files, in a platformspecific way, typically via a ‘pager’.
file.show(..., header = rep("", nfiles), title = "R Information", delete.file = FALSE, pager = getOption("pager"), encoding = "")
file.show(..., header= rep("", nfiles), title="R Information", delete.file=FALSE, pager= getOption("pager"), encoding="")
... | one or more character vectors containing the names of thefiles to be displayed. Paths with havetilde expansion. |
header | character vector (of the same length as the number of filesspecified in |
title | an overall title for the display. If a single separatewindow is used for the display, |
delete.file | should the files be deleted after display? Usedfor temporary files. |
pager | the pager to be used, see ‘Details’. |
encoding | character string giving the encoding to be assumed forthe file(s). |
This function provides the core of the R help system, but it can beused for other purposes as well, such aspage
.
How the pager is implemented is highly system-dependent.
The basic Unix version concatenates the files (using the headers) to atemporary file, and displays it in the pager selected by thepager
argument, which is a character vector specifying a systemcommand (a full path or a command found on thePATH) to run onthe set of files. The ‘factory-fresh’ default is to use‘R_HOME/bin/pager’, which is a shell script running the command-linespecified by the environment variablePAGER whose default is setat configuration, usually toless
. On a Unix-alikemore
is used ifpager
is empty.
Most GUI systems will use a separate pager window for each file, andlet the user leave it up whileR continues running. The selection ofsuch pagers could either be done using special pager names beingintercepted by lower-level code (such as"internal"
and"console"
on Windows), or by lettingpager
be anRfunction which will be called with arguments(files, header, title, delete.file)
corresponding to the first four arguments offile.show
and take care of interfacing to the GUI.
TheR.app
GUI on macOS uses its internal pager irrespectiveof the setting ofpager
.
Not all implementations will honourdelete.file
. Inparticular, using an external pager on Windows does not, as there isno way to know when the external application has finished with thefile.
Ross Ihaka, Brian Ripley.
Text-typehelp
andRShowDoc
callfile.show
.
ConsidergetOption("pdfviewer")
and,e.g.,system
for displaying pdf files.
file.show(file.path(R.home("doc"), "COPYRIGHTS"))
file.show(file.path(R.home("doc"),"COPYRIGHTS"))
These functions provide a low-level interface to the computer'sfile system.
file.create(..., showWarnings = TRUE)file.exists(...)file.remove(...)file.rename(from, to)file.append(file1, file2)file.copy(from, to, overwrite = recursive, recursive = FALSE, copy.mode = TRUE, copy.date = FALSE)file.symlink(from, to)file.link(from, to)
file.create(..., showWarnings=TRUE)file.exists(...)file.remove(...)file.rename(from, to)file.append(file1, file2)file.copy(from, to, overwrite= recursive, recursive=FALSE, copy.mode=TRUE, copy.date=FALSE)file.symlink(from, to)file.link(from, to)
... ,file1 ,file2 | character vectors, containing file names or paths. |
from ,to | character vectors, containing file names or paths.For
|
overwrite | logical; should existing destination files be overwritten? |
showWarnings | logical; should the warnings on failure be shown? |
recursive | logical. If |
copy.mode | logical: should file permission bits be copied wherepossible? |
copy.date | logical: should file dates be preserved wherepossible? See |
The...
arguments are concatenated to form one characterstring: you can specify the files separately or as one vector. All ofthese functions expand path names: seepath.expand
. (file.exists
silently reports falsefor paths that would be too long after expansion: the rest will give awarning.)
file.create
creates files with the given names if they do notalready exist and truncates them if they do. They are created withthe maximal read/write permissions allowed by the‘umask’ setting (where relevant). By default a warningis given (with the reason) if the operation fails.
file.exists
returns a logical vector indicating whether thefiles named by its argument exist. (Here ‘exists’ is in thesense of the system'sstat
call: a file will be reported asexisting only if you have the permissions needed bystat
.Existence can also be checked byfile.access
, whichmight use different permissions and so obtain a different result.Note that the existence of a file does not imply that it is readable:for that usefile.access
.) What constitutes a‘file’ is system-dependent, but should include directories.(However, directory names must not include a trailing backslash orslash on Windows.) Note that if the file is a symbolic link on aUnix-alike, the result indicates if the link points to an actual file,not just if the link exists. On Windows, the result is unreliable for abroken symbolic link (junction).Lastly, note thedifferent functionexists
whichchecks for existence ofR objects.
file.remove
attempts to remove the files named in its argument.On most Unix platforms ‘file’ includesemptydirectories, symbolic links, fifos and sockets. On Windows,‘file’ means a regular file and not, say, an empty directory.
file.rename
attempts to rename files (andfrom
andto
must be of the same length). Where file permissions allowthis will overwrite an existing element ofto
. This is subjectto the limitations of the OS's corresponding system call (seesomething likeman 2 rename
on a Unix-alike): in particularin the interpretation of ‘file’: most platforms will not renamefiles from one file system to another.NB: This means thatrenaming a file from a temporary directory to the user's filespace orduring package installation will often fail. (On Windows,file.rename
can rename files but not directories acrossvolumes.) On platforms which allow directories to be renamed,typically neither or both offrom
andto
must adirectory, and ifto
exists it must be an empty directory.
file.append
attempts to append the files named by itssecond argument to those named by its first. TheR subscriptrecycling rule is used to align names given in vectorsof different lengths.
file.copy
works in a similar way tofile.append
but withthe arguments in the natural order for copying. Copying to existingdestination files is skipped unlessoverwrite = TRUE
. Theto
argument can specify a single existing directory. Ifcopy.mode = TRUE
file read/write/execute permissions are copiedwhere possible, restricted by ‘umask’. (On Windows thisapplies only to files.) Other security attributes such asACLs are notcopied. On a POSIX filesystem the targets of symbolic links will becopied rather than the links themselves, and hard links are copiedseparately. Usingcopy.date = TRUE
may or may not copy thetimestamp exactly (for example, fractional seconds may be omitted),but is more likely to do so as fromR 3.4.0.
file.symlink
andfile.link
make symbolic and hard linkson those file systems which support them. Forfile.symlink
theto
argument can specify a single existing directory. (Unix andmacOS native filesystems support both. Windows has hard links tofiles onNTFS file systems and concepts related to symbolic links onrecent versions: see the section below on the Windows version of thishelp page. What happens on a FAT orSMB-mounted file system is OS-specific.)
File arguments with a marked encoding (seeEncoding
areif possible translated to the native encoding, except on Windows whereUnicode file operations are used (so marking as UTF-8 can be used toaccess file paths not in the native encoding on suitable filesystems).
These functions return a logical vector indicating whichoperation succeeded for each of the files attempted. Using a missingvalue for a file or path name will always be regarded as a failure.
IfshowWarnings = TRUE
,file.create
will give a warningfor an unexpected failure.
Case-insensitive file systems are the norm on Windows and macOS,but can be found on all OSes (for example a FAT-formatted USB drive isprobably case-insensitive).
These functions will most likely match existing files regardless of caseon such file systems: however this is an OS function and it ispossible that file names might be mapped to upper or lower case.
Always check the return value of these functions when used in packagecode. This is especially important forfile.rename
, which hasOS-specific restrictions (and note that the session temporarydirectory is commonly on a different file system from the workingdirectory): it is only portable to usefile.rename
to changefile name(s) within a single directory.
Ross Ihaka, Brian Ripley
file.info
,file.access
,file.path
,file.show
,list.files
,unlink
,basename
,path.expand
.
Sys.glob
to expand wildcards in file specifications.
file_test
,Sys.readlink
(for ‘symlink’s).
https://en.wikipedia.org/wiki/Hard_link andhttps://en.wikipedia.org/wiki/Symbolic_link for the concepts oflinks and their limitations.
cat("file A\n", file = "A")cat("file B\n", file = "B")file.append("A", "B")file.create("A") # (trashing previous)file.append("A", rep("B", 10))if(interactive()) file.show("A") # -> the 10 lines from 'B'file.copy("A", "C")dir.create("tmp")file.copy(c("A", "B"), "tmp")list.files("tmp") # -> "A" and "B"setwd("tmp")file.remove("A") # the tmp/A filefile.symlink(file.path("..", c("A", "B")), ".") # |--> (TRUE,FALSE) : ok for A but not B as it exists alreadysetwd("..")unlink("tmp", recursive = TRUE)file.remove("A", "B", "C")
cat("file A\n", file="A")cat("file B\n", file="B")file.append("A","B")file.create("A")# (trashing previous)file.append("A", rep("B",10))if(interactive()) file.show("A")# -> the 10 lines from 'B'file.copy("A","C")dir.create("tmp")file.copy(c("A","B"),"tmp")list.files("tmp")# -> "A" and "B"setwd("tmp")file.remove("A")# the tmp/A filefile.symlink(file.path("..", c("A","B")),".")# |--> (TRUE,FALSE) : ok for A but not B as it exists alreadysetwd("..")unlink("tmp", recursive=TRUE)file.remove("A","B","C")
These functions provide a low-level interface to the computer'sfile system.
dir.exists(paths)dir.create(path, showWarnings = TRUE, recursive = FALSE, mode = "0777")Sys.chmod(paths, mode = "0777", use_umask = TRUE)Sys.umask(mode = NA)
dir.exists(paths)dir.create(path, showWarnings=TRUE, recursive=FALSE, mode="0777")Sys.chmod(paths, mode="0777", use_umask=TRUE)Sys.umask(mode=NA)
path | a character vector containing a single path name. Tildeexpansion (see |
paths | character vectors containing file or directory paths. Tildeexpansion (see |
showWarnings | logical; should the warnings on failure be shown? |
recursive | logical. Should elements of the path other than thelast be created? If true, like the Unix command |
mode | the mode to be used on Unix-alikes: it will becoerced by |
use_umask | logical: should the mode be restricted by the |
dir.exists
checks that the paths exist (in the same sense asfile.exists
) and are directories.
dir.create
creates the last element of the path, unlessrecursive = TRUE
. Trailing path separators are discarded.
The mode will be modified by theumask
setting in the same wayas for the system functionmkdir
. What modes can be set isOS-dependent, and it is unsafe to assume that more than three octaldigits will be used. For more details see your OS's documentation on thesystem callmkdir
, e.g.man 2 mkdir
(and not that onthe command-line utility of that name).
One of the idiosyncrasies of Windows is that directory creation mayreport success but create a directory with a different name, forexampledir.create("G.S.")
creates ‘"G.S"’. This isundocumented, and what are the precise circumstances is unknown (andmight depend on the version of Windows). Also avoid directory nameswith a trailing space.
Sys.chmod
sets the file permissions of one or more files.It may not be supported on a system (when a warning is issued).See the comments fordir.create
for how modes are interpreted.Changing mode on a symbolic link is unlikely to work (nor benecessary). For more details see your OS's documentation on thesystem callchmod
, e.g.man 2 chmod
(and not that onthe command-line utility of that name). Whether this changes thepermission of a symbolic link or its target is OS-dependent (althoughto change the target is more common, and POSIX does not support modesfor symbolic links: BSD-based Unixes do, though).
Sys.umask
sets theumask
and returns the previous value:as a special casemode = NA
just returns the current value.It may not be supported (when a warning is issued and"0"
is returned). For more details see your OS's documentation on thesystem callumask
, e.g.man 2 umask
.
How modes are handled depends on the file system, even on Unix-alikes(although their documentation is often written assuming a POSIX filesystem). So treat documentation cautiously if you are using, say, aFAT/FAT32 or network-mounted file system.
Seefiles for how file paths with marked encodings are interpreted.
dir.exists
returns a logical vector ofTRUE
orFALSE
values (without names).
dir.create
andSys.chmod
return invisibly a logical vectorindicating if the operation succeeded for each of the files attempted.Using a missing value for a path name will always be regarded as afailure.dir.create
indicates failure if the directory alreadyexists. IfshowWarnings = TRUE
,dir.create
will give awarning for an unexpected failure (e.g., not for a missing value norfor an already existing component forrecursive = TRUE
).
Sys.umask
returns the previous value of theumask
,as a length-one object of class"octmode"
: thevisibility flag is off unlessmode
isNA
.
See also the section in the help forfile.exists
oncase-insensitive file systems for the interpretation ofpath
andpaths
.
Ross Ihaka, Brian Ripley
file.info
,file.exists
,file.path
,list.files
,unlink
,basename
,path.expand
.
## Not run: ## Fix up maximal allowed permissions in a file treeSys.chmod(list.dirs("."), "777")f <- list.files(".", all.files = TRUE, full.names = TRUE, recursive = TRUE)Sys.chmod(f, (file.mode(f) | "664"))## End(Not run)
## Not run:## Fix up maximal allowed permissions in a file treeSys.chmod(list.dirs("."),"777")f<- list.files(".", all.files=TRUE, full.names=TRUE, recursive=TRUE)Sys.chmod(f,(file.mode(f)|"664"))## End(Not run)
Find the paths to one or more packages.
find.package(package, lib.loc = NULL, quiet = FALSE, verbose = getOption("verbose"))path.package(package, quiet = FALSE)packageNotFoundError(package, lib.loc, call = NULL)
find.package(package, lib.loc=NULL, quiet=FALSE, verbose= getOption("verbose"))path.package(package, quiet=FALSE)packageNotFoundError(package, lib.loc, call=NULL)
package | character vector: the names of packages. |
lib.loc | a character vector describing the location ofRlibrary trees to search through, or |
quiet | logical. Should this not give warnings or an errorif the package is not found? |
verbose | a logical. If |
call | call expression. |
find.package
returns path to the locations where thegiven packages are found. Iflib.loc
isNULL
, thenloaded namespaces are searched before the libraries. If a package isfound more than once, the first match is used. Unlessquiet = TRUE
a warning will be given about the named packages which are notfound, and an error if none are. Ifverbose
is true, warningsabout packages found more than once are given. For a package to bereturned it must contain a either a ‘Meta’ subdirectory or a‘DESCRIPTION’ file containing a validversion
field, butit need not be installed (it could be a source package iflib.loc
was set suitably).
find.package
is not usually the right tool to find out if apackage is available for use: the only way to do that is to userequire
to try to load it. It need not be installed forthe correct platform, it might have a version requirement not met bythe running version ofR, there might be dependencies which are notavailable, ....
path.package
returns the paths from which the named packageswere loaded, or if none were named, for all currently attached packages.Unlessquiet = TRUE
it will warn if some of the packages namedare not attached, and given an error if none are.
packageNotFoundError
creates an error condition object of classpackageNotFoundError
for signaling errors. The condition objectcontains the fieldspackage
andlib.loc
.
A character vector of paths of package directories.
path.expand
andnormalizePath
for pathstandardization.
try(find.package("knitr"))## will not give an error, maybe a warning about *all* locations it is found:find.package("kitty", quiet=TRUE, verbose=TRUE)## Find all .libPaths() entries a package is found:findPkgAll <- function(pkg) unlist(lapply(.libPaths(), function(lib) find.package(pkg, lib, quiet=TRUE, verbose=FALSE)))findPkgAll("MASS")findPkgAll("knitr")
try(find.package("knitr"))## will not give an error, maybe a warning about *all* locations it is found:find.package("kitty", quiet=TRUE, verbose=TRUE)## Find all .libPaths() entries a package is found:findPkgAll<-function(pkg) unlist(lapply(.libPaths(),function(lib) find.package(pkg, lib, quiet=TRUE, verbose=FALSE)))findPkgAll("MASS")findPkgAll("knitr")
Given a vector of non-decreasing breakpoints invec
, find theinterval containing each element ofx
; i.e., ifi <- findInterval(x,v)
, for each indexj
inx
where
,
, and
N <- length(v)
.At the two boundaries, the returned index may differ by 1, dependingon the optional argumentsrightmost.closed
andall.inside
.
findInterval(x, vec, rightmost.closed = FALSE, all.inside = FALSE, left.open = FALSE)
findInterval(x, vec, rightmost.closed=FALSE, all.inside=FALSE, left.open=FALSE)
x | numeric. |
vec | numeric, sorted (weakly) increasingly, of length |
rightmost.closed | logical; if true, the rightmost interval, |
all.inside | logical; if true, the returned indices are coercedinto |
left.open | logical; if true all the intervals are open at leftand closed at right; in the formulas below, |
The functionfindInterval
finds the index of one vectorx
inanother,vec
, where the latter must be non-decreasing. Wherethis is trivial, equivalent toapply( outer(x, vec, `>=`), 1, sum)
,as a matter of fact, the internal algorithm uses interval searchensuring complexity where
n <- length(x)
(andN <- length(vec)
). For (almost)sortedx
, it will be even faster, basically.
This is the same computation as for the empirical distributionfunction, and indeed,findInterval(t, sort(X))
isidentical to where
is the empirical distributionfunction of
.
Whenrightmost.closed = TRUE
, the result forx[j] = vec[N]
(), is
N - 1
as for all othervalues in the last interval.
left.open = TRUE
is occasionally useful, e.g., for survival data.For (anti-)symmetry reasons, it is equivalent to using“mirrored” data, i.e., the following is always true:
identical( findInterval( x, v, left.open= TRUE, ...) , N - findInterval(-x, -v[N:1], left.open=FALSE, ...) )
whereN <- length(vec)
as above.
vector of lengthlength(x)
with values in0:N
(andNA
) whereN <- length(vec)
, or values coerced to1:(N-1)
if and only ifall.inside = TRUE
(equivalently coercing allx valuesinside the intervals). Note thatNA
s arepropagated fromx
, andInf
values are allowed inbothx
andvec
.
Martin Maechler
approx(*, method = "constant")
which is ageneralization offindInterval()
,ecdf
forcomputing the empirical distribution function which is (up to a factorof) also basically the same as
findInterval(.)
.
x <- 2:18v <- c(5, 10, 15) # create two bins [5,10) and [10,15)cbind(x, findInterval(x, v))N <- 100X <- sort(round(stats::rt(N, df = 2), 2))tt <- c(-100, seq(-2, 2, length.out = 201), +100)it <- findInterval(tt, X)tt[it < 1 | it >= N] # only first and last are outside range(X)## 'left.open = TRUE' means "mirroring" :N <- length(v)stopifnot(identical( findInterval( x, v, left.open=TRUE) , N - findInterval(-x, -v[N:1])))
x<-2:18v<- c(5,10,15)# create two bins [5,10) and [10,15)cbind(x, findInterval(x, v))N<-100X<- sort(round(stats::rt(N, df=2),2))tt<- c(-100, seq(-2,2, length.out=201),+100)it<- findInterval(tt, X)tt[it<1| it>= N]# only first and last are outside range(X)## 'left.open = TRUE' means "mirroring" :N<- length(v)stopifnot(identical( findInterval( x, v, left.open=TRUE), N- findInterval(-x,-v[N:1])))
Forces the evaluation of a function argument.
force(x)
force(x)
x | a formal argument of the enclosing function. |
force
forces the evaluation of a formal argument. This canbe useful if the argument will be captured in a closure by the lexicalscoping rules and will later be altered by an explicit assignment oran implicit assignment in a loop or an apply function.
This is semantic sugar: just evaluating the symbol will do thesame thing (see the examples).
force
does not force the evaluation of otherpromises. (It works by forcing the promise thatis created when the actual arguments of a call are matched to theformal arguments of a closure, the mechanism which implementslazy evaluation.)
f <- function(y) function() ylf <- vector("list", 5)for (i in seq_along(lf)) lf[[i]] <- f(i)lf[[1]]() # returns 5g <- function(y) { force(y); function() y }lg <- vector("list", 5)for (i in seq_along(lg)) lg[[i]] <- g(i)lg[[1]]() # returns 1## This is identical tog <- function(y) { y; function() y }
f<-function(y)function() ylf<- vector("list",5)for(iin seq_along(lf)) lf[[i]]<- f(i)lf[[1]]()# returns 5g<-function(y){ force(y);function() y}lg<- vector("list",5)for(iin seq_along(lg)) lg[[i]]<- g(i)lg[[1]]()# returns 1## This is identical tog<-function(y){ y;function() y}
Call a function with a specified number of leading arguments forcedbefore the call if the function is a closure.
forceAndCall(n, FUN, ...)
forceAndCall(n, FUN,...)
n | number of leading arguments to force. |
FUN | function to call. |
... | arguments to |
forceAndCall
calls the functionFUN
with argumentsspecified in...
. If the value ofFUN
is a closurethen the firstn
arguments to the function are evaluated(i.e. their delayed evaluation promises are forced) before executingthe function body. If the value ofFUN
is a primitive thenthe callFUN(...)
is evaluated in the usual way.
forceAndCall
is intended to help defining higher orderfunctions likeapply
to behave more reasonably when theresult returned by the function applied is a closure that captured itsarguments.
Functions to make calls to compiled code that has been loaded intoR.
.C(.NAME, ..., NAOK = FALSE, DUP = TRUE, PACKAGE, ENCODING) .Fortran(.NAME, ..., NAOK = FALSE, DUP = TRUE, PACKAGE, ENCODING)
.C(.NAME,..., NAOK=FALSE, DUP=TRUE, PACKAGE, ENCODING) .Fortran(.NAME,..., NAOK=FALSE, DUP=TRUE, PACKAGE, ENCODING)
.NAME | a character string giving the name of a C function orFortran subroutine, or an object of class |
... | arguments to be passed to the foreign function. Up to 65. |
NAOK | if |
PACKAGE | if supplied, confine the search for a character string This is intended to add safety for packages, which can ensure byusing this argument that no other package can override their externalsymbols, and also speeds up the search (see ‘Note’). |
DUP ,ENCODING | For back-compatibility, accepted but ignored. |
These functions can be used to make calls to compiled C and Fortrancode. Later interfaces are.Call
and.External
which are more flexible and have betterperformance.
These functions areprimitive, and.NAME
is alwaysmatched to the first argument supplied (which should not be named).The other named arguments follow...
and so cannot beabbreviated. For clarity, should avoid using names in the argumentspassed to...
that match or partially match.NAME
.
A list similar to the...
list of arguments passed in(including any names given to the arguments), but reflecting anychanges made by the C or Fortran code.
The mapping of the types ofR arguments to C or Fortran arguments is
R | C | Fortran |
integer | int * | integer |
numeric | double * | double precision |
-- or -- | float * | real |
complex | Rcomplex * | double complex |
logical | int * | integer |
character | char ** | [see below] |
raw | unsigned char * | not allowed |
list | SEXP * | not allowed |
other | SEXP | not allowed |
Note: The C types corresponding tointeger
andlogical
areint
, notlong
as in S. Thisdifference matters on most 64-bit platforms, whereint
is32-bit andlong
is 64-bit (but not on 64-bit Windows).
Note: The Fortran type corresponding tological
isinteger
, notlogical
: the difference matters on someFortran compilers.
Numeric vectors inR will be passed as typedouble *
to C(and asdouble precision
to Fortran) unless the argument hasattributeCsingle
set toTRUE
(useas.single
orsingle
). This mechanism isonly intended to be used to facilitate the interfacing of existing Cand Fortran code.
The C typeRcomplex
is defined in ‘Complex.h’ as atypedef struct {double r; double i;}
. It may or may not beequivalent to the C99double complex
type, depending on thecompiler used.
Logical values are sent as0
(FALSE
),1
(TRUE
) orINT_MIN = -2147483648
(NA
, but only ifNAOK = TRUE
), and the compiled code should return one of thesethree values: however non-zero values other thanINT_MIN
aremapped toTRUE
.
Missing (NA
) string values are passed to.C
as the string"NA". As the Cchar
type can represent all possible bit patternsthere appears to be no way to distinguish missing strings from thestring"NA"
. If this distinction is important use.Call
.
Using a character string with.Fortran
is deprecated and willgive a warning. It passes the first (only) character string of acharacter vector as a C character array to Fortran: that may be usableascharacter*255
if its true length is passed separately. Onlyup to 255 characters of the string are passed back. (How well thisworks, and even if it works at all, depends on the C and Fortrancompilers and the platform.)
Lists, functions or otherR objects can (for historical reasons) bepassed to.C
, but the.Call
interface is muchpreferred. All inputs apart from atomic vectors should be regarded asread-only, and all apart from vectors (including lists), functions andenvironments are now deprecated.
All Fortran compilers known to be usable to compileR map symbol namesto lower case, and so does.Fortran
.
Symbol names containing underscores are not valid Fortran 77 (althoughthey are valid in Fortran 9x). Many Fortran 77 compilers will allowthem but may translate them in a different way to names not containingunderscores. Such names will often work with.Fortran
(sincehow they are translated is detected whenR is built and theinformation used by.Fortran
), but portable code should not useFortran names containing underscores.
Use.Fortran
with care for compiled Fortran 9x code: it may notwork if the Fortran 9x compiler used differs from the Fortran compilerused when configuringR, especially if the subroutine name is notlower-case or includes an underscore. The most portable way to callFortran 9x code fromR is to use.C
and the Fortran 2003moduleiso_c_binding
to provide a C interface to the Fortrancode.
Character vectors are copied before calling the compiled code and tocollect the results. For other atomic vectors the argument is copiedbefore calling the compiled code if it is otherwise used in thecalling code.
Non-atomic-vector objects are read-only to the C code and are nevercopied.
This behaviour can be changed by settingoptions(CBoundsCheck = TRUE)
. In that case raw,logical, integer, double and complex vector arguments are copied bothbefore and after calling the compiled code. The first copy made isextended at each end by guard bytes, and on return it is checked thatthese are unaltered. For.C
, each element of a charactervector uses guard bytes.
If one of these functions is to be used frequently, do specifyPACKAGE
(to confine the search to a single DLL) or pass.NAME
as one of the native symbol objects. Searching forsymbols can take a long time, especially when many namespaces are loaded.
You may seePACKAGE = "base"
for symbols linked intoR. Donot use this in your own code: such symbols are not part of the APIand may be changed without warning.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
The ‘Writing R Extensions’ manual.
Get or set the formal arguments of afunction
.
formals(fun = sys.function(sys.parent()), envir = parent.frame())formals(fun, envir = environment(fun)) <- value
formals(fun= sys.function(sys.parent()), envir= parent.frame())formals(fun, envir= environment(fun))<- value
fun | a |
envir |
|
value |
For the first form,fun
can also be a character string namingthe function to be manipulated, which is searched for inenvir
,by default from the parentframe. If it is not specified, the function callingformals
isused.
Onlyclosures, i.e., non-primitive functions, have formals, notprimitive functions.
Note thatformals(args(f))
gives a formal argument list forall functionsf
, primitive or not.
formals
returns the formal argument list of the functionspecified, as apairlist
, orNULL
for anon-function or primitive.
The replacement form sets the formals of a function to thelist/pairlist on the right hand side, and (potentially) resets theenvironment of the function, droppingattributes
.
formalArgs
(frommethods), a shortcut fornames(formals(.))
.args
for a human-readable version,and asintermediary to get formals of a primitive function.alist
toconstruct a typical formalsvalue
,see the examples.
The three parts of a (non-primitive)function
are itsformals
,body
, andenvironment
.
require(stats)formals(lm)## If you just want the names of the arguments, use formalArgs instead.names(formals(lm))methods:: formalArgs(lm) # same## formals returns a pairlist. Arguments with no default have type symbol (aka name).str(formals(lm))## formals returns NULL for primitive functions. Use it in combination with## args for this case.is.primitive(`+`)formals(`+`)formals(args(`+`))## You can overwrite the formal arguments of a function (though this is## advanced, dangerous coding).f <- function(x) a + bformals(f) <- alist(a = , b = 3)f # function(a, b = 3) a + bf(2) # result = 5
require(stats)formals(lm)## If you just want the names of the arguments, use formalArgs instead.names(formals(lm))methods:: formalArgs(lm)# same## formals returns a pairlist. Arguments with no default have type symbol (aka name).str(formals(lm))## formals returns NULL for primitive functions. Use it in combination with## args for this case.is.primitive(`+`)formals(`+`)formals(args(`+`))## You can overwrite the formal arguments of a function (though this is## advanced, dangerous coding).f<-function(x) a+ bformals(f)<- alist(a=, b=3)f# function(a, b = 3) a + bf(2)# result = 5
Format anR object for pretty printing.
format(x, ...)## Default S3 method:format(x, trim = FALSE, digits = NULL, nsmall = 0L, justify = c("left", "right", "centre", "none"), width = NULL, na.encode = TRUE, scientific = NA, big.mark = "", big.interval = 3L, small.mark = "", small.interval = 5L, decimal.mark = getOption("OutDec"), zero.print = NULL, drop0trailing = FALSE, ...)## S3 method for class 'data.frame'format(x, ..., justify = "none")## S3 method for class 'factor'format(x, ...)## S3 method for class 'AsIs'format(x, width = 12, ...)
format(x,...)## Default S3 method:format(x, trim=FALSE, digits=NULL, nsmall=0L, justify= c("left","right","centre","none"), width=NULL, na.encode=TRUE, scientific=NA, big.mark="", big.interval=3L, small.mark="", small.interval=5L, decimal.mark= getOption("OutDec"), zero.print=NULL, drop0trailing=FALSE,...)## S3 method for class 'data.frame'format(x,..., justify="none")## S3 method for class 'factor'format(x,...)## S3 method for class 'AsIs'format(x, width=12,...)
x | anyR object (conceptually); typically numeric. |
trim | logical; if |
digits | a positive integer indicating how many significant digitsare to be used fornumeric and complex |
nsmall | the minimum number of digits to the right of the decimalpoint in formatting real/complex numbers in non-scientific formats.Allowed values are |
justify | should acharacter vector be left-justified (thedefault), right-justified, centred or left alone. Can be abbreviated. |
width |
|
na.encode | logical: should |
scientific | either a logical specifying whetherelements of a real or complex vector should be encoded in scientificformat, or an integer penalty (see |
... | further arguments passed to or from other methods. |
big.mark ,big.interval ,small.mark ,small.interval ,decimal.mark ,zero.print ,drop0trailing | used for prettying (longish) numerical and complex sequences.Passed to |
format
is a generic function. Apart from the methods describedhere there are methods for dates (seeformat.Date
),date-times (seeformat.POSIXct
) and for other classes suchasformat.octmode
andformat.dist
.
format.data.frame
formats the data frame column by column,applying the appropriate method offormat
for each column.Methods for columns are often similar toas.character
but offermore control. Matrix and data-frame columns will be converted toseparate columns in the result, and character columns (normally all)will be given class"AsIs"
.
format.factor
converts the factor to a character vector andthen calls the default method (and sojustify
applies).
format.AsIs
deals with columns of complicated objects thathave been extracted from a data frame. Character objects and (atomic)matrices are passed to the default method (and sowidth
doesnot apply).Otherwise it callstoString
to convert the objectto character (if a vector or list, element by element) and thenright-justifies the result.
Justification for character vectors (and objects converted tocharacter vectors by their methods) is done on display width (seenchar
), taking double-width characters and the renderingof special characters (as escape sequences, including escapingbackslash but not double quote: seeprint.default
) intoaccount. Thus the width is as displayed byprint(quote = FALSE)
and not as displayed bycat
. Character stringsare padded with blanks to the display width of the widest. (Ifna.encode = FALSE
missing character strings are not included inthe width computations and are not encoded.)
Numeric vectors are encoded with the minimum number of decimal placesneeded to display all the elements to at least thedigits
significant digits. However, if all the elements then have trailingzeroes, the number of decimal places is reduced until at least oneelement has a non-zero final digit; see also the argumentdocumentation forbig.*
,small.*
etc, above. See thenote inprint.default
aboutdigits >= 16
.
Raw vectors are converted to their 2-digit hexadecimal representationbyas.character
.
format.default(x)
now provides a “minimal” string whenisS4(x)
is true.
While the internal code respects the optiongetOption("OutDec")
for the ‘decimal mark’ in general,decimal.mark
takes precedence over that option. Similarly,scientific
takes precedence overgetOption("scipen")
.
An object of similar structure tox
containing characterrepresentations of the elements of the first argumentx
in a common format, and in the current locale's encoding.
For character, numeric, complex or factorx
, dims and dimnamesare preserved on matrices/arrays and names on vectors: no otherattributes are copied.
Ifx
is a list, the result is a character vector obtained byapplyingformat.default(x, ...)
to each element of the list(afterunlist
ing elements which are themselves lists),and then collapsing the result for each element withpaste(collapse = ", ")
. The defaults in this case aretrim = TRUE, justify = "none"
since one does not usually wantalignment in the collapsed strings.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
format.info
indicates how an atomic vector would beformatted.
formatC
,paste
,as.character
,sprintf
,print
,prettyNum
,toString
,encodeString
.
format(1:10)format(1:10, trim = TRUE)zz <- data.frame("(row names)"= c("aaaaa", "b"), check.names = FALSE)format(zz)format(zz, justify = "left")## use of nsmallformat(13.7)format(13.7, nsmall = 3)format(c(6.0, 13.1), digits = 2)format(c(6.0, 13.1), digits = 2, nsmall = 1)## use of scientificformat(2^31-1)format(2^31-1, scientific = TRUE)## scientific = numeric scipen (= {sci}entific notation {pen}alty) :x <- c(1e5, 1000, 10, 0.1, .001, .123)t(sapply(setNames(,-4:1), \(sci) sapply(x, format, scientific=sci)))## a listz <- list(a = letters[1:3], b = (-pi+0i)^((-2:2)/2), c = c(1,10,100,1000), d = c("a", "longer", "character", "string"), q = quote( a + b ), e = expression(1+x))## can you find the "2" small differences?(f1 <- format(z, digits = 2))(f2 <- format(z, digits = 2, justify = "left", trim = FALSE))f1 == f2 ## 2 FALSE, 4 TRUE## A "minimal" format() for S4 objects without their own format() method:cc <- methods::getClassDef("standardGeneric")format(cc) ## "<S4 class ......>"
format(1:10)format(1:10, trim=TRUE)zz<- data.frame("(row names)"= c("aaaaa","b"), check.names=FALSE)format(zz)format(zz, justify="left")## use of nsmallformat(13.7)format(13.7, nsmall=3)format(c(6.0,13.1), digits=2)format(c(6.0,13.1), digits=2, nsmall=1)## use of scientificformat(2^31-1)format(2^31-1, scientific=TRUE)## scientific = numeric scipen (= {sci}entific notation {pen}alty) :x<- c(1e5,1000,10,0.1,.001,.123)t(sapply(setNames(,-4:1), \(sci) sapply(x, format, scientific=sci)))## a listz<- list(a= letters[1:3], b=(-pi+0i)^((-2:2)/2), c= c(1,10,100,1000), d= c("a","longer","character","string"), q= quote( a+ b), e= expression(1+x))## can you find the "2" small differences?(f1<- format(z, digits=2))(f2<- format(z, digits=2, justify="left", trim=FALSE))f1== f2## 2 FALSE, 4 TRUE## A "minimal" format() for S4 objects without their own format() method:cc<- methods::getClassDef("standardGeneric")format(cc)## "<S4 class ......>"
Information is returned on howformat(x, digits, nsmall)
would be formatted.
format.info(x, digits = NULL, nsmall = 0)
format.info(x, digits=NULL, nsmall=0)
x | an atomic vector; a potential argument of |
digits | how many significant digits are to be used fornumeric and complex |
nsmall | (see |
Aninteger
vector
of length 1, 3 or 6, sayr
.
For logical, integer and character vectors a single element,the width which would be used byformat
ifwidth = NULL
.
For numeric vectors:
r[1] | width (in characters) used by |
r[2] | number of digits after decimal point. |
r[3] | in |
For a complex vector the first three elements refer to the real parts,and there are three further elements corresponding to the imaginaryparts.
format
(notably aboutdigits >= 16
),formatC
.
dd <- options("digits") ; options(digits = 7) #-- for the followingformat.info(123) # 3 0 0format.info(pi) # 8 6 0format.info(1e8) # 5 0 1 - exponential "1e+08"format.info(1e222) # 6 0 2 - exponential "1e+222"x <- pi*10^c(-10,-2,0:2,8,20)names(x) <- formatC(x, width = 1, digits = 3, format = "g")cbind(sapply(x, format))t(sapply(x, format.info))## using at least 8 digits right of "."t(sapply(x, format.info, nsmall = 8))# Reset old options:options(dd)
dd<- options("digits"); options(digits=7)#-- for the followingformat.info(123)# 3 0 0format.info(pi)# 8 6 0format.info(1e8)# 5 0 1 - exponential "1e+08"format.info(1e222)# 6 0 2 - exponential "1e+222"x<- pi*10^c(-10,-2,0:2,8,20)names(x)<- formatC(x, width=1, digits=3, format="g")cbind(sapply(x, format))t(sapply(x, format.info))## using at least 8 digits right of "."t(sapply(x, format.info, nsmall=8))# Reset old options:options(dd)
format.pval
is intended for formatting p-values.
format.pval(pv, digits = max(1, getOption("digits") - 2), eps = .Machine$double.eps, na.form = "NA", ...)
format.pval(pv, digits= max(1, getOption("digits")-2), eps= .Machine$double.eps, na.form="NA",...)
pv | a numeric vector. |
digits | how many significant digits are to be used. |
eps | a numerical tolerance: see ‘Details’. |
na.form | character representation of |
... | further arguments to be passed to |
format.pval
is mainly an auxiliary function forprint.summary.lm
etc., and does separate formatting forfixed, floating point and very small values; those less thaneps
are formatted as"< [eps]"
(where ‘[eps]’stands forformat(eps, digits)
).
A character vector.
format.pval(c(stats::runif(5), pi^-100, NA))format.pval(c(0.1, 0.0001, 1e-27))
format.pval(c(stats::runif(5), pi^-100,NA))format.pval(c(0.1,0.0001,1e-27))
formatC()
formats numbers individually and flexibly usingC
style format specifications.
prettyNum()
is used for “prettifying” (possiblyformatted) numbers, also informat.default
.
.format.zeros(x)
, an auxiliary function ofprettyNum()
,re-formats the zeros in a vectorx
of formatted numbers.
formatC(x, digits = NULL, width = NULL, format = NULL, flag = "", mode = NULL, big.mark = "", big.interval = 3L, small.mark = "", small.interval = 5L, decimal.mark = getOption("OutDec"), preserve.width = "individual", zero.print = NULL, replace.zero = TRUE, drop0trailing = FALSE)prettyNum(x, big.mark = "", big.interval = 3L, small.mark = "", small.interval = 5L, decimal.mark = getOption("OutDec"), input.d.mark = decimal.mark, preserve.width = c("common", "individual", "none"), zero.print = NULL, replace.zero = FALSE, drop0trailing = FALSE, is.cmplx = NA, ...).format.zeros(x, zero.print, nx = suppressWarnings(as.numeric(x)), replace = FALSE, warn.non.fitting = TRUE)
formatC(x, digits=NULL, width=NULL, format=NULL, flag="", mode=NULL, big.mark="", big.interval=3L, small.mark="", small.interval=5L, decimal.mark= getOption("OutDec"), preserve.width="individual", zero.print=NULL, replace.zero=TRUE, drop0trailing=FALSE)prettyNum(x, big.mark="", big.interval=3L, small.mark="", small.interval=5L, decimal.mark= getOption("OutDec"), input.d.mark= decimal.mark, preserve.width= c("common","individual","none"), zero.print=NULL, replace.zero=FALSE, drop0trailing=FALSE, is.cmplx=NA,...).format.zeros(x, zero.print, nx= suppressWarnings(as.numeric(x)), replace=FALSE, warn.non.fitting=TRUE)
x | an atomic numerical or character object, possibly |
digits | the desired number of digits after the decimalpoint ( Default: 2 for integer, 4 for real numbers. If less than 0,the C default of 6 digits is used. If specified as more than 50, 50will be used with a warning unless |
width | the total field width; if both |
format | equal to
|
flag | for
There can be more than one of these flags, in any order. Other charactersused to have no effect for |
mode |
|
big.mark | character; if not empty used as mark between every |
big.interval | see |
small.mark | character; if not empty used as mark between every |
small.interval | see |
decimal.mark | the character to be used to indicate the numericdecimal point. |
input.d.mark | if |
preserve.width | string specifying if the string widths shouldbe preserved where possible in those cases where marks( |
zero.print | logical, character string or |
replace.zero ,replace | logical; if This works via |
warn.non.fitting | logical; if it is true, |
drop0trailing | logical, indicating if trailing zeros,i.e., |
is.cmplx | optional logical, to be used when |
... | arguments passed to |
nx | numeric vector of the same length as |
For numbers,formatC()
callsprettyNum()
when neededwhich itself calls.format.zeros(*, replace=replace.zero)
.(“when needed”: whenzero.print
is notNULL
,drop0trailing
is true, or one ofbig.mark
,small.mark
, ordecimal.mark
is not at default.)
If you setformat
it overrides the setting ofmode
, soformatC(123.45, mode = "double", format = "d")
gives123
.
The rendering of scientific format is platform-dependent: some systemsusen.ddde+nnn
orn.dddenn
rather thann.ddde+nn
.
formatC
does not necessarily align the numbers on the decimalpoint, soformatC(c(6.11, 13.1), digits = 2, format = "fg")
givesc("6.1", " 13")
. If you want common formatting for severalnumbers, useformat
.
prettyNum
is the utility function for prettifyingx
.x
can be complex (orformat(<complex>)
), here. Ifx
is not a character,format(x[i], ...)
is applied toeach element, and then it is left unchanged if all the other argumentsare at their defaults. Use theinput.d.mark
argument forprettyNum(x)
whenx
is acharacter
vector notresulting from something likeformat(<number>)
with a period asdecimal mark.
Becausegsub
is used to insert thebig.mark
andsmall.mark
, special characters need escaping. In particular,to insert a single backslash, use"\\\\"
.
The C doubles used forR numerical vectors have signed zeros, whichformatC
may output as-0
,-0.000
....
There is a warning ifbig.mark
anddecimal.mark
are thesame: that would be confusing to those reading the output.
A character object of same size and attributes asx
(afterdiscarding any class), in the current locale's encoding.
Unlikeformat
, each number is formatted individually.Looping over each element ofx
, the C functionsprintf(...)
is called for numeric inputs (inside the Cfunctionstr_signif
).
formatC
: for characterx
, do simple (left or right)padding with white space.
The default fordecimal.mark
informatC()
was changed inR 3.2.0: for use withinprint
methods in packages which mightbe used with earlier versions: usedecimal.mark = getOption("OutDec")
explicitly.
formatC
was originally written by Bill Dunlap for S-PLUS, latermuch improved by Martin Maechler.
It was first adapted forR by Friedrich Leisch and since muchimproved by the R Core team.
Kernighan, B. W. and Ritchie, D. M. (1988)The C Programming Language. Second edition. Prentice Hall.
sprintf
for more general C-like formatting.
xx <- pi * 10^(-5:4)cbind(format(xx, digits = 4), formatC(xx))cbind(formatC(xx, width = 9, flag = "-"))cbind(formatC(xx, digits = 5, width = 8, format = "f", flag = "0"))cbind(format(xx, digits = 4), formatC(xx, digits = 4, format = "fg"))f <- (-2:4); f <- f*16^f# Default ("g") format:formatC(pi*f)# Fixed ("f") format, more than one flag ('width' partly "enlarged"):cbind(formatC(pi*f, digits = 3, width=9, format = "f", flag = "0+"))formatC( c("a", "Abc", "no way"), width = -7) # <=> flag = "-"formatC(c((-1:1)/0,c(1,100)*pi), width = 8, digits = 1)## note that some of the results here depend on the implementation## of long-double arithmetic, which is platform-specific.xx <- c(1e-12,-3.98765e-10,1.45645e-69,1e-70,pi*1e37,3.44e4)## 1 2 3 4 5 6formatC(xx)formatC(xx, format = "fg") # special "fixed" format.formatC(xx[1:4], format = "f", digits = 75) #>> even longer stringsformatC(c(3.24, 2.3e-6), format = "f", digits = 11)formatC(c(3.24, 2.3e-6), format = "f", digits = 11, drop0trailing = TRUE)r <- c("76491283764.97430", "29.12345678901", "-7.1234", "-100.1","1123")## American:prettyNum(r, big.mark = ",")## Some Europeans:prettyNum(r, big.mark = "'", decimal.mark = ",")(dd <- sapply(1:10, function(i) paste((9:0)[1:i], collapse = "")))prettyNum(dd, big.mark = "'")## examples of 'small.mark'pN <- stats::pnorm(1:7, lower.tail = FALSE)cbind(format (pN, small.mark = " ", digits = 15))cbind(formatC(pN, small.mark = " ", digits = 17, format = "f"))cbind(ff <- format(1.2345 + 10^(0:5), width = 11, big.mark = "'"))## all with same width (one more than the specified minimum)## individual formatting to common width:fc <- formatC(1.234 + 10^(0:8), format = "fg", width = 11, big.mark = "'")cbind(fc)## Powers of two, stored exactly, formatted individually:pow.2 <- formatC(2^-(1:32), digits = 24, width = 1, format = "fg")## nicely printed (the last line showing 5^32 exactly):noquote(cbind(pow.2))## complex numbers:r <- 10.0000001; rv <- (r/10)^(1:10)(zv <- (rv + 1i*rv))op <- options(digits = 7) ## (system default)(pnv <- prettyNum(zv))stopifnot(pnv == "1+1i", pnv == format(zv), pnv == prettyNum(zv, drop0trailing = TRUE))## more digits change the picture:options(digits = 8)head(fv <- format(zv), 3)prettyNum(fv)prettyNum(fv, drop0trailing = TRUE) # a bit niceroptions(op)## The ' flag :doLC <- FALSE # <= R warns, so change to TRUE manually if you want see the effectif(doLC) { oldLC <- Sys.getlocale("LC_NUMERIC") Sys.setlocale("LC_NUMERIC", "de_CH.UTF-8")}formatC(1.234 + 10^(0:4), format = "fg", width = 11, flag = "'")## --> ..... " 1'001" " 10'001" on supported platformsif(doLC) ## revert, typically to "C" : Sys.setlocale("LC_NUMERIC", oldLC)
xx<- pi*10^(-5:4)cbind(format(xx, digits=4), formatC(xx))cbind(formatC(xx, width=9, flag="-"))cbind(formatC(xx, digits=5, width=8, format="f", flag="0"))cbind(format(xx, digits=4), formatC(xx, digits=4, format="fg"))f<-(-2:4); f<- f*16^f# Default ("g") format:formatC(pi*f)# Fixed ("f") format, more than one flag ('width' partly "enlarged"):cbind(formatC(pi*f, digits=3, width=9, format="f", flag="0+"))formatC( c("a","Abc","no way"), width=-7)# <=> flag = "-"formatC(c((-1:1)/0,c(1,100)*pi), width=8, digits=1)## note that some of the results here depend on the implementation## of long-double arithmetic, which is platform-specific.xx<- c(1e-12,-3.98765e-10,1.45645e-69,1e-70,pi*1e37,3.44e4)## 1 2 3 4 5 6formatC(xx)formatC(xx, format="fg")# special "fixed" format.formatC(xx[1:4], format="f", digits=75)#>> even longer stringsformatC(c(3.24,2.3e-6), format="f", digits=11)formatC(c(3.24,2.3e-6), format="f", digits=11, drop0trailing=TRUE)r<- c("76491283764.97430","29.12345678901","-7.1234","-100.1","1123")## American:prettyNum(r, big.mark=",")## Some Europeans:prettyNum(r, big.mark="'", decimal.mark=",")(dd<- sapply(1:10,function(i) paste((9:0)[1:i], collapse="")))prettyNum(dd, big.mark="'")## examples of 'small.mark'pN<- stats::pnorm(1:7, lower.tail=FALSE)cbind(format(pN, small.mark=" ", digits=15))cbind(formatC(pN, small.mark=" ", digits=17, format="f"))cbind(ff<- format(1.2345+10^(0:5), width=11, big.mark="'"))## all with same width (one more than the specified minimum)## individual formatting to common width:fc<- formatC(1.234+10^(0:8), format="fg", width=11, big.mark="'")cbind(fc)## Powers of two, stored exactly, formatted individually:pow.2<- formatC(2^-(1:32), digits=24, width=1, format="fg")## nicely printed (the last line showing 5^32 exactly):noquote(cbind(pow.2))## complex numbers:r<-10.0000001; rv<-(r/10)^(1:10)(zv<-(rv+1i*rv))op<- options(digits=7)## (system default)(pnv<- prettyNum(zv))stopifnot(pnv=="1+1i", pnv== format(zv), pnv== prettyNum(zv, drop0trailing=TRUE))## more digits change the picture:options(digits=8)head(fv<- format(zv),3)prettyNum(fv)prettyNum(fv, drop0trailing=TRUE)# a bit niceroptions(op)## The ' flag :doLC<-FALSE# <= R warns, so change to TRUE manually if you want see the effectif(doLC){ oldLC<- Sys.getlocale("LC_NUMERIC") Sys.setlocale("LC_NUMERIC","de_CH.UTF-8")}formatC(1.234+10^(0:4), format="fg", width=11, flag="'")## --> ..... " 1'001" " 10'001" on supported platformsif(doLC)## revert, typically to "C" : Sys.setlocale("LC_NUMERIC", oldLC)
Format vectors of items and their descriptions as 2-columntables or LaTeX-style description lists.
formatDL(x, y, style = c("table", "list"), width = 0.9 * getOption("width"), indent = NULL)
formatDL(x, y, style= c("table","list"), width=0.9* getOption("width"), indent=NULL)
x | a vector giving the items to be described, or a list oflength 2 or a matrix with 2 columns giving both items anddescriptions. |
y | a vector of the same length as |
style | a character string specifying the rendering style of thedescription information. Can be abbreviated.If |
width | a positive integer giving the target column for wrappinglines in the output. |
indent | a positive integer specifying the indentation of thesecond column in table style, and the indentation of continuationlines in list style. Must not be greater than |
After extracting the vectors of items and corresponding descriptionsfrom the arguments, both are coerced to character vectors.
In table style, items with more thanindent - 3
characters aredisplayed on a line of their own.
a character vector with the formatted entries.
## Provide a nice summary of the numerical characteristics of the## machine R is running on:writeLines(formatDL(unlist(.Machine)))## Inspect Sys.getenv() results in "list" style (by default, these are## printed in "table" style):writeLines(formatDL(Sys.getenv(), style = "list"))
## Provide a nice summary of the numerical characteristics of the## machine R is running on:writeLines(formatDL(unlist(.Machine)))## Inspect Sys.getenv() results in "list" style (by default, these are## printed in "table" style):writeLines(formatDL(Sys.getenv(), style="list"))
These functions provide the base mechanisms for definingnew functions in theR language.
function( arglist ) expr\( arglist ) exprreturn(value)
function( arglist) expr\( arglist) exprreturn(value)
arglist | empty or one or more (comma-separated) ‘name’ or‘name = expression’ termsand/or the special token |
expr | an expression. |
value | an expression. |
The names in an argument list can be back-quoted non-standard names(see ‘backquote’).
Ifvalue
is missing,NULL
is returned. If it is asingle expression, the value of the evaluated expression is returned.(The expression is evaluated as soon asreturn
is called, inthe evaluation frame of the function and before anyon.exit
expression is evaluated.)
If the end of a function is reached without callingreturn
, thevalue of the last evaluated expression is returned.
The shorthand form\(x) x + 1
is parsed asfunction(x) x + 1
. It may be helpful in making code containing simple functionexpressions more readable.
This type of function is not the only type inR: they are calledclosures (a name with origins in LISP) to distinguish them fromprimitive functions.
A closure has three components, itsformals
(its argumentlist), itsbody
(expr
in the ‘Usage’section) and itsenvironment
which provides theenclosure of the evaluation frame when the closure is used.
There is an optional further component if the closure has beenbyte-compiled. This is not normally user-visible, but is indicatedwhen functions are printed.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
args
.
formals
,body
andenvironment
for accessing the component parts of afunction.
debug
for debugging; usinginvisible
insidereturn(.)
for returninginvisibly.
norm <- function(x) sqrt(x%*%x)norm(1:4)## An anonymous function:(function(x, y){ z <- x^2 + y^2; x+y+z })(0:7, 1)
norm<-function(x) sqrt(x%*%x)norm(1:4)## An anonymous function:(function(x, y){ z<- x^2+ y^2; x+y+z})(0:7,1)
Reduce
uses a binary function to successively combine theelements of a given vector and a possibly given initial value.
Filter
extracts the elements of a vector for which a predicate(logical) function gives true.
Find
andPosition
give the first or last suchelement and its position in the vector, respectively.
Map
applies a function to the corresponding elements of given vectors.
Negate
creates the negation of a given function.
Reduce(f, x, init, right = FALSE, accumulate = FALSE, simplify = TRUE)Filter(f, x)Find(f, x, right = FALSE, nomatch = NULL)Map(f, ...)Negate(f)Position(f, x, right = FALSE, nomatch = NA_integer_)
Reduce(f, x, init, right=FALSE, accumulate=FALSE, simplify=TRUE)Filter(f, x)Find(f, x, right=FALSE, nomatch=NULL)Map(f,...)Negate(f)Position(f, x, right=FALSE, nomatch=NA_integer_)
f | a function of the appropriate arity (binary for |
x | a vector. |
init | anR object of the same kind as the elements of |
right | a logical indicating whether to proceed from left toright (default) or from right to left. |
accumulate | a logical indicating whether the successive reducecombinations should be accumulated. By default, only the finalcombination is used. |
simplify | a logical indicating whether accumulated resultsshould be simplified (by unlisting) in case they all are lengthone. |
nomatch | the value to be returned in the case when“no match” (no element satisfying the predicate) is found. |
... | vectors to which the function is |
Ifinit
is given,Reduce
logically adds it to the start(when proceeding left to right) or the end ofx
, respectively.If this possibly augmented vector has
elements,
Reduce
successively applies to the elements of
from left to right or right to left, respectively. I.e., a leftreduce computes
,
, etc.,and returns
, and a right reduce does
,
and returns
. (E.g., if
is thesequence (2, 3, 4) and
is division, left and right reduce give
and
, respectively.)If
has only a single element, this is returned; if there areno elements,
NULL
is returned. Thus, it is ensured thatf
is always called with 2 arguments.
The current implementation is non-recursive to ensure stability andscalability.
Reduce
is patterned after Common Lisp'sreduce
. Areduce is also known as a fold (e.g., in Haskell) or an accumulate(e.g., in the C++ Standard Template Library). The accumulativeversion corresponds to Haskell's scan functions.
Filter
applies the unary predicate functionf
to eachelement ofx
, coercing to logical if necessary, and returns thesubset ofx
for which this gives true. Note that possibleNA
values are currently always taken as false; control overNA
handling may be added in the future.Filter
corresponds tofilter
in Haskell or ‘remove-if-not’ inCommon Lisp.
Find
andPosition
are patterned after Common Lisp's‘find-if’ and ‘position-if’, respectively. If there is anelement for which the predicate function gives true, then the first orlast such element or its position is returned depending on whetherright
is false (default) or true, respectively. If there is nosuch element, the value specified bynomatch
is returned. Thecurrent implementation is not optimized for performance.
Map
is a simple wrapper tomapply
which does notattempt to simplify the result, similar to Common Lisp'smapcar
(with arguments being recycled, however). Future versions may allowsome control of the result type.
Negate
corresponds to Common Lisp'scomplement
. Given a(predicate) functionf
, it creates a function which returns thelogical negation of whatf
returns.
FunctionclusterMap
andmcmapply
(notWindows) in packageparallel provide parallel versions ofMap
.
## A general-purpose adder:add <- function(x) Reduce(`+`, x)add(list(1, 2, 3))## Like sum(), but can also used for adding matrices etc., as it will## use the appropriate '+' method in each reduction step.## More generally, many generics meant to work on arbitrarily many## arguments can be defined via reduction:FOO <- function(...) Reduce(FOO2, list(...))FOO2 <- function(x, y) UseMethod("FOO2")## FOO() methods can then be provided via FOO2() methods.## A general-purpose cumulative adder:cadd <- function(x) Reduce(`+`, x, accumulate = TRUE)cadd(seq_len(7))## A simple function to compute continued fractions:cfrac <- function(x) Reduce(function(u, v) u + 1 / v, x, right = TRUE)## Continued fraction approximation for pi:cfrac(c(3, 7, 15, 1, 292))## Continued fraction approximation for Euler's number (e):cfrac(c(2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8))## Map() now recycles similar to basic Ops:Map(`+`, 1, 1 : 3) ; 1 + 1:3Map(`+`, numeric(), 1 : 3) ; numeric() + 1:3## Iterative function application:Funcall <- function(f, ...) f(...)## Compute log(exp(acos(cos(0))))Reduce(Funcall, list(log, exp, acos, cos), 0, right = TRUE)## n-fold iterate of a function, functional style:Iterate <- function(f, n = 1) function(x) Reduce(Funcall, rep.int(list(f), n), x, right = TRUE)## Continued fraction approximation to the golden ratio:Iterate(function(x) 1 + 1 / x, 30)(1)## which is the same ascfrac(rep.int(1, 31))## Computing square root approximations for x as fixed points of the## function t |-> (t + x / t) / 2, as a function of the initial value:asqrt <- function(x, n) Iterate(function(t) (t + x / t) / 2, n)asqrt(2, 30)(10) # Starting from a positive value => +sqrt(2)asqrt(2, 30)(-1) # Starting from a negative value => -sqrt(2)## A list of all functions in the base environment:funs <- Filter(is.function, sapply(ls(baseenv()), get, baseenv()))## Functions in base with more than 10 arguments:names(Filter(function(f) length(formals(f)) > 10, funs))## Number of functions in base with a '...' argument:length(Filter(function(f) any(names(formals(f)) %in% "..."), funs))## Find all objects in the base environment which are *not* functions:Filter(Negate(is.function), sapply(ls(baseenv()), get, baseenv()))
## A general-purpose adder:add<-function(x) Reduce(`+`, x)add(list(1,2,3))## Like sum(), but can also used for adding matrices etc., as it will## use the appropriate '+' method in each reduction step.## More generally, many generics meant to work on arbitrarily many## arguments can be defined via reduction:FOO<-function(...) Reduce(FOO2, list(...))FOO2<-function(x, y) UseMethod("FOO2")## FOO() methods can then be provided via FOO2() methods.## A general-purpose cumulative adder:cadd<-function(x) Reduce(`+`, x, accumulate=TRUE)cadd(seq_len(7))## A simple function to compute continued fractions:cfrac<-function(x) Reduce(function(u, v) u+1/ v, x, right=TRUE)## Continued fraction approximation for pi:cfrac(c(3,7,15,1,292))## Continued fraction approximation for Euler's number (e):cfrac(c(2,1,2,1,1,4,1,1,6,1,1,8))## Map() now recycles similar to basic Ops:Map(`+`,1,1:3);1+1:3Map(`+`, numeric(),1:3); numeric()+1:3## Iterative function application:Funcall<-function(f,...) f(...)## Compute log(exp(acos(cos(0))))Reduce(Funcall, list(log, exp, acos, cos),0, right=TRUE)## n-fold iterate of a function, functional style:Iterate<-function(f, n=1)function(x) Reduce(Funcall, rep.int(list(f), n), x, right=TRUE)## Continued fraction approximation to the golden ratio:Iterate(function(x)1+1/ x,30)(1)## which is the same ascfrac(rep.int(1,31))## Computing square root approximations for x as fixed points of the## function t |-> (t + x / t) / 2, as a function of the initial value:asqrt<-function(x, n) Iterate(function(t)(t+ x/ t)/2, n)asqrt(2,30)(10)# Starting from a positive value => +sqrt(2)asqrt(2,30)(-1)# Starting from a negative value => -sqrt(2)## A list of all functions in the base environment:funs<- Filter(is.function, sapply(ls(baseenv()), get, baseenv()))## Functions in base with more than 10 arguments:names(Filter(function(f) length(formals(f))>10, funs))## Number of functions in base with a '...' argument:length(Filter(function(f) any(names(formals(f))%in%"..."), funs))## Find all objects in the base environment which are *not* functions:Filter(Negate(is.function), sapply(ls(baseenv()), get, baseenv()))
A call ofgc
causes a garbage collection to take place.gcinfo
sets a flag so thatautomatic collection is either silent (verbose = FALSE
) orprints memory usage statistics (verbose = TRUE
).
gc(verbose = getOption("verbose"), reset = FALSE, full = TRUE)gcinfo(verbose)
gc(verbose= getOption("verbose"), reset=FALSE, full=TRUE)gcinfo(verbose)
verbose | logical; if |
reset | logical; if |
full | logical; if |
A call ofgc
causes a garbage collection to take place.This will also take place automatically without user intervention, and theprimary purpose of callinggc
is for the report on memoryusage. For an accurate reportfull = TRUE
should be used.
It can be useful to callgc
after a large objecthas been removed, as this may promptR to return memory to theoperating system.
R allocates space for vectors in multiples of 8 bytes: hence thereport of"Vcells"
, a relic of an earlier allocator (that useda vector heap).
Whengcinfo(TRUE)
is in force, messages are sent to the messageconnection at each garbage collection of the form
Garbage collection 12 = 10+0+2 (level 0) ... 6.4 Mbytes of cons cells used (58%) 2.0 Mbytes of vectors used (32%)
Here the last two lines give the current memory usage rounded up tothe next 0.1Mb and as a percentage of the current trigger value.The first line gives a breakdown of the number of garbage collectionsat various levels (for an explanation see the ‘R Internals’ manual).
gc
returns a matrix with rows"Ncells"
(conscells), usually 28 bytes each on 32-bit systems and 56 bytes on64-bit systems, and"Vcells"
(vector cells, 8 byteseach), and columns"used"
and"gc trigger"
,each also interpreted in megabytes (rounded up to the next 0.1Mb).
If maxima have been set for either"Ncells"
or"Vcells"
,a fifth column is printed giving the current limits in Mb (withNA
denoting no limit).
The final two columns show the maximum space used since the last calltogc(reset = TRUE)
(or sinceR started).
gcinfo
returns the previous value of the flag.
The ‘R Internals’ manual.
Memory
onR's memory management,andgctorture
if you are anR developer.
gc.time()
reportstime used for garbage collection.
reg.finalizer
for actions to happen at garbagecollection.
gc() #- do it nowgcinfo(TRUE) #-- in the future, show when R does it## vvvvv use larger to *show* somethingx <- integer(100000); for(i in 1:18) x <- c(x, i)gcinfo(verbose = FALSE) #-- don't show it anymoregc(TRUE)gc(reset = TRUE)
gc()#- do it nowgcinfo(TRUE)#-- in the future, show when R does it## vvvvv use larger to *show* somethingx<- integer(100000);for(iin1:18) x<- c(x, i)gcinfo(verbose=FALSE)#-- don't show it anymoregc(TRUE)gc(reset=TRUE)
This function reports the time spent in garbage collection so far intheR session whileGC timing was enabled.
gc.time(on = TRUE)
gc.time(on=TRUE)
on | logical; if |
Due to timer resolution this may be under-estimate.
This is aprimitive.
A numerical vector of length 5 giving the user CPU time, the systemCPU time, the elapsed time and children's user and system CPU times(normally both zero), of time spent doing garbage collection whilstGC timing was enabled.
Times of child processes are not available on Windows and will alwaysbe given asNA
.
gc
,proc.time
for the timings for the session.
gc.time()
gc.time()
Provokes garbage collection on (nearly) every memory allocation.Intended to ferret out memory protection bugs. Also makesR runvery slowly, unfortunately.
gctorture(on = TRUE)gctorture2(step, wait = step, inhibit_release = FALSE)
gctorture(on=TRUE)gctorture2(step, wait= step, inhibit_release=FALSE)
on | logical; turning it on/off. |
step | integer; runGC every |
wait | integer; number of allocations to wait before startingGC torture. |
inhibit_release | logical; do not release free objects forre-use: use with caution. |
Callinggctorture(TRUE)
instructs the memory manager to force afullGC on every allocation.gctorture2
provides a more refinedinterface that allows the start of theGC torture to be deferred andalso gives the option of running aGC only everystep
allocations.
The third argument togctorture2
is only used if R has beenconfigured with a strict write barrier enabled. When this is the caseall garbage collections are full collections, and the memory managermarks free nodes and enables checks in many situations that signal anerror when a free node is used. This can help greatly in isolatingunprotected values in C code. It does not detect the case where anode becomes free and is reallocated. Theinhibit_release
argument can be used to prevent such reallocation. This will causememory to grow and should be used with caution and in conjunction withoperating system facilities to monitor and limit process memory use.
gctorture2
can also be invoked via environment variables at thestart of theR session.R_GCTORTURE corresponds to thestep
argument,R_GCTORTURE_WAIT towait
, andR_GCTORTURE_INHIBIT_RELEASE toinhibit_release
.
Previous value of first argument.
Peter Dalgaard and Luke Tierney
Search by name for an object (get
) or zero or more objects(mget
).
get(x, pos = -1, envir = as.environment(pos), mode = "any", inherits = TRUE)mget(x, envir = as.environment(-1), mode = "any", ifnotfound, inherits = FALSE)dynGet(x, ifnotfound = , minframe = 1L, inherits = FALSE)
get(x, pos=-1, envir= as.environment(pos), mode="any", inherits=TRUE)mget(x, envir= as.environment(-1), mode="any", ifnotfound, inherits=FALSE)dynGet(x, ifnotfound=, minframe=1L, inherits=FALSE)
x | For |
pos ,envir | where to look for the object (see ‘Details’); ifomitted search as if the name of the object appeared unquoted in anexpression. |
mode | the mode or type of object sought: see the‘Details’ section. |
inherits | should the enclosing frames of the environment besearched? |
ifnotfound | For |
minframe | integer specifying the minimal frame number to lookinto. |
Thepos
argument can specify the environment in which to lookfor the object in any of several ways: as a positive integer (theposition in thesearch
list); as the character stringname of an element in the search list; or as anenvironment
(including usingsys.frame
to access the currently active function calls). The default of-1
indicates the current environment of the call toget
. Theenvir
argument is an alternative way tospecify an environment.
These functions look to see if each of the name(s)x
have avalue bound to it in the specified environment. Ifinherits
isTRUE
and a value is not found forx
in the specifiedenvironment, the enclosing frames of the environment are searcheduntil the namex
is encountered. Seeenvironment
and the ‘R Language Definition’ manual for details about thestructure of environments and their enclosures.
Ifmode
is specified then only objects of that type are sought.mode
here is a mixture of the meanings oftypeof
andmode
:"function"
covers primitive functionsand operators,"numeric"
,"integer"
and"double"
all refer to any numeric type,"symbol"
and"name"
areequivalentbut"language"
must be used (and not"call"
or"("
).Currently,mode = "S4"
andmode = "object"
are equivalent.
Formget
, the values ofmode
andifnotfound
canbe either the same length asx
or of length 1. The argumentifnotfound
must be a list containing either the value to use ifthe requested item is not found or a function of one argument whichwill be called if the item is not found, with argument the name of theitem being requested.
dynGet()
is somewhat experimental and to be usedinsideanother function. It looks for an object in the callers, i.e.,thesys.frame()
s of the function. Use with caution.
Forget
, the object found. If no object is found an error results.
Formget
, a named list of objects (found or specifiedviaifnotfound
).
The reverse (or “inverse”) ofa <- get(nam)
isassign(nam, a)
, assigninga
to namenam
.
inherits = TRUE
is the default forget
inRbut not for S where it had a different meaning.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
exists
for checking whether an object exists;get0
for an efficient way of both checking existence andgetting an object.
assign
, the inverse ofget()
, see above.
UsegetAnywhere
for searching for an objectanywhere, including in other namespaces, andgetFromNamespace
to find an object in a specificnamespace.
get("%o%")## test mgete1 <- new.env()mget(letters, e1, ifnotfound = as.list(LETTERS))
get("%o%")## test mgete1<- new.env()mget(letters, e1, ifnotfound= as.list(LETTERS))
This function allows us to query the set of routinesin a DLL that are registered with R to enhancedynamic lookup, error handling when calling native routines,and potentially security in the future.This function provides a description of each of theregistered routines in the DLL for the different interfaces,i.e..C
,.Call
,.Fortran
and.External
.
getDLLRegisteredRoutines(dll, addNames = TRUE)
getDLLRegisteredRoutines(dll, addNames=TRUE)
dll | a character string or The The |
addNames | a logical value. If this is |
This takes the registration information after it has been registeredand processed by the R internals. In other words, it uses the extendedinformation.
There is aprint
method for the class, which prints only thetypes which have registered routines.
A list of class"DLLRegisteredRoutines"
with four elementscorresponding to the routines registered for the.C
,.Call
,.Fortran
and.External
interfaces. Each isa list (of class"NativeRoutineList"
) with as many elements asthere were routines registered for that interface.
Each element identifies a routine and is an objectof class"NativeSymbolInfo"
.An object of this class has the following fields:
name | the registered name of the routine (not necessarily thename in the C code). |
address | the memory address of the routine as resolved in theloaded DLL. This may be |
dll | an object of class |
numParameters | the number of arguments the native routine is tobe called with. |
Duncan Temple Lang[email protected]
‘Writing R Extensions’ manual for symbol registration.
Duncan Temple Lang (2001).“In Search of C/C++ & FORTRAN Routines”.R News,1(3), 20–23.https://www.r-project.org/doc/Rnews/Rnews_2001-3.pdf.
getLoadedDLLs
,getNativeSymbolInfo
for information on the entry points listed.
dlls <- getLoadedDLLs()getDLLRegisteredRoutines(dlls[["base"]])getDLLRegisteredRoutines("stats")
dlls<- getLoadedDLLs()getDLLRegisteredRoutines(dlls[["base"]])getDLLRegisteredRoutines("stats")
This function provides a way to get a list of all the DLLs (seedyn.load
) that are currently loaded in theR session.
getLoadedDLLs()
getLoadedDLLs()
This queries the internal table that manages the DLLs.
An object of class"DLLInfoList"
which is alist
with an element corresponding to each DLL that is currently loaded in thesession. Each element is an object of class"DLLInfo"
whichhas the following entries.
name | the abbreviated name. |
path | the fully qualified name of the loaded DLL. |
dynamicLookup | a logical value indicating whether R uses onlythe registration information to resolve symbols or whether itsearches the entire symbol table of the DLL. |
handle | a reference to the C-level data structure thatprovides access to the contents of the DLL.This is an object of class |
Note that the classDLLInfo
has a method for$
which can be used to resolve native symbols within thatDLL. Therefore, one must access the R-level elements describedabove using[[
, e.g.x[["name"]]
orx[["handle"]]
.
We are starting to use thehandle
elements in the DLL object toresolve symbols more directly inR.
Duncan Temple Lang[email protected].
getDLLRegisteredRoutines
,getNativeSymbolInfo
getLoadedDLLs()utils::tail(getLoadedDLLs(), 2) # the last 2 loaded ones, still a DLLInfoList
getLoadedDLLs()utils::tail(getLoadedDLLs(),2)# the last 2 loaded ones, still a DLLInfoList
This finds and returns a description of one or more dynamically loadedor ‘exported’ built-in native symbols. For each name, itreturns information about the name of the symbol, the library in whichit is located and, if available, the number of arguments it expectsand by which interface it should be called (i.e.Call
,.C
,.Fortran
, or.External
). Additionally, it returns the address of thesymbol and this can be passed to other C routines. Specifically, thisprovides a way to explicitly share symbols between differentdynamically loaded package libraries. Also, it provides a way toquery where symbols were resolved, and aids diagnosing strangebehavior associated with dynamic resolution.
getNativeSymbolInfo(name, PACKAGE, unlist = TRUE, withRegistrationInfo = FALSE)
getNativeSymbolInfo(name, PACKAGE, unlist=TRUE, withRegistrationInfo=FALSE)
name | the name(s) of the native symbol(s). |
PACKAGE | an optional argument that specifies to whichDLL to restrict the search for this symbol. If this is |
unlist | a logical value which controls how the result isreturned if the function is called with the name of a single symbol.If |
withRegistrationInfo | a logical value indicating whether, if |
This uses the same mechanism for resolving symbols as is usedin all the native interfaces (.Call
, etc.).If the symbol has been explicitly registered by the DLLin which it is contained, information about the number of argumentsand the interface by which it should be called will be returned.Otherwise, a generic native symbol object is returned.
Generally, a list ofNativeSymbolInfo
elements whose elementscan be indexed by the elements ofname
in the call. EachNativeSymbolInfo
object is a list containing the followingelements:
name | the name of the symbol, as given by the |
address | if |
dll | a list containing 3 elements:
|
If the routine was explicitly registered by the dynamically loadedlibrary, the list contains a fourth field
numParameters | the number of arguments that should be passed ina call to this routine. |
Additionally, the list will have an additional class,beingCRoutine
,CallRoutine
,FortranRoutine
orExternalRoutine
corresponding to the R interface by which itshould be invoked.
If any of the symbols is not found, an error is raised.
Ifname
contains only one symbol name andunlist
isTRUE
, then the singleNativeSymbolInfo
is returnedrather than the list containing that one element.
The third element of theNativeSymbolInfo
objects was renamedfrompackage
todll
inR version 3.6.0, for consistencywith the names of theNativeSymbolInfo
objects returned bygetDLLRegisteredRoutines()
.
One motivation for accessing this reflectance information is to beable to pass native routines to C routines as function pointers in C.This allows us to treat native routines andR functions in a similarmanner, such as when passing anR function to C code that makescallbacks to that function at different points in its computation(e.g.,nls
). Additionally, we can resolve the symboljust once and avoid resolving it repeatedly or using the internalcache.
Duncan Temple Lang
For information about registering native routines,see “In Search of C/C++ & FORTRAN Routines”,R-News, volume 1, number 3, 2001, p20–23(https://www.r-project.org/doc/Rnews/Rnews_2001-3.pdf).
getDLLRegisteredRoutines
,is.loaded
,.C
,.Fortran
,.External
,.Call
,dyn.load
.
Translation of text messages typically from calls tostop()
,warning()
, ormessage()
happens when Native Language Support (NLS) was enabled in this build ofR as it is almost always, see also thebindtextdomain()
example.
The functions documented here are the low level building blocks usedexplicitly or implicitly in almost all such message producing calls andthey attempt totranslate character vectors or set where the translations are to be found.
gettext(..., domain = NULL, trim = TRUE)ngettext(n, msg1, msg2, domain = NULL)bindtextdomain(domain, dirname = NULL)Sys.setLanguage(lang, unset = "en")
gettext(..., domain=NULL, trim=TRUE)ngettext(n, msg1, msg2, domain=NULL)bindtextdomain(domain, dirname=NULL)Sys.setLanguage(lang, unset="en")
... | one or more character vectors. |
trim | logical indicating if the white space trimming in |
domain | the ‘domain’ for the translation, a |
n | a non-negative integer. |
msg1 | the message to be used in English for |
msg2 | the message to be used in English for |
dirname | the directory in which to find translated messagecatalogs for the domain. |
lang | a |
unset | a string, specifying the default language assumed to becurrent in the case |
Ifdomain
isNULL
(the default) ingettext
orngettext
, the domain is inferred. Ifgettext
orngettext
is called from a function in the namespace ofpackagepkg including called viastop()
,warning()
, ormessage()
from the function,or, say, evaluated as if called from that namespace, see theevalq()
example, the domain is set to"R-pkg"
. Otherwise there is no defaultdomain and messages are not translated.
Settingdomain = NA
ingettext
orngettext
suppresses any translation.
""
does not match any domain. Ingettext
orngettext
,domain = ""
is effectively the same asdomain = NA
.
If the domain is found, each character string is offered fortranslation, and replaced by its translation into the current languageif one is found.
Thelanguage to be used for message translation is determined byyour OS default and/or the locale setting atR's startup, seeSys.getlocale()
, and notably theLANGUAGE environment variable, and alsoSys.setLanguage()
here.
Conventionally the domain forR warning/error messages in packagepkg is"R-pkg"
, and that for C-level messages is"pkg"
.
Forgettext
, whentrim
is true as by default,leading and trailing whitespace is ignored (“trimmed”) whenlooking for the translation.
ngettext
is used where the message needs to vary by a singleinteger. Translating such messages is subject to very specific rulesfor different languages: see the GNU Gettext Manual. The stringwill often contain a single instance of%d
to be used insprintf
. If English is used,msg1
is returned ifn == 1
andmsg2
in all other cases.
bindtextdomain
is typically wrapper for the C function of the samename: your system may have aman
page for it. With anon-NULL
dirname
it specifies where to look for messagecatalogues: withdirname = NULL
it returns the current location.IfNLS is not enabled,bindtextdomain(*,*)
returnsNULL
.The special casebindtextdomain(NULL)
calls C leveltextdomain(textdomain(NULL))
for the purpose of flushing (i.e.,emptying) the cache of already translated strings; it returnsTRUE
whenNLS is enabled.
The utilitySys.setLanguage(lang)
combines setting theLANGUAGE environment variable with flushing the translation cachebybindtextdomain(NULL)
.
Forgettext
, a character vector, one element per string in...
. If translation is not enabled or no domain is found orno translation is found in that domain, the original strings arereturned.
Forngettext
, a character string.
Forbindtextdomain
, a character string giving the current basedirectory, orNULL
if setting it failed.
ForSys.setLanguage()
, the previousLANGUAGE setting withattributeattr(*, "ok")
, alogical
indicating success.Note that currently, using a non-existing languagelang
is stillset and no translation will happen, without anymessage
.
stop
andwarning
make use ofgettext
totranslate messages.
xgettext
(packagetools) for extracting translatablestrings fromR source files.
bindtextdomain("R") # non-null if and only if NLS is enabledfor(n in 0:3) print(sprintf(ngettext(n, "%d variable has missing values", "%d variables have missing values"), n))## Not run: ## for translation, those strings should appear in R-pkg.pot asmsgid "%d variable has missing values"msgid_plural "%d variables have missing values"msgstr[0] ""msgstr[1] ""## End(Not run)miss <- "One only" # this line, or the next for the ngettext() belowmiss <- c("one", "or", "another")cat(ngettext(length(miss), "variable", "variables"), paste(sQuote(miss), collapse = ", "), ngettext(length(miss), "contains", "contain"), "missing values\n")## better for translators would be to usecat(sprintf(ngettext(length(miss), "variable %s contains missing values\n", "variables %s contain missing values\n"), paste(sQuote(miss), collapse = ", ")))thisLang <- Sys.getenv("LANGUAGE", unset = NA) # so we can reset itif(is.na(thisLang) || !nzchar(thisLang)) thisLang <- "en" # "factory" defaultenT <- "empty model supplied"Sys.setenv(LANGUAGE = "de") # may not always 'work'gettext(enT, domain="R-stats")# "leeres Modell angegeben" (if translation works)tget <- function() gettext(enT)tget() # not translated as fn tget() is not from "stats" pkg/namespaceevalq(function() gettext(enT), asNamespace("stats"))() # *is* translated## Sys.setLanguage() -- typical usage --Sys.setLanguage("en") -> oldSet # does set LANGUAGE env.varerrMsg <- function(expr) tryCatch(expr, error=conditionMessage)(errMsg(1 + "2") -> err)Sys.setLanguage("fr")errMsg(1 + "2")Sys.setLanguage("de")errMsg(1 + "2")## Usually, you would reset the language to "previous" viaSys.setLanguage(oldSet)## A show off of translations -- platform (font etc) dependent:## The translation languages available for "base" R in this version of R:if(capabilities("NLS")) withAutoprint({ langs <- list.files(bindtextdomain("R"), pattern = "^[a-z]{2}(_[A-Z]{2}|@quot)?$") langs txts <- sapply(setNames(,langs), function(lang) { Sys.setLanguage(lang) gettext("incompatible dimensions", domain="R-stats") }) cbind(txts) (nTrans <- length(unique(txts))) (not_translated <- names(txts[txts == txts[["en"]]]))})## Here, we reset to the *original* setting before the full example started:if(nzchar(thisLang)) { ## reset to previous and check Sys.setLanguage(thisLang) stopifnot(identical(errMsg(1 + "2"), err))} # else staying at 'de' ..
bindtextdomain("R")# non-null if and only if NLS is enabledfor(nin0:3) print(sprintf(ngettext(n,"%d variable has missing values","%d variables have missing values"), n))## Not run: ## for translation, those strings should appear in R-pkg.pot asmsgid"%d variable has missing values"msgid_plural"%d variables have missing values"msgstr[0]""msgstr[1]""## End(Not run)miss<-"One only"# this line, or the next for the ngettext() belowmiss<- c("one","or","another")cat(ngettext(length(miss),"variable","variables"), paste(sQuote(miss), collapse=", "), ngettext(length(miss),"contains","contain"),"missing values\n")## better for translators would be to usecat(sprintf(ngettext(length(miss),"variable %s contains missing values\n","variables %s contain missing values\n"), paste(sQuote(miss), collapse=", ")))thisLang<- Sys.getenv("LANGUAGE", unset=NA)# so we can reset itif(is.na(thisLang)||!nzchar(thisLang)) thisLang<-"en"# "factory" defaultenT<-"empty model supplied"Sys.setenv(LANGUAGE="de")# may not always 'work'gettext(enT, domain="R-stats")# "leeres Modell angegeben" (if translation works)tget<-function() gettext(enT)tget()# not translated as fn tget() is not from "stats" pkg/namespaceevalq(function() gettext(enT), asNamespace("stats"))()# *is* translated## Sys.setLanguage() -- typical usage --Sys.setLanguage("en")-> oldSet# does set LANGUAGE env.varerrMsg<-function(expr) tryCatch(expr, error=conditionMessage)(errMsg(1+"2")-> err)Sys.setLanguage("fr")errMsg(1+"2")Sys.setLanguage("de")errMsg(1+"2")## Usually, you would reset the language to "previous" viaSys.setLanguage(oldSet)## A show off of translations -- platform (font etc) dependent:## The translation languages available for "base" R in this version of R:if(capabilities("NLS")) withAutoprint({ langs<- list.files(bindtextdomain("R"), pattern="^[a-z]{2}(_[A-Z]{2}|@quot)?$") langs txts<- sapply(setNames(,langs),function(lang){ Sys.setLanguage(lang) gettext("incompatible dimensions", domain="R-stats")}) cbind(txts)(nTrans<- length(unique(txts)))(not_translated<- names(txts[txts== txts[["en"]]]))})## Here, we reset to the *original* setting before the full example started:if(nzchar(thisLang)){## reset to previous and check Sys.setLanguage(thisLang) stopifnot(identical(errMsg(1+"2"), err))}# else staying at 'de' ..
getwd
returns an absolute filepath representing the currentworking directory of theR process;setwd(dir)
is used to setthe working directory todir
.
getwd()setwd(dir)
getwd()setwd(dir)
dir | A character string:tilde expansion will be done. |
Seefiles for how file paths with marked encodings are interpreted.
getwd
returns a character string orNULL
if the workingdirectory is not available.On Windows the path returned will use/
as the path separatorand be encoded in UTF-8. The path will not have a trailing/
unless it is the root directory (of a drive or share on Windows).
setwd
returns the current directory before the change,invisibly and with the same conventions asgetwd
. It will givean error if it does not succeed (including if it is not implemented).
Note that the return value is said to bean absolutefilepath: there can be more than one representation of the path to adirectory and on some OSes the value returned can differ afterchanging directories and changing back to the same directory (forexample if symbolic links have been traversed).
list.files
for thecontents of a directory.
normalizePath
for a ‘canonical’ path name.
(WD <- getwd())if (!is.null(WD)) setwd(WD)
(WD<- getwd())if(!is.null(WD)) setwd(WD)
Generate factors by specifying the pattern of their levels.
gl(n, k, length = n*k, labels = seq_len(n), ordered = FALSE)
gl(n, k, length= n*k, labels= seq_len(n), ordered=FALSE)
n | an integer giving the number of levels. |
k | an integer giving the number of replications. |
length | an integer giving the length of the result. |
labels | an optional vector of labels for the resulting factorlevels. |
ordered | a logical indicating whether the result should beordered or not. |
The result has levels from1
ton
with each valuereplicated in groups of lengthk
out to a total length oflength
.
gl
is modelled on theGLIM function of the same name.
The underlyingfactor()
.
## First control, then treatment:gl(2, 8, labels = c("Control", "Treat"))## 20 alternating 1s and 2sgl(2, 1, 20)## alternating pairs of 1s and 2sgl(2, 2, 20)
## First control, then treatment:gl(2,8, labels= c("Control","Treat"))## 20 alternating 1s and 2sgl(2,1,20)## alternating pairs of 1s and 2sgl(2,2,20)
grep
,grepl
,regexpr
,gregexpr
,regexec
andgregexec
search for matches to argumentpattern
within each element of a character vector: they differ inthe format of and amount of detail in the results.
sub
andgsub
perform replacement of the first and allmatches respectively.
grep(pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE, invert = FALSE)grepl(pattern, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)sub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)regexpr(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)gregexpr(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)regexec(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)gregexec(pattern, text, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)
grep(pattern, x, ignore.case=FALSE, perl=FALSE, value=FALSE, fixed=FALSE, useBytes=FALSE, invert=FALSE)grepl(pattern, x, ignore.case=FALSE, perl=FALSE, fixed=FALSE, useBytes=FALSE)sub(pattern, replacement, x, ignore.case=FALSE, perl=FALSE, fixed=FALSE, useBytes=FALSE)gsub(pattern, replacement, x, ignore.case=FALSE, perl=FALSE, fixed=FALSE, useBytes=FALSE)regexpr(pattern, text, ignore.case=FALSE, perl=FALSE, fixed=FALSE, useBytes=FALSE)gregexpr(pattern, text, ignore.case=FALSE, perl=FALSE, fixed=FALSE, useBytes=FALSE)regexec(pattern, text, ignore.case=FALSE, perl=FALSE, fixed=FALSE, useBytes=FALSE)gregexec(pattern, text, ignore.case=FALSE, perl=FALSE, fixed=FALSE, useBytes=FALSE)
pattern | character string containing aregular expression(or character string for |
x ,text | a character vector where matches are sought, or anobject which can be coerced by |
ignore.case | if |
perl | logical. Should Perl-compatible regexps be used? |
value | if |
fixed | logical. If |
useBytes | logical. If |
invert | logical. If |
replacement | a replacement for matched pattern in |
Arguments which should be character strings or character vectors arecoerced to character if possible.
Each of these functions operates in one of three modes:
fixed = TRUE
: use exact matching.
perl = TRUE
: use Perl-style regular expressions.
fixed = FALSE, perl = FALSE
: use POSIX 1003.2extended regular expressions (the default).
See the help pages onregular expression for details of thedifferent types of regular expressions.
The two*sub
functions differ only in thatsub
replacesonly the first occurrence of apattern
whereasgsub
replaces all occurrences. Ifreplacement
containsbackreferences which are not defined inpattern
the result isundefined (but most often the backreference is taken to be""
).
Forregexpr
,gregexpr
,regexec
andgregexec
it is an error forpattern
to beNA
, otherwiseNA
is permitted and gives anNA
match.
Bothgrep
andgrepl
take missing values inx
asnot matching a non-missingpattern
.
The main effect ofuseBytes = TRUE
is to avoid errors/warningsabout invalid inputs and spurious matches in multibyte locales, butforregexpr
it changes the interpretation of the output. Itinhibits the conversion of inputs with marked encodings, and is forcedif any input is found which is marked as"bytes"
(seeEncoding
).
Caseless matching does not make much sense for bytes in a multibytelocale, and you should expect it only to work for ASCII characters ifuseBytes = TRUE
.
regexpr
andgregexpr
withperl = TRUE
allowPython-style named captures, but not forlong vector inputs.
Invalid inputs in the current locale are warned about up to 5 times.
Caseless matching withperl = TRUE
for non-ASCII charactersdepends on the PCRE library being compiled with ‘Unicodeproperty support’, which PCRE2 is by default.
grep(value = FALSE)
returns a vector of the indicesof the elements ofx
that yielded a match (or not, forinvert = TRUE
). This will be an integer vector unless the inputis along vector, when it will be a double vector.
grep(value = TRUE)
returns a character vector containing theselected elements ofx
(after coercion, preserving names but noother attributes).
grepl
returns a logical vector (match or not for each element ofx
).
sub
andgsub
return a character vector of the same lengthand with the same attributes asx
(after possible coercion tocharacter). Elements of character vectorsx
which are notsubstituted will be returned unchanged (including any declared encoding ifuseBytes = FALSE
). IfuseBytes = FALSE
a non-ASCIIsubstituted result will often be in UTF-8 with a marked encoding (e.g., ifthere is a UTF-8 input, and in a multibyte locale unlessfixed = TRUE
). Such strings can be re-encoded byenc2native
. Ifany of the inputs is marked as"bytes"
, elements of charactervectorsx
which are substituted will be returned marked as"bytes"
, but the encoding flag on elements not substituted isunspecified (it may be the original or "bytes"). If none of the inputs ismarked as"bytes"
, butuseBytes = TRUE
is given explicitly,the encoding flag is unspecified even on the substituted elements (it maybe"bytes"
or"unknown"
, possibly invalid in the currentencoding). Mixed use of"bytes"
and other marked encodings isdiscouraged, but if still desired one may useiconv
tore-encode the result e.g. to UTF-8 with suitably substituted invalidbytes.
regexpr
returns an integer vector of the same length astext
giving the starting position of the first match or if there is none, with attribute
"match.length"
, aninteger vector giving the length of the matched text (or forno match). The match positions and lengths are in characters unless
useBytes = TRUE
is used, when they are in bytes (as they arefor ASCII-only matching: in either case an attributeuseBytes
with valueTRUE
is set on the result). Ifnamed capture is used there are further attributes"capture.start"
,"capture.length"
and"capture.names"
.
gregexpr
returns a list of the same length astext
eachelement of which is of the same form as the return value forregexpr
, except that the starting positions of every (disjoint)match are given.
regexec
returns a list of the same length astext
eachelement of which is either if there is no match, or asequence of integers with the starting positions of the match and allsubstrings corresponding to parenthesized subexpressions of
pattern
, with attribute"match.length"
a vectorgiving the lengths of the matches (or for no match). Theinterpretation of positions and length and the attributes follows
regexpr
.
gregexec
returns the same asregexec
, except that toaccommodate multiple matches per element oftext
, the integersequences for each match are made into columns of a matrix, with onematrix per element oftext
with matches.
Where matching failed because of resource limits (especially forperl = TRUE
) this is regarded as a non-match, usually with awarning.
The POSIX 1003.2 mode ofgsub
andgregexpr
does notwork correctly with repeated word-boundaries (e.g.,pattern = "\b"
).Useperl = TRUE
for such matches (but that may notwork as expected with non-ASCII inputs, as the meaning of‘word’ is system-dependent).
If you are doing a lot of regular expression matching, including onvery long strings, you will want to consider the options used.Generallyperl = TRUE
will be faster than the default regularexpression engine, andfixed = TRUE
faster still (especiallywhen each pattern is matched only a few times).
If you are working with texts with non-ASCII characters, which can beeasily turned into ASCII (e.g. by substituting fancy quotes), doing so islikely to improve performance.
If you are working in a single-byte locale (though not common sinceR 4.2)and have marked UTF-8 strings that are representable in that locale,convert them first as just one UTF-8 string will force all the matching tobe done in Unicode, which attracts a penalty of aroundfor the default POSIX 1003.2 mode.
WhileuseBytes = TRUE
will improve performance further, because thestrings will not be checked before matching and the actual matching willbe faster, it can produce unexpected results so is best avoided. Withfixed = TRUE
anduseBytes = FALSE
, optimizations are inplace that take advantage of byte-based matching working for such patternsin UTF-8. WithuseBytes = TRUE
, character ranges, wildcards,and other regular expression patterns may produce unexpected results.
PCRE-based matching by default used to put additional effort into‘studying’ the compiled pattern whenx
/text
haslength 10 or more. That study may use the PCREJIT compiler onplatforms where it is available (seepcre_config
). Asfrom PCRE2 (PCRE version >= 10.00 as reported byextSoftVersion
), there is no study phase, but thepatterns are optimized automatically when possible, and PCREJIT isused when enabled. The details are controlled byoptions
PCRE_study
andPCRE_use_JIT
.(Some timing comparisons can be seen by running file‘tests/PCRE.R’ in theR sources (and perhaps installed).)People working with PCRE and very long strings can adjust the maximumsize of theJIT stack by setting environment variableR_PCRE_JIT_STACK_MAXSIZE beforeJIT is used to a value between1
and1000
in MB: the default is64
. WhenJIT isnot used with PCRE version < 10.30 (that is with PCRE1 and oldversions of PCRE2), it might also be wise to set the optionPCRE_limit_recursion
.
Aspects will be platform-dependent as well as locale-dependent: forexample the implementation of character classes (except[:digit:]
and[:xdigit:]
). One can expect results to beconsistent for ASCII inputs and when working in UTF-8 mode (when mostplatforms will use Unicode character tables, although those areupdated frequently and subject to some degree of interpretation – isa circled capital letter alphabetic or a symbol?). However, resultsin 8-bit encodings can differ considerably between platforms, modesand from the UTF-8 versions.
The C code for POSIX-style regular expression matching has changedover the years. As fromR 2.10.0 (Oct 2009) the TRE library of VilleLaurikari (https://github.com/laurikari/tre) is used. The POSIXstandard does give some room for interpretation, especially in thehandling of invalid regular expressions and the collation of characterranges, so the results will have changed slightly over the years.
For Perl-style matching PCRE2 or PCRE (https://www.pcre.org) isused: again the results may depend (slightly) on the version of PCREin use.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole (grep
)
regular expression (akaregexp
) for the detailsof the pattern specification.
regmatches
for extracting matched substrings based onthe results ofregexpr
,gregexpr
andregexec
.
glob2rx
to turn wildcard matches into regular expressions.
agrep
for approximate matching.
charmatch
,pmatch
for partial matching,match
for matching to whole strings,startsWith
for matching of initial parts of strings.
tolower
,toupper
andchartr
for character translations.
apropos
uses regexps and has more examples.
grepRaw
for matching raw vectors.
OptionsPCRE_limit_recursion
,PCRE_study
andPCRE_use_JIT
.
extSoftVersion
for the versions of regex and PCRElibraries in use,pcre_config
for more details forPCRE.
grep("[a-z]", letters)txt <- c("arm","foot","lefroo", "bafoobar")if(length(i <- grep("foo", txt))) cat("'foo' appears at least once in\n\t", txt, "\n")i # 2 and 4txt[i]## Double all 'a' or 'b's; "\" must be escaped, i.e., 'doubled'gsub("([ab])", "\\1_\\1_", "abc and ABC")txt <- c("The", "licenses", "for", "most", "software", "are", "designed", "to", "take", "away", "your", "freedom", "to", "share", "and", "change", "it.", "", "By", "contrast,", "the", "GNU", "General", "Public", "License", "is", "intended", "to", "guarantee", "your", "freedom", "to", "share", "and", "change", "free", "software", "--", "to", "make", "sure", "the", "software", "is", "free", "for", "all", "its", "users")( i <- grep("[gu]", txt) ) # indicesstopifnot( txt[i] == grep("[gu]", txt, value = TRUE) )## Note that for some implementations character ranges are## locale-dependent (but not currently). Then [b-e] in locales such as## en_US may include B as the collation order is aAbBcCdDe ...(ot <- sub("[b-e]",".", txt))txt[ot != gsub("[b-e]",".", txt)]#- gsub does "global" substitution## In caseless matching, ranges include both cases:a <- grep("[b-e]", txt, value = TRUE)b <- grep("[b-e]", txt, ignore.case = TRUE, value = TRUE)setdiff(b, a)txt[gsub("g","#", txt) != gsub("g","#", txt, ignore.case = TRUE)] # the "G" wordsregexpr("en", txt)gregexpr("e", txt)## Using grepl() for filtering## Find functions with argument names matching "warn":findArgs <- function(env, pattern) { nms <- ls(envir = as.environment(env)) nms <- nms[is.na(match(nms, c("F","T")))] # <-- work around "checking hack" aa <- sapply(nms, function(.) { o <- get(.) if(is.function(o)) names(formals(o)) }) iw <- sapply(aa, function(a) any(grepl(pattern, a, ignore.case=TRUE))) aa[iw]}findArgs("package:base", "warn")## trim trailing white spacestr <- "Now is the time "sub(" +$", "", str) ## spaces only## what is considered 'white space' depends on the locale.sub("[[:space:]]+$", "", str) ## white space, POSIX-style## what PCRE considered white space changed in version 8.34: see ?regexsub("\\s+$", "", str, perl = TRUE) ## PCRE-style white space## capitalizingtxt <- "a test of capitalizing"gsub("(\\w)(\\w*)", "\\U\\1\\L\\2", txt, perl=TRUE)gsub("\\b(\\w)", "\\U\\1", txt, perl=TRUE)txt2 <- "useRs may fly into JFK or laGuardia"gsub("(\\w)(\\w*)(\\w)", "\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE) sub("(\\w)(\\w*)(\\w)", "\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE)## named capturenotables <- c(" Ben Franklin and Jefferson Davis", "\tMillard Fillmore")# name groups 'first' and 'last'name.rex <- "(?<first>[[:upper:]][[:lower:]]+) (?<last>[[:upper:]][[:lower:]]+)"(parsed <- regexpr(name.rex, notables, perl = TRUE))gregexpr(name.rex, notables, perl = TRUE)[[2]]parse.one <- function(res, result) { m <- do.call(rbind, lapply(seq_along(res), function(i) { if(result[i] == -1) return("") st <- attr(result, "capture.start")[i, ] substring(res[i], st, st + attr(result, "capture.length")[i, ] - 1) })) colnames(m) <- attr(result, "capture.names") m}parse.one(notables, parsed)## Decompose a URL into its components.## Example by LT (http://www.cs.uiowa.edu/~luke/R/regexp.html).x <- "http://stat.umn.edu:80/xyz"m <- regexec("^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)", x)mregmatches(x, m)## Element 3 is the protocol, 4 is the host, 6 is the port, and 7## is the path. We can use this to make a function for extracting the## parts of a URL:URL_parts <- function(x) { m <- regexec("^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)", x) parts <- do.call(rbind, lapply(regmatches(x, m), `[`, c(3L, 4L, 6L, 7L))) colnames(parts) <- c("protocol","host","port","path") parts}URL_parts(x)## gregexec() may match multiple times within a single string.pattern <- "([[:alpha:]]+)([[:digit:]]+)"s <- "Test: A1 BC23 DEF456"m <- gregexec(pattern, s)mregmatches(s, m)## Before gregexec() was implemented, one could emulate it by running## regexec() on the regmatches obtained via gregexpr(). E.g.:lapply(regmatches(s, gregexpr(pattern, s)), function(e) regmatches(e, regexec(pattern, e)))
grep("[a-z]", letters)txt<- c("arm","foot","lefroo","bafoobar")if(length(i<- grep("foo", txt))) cat("'foo' appears at least once in\n\t", txt,"\n")i# 2 and 4txt[i]## Double all 'a' or 'b's; "\" must be escaped, i.e., 'doubled'gsub("([ab])","\\1_\\1_","abc and ABC")txt<- c("The","licenses","for","most","software","are","designed","to","take","away","your","freedom","to","share","and","change","it.","","By","contrast,","the","GNU","General","Public","License","is","intended","to","guarantee","your","freedom","to","share","and","change","free","software","--","to","make","sure","the","software","is","free","for","all","its","users")( i<- grep("[gu]", txt))# indicesstopifnot( txt[i]== grep("[gu]", txt, value=TRUE))## Note that for some implementations character ranges are## locale-dependent (but not currently). Then [b-e] in locales such as## en_US may include B as the collation order is aAbBcCdDe ...(ot<- sub("[b-e]",".", txt))txt[ot!= gsub("[b-e]",".", txt)]#- gsub does "global" substitution## In caseless matching, ranges include both cases:a<- grep("[b-e]", txt, value=TRUE)b<- grep("[b-e]", txt, ignore.case=TRUE, value=TRUE)setdiff(b, a)txt[gsub("g","#", txt)!= gsub("g","#", txt, ignore.case=TRUE)]# the "G" wordsregexpr("en", txt)gregexpr("e", txt)## Using grepl() for filtering## Find functions with argument names matching "warn":findArgs<-function(env, pattern){ nms<- ls(envir= as.environment(env)) nms<- nms[is.na(match(nms, c("F","T")))]# <-- work around "checking hack" aa<- sapply(nms,function(.){ o<- get(.)if(is.function(o)) names(formals(o))}) iw<- sapply(aa,function(a) any(grepl(pattern, a, ignore.case=TRUE))) aa[iw]}findArgs("package:base","warn")## trim trailing white spacestr<-"Now is the time "sub(" +$","", str)## spaces only## what is considered 'white space' depends on the locale.sub("[[:space:]]+$","", str)## white space, POSIX-style## what PCRE considered white space changed in version 8.34: see ?regexsub("\\s+$","", str, perl=TRUE)## PCRE-style white space## capitalizingtxt<-"a test of capitalizing"gsub("(\\w)(\\w*)","\\U\\1\\L\\2", txt, perl=TRUE)gsub("\\b(\\w)","\\U\\1", txt, perl=TRUE)txt2<-"useRs may fly into JFK or laGuardia"gsub("(\\w)(\\w*)(\\w)","\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE) sub("(\\w)(\\w*)(\\w)","\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE)## named capturenotables<- c(" Ben Franklin and Jefferson Davis","\tMillard Fillmore")# name groups 'first' and 'last'name.rex<-"(?<first>[[:upper:]][[:lower:]]+) (?<last>[[:upper:]][[:lower:]]+)"(parsed<- regexpr(name.rex, notables, perl=TRUE))gregexpr(name.rex, notables, perl=TRUE)[[2]]parse.one<-function(res, result){ m<- do.call(rbind, lapply(seq_along(res),function(i){if(result[i]==-1) return("") st<- attr(result,"capture.start")[i,] substring(res[i], st, st+ attr(result,"capture.length")[i,]-1)})) colnames(m)<- attr(result,"capture.names") m}parse.one(notables, parsed)## Decompose a URL into its components.## Example by LT (http://www.cs.uiowa.edu/~luke/R/regexp.html).x<-"http://stat.umn.edu:80/xyz"m<- regexec("^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)", x)mregmatches(x, m)## Element 3 is the protocol, 4 is the host, 6 is the port, and 7## is the path. We can use this to make a function for extracting the## parts of a URL:URL_parts<-function(x){ m<- regexec("^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)", x) parts<- do.call(rbind, lapply(regmatches(x, m), `[`, c(3L,4L,6L,7L))) colnames(parts)<- c("protocol","host","port","path") parts}URL_parts(x)## gregexec() may match multiple times within a single string.pattern<-"([[:alpha:]]+)([[:digit:]]+)"s<-"Test: A1 BC23 DEF456"m<- gregexec(pattern, s)mregmatches(s, m)## Before gregexec() was implemented, one could emulate it by running## regexec() on the regmatches obtained via gregexpr(). E.g.:lapply(regmatches(s, gregexpr(pattern, s)),function(e) regmatches(e, regexec(pattern, e)))
grepRaw
searches for substringpattern
matches within araw vectorx
.
grepRaw(pattern, x, offset = 1L, ignore.case = FALSE, value = FALSE, fixed = FALSE, all = FALSE, invert = FALSE)
grepRaw(pattern, x, offset=1L, ignore.case=FALSE, value=FALSE, fixed=FALSE, all=FALSE, invert=FALSE)
pattern | raw vector containing aregular expression(or fixed pattern for |
x | a raw vector where matches are sought, or an object which canbe coerced by |
ignore.case | if |
offset | an integer specifying the offset fromwhich the search should start. Must be positive. The beginning ofline is defined to be at that offset so |
value | logical. Determines the return value: see ‘Value’. |
fixed | logical. If |
all | logical. If |
invert | logical. If |
Unlikegrep
, seeks matching patterns within the rawvectorx
. This has implications especially in theall = TRUE
case, e.g., patterns matching empty strings are inherentlyinfinite and thus may lead to unexpected results.
The argumentinvert
is interpreted as asking to return thecomplement of the match, which is only meaningful forvalue = TRUE
. Argumentoffset
determines the start of the search, notof the complement. Note thatinvert = TRUE
withall = TRUE
will splitx
into pieces delimited by the patternincluding leading and trailing empty strings (consequently the use ofregular expressions with"^"
or"$"
in that case maylead to less intuitive results).
Some combinations of arguments such asfixed = TRUE
withvalue = TRUE
are supported but are less meaningful.
grepRaw(value = FALSE)
returns an integer vector of the offsetsat which matches have occurred. Ifall = FALSE
then it will beeither of length zero (no match) or length one (first matchingposition).
grepRaw(value = TRUE, all = FALSE)
returns a raw vector whichis either empty (no match) or the matched part ofx
.
grepRaw(value = TRUE, all = TRUE)
returns a (potentiallyempty) list of raw vectors corresponding to the matched parts.
The TRE library of Ville Laurikari (https://github.com/laurikari/tre/)is used except forfixed = TRUE
.
regular expression (akaregexp
) for the detailsof the pattern specification.
grep
for matching character vectors.
grepRaw("no match", "textText") # integer(0): no matchgrepRaw("adf", "adadfadfdfadadf") # 3 - the first matchgrepRaw("adf", "adadfadfdfadadf", all=TRUE, fixed=TRUE)## [1] 3 6 13 -- three matches
grepRaw("no match","textText")# integer(0): no matchgrepRaw("adf","adadfadfdfadadf")# 3 - the first matchgrepRaw("adf","adadfadfdfadadf", all=TRUE, fixed=TRUE)## [1] 3 6 13 -- three matches
Group generic methods can be defined for the following pre-specified groups offunctions,Math
,Ops
,matrixOps
,Summary
andComplex
.(There are no objects of these names in baseR, but there are in themethods package, not yet formatrixOps
.)
A method defined for an individual member of the group takesprecedence over a method defined for the group as a whole.
## S3 methods for group generics have prototypes:Math(x, ...)Ops(e1, e2)Complex(z)Summary(..., na.rm = FALSE)matrixOps(x, y)
## S3 methods for group generics have prototypes:Math(x,...)Ops(e1, e2)Complex(z)Summary(..., na.rm=FALSE)matrixOps(x, y)
x ,y ,z ,e1 ,e2 | objects. |
... | further arguments passed to methods. |
na.rm | logical: should missing values be removed? |
There are fivegroups for which S3 methods can be written,namely the"Math"
,"Ops"
,"Summary"
,"matrixOps"
, and"Complex"
groups. These are notR objects in baseR, butmethods can be supplied for them and baseR containsfactor
,data.frame
anddifftime
methods for the first three groups. (There isalso aordered
method forOps
,POSIXt
andDate
methods forMath
andOps
,package_version
methods forOps
andSummary
, as well as ats
method forOps
in packagestats.)
Group"Math"
:
abs
,sign
,sqrt
,floor
,ceiling
,trunc
,round
,signif
exp
,log
,expm1
,log1p
,cos
,sin
,tan
,cospi
,sinpi
,tanpi
,acos
,asin
,atan
cosh
,sinh
,tanh
,acosh
,asinh
,atanh
lgamma
,gamma
,digamma
,trigamma
cumsum
,cumprod
,cummax
,cummin
Members of this group dispatch onx
. Most members acceptonly one argument, but memberslog
,round
andsignif
accept one or two arguments, andtrunc
acceptsone or more.
Group"Ops"
:
"+"
,"-"
,"*"
,"/"
,"^"
,"%%"
,"%/%"
"&"
,"|"
,"!"
"=="
,"!="
,"<"
,"<="
,">="
,">"
This group contains both binary and unary operators (+
,-
and!
): when a unary operator is encountered theOps
method is called with one argument ande2
ismissing.
The classes of both arguments are considered in dispatching anymember of this group. For each argument its vector of classes isexamined to see if there is a matching specific (preferred) orOps
method. If a method is found for just one argument orthe same method is found for both, it is used.If different methods are found, then the genericchooseOpsMethod()
is called topick the appropriate method. (See?chooseOpsMethod
fordetails). IfchooseOpsMethod()
does not resolve the method,then there is a warning about‘incompatible methods’: in that case or if no method is foundfor either argument the internal method is used.
Note that thedata.frame
methods for the comparison("Compare"
:==
,<
, ...) and logic("Logic"
:&
|
and!
) operators return alogicalmatrix
instead of a data frame, forconvenience and back compatibility.
If the members of this group are called as functions, any argumentnames are removed to ensure that positional matching is always used.
Group"matrixOps"
:
"%*%"
This group currently contains the matrix multiply%*%
binary operatoronly, where at leastcrossprod()
andtcrossprod()
are meant to follow.Members of the group have the same dispatch semantics (usingboth arguments)as theOps
group.
Group"Summary"
:
all
,any
sum
,prod
min
,max
range
Members of this group dispatch on the first argument supplied.
Note that thedata.frame
methods for the"Summary"
and"Math"
groups require “numeric-alike”columnsx
, i.e., fulfilling
is.numeric(x) || is.logical(x) || is.complex(x)
Group"Complex"
:
Arg
,Conj
,Im
,Mod
,Re
Members of this group dispatch onz
.
Note that a method will be used for one of these groups or one of itsmembersonly if it corresponds to a"class"
attribute,as the internal code dispatches onoldClass
and not onclass
. This is for efficiency: having to dispatch on,say,Ops.integer
would be too slow.
The number of arguments supplied for primitive members of the"Math"
group generic methods is not checked prior to dispatch.
There is no lazy evaluation of arguments for group-generic functions.
These functions are all primitive andinternal generic.
The details of method dispatch and variables such as.Generic
are discussed in the help forUseMethod
. There are afew small differences:
For the operators of groupOps
, the object.Method
is a length-two character vector with elements themethods selected for the left and right arguments respectively. (Ifno method was selected, the corresponding element is""
.)
Object.Group
records the group used for dispatch (ifa specific method is used this is""
).
Packagemethods does contain objects with these names, which ithas re-used in confusing similar (but different) ways. See the helpfor that package.
Appendix A,Classes and Methods of
Chambers, J. M. and Hastie, T. J. eds (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
methods
for methods of non-internal generic functions.
S4groupGeneric for group generics for S4 methods.
require(utils)d.fr <- data.frame(x = 1:9, y = stats::rnorm(9))class(1 + d.fr) == "data.frame" ##-- add to d.f. ...methods("Math")methods("Ops")methods("Summary")methods("Complex") # none in base R
require(utils)d.fr<- data.frame(x=1:9, y= stats::rnorm(9))class(1+ d.fr)=="data.frame"##-- add to d.f. ...methods("Math")methods("Ops")methods("Summary")methods("Complex")# none in base R
grouping
returns a permutation which rearranges its firstargument such that identical values are adjacent to each other. Alsoreturned as attributes are the group-wise partitioning and the maximumgroup size.
grouping(...)
grouping(...)
... | a sequence of numeric, character or logicalvectors, all of the same length, or a classedR object. |
The function partially sorts the elements so that identical values areadjacent.NA
values come last. This is guaranteed to bestable, so ties are preserved, and if the data are alreadygrouped/sorted, the grouping is unchanged. This is useful foraggregation and is particularly fast for character vectors.
Under the covers, the"radix"
method oforder
isused, and the same caveats apply, including restrictions on characterencodings and lack of support forlong vectors (those with or more elements). Real-valued numbers are slightlyrounded to account for numerical imprecision.
Likeorder
, for a classedR object the grouping is based onthe result ofxtfrm
.
An object of class"grouping"
, the representation of whichshould be considered experimental and subject to change. It is aninteger vector with two attributes:
ends | subscripts in the result corresponding to the lastmember of each group |
maxgrpn | the maximum group size |
(ii <- grouping(x <- c(1, 1, 3:1, 1:4, 3), y <- c(9, 9:1), z <- c(2, 1:9)))## 6 5 2 1 7 4 10 8 3 9rbind(x, y, z)[, ii]
(ii<- grouping(x<- c(1,1,3:1,1:4,3), y<- c(9,9:1), z<- c(2,1:9)))## 6 5 2 1 7 4 10 8 3 9rbind(x, y, z)[, ii]
gzcon
provides a modified connection that wraps an existingconnection, and decompresses reads or compresses writes through thatconnection. Standardgzip
headers are assumed.
gzcon(con, level = 6, allowNonCompressed = TRUE, text = FALSE)
gzcon(con, level=6, allowNonCompressed=TRUE, text=FALSE)
con | a connection. |
level | integer between 0 and 9, the compression level when writing. |
allowNonCompressed | logical. When reading, shouldnon-compressed input be allowed? |
text | logical. Should the connection be text-oriented? This isdistinct from the mode of the connection (must always be binary).If |
Ifcon
is open then the modified connection is opened. Closingthe wrapper connection will also close the underlying connection.
Reading from a connection which does not supply agzip
magicheader is equivalent to reading from the original connection ifallowNonCompressed
is true, otherwise an error.
Compressed output will contain embeddedNUL bytes, and socon
is not permitted to be atextConnection
opened withopen = "w"
. Use a writablerawConnection
tocompress data into a variable.
The original connection becomes unusable: any object pointing to it willnow refer to the modified connection. For this reason, the newconnection needs to be closed explicitly.
An object inheriting from class"connection"
. This is the sameconnectionnumber as supplied, but with a modified internalstructure. It has binary mode.
## Uncompress a data file from a URLz <- gzcon(url("https://www.stats.ox.ac.uk/pub/datasets/csb/ch12.dat.gz"))# read.table can only read from a text-mode connection.raw <- textConnection(readLines(z))close(z)dat <- read.table(raw)close(raw)dat[1:4, ]## gzfile and gzcon can inter-work.## Of course here one would use gzfile, but file() can be replaced by## any other connection generator.zzfil <- tempfile(fileext = ".gz")zz <- gzfile(zzfil, "w")cat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")close(zz)readLines(zz <- gzcon(file(zzfil, "rb")))close(zz)unlink(zzfil)zzfil2 <- tempfile(fileext = ".gz")zz <- gzcon(file(zzfil2, "wb"))cat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")close(zz)readLines(zz <- gzfile(zzfil2))close(zz)unlink(zzfil2)
## Uncompress a data file from a URLz<- gzcon(url("https://www.stats.ox.ac.uk/pub/datasets/csb/ch12.dat.gz"))# read.table can only read from a text-mode connection.raw<- textConnection(readLines(z))close(z)dat<- read.table(raw)close(raw)dat[1:4,]## gzfile and gzcon can inter-work.## Of course here one would use gzfile, but file() can be replaced by## any other connection generator.zzfil<- tempfile(fileext=".gz")zz<- gzfile(zzfil,"w")cat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)readLines(zz<- gzcon(file(zzfil,"rb")))close(zz)unlink(zzfil)zzfil2<- tempfile(fileext=".gz")zz<- gzcon(file(zzfil2,"wb"))cat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)readLines(zz<- gzfile(zzfil2))close(zz)unlink(zzfil2)
Integers which are displayed in hexadecimal (short ‘hex’) format,with as many digits as are needed to display the largest, using leadingzeroes as necessary.
Arithmetic works as for integers, and non-integer valued mathematicalfunctions typically work by truncating the result to integer.
as.hexmode(x)## S3 method for class 'hexmode'as.character(x, keepStr = FALSE, ...)## S3 method for class 'hexmode'format(x, width = NULL, upper.case = FALSE, ...)## S3 method for class 'hexmode'print(x, ...)
as.hexmode(x)## S3 method for class 'hexmode'as.character(x, keepStr=FALSE,...)## S3 method for class 'hexmode'format(x, width=NULL, upper.case=FALSE,...)## S3 method for class 'hexmode'print(x,...)
x | an object, for the methods inheriting from class |
keepStr | a |
width |
|
upper.case | a logical indicating whether to use upper-caseletters or lower-case letters (default). |
... | further arguments passed to or from other methods. |
Class"hexmode"
consists of integer vectors with that classattribute, used primarily to ensure that they are printed in hex.Subsetting ([
) works too, as do arithmetic orother mathematical operations, albeit truncated to integer.
as.character(x)
drops allattributes
(unless whenkeepStr=TRUE
where it keeps,dim
,dimnames
andnames
for back compatibility) and converts each entry individually, hence with noleading zeroes, whereas informat()
, whenwidth = NULL
(thedefault), the output is padded with leading zeroes to the smallest widthneeded for all the non-missing elements.
as.hexmode
can convert integers (oftype"integer"
or"double"
) and character vectors whose elements contain only0-9
,a-f
,A-F
(or areNA
) to class"hexmode"
.
There is a!
method and methods for|
and&
:these recycle their arguments to the length of the longer and thenapply the operators bitwise to each element.
octmode
,sprintf
for other options inconverting integers to hex,strtoi
to convert hexstrings to integers.
i <- as.hexmode("7fffffff")i; class(i)identical(as.integer(i), .Machine$integer.max)hm <- as.hexmode(c(NA, 1)); hmas.integer(hm)Xm <- as.hexmode(1:16)Xm # print()s via format()stopifnot(nchar(format(Xm)) == 2)Xm[-16] # *no* leading zeroes!stopifnot(format(Xm[-16]) == c(1:9, letters[1:6]))## Integer arithmetic (remaining "hexmode"):16*XmXm^2-Xm(fac <- factorial(Xm[1:12])) # !1, !2, !3, !4 .. in hexadecimalsas.integer(fac) # indeed the same as factorial(1:12)
i<- as.hexmode("7fffffff")i; class(i)identical(as.integer(i), .Machine$integer.max)hm<- as.hexmode(c(NA,1)); hmas.integer(hm)Xm<- as.hexmode(1:16)Xm# print()s via format()stopifnot(nchar(format(Xm))==2)Xm[-16]# *no* leading zeroes!stopifnot(format(Xm[-16])== c(1:9, letters[1:6]))## Integer arithmetic (remaining "hexmode"):16*XmXm^2-Xm(fac<- factorial(Xm[1:12]))# !1, !2, !3, !4 .. in hexadecimalsas.integer(fac)# indeed the same as factorial(1:12)
These functions give the obvious hyperbolic functions. Theyrespectively compute the hyperbolic cosine, sine, tangent, and theirinverses, arc-cosine, arc-sine, arc-tangent (or ‘area cosine’,etc).
cosh(x)sinh(x)tanh(x)acosh(x)asinh(x)atanh(x)
cosh(x)sinh(x)tanh(x)acosh(x)asinh(x)atanh(x)
x | a numeric or complex vector |
These areinternal genericprimitive functions: methodscan be defined for them individually or via theMath
group generic.
Branch cuts are consistent with the inverse trigonometric functionsasin
et seq, and agree with those defined inAbramowitz & Stegun, figure 4.7, page 86.The behaviour actually on the cutsfollows the C99 standard which requires continuity coming round theendpoint in a counter-clockwise direction.
All are S4 generic functions: methods can be definedfor them individually or via theMath
group generic.
Abramowitz, M. and Stegun, I. A. (1972)Handbook of Mathematical Functions. New York: Dover.
Chapter 4. Elementary Transcendental Functions: Logarithmic,Exponential, Circular and Hyperbolic Functions
The trigonometric functions,cos
,sin
,tan
, and their inversesacos
,asin
,atan
.
The logistic distribution functionplogis
is a shiftedversion oftanh()
for numericx
.
This uses system facilities to convert a character vector betweenencodings: the ‘i’ stands for ‘internationalization’.
iconv(x, from = "", to = "", sub = NA, mark = TRUE, toRaw = FALSE)iconvlist()
iconv(x, from="", to="", sub=NA, mark=TRUE, toRaw=FALSE)iconvlist()
x | a character vector, or an object to be converted to a charactervector by |
from | a character string describing the current encoding. |
to | a character string describing the target encoding. |
sub | character string. If not |
mark | logical, for expert use. Should encodings be marked? |
toRaw | logical. Should a list of raw vectors be returned ratherthan a character vector? |
The names of encodings and which ones are available areplatform-dependent. AllR platforms support""
(for theencoding of the current locale),"latin1"
and"UTF-8"
.Generally case is ignored when specifying an encoding.
On most platformsiconvlist
provides an alphabetical list ofthe supported encodings. On others, the information is on the manpage foriconv(5)
or elsewhere in the man pages (but bewarethat the system commandiconv
may not support the same set ofencodings as the C functionsR calls). Unfortunately, the names arerarely supported across all platforms.
Elements ofx
which cannot be converted (perhaps because theyare invalid or because they cannot be represented in the targetencoding) will be returned asNA
(orNULL
fortoRaw = TRUE
) unlesssub
is specified.
Most versions oficonv
will allow transliteration by appending‘//TRANSLIT’ to theto
encoding: see the examples.
Encoding"ASCII"
is accepted, and on most systems"C"
and"POSIX"
are synonyms for ASCII. Where"ASCII/TRANSLIT"
is unsupported by the OS,"ASCII"
isused withsub = "c99"
if from UTF-8, elsesub = "?"
. (However, musl's version of"ASCII"
substitutes*
.)
Elements ofx
with a declared encoding (UTF-8 or latin1, seeEncoding
) are converted from that encoding iffrom = ""
, otherwise they are taken as being in the encoding specified byfrom
.
Note that implementations oficonv
typically do not do muchvalidity checking and will often mis-convert inputs which are invalidin encodingfrom
.
Ifsub = "Unicode"
orsub = "c99"
is used for anon-UTF-8 input it is the same assub = "byte"
.
IftoRaw = FALSE
(the default), the value is a character vectorof the same length and the same attributes asx
(afterconversion to a character vector). If conversion fails for an elementthat element of the result is set toNA_character_
. (NB:whether conversion fails is implementation-specific.)NA_character_
inputs giveNA_character_
outputs.
Ifmark = TRUE
(the default) the elements of the result have adeclared encoding ifto
is"latin1"
or"UTF-8"
,or ifto = ""
and the current locale's encoding is detected asLatin-1 (or its superset CP1252 on Windows) or UTF-8.
IftoRaw = TRUE
, the value is a list of the same length andthe same attributes asx
whose elements are eitherNULL
(if conversion fails or the input wasNA_character_
) or a rawvector.
Foriconvlist()
, a character vector (typically of a few hundredelements) of known encoding names.
There are three main implementations oficonv
in use. Linux'smost common C runtime, ‘glibc’, contains one. Several platformssupply versions or emulations of GNU ‘libiconv’, includingprevious versions of macOS and FreeBSD, in some cases with additionalencodings. On Windows we use a version of Yukihiro Nakadaira's‘win_iconv’, which is based on Windows' codepages. (We haveadded many encoding names for compatibility with other systems.) Allthree haveiconvlist
, ignore case in encoding names and support‘//TRANSLIT’ (but with different results, and for‘win_iconv’ currently a ‘best fit’ strategy is used exceptforto = "ASCII"
).
The macOS 14 implementation is attributed to the ‘CitrusProject’: the Apple headers declare it as ‘compatible’ with GNU‘libiconv’ 1.11 from 2006. However, it differs in significantways including using transliteration for conversions which cannot berepresented exactly in the target encoding. (It seems thisimplementation is also used in recent versions of FreeBSD. Earlierversions of macOS used GNU ‘libiconv’ 1.11 and someCRAN builds still do.) For a failingconversion macOS 14 generally translated character(s) to?
but14.1 gives an error (so anNA
result inR).
Most commercial Unixes contain an implementation oficonv
butnone we have encountered have supported the encoding names we need:the ‘R Installation and Administration’ manual recommendedinstalling GNU ‘libiconv’ on Solaris and AIX.
Some Linux distributions use ‘musl’ as their C runtime. This isless comprehensive than ‘glibc’: it does not support‘//TRANSLIT’ but does inexact conversions (currently using‘*’).
There are other implementations, e.g. NetBSD has used one from theCitrus project (which does not support ‘//TRANSLIT’) and there isan older FreeBSD port.
Note that you cannot rely on invalid inputs being detected, especiallyforto = "ASCII"
where some implementations allow 8-bitcharacters and pass them through unchanged or with transliteration orsubstitution.
Some of the implementations have interesting extra encodings: forexample GNU ‘libiconv’ and macOS 14 allowto = "C99"
to use‘\uxxxx’ escapes (or if needed ‘\Uuxxxxxxxx’) fornon-ASCII characters.
most commonly known as ‘BOMs’.
Encodings using character units which are more than one byte in sizecan be written on a file in either big-endian or little-endian order:this applies most commonly to UCS-2, UTF-16 and UTF-32/UCS-4encodings. Some systems will write the Unicode characterU+FEFF
at the beginning of a file in these encodings andperhaps also in UTF-8. In that usage the character is known as aBOM,and should be handled during input (see the ‘Encodings’ sectionunderconnection
: re-encoded connections have somespecial handling ofBOMs). The rest of this section applies when thishas not been done sox
starts with aBOM.
Implementations will generally interpret aBOM forfrom
givenas one of"UCS-2"
,"UTF-16"
and"UTF-32"
. Implementations differ in how they treatBOMs inx
in otherfrom
encodings: they may be discarded,returned as characterU+FEFF
or regarded as invalid.
The most portable name for the ISO 8859-15 encoding, commonly known as‘Latin 9’, is"iso885915"
: most platforms support both"latin-9"
and"latin9"
but GNU ‘libiconv’ does notsupport the latter. ‘musl’ (as used by Alpine Linux and otherlightweight Linux distributions) supports neither, butR remaps thereto"iso885915"
.
Encoding names"utf8"
,"mac"
and"macroman"
arenot portable."utf8"
is converted to"UTF-8"
forfrom
andto
byiconv
, but notfor e.g.fileEncoding
arguments."macintosh"
isthe official (and most widely supported) name for ‘Mac Roman’(https://en.wikipedia.org/wiki/Mac_OS_Roman).
Usingsub
substitutes each non-convertiblebyte in theinput, so when converting from UTF-8 a non-convertible character maybe replaced by two or more bytes. Usingsub = "c99"
orsub = "Unicode"
will be clearer.
## In principle, as not all systems have iconvlisttry(utils::head(iconvlist(), n = 50))## Not run: ## convert from Latin-2 to UTF-8: two of the glibc iconv variants.iconv(x, "ISO_8859-2", "UTF-8")iconv(x, "LATIN2", "UTF-8")## End(Not run)## Both x below are in latin1 and will only display correctly in a## locale that can represent and display latin1.x <- "fran\xE7ais"Encoding(x) <- "latin1"xcharToRaw(xx <- iconv(x, "latin1", "UTF-8"))xx## The results in the comments are those from glibc and GNU libiconviconv(x, "latin1", "ASCII") # NAiconv(x, "latin1", "ASCII", "?") # "fran?ais"iconv(x, "latin1", "ASCII", "") # "franais"iconv(x, "latin1", "ASCII", "byte") # "fran<e7>ais"iconv(xx, "UTF-8", "ASCII", "Unicode")# "fran<U+00E7>ais"iconv(xx, "UTF-8", "ASCII", "c99") # "fran\\u00e7ais"## Extracts from old R help files (they are nowadays in UTF-8)x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")Encoding(x) <- "latin1"xtry(iconv(x, "latin1", "ASCII//TRANSLIT")) # platform-dependent## glibc gives "Ekstroem" "Joreskog" "bisschen Zurcher"## macOS 14 gives "Ekstrom" "J\"oreskog" "bisschen Z\"urcher"## musl gives "Ekstr*m" "J*reskog" "bi*chen Z*rcher"iconv(x, "latin1", "ASCII", sub = "byte")## and for Windows' 'Unicode'str(xx <- iconv(x, "latin1", "UTF-16LE", toRaw = TRUE))iconv(xx, "UTF-16LE", "UTF-8")emoji <- "\U0001f604"iconv(emoji,, "latin1", sub = "Unicode") # "<U+1F604>"iconv(emoji,, "latin1", sub = "c99")
## In principle, as not all systems have iconvlisttry(utils::head(iconvlist(), n=50))## Not run:## convert from Latin-2 to UTF-8: two of the glibc iconv variants.iconv(x,"ISO_8859-2","UTF-8")iconv(x,"LATIN2","UTF-8")## End(Not run)## Both x below are in latin1 and will only display correctly in a## locale that can represent and display latin1.x<-"fran\xE7ais"Encoding(x)<-"latin1"xcharToRaw(xx<- iconv(x,"latin1","UTF-8"))xx## The results in the comments are those from glibc and GNU libiconviconv(x,"latin1","ASCII")# NAiconv(x,"latin1","ASCII","?")# "fran?ais"iconv(x,"latin1","ASCII","")# "franais"iconv(x,"latin1","ASCII","byte")# "fran<e7>ais"iconv(xx,"UTF-8","ASCII","Unicode")# "fran<U+00E7>ais"iconv(xx,"UTF-8","ASCII","c99")# "fran\\u00e7ais"## Extracts from old R help files (they are nowadays in UTF-8)x<- c("Ekstr\xf8m","J\xf6reskog","bi\xdfchen Z\xfcrcher")Encoding(x)<-"latin1"xtry(iconv(x,"latin1","ASCII//TRANSLIT"))# platform-dependent## glibc gives "Ekstroem" "Joreskog" "bisschen Zurcher"## macOS 14 gives "Ekstrom" "J\"oreskog" "bisschen Z\"urcher"## musl gives "Ekstr*m" "J*reskog" "bi*chen Z*rcher"iconv(x,"latin1","ASCII", sub="byte")## and for Windows' 'Unicode'str(xx<- iconv(x,"latin1","UTF-16LE", toRaw=TRUE))iconv(xx,"UTF-16LE","UTF-8")emoji<-"\U0001f604"iconv(emoji,,"latin1", sub="Unicode")# "<U+1F604>"iconv(emoji,,"latin1", sub="c99")
Controls the way collation is done by ICU (an optional part of theRbuild).
icuSetCollate(...)icuGetCollate(type = c("actual", "valid"))
icuSetCollate(...)icuGetCollate(type= c("actual","valid"))
... | named arguments, see ‘Details’. |
type | a character string: either the |
Optionally,R can be built to collate character strings by ICU(https://icu.unicode.org/). For such systems,icuSetCollate
can be used to tune the way collation is done.On other builds calling this function does nothing, with a warning.
Possible arguments are
locale
:A character string such as"da_DK"
giving the language and country whose collation rules are to beused. If present, this should be the first argument.
case_first
:"upper"
,"lower"
or"default"
, asking for upper- or lower-case characters to besorted first. The default is usually lower-case first, but not inall languages (not under the default settings for Danish, for example).
alternate_handling
:Controls the handling of‘variable’ characters (mainly punctuation and symbols).Possible values are"non_ignorable"
(primary strength) and"shifted"
(quaternary strength).
strength
:Which components should be used? Possiblevalues"primary"
,"secondary"
,"tertiary"
(default),"quaternary"
and"identical"
.
french_collation
:In a French locale the way accentsaffect collation is from right to left, whereas in most other localesit is from left to right. Possible values"on"
,"off"
and"default"
.
normalization
:Should strings be normalized? Possible valuesare"on"
and"off"
(default). This affects thecollation of composite characters.
case_level
:An additional level between secondary andtertiary, used to distinguish large and small Japanese Kanacharacters. Possible values"on"
and"off"
(default).
hiragana_quaternary
:Possible values"on"
(sortHiragana first at quaternary level) and"off"
.
Only the first three are likely to be of interest except to those with adetailed understanding of collation and specialized requirements.
Some special values are accepted forlocale
:
"none"
:ICU is not used for collation: the OS'scollation services are used instead.
"ASCII"
:ICU is not used for collation: the C functionstrcmp
is used instead, which should sort byte-by-byte in(unsigned) numerical order.
"default"
:obtains the locale from the OS as is done at the start of thesession (except on Windows). If environment variableR_ICU_LOCALE is set to a non-empty value, its value is usedrather than consulting the OS, unless environment variableLC_ALL is set to 'C' (or unset butLC_COLLATE is set to'C').
""
,"root"
:the ‘root’ collation: seehttps://www.unicode.org/reports/tr35/tr35-collation.html#Root_Collation.
For the specifications of ‘real’ ICU locales, seehttps://unicode-org.github.io/icu/userguide/locale/. Note that ICU does notreport that a locale is not supported, but falls back to its idea of‘best fit’ (which could be rather different and is reported byicuGetCollate("actual")
, often"root"
). Most Englishlocales fall back to"root"
as although e.g."en_GB"
isa valid locale (at least on some platforms), it contains no specialrules for collation. Note that"C"
is not a supported ICU localeand henceR_ICU_LOCALE should never be set to"C"
.
Some examples arecase_level = "on", strength = "primary"
to ignoreaccent differences andalternate_handling = "shifted"
to ignorespace and punctuation characters.
Initially ICU will not be used for collation if the OS is set to use theC
locale for collation andR_ICU_LOCALE is not set. Oncethis function is called with a value forlocale
, ICU will be useduntil it is called again withlocale = "none"
. ICU will not beused onceSys.setlocale
is called with a"C"
value forLC_ALL
orLC_COLLATE
, even ifR_ICU_LOCALE is set. ICU will be used again honoringR_ICU_LOCALE onceSys.setlocale
is called to set a different collation order. Environment variablesLC_ALL (orLC_COLLATE) take precedenceoverR_ICU_LOCALE if and only if they are set to 'C'. Due to theinteraction with other ways of setting the collation order,R_ICU_LOCALE should be used with care and only when needed.
All customizations are reset to the default for the locale iflocale
is specified: the collation engine is reset if theOS collation locate category is changed bySys.setlocale
.
ForicuGetCollate
, a character string describing the ICU localein use (which may be reported as"ICU not in use"
). The‘actual’ locale may be simpler than the requested locale: forexample"da"
rather than"da_DK"
: English locales arelikely to report"root"
.
Except on Windows, ICU is used by default wherever it is available.As it works internally in UTF-8, it will be most efficient in UTF-8locales.
On Windows,R is normally built including ICU, but it will only beused if environment variableR_ICU_LOCALE had been set whenRis started or aftericuSetCollate
is called to select thelocale (as ICU and Windows differ in their idea of locale names).Note thaticuSetCollate(locale = "default")
should workreasonably well, but finds the system default ignoring environmentvariables such asLC_COLLATE.
capabilities
for whether ICU is available;extSoftVersion
for its version.
The ICU user guide chapter on collation(https://unicode-org.github.io/icu/userguide/collation/).
## These examples depend on having ICU available, and on the locale.## As we don't know the current settings, we can only reset to the default.if(capabilities("ICU")) withAutoprint({ icuGetCollate() icuGetCollate("valid") x <- c("Aarhus", "aarhus", "safe", "test", "Zoo") sort(x) icuSetCollate(case_first = "upper"); sort(x) icuSetCollate(case_first = "lower"); sort(x) ## Danish collates upper-case-first and with 'aa' as a single letter icuSetCollate(locale = "da_DK", case_first = "default"); sort(x) ## Estonian collates Z between S and T icuSetCollate(locale = "et_EE"); sort(x) icuSetCollate(locale = "default"); icuGetCollate("valid")})
## These examples depend on having ICU available, and on the locale.## As we don't know the current settings, we can only reset to the default.if(capabilities("ICU")) withAutoprint({ icuGetCollate() icuGetCollate("valid") x<- c("Aarhus","aarhus","safe","test","Zoo") sort(x) icuSetCollate(case_first="upper"); sort(x) icuSetCollate(case_first="lower"); sort(x)## Danish collates upper-case-first and with 'aa' as a single letter icuSetCollate(locale="da_DK", case_first="default"); sort(x)## Estonian collates Z between S and T icuSetCollate(locale="et_EE"); sort(x) icuSetCollate(locale="default"); icuGetCollate("valid")})
The safe and reliable way to test two objects for beingexactly equal. It returnsTRUE
in this case,FALSE
in every other case.
identical(x, y, num.eq = TRUE, single.NA = TRUE, attrib.as.set = TRUE, ignore.bytecode = TRUE, ignore.environment = FALSE, ignore.srcref = TRUE, extptr.as.ref = FALSE)
identical(x, y, num.eq=TRUE, single.NA=TRUE, attrib.as.set=TRUE, ignore.bytecode=TRUE, ignore.environment=FALSE, ignore.srcref=TRUE, extptr.as.ref=FALSE)
x ,y | anyR objects. |
num.eq | logical indicating if ( |
single.NA | logical indicating if there is conceptually just one numeric |
attrib.as.set | logical indicating if |
ignore.bytecode | logical indicating if byte code should beignored when comparingclosures. |
ignore.environment | logical indicating if their environmentsshould be ignored when comparingclosures. |
ignore.srcref | logical indicating if their |
extptr.as.ref | logical indicating whether external pointerobjects should be compared as reference objects and consideredidentical only if they are the same object in memory. By default,external pointers are considered identical if the addresses theycontain are identical. |
A call toidentical
is the way to test exact equality inif
andwhile
statements, as well as in logicalexpressions that use&&
or||
. In all theseapplications you need to be assured of getting a single logicalvalue.
Users often use the comparison operators, such as==
or!=
, in these situations. It looks natural, but it is not whatthese operators are designed to do inR. They return an object likethe arguments. If you expectedx
andy
to be of length1, but it happened that one of them was not, you willnot get asingleFALSE
. Similarly, if one of the arguments isNA
,the result is alsoNA
. In either case, the expressionif(x == y)....
won't work as expected.
The functionall.equal
is also sometimes used to test equalitythis way, but was intended for something different: it allows forsmall differences in numeric results.
The computations inidentical
are also reliable and usuallyfast. There should never be an error. The only known way to killidentical
is by having an invalid pointer at the C level,generating a memory fault. It will usually find inequality quickly.Checking equality for two large, complicated objects can take longerif the objects are identical or nearly so, but represent completelyindependent copies. For most applications, however, the computational costshould be negligible.
Ifsingle.NA
is true, as by default,identical
seesNaN
as different fromNA_real_
, but allNaN
s are equal (and allNA
of the same type are equal).
Character strings (except those in marked encoding"bytes"
) areregarded as identical even if they are in different marked encodings butwould agree when translated to UTF-8. A character string in marked encoding"bytes"
is only regarded as identical to a character string in thesame encoding and with the same content.
Ifattrib.as.set
is true, as by default, comparison ofattributes view them as a set (and not a vector, so order is nottested).
Ifignore.bytecode
is true (the default), the compiledbytecode of a function (seecmpfun
) will be ignored inthe comparison. If it is false, functions will compare equal only ifthey are copies of the same compiled object (or both areuncompiled). To check whether two different compiles are equal, youshould compare the results ofdisassemble()
.
You almost never want to useidentical
on datetimes of class"POSIXlt"
: not only can different times in the differenttime zones represent the same time and time zones have multiple names,but several of the components are optional.
Note that the strictest test for equality is
identical(x, y, num.eq = FALSE, single.NA = FALSE, attrib.as.set = FALSE, ignore.bytecode = FALSE, ignore.environment = FALSE, ignore.srcref = FALSE, extptr.as.ref = TRUE)
A single logical value,TRUE
orFALSE
, neverNA
and never anything other than a single value.
John Chambers and R Core
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
all.equal
for descriptions of how two objects differ;Comparison
andLogic
for elementwise comparisons.
identical(1, NULL) ## FALSE -- don't try this with ==identical(1, 1.) ## TRUE in R (both are stored as doubles)identical(1, as.integer(1)) ## FALSE, stored as different typesx <- 1.0; y <- 0.99999999999## how to test for object equality allowing for numeric fuzz :(E <- all.equal(x, y))identical(TRUE, E)isTRUE(E) # alternative test## If all.equal thinks the objects are different, it returns a## character string, and the above expression evaluates to FALSE## even for unusual R objects :identical(.GlobalEnv, environment())### ------- Pickyness Flags : -----------------------------## the infamous example:identical(0., -0.) # TRUE, i.e. not differentiatedidentical(0., -0., num.eq = FALSE)## similar:identical(NaN, -NaN) # TRUEidentical(NaN, -NaN, single.NA = FALSE) # differ on bit-level### For functions ("closure"s): ----------------------------------------------### ~~~~~~~~~f <- function(x) xfg <- compiler::cmpfun(f)gidentical(f, g) # TRUE, as bytecode is ignored by defaultidentical(f, g, ignore.bytecode=FALSE) # FALSE: bytecode differs## GLM families contain several functions, some of which share an environment:p1 <- poisson() ; p2 <- poisson()identical(p1, p2) # FALSEidentical(p1, p2, ignore.environment=TRUE) # TRUE## in interactive use, the 'keep.source' option is typically true:op <- options(keep.source = TRUE) # and so, these have differing "srcref" :f1 <- function() {}f2 <- function() {}identical(f1,f2)# ignore.srcref= TRUE : TRUEidentical(f1,f2, ignore.srcref=FALSE)# FALSEoptions(op) # revert to previous state
identical(1,NULL)## FALSE -- don't try this with ==identical(1,1.)## TRUE in R (both are stored as doubles)identical(1, as.integer(1))## FALSE, stored as different typesx<-1.0; y<-0.99999999999## how to test for object equality allowing for numeric fuzz :(E<- all.equal(x, y))identical(TRUE, E)isTRUE(E)# alternative test## If all.equal thinks the objects are different, it returns a## character string, and the above expression evaluates to FALSE## even for unusual R objects :identical(.GlobalEnv, environment())### ------- Pickyness Flags : -----------------------------## the infamous example:identical(0.,-0.)# TRUE, i.e. not differentiatedidentical(0.,-0., num.eq=FALSE)## similar:identical(NaN,-NaN)# TRUEidentical(NaN,-NaN, single.NA=FALSE)# differ on bit-level### For functions ("closure"s): ----------------------------------------------### ~~~~~~~~~f<-function(x) xfg<- compiler::cmpfun(f)gidentical(f, g)# TRUE, as bytecode is ignored by defaultidentical(f, g, ignore.bytecode=FALSE)# FALSE: bytecode differs## GLM families contain several functions, some of which share an environment:p1<- poisson(); p2<- poisson()identical(p1, p2)# FALSEidentical(p1, p2, ignore.environment=TRUE)# TRUE## in interactive use, the 'keep.source' option is typically true:op<- options(keep.source=TRUE)# and so, these have differing "srcref" :f1<-function(){}f2<-function(){}identical(f1,f2)# ignore.srcref= TRUE : TRUEidentical(f1,f2, ignore.srcref=FALSE)# FALSEoptions(op)# revert to previous state
A trivial identity function returning its argument.
identity(x)
identity(x)
x | anR object. |
diag
creates diagonal matrices, including identity ones.
ifelse
returns a value with the same shape astest
which is filled with elements selectedfrom eitheryes
orno
depending on whether the element oftest
isTRUE
orFALSE
.
ifelse(test, yes, no)
ifelse(test, yes, no)
test | an object which can be coerced to logical mode. |
yes | return values for true elements of |
no | return values for false elements of |
Ifyes
orno
are too short, their elements are recycled.yes
will be evaluated if and only if any element oftest
is true, and analogously forno
.
Missing values intest
give missing values in the result.
A vector of the same length and attributes (including dimensions and"class"
) astest
and data values from the values ofyes
orno
. The mode of the answer will be coerced fromlogical to accommodate first any values taken fromyes
and thenany values taken fromno
.
The mode of the result may depend on the value oftest
(see theexamples), and the class attribute (seeoldClass
) of theresult is taken fromtest
and may be inappropriate for thevalues selected fromyes
andno
.
Sometimes it is better to use a construction such as
(tmp <- yes; tmp[!test] <- no[!test]; tmp)
, possibly extended to handle missing values intest
.
Further note thatif(test) yes else no
is much more efficientand often much preferable toifelse(test, yes, no)
whenevertest
is a simple true/false result, i.e., whenlength(test) == 1
.
Thesrcref
attribute of functions is handled specially: iftest
is a simple true result andyes
evaluates to a functionwithsrcref
attribute,ifelse
returnsyes
includingits attribute (the same applies to a falsetest
andno
argument). This functionality is only for backwards compatibility, theformif(test) yes else no
should be used wheneveryes
andno
are functions.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
if
.
x <- c(6:-4)sqrt(x) #- gives warningsqrt(ifelse(x >= 0, x, NA)) # no warning## Note: the following also gives the warning !ifelse(x >= 0, sqrt(x), NA)## ifelse() strips attributes## This is important when working with Dates and factorsx <- seq(as.Date("2000-02-29"), as.Date("2004-10-04"), by = "1 month")## has many "yyyy-mm-29", but a few "yyyy-03-01" in the non-leap yearsy <- ifelse(as.POSIXlt(x)$mday == 29, x, NA)head(y) # not what you expected ... ==> need restore the class attribute:class(y) <- class(x)y## This is a (not atypical) case where it is better *not* to use ifelse(),## but rather the more efficient and still clear:y2 <- xy2[as.POSIXlt(x)$mday != 29] <- NA## which gives the same as ifelse()+class() hack:stopifnot(identical(y2, y))## example of different return modes (and 'test' alone determining length):yes <- 1:3no <- pi^(1:4)utils::str( ifelse(NA, yes, no) ) # logical, length 1utils::str( ifelse(TRUE, yes, no) ) # integer, length 1utils::str( ifelse(FALSE, yes, no) ) # double, length 1
x<- c(6:-4)sqrt(x)#- gives warningsqrt(ifelse(x>=0, x,NA))# no warning## Note: the following also gives the warning !ifelse(x>=0, sqrt(x),NA)## ifelse() strips attributes## This is important when working with Dates and factorsx<- seq(as.Date("2000-02-29"), as.Date("2004-10-04"), by="1 month")## has many "yyyy-mm-29", but a few "yyyy-03-01" in the non-leap yearsy<- ifelse(as.POSIXlt(x)$mday==29, x,NA)head(y)# not what you expected ... ==> need restore the class attribute:class(y)<- class(x)y## This is a (not atypical) case where it is better *not* to use ifelse(),## but rather the more efficient and still clear:y2<- xy2[as.POSIXlt(x)$mday!=29]<-NA## which gives the same as ifelse()+class() hack:stopifnot(identical(y2, y))## example of different return modes (and 'test' alone determining length):yes<-1:3no<- pi^(1:4)utils::str( ifelse(NA, yes, no))# logical, length 1utils::str( ifelse(TRUE, yes, no))# integer, length 1utils::str( ifelse(FALSE, yes, no))# double, length 1
Creates or tests for objects of type"integer"
.
integer(length = 0)as.integer(x, ...)is.integer(x)
integer(length=0)as.integer(x,...)is.integer(x)
length | a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error. |
x | object to be coerced or tested. |
... | further arguments passed to or from other methods. |
Integer vectors exist so that data can be passed to C or Fortran codewhich expects them, and so that (small) integer data can be representedexactly and compactly.
Note that current implementations ofR use 32-bit integers forinteger vectors, so the range of representable integers is restrictedto about:
double
s canhold much larger integers exactly.
integer
creates a integer vector of the specified length.Each element of the vector is equal to0
.
as.integer
attempts to coerce its argument to be of integertype. The answer will beNA
unless the coercion succeeds. Realvalues larger in modulus than the largest integer are coerced toNA
(unlike S which gives the most extreme integer of the samesign). Non-integral numeric values are truncated towards zero (i.e.,as.integer(x)
equalstrunc(x)
there), andimaginary parts of complex numbers are discarded (with a warning).Character strings containing optional whitespace followed by either adecimal representation or a hexadecimal representation (starting with0x
or0X
) can be converted, as well as any allowed bythe platform for real numbers. Likeas.vector
it stripsattributes including names. (To ensure that an objectx
is ofinteger type without stripping attributes, usestorage.mode(x) <- "integer"
.)
is.integer
returnsTRUE
orFALSE
depending onwhether its argument is of integertype or not, unless it is afactor when it returnsFALSE
.
is.integer(x)
doesnot test ifx
contains integernumbers! For that, useround
, as in the functionis.wholenumber(x)
in the examples.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
round
(andceiling
andfloor
on that helppage) to convert to integral values.
## as.integer() truncates:x <- pi * c(-1:1, 10)as.integer(x)is.integer(1) # is FALSE !is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x - round(x)) < tolis.wholenumber(1) # is TRUE(x <- seq(1, 5, by = 0.5) )is.wholenumber( x ) #--> TRUE FALSE TRUE ...
## as.integer() truncates:x<- pi* c(-1:1,10)as.integer(x)is.integer(1)# is FALSE !is.wholenumber<-function(x, tol= .Machine$double.eps^0.5) abs(x- round(x))< tolis.wholenumber(1)# is TRUE(x<- seq(1,5, by=0.5))is.wholenumber( x)#--> TRUE FALSE TRUE ...
interaction
computes a factor which represents the interactionof the given factors. The result ofinteraction
is always unordered.
interaction(..., drop = FALSE, sep = ".", lex.order = FALSE)
interaction(..., drop=FALSE, sep=".", lex.order=FALSE)
... | the factors for which interaction is to be computed, or asingle list giving those factors. |
drop | if |
sep | string to construct the new level labels by joining theconstituent ones. |
lex.order | logical indicating if the order of factor concatenationshould be lexically ordered. |
A factor which represents the interaction of the given factors.The levels are labelled as the levels of the individual factors joinedbysep
which is.
by default.
By default, whenlex.order = FALSE
, the levels are ordered sothe level of the first factor varies fastest, then the second and soon. This is the reverse of lexicographic ordering (which you can getbylex.order = TRUE
), and differs from:
. (It is done this way for compatibility with S.)
Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
factor
;:
wheref:g
is similar tointeraction(f, g, sep = ":")
whenf
andg
are factors.
a <- gl(2, 4, 8)b <- gl(2, 2, 8, labels = c("ctrl", "treat"))s <- gl(2, 1, 8, labels = c("M", "F"))interaction(a, b)interaction(a, b, s, sep = ":")stopifnot(identical(a:s, interaction(a, s, sep = ":", lex.order = TRUE)), identical(a:s:b, interaction(a, s, b, sep = ":", lex.order = TRUE)))
a<- gl(2,4,8)b<- gl(2,2,8, labels= c("ctrl","treat"))s<- gl(2,1,8, labels= c("M","F"))interaction(a, b)interaction(a, b, s, sep=":")stopifnot(identical(a:s, interaction(a, s, sep=":", lex.order=TRUE)), identical(a:s:b, interaction(a, s, b, sep=":", lex.order=TRUE)))
ReturnTRUE
whenR is being used interactively andFALSE
otherwise.
interactive()
interactive()
An interactiveR session is one in which it is assumed that there isa human operator to interact with, so for exampleR can prompt forcorrections to incorrect input or ask what to do next or if it is OKto move to the next plot.
GUI consoles will arrange to startR in an interactive session. WhenR is run in a terminal (viaRterm.exe
on Windows), itassumes that it is interactive if ‘stdin’ is connected to a(pseudo-)terminal and not if ‘stdin’ is redirected to a file orpipe. Command-line options--interactive (Unix) and--ess (Windows,Rterm.exe
) override the defaultassumption.(On a Unix-alike, whether thereadline
command-line editor isused isnot overridden by--interactive.)
Embedded uses ofR can set a session to be interactive or not.
Internally, whether a session is interactive determines
how some errors are handled and reported, e.g. seestop
andoptions("showWarnCalls")
.
whether one of--save,--no-save or--vanilla is required, and ifR ever asks whether to save theworkspace.
the choice of default graphics device launched when needed andbydev.new
: seeoptions("device")
whether graphics devices ever ask for confirmation of a newpage.
In addition,R's ownR code makes use ofinteractive()
: forexamplehelp
,debugger
andinstall.packages
do.
This is aprimitive function.
.First <- function() if(interactive()) x11()
.First<-function()if(interactive()) x11()
.Internal
performs a call to an internal codewhich is built in to theR interpreter.
Only trueR wizards should even consider using this function, and onlyR developers can add to the list of internal functions.
.Internal(call)
.Internal(call)
call | a call expression |
.Primitive
,.External
(the nearestequivalent available to users).
ManyR-internal functions aregeneric and allowmethods to be written for.
The following primitive and internal functions aregeneric,i.e., you can writemethods
for them:
length
,length<-
,lengths
,dimnames
,dimnames<-
,dim
,dim<-
,names
,names<-
,levels<-
,@
,@<-
,
as.character
,as.complex
,as.double
,as.integer
,as.logical
,as.raw
,as.vector
,as.call
,as.environment
is.array
,is.matrix
,is.na
,anyNA
,is.nan
,is.finite
is.infinite
is.numeric
,nchar
rep
,rep.int
rep_len
seq.int
(which dispatches methods for"seq"
),is.unsorted
andxtfrm
In addition,is.name
is a synonym foris.symbol
anddispatches methods for the latter. Similarly,as.numeric
is a synonym foras.double
and dispatches methods for thelatter, i.e., S3 methods are foras.double
, whereas S4 methodsare to be written foras.numeric
.
Note that all of thegroup generic functions are alsointernal/primitive and allow methods to be written for them.
.S3PrimitiveGenerics
is a character vector listing theprimitives which are internal generic and notgroup generic,(not only for S3 but also S4).Similarly, the.internalGenerics
character vector contains the namesof the internal (via.Internal(..)
) non-primitive functionswhich are internally generic.
For efficiency, internal dispatch only occurs onobjects, thatis those for whichis.object
returns true.
methods
for the methods which are available.
Return a (temporarily) invisible copy of an object.
invisible(x = NULL)
invisible(x=NULL)
x | an arbitraryR object, by default |
This function can be useful when it is desired to have functionsreturn values which can be assigned, but which do not print when theyare not assigned.
This is aprimitive function.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
# These functions both return their argumentf1 <- function(x) xf2 <- function(x) invisible(x)f1(1) # printsf2(1) # does not
# These functions both return their argumentf1<-function(x) xf2<-function(x) invisible(x)f1(1)# printsf2(1)# does not
is.finite
andis.infinite
return a vector of the samelength asx
, indicating which elements are finite (not infiniteand not missing) or infinite.
Inf
and-Inf
are positive and negative infinitywhereasNaN
means ‘Not a Number’. (These apply to numericvalues and real and imaginary parts of complex values but not tovalues of integer vectors.)Inf
andNaN
(as well asNA
) arereserved words in theR language.
is.finite(x)is.infinite(x)is.nan(x)InfNaN
is.finite(x)is.infinite(x)is.nan(x)InfNaN
x | R object to be tested: the default methods handle atomicvectors. |
is.finite
returns a vector of the same length asx
thej-th element of which isTRUE
ifx[j]
is finite (i.e., itis not one of the valuesNA
,NaN
,Inf
or-Inf
) andFALSE
otherwise. Complexnumbers are finite if both the real and imaginary parts are.
is.infinite
returns a vector of the same length asx
thej-th element of which isTRUE
ifx[j]
is infinite (i.e.,equal to one ofInf
or-Inf
) andFALSE
otherwise. This will be false unlessx
is numeric or complex.Complex numbers are infinite if either the real or the imaginary part is.
is.nan
tests if a numeric value isNaN
. Do not testequality toNaN
, or even useidentical
, sincesystems typically have many different NaN values. One of these isused for the numeric missing valueNA
, andis.nan
isfalse for that value. A complex number is regarded asNaN
ifeither the real or imaginary part isNaN
but notNA
.All elements of logical, integer and raw vectors are considered not tobe NaN.
All three functions acceptNULL
as input and return a lengthzero result. The default methods accept character and raw vectors, andreturnFALSE
for all entries. Prior toR version 2.14.0 theyaccepted all input, returningFALSE
for most non-numericvalues; cases which are not atomic vectors are now signalled aserrors.
All three functions are generic: you can write methods to handlespecific classes of objects, seeInternalMethods.
A logical vector of the same length asx
:dim
,dimnames
andnames
attributes are preserved.
InR, basically all mathematical functions (including basicArithmetic
), are supposed to work properly with+/- Inf
andNaN
as input or output.
The basic rule should be that calls and relations withInf
sreally are statements with a proper mathematicallimit.
Computations involvingNaN
will returnNaN
or perhapsNA
: which of those two is not guaranteed and may dependon theR platform (since compilers may re-order computations).
The IEC 60559 standard, also known as theANSI/IEEE 754 Floating-Point Standard.
https://en.wikipedia.org/wiki/NaN.
D. Goldberg (1991).What Every Computer Scientist Should Know about Floating-PointArithmetic.ACM Computing Surveys,23(1), 5–48.doi:10.1145/103162.103163.
Also available athttps://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html.
The C99 functionisfinite
is used foris.finite
.
NA
, ‘Not Available’ which is not a numberas well, however usually used for missing values and applies to manymodes, not just numeric and complex.
pi / 0 ## = Inf a non-zero number divided by zero creates infinity0 / 0 ## = NaN1/0 + 1/0 # Inf1/0 - 1/0 # NaNstopifnot( 1/0 == Inf, 1/Inf == 0)sin(Inf)cos(Inf)tan(Inf)
pi/0## = Inf a non-zero number divided by zero creates infinity0/0## = NaN1/0+1/0# Inf1/0-1/0# NaNstopifnot(1/0==Inf,1/Inf==0)sin(Inf)cos(Inf)tan(Inf)
Checks whether its argument is a (primitive) function.
is.function(x)is.primitive(x)
is.function(x)is.primitive(x)
x | anR object. |
is.primitive(x)
tests ifx
is aprimitive function,i.e, iftypeof(x)
is either"builtin"
or"special"
.
TRUE
ifx
is a (primitive) function, andFALSE
otherwise.
is.function(1) # FALSEis.function (is.primitive) # TRUE: it is a function, but ..is.primitive(is.primitive) # FALSE: it's not a primitive one, whereasis.primitive(is.function) # TRUE: that one *is*
is.function(1)# FALSEis.function(is.primitive)# TRUE: it is a function, but ..is.primitive(is.primitive)# FALSE: it's not a primitive one, whereasis.primitive(is.function)# TRUE: that one *is*
is.language
returnsTRUE
ifx
is avariablename
, acall
, or anexpression
.
is.language(x)
is.language(x)
x | object to be tested. |
Aname
is also known as ‘symbol’, from its type(typeof
), seeis.symbol
.
Iftypeof(x) == "language"
, thenis.language(x)
is always true, but the reverse does not hold as expressions ornamesy
also fulfillis.language(y)
, see the examples.
This is aprimitive function.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
ll <- list(a = expression(x^2 - 2*x + 1), b = as.name("Jim"), c = as.expression(exp(1)), d = call("sin", pi))sapply(ll, typeof)sapply(ll, mode)stopifnot(sapply(ll, is.language))
ll<- list(a= expression(x^2-2*x+1), b= as.name("Jim"), c= as.expression(exp(1)), d= call("sin", pi))sapply(ll, typeof)sapply(ll, mode)stopifnot(sapply(ll, is.language))
A function mostly for internal use. It returnsTRUE
if theobjectx
has theR internalOBJECT
bit set, andFALSE
otherwise. TheOBJECT
bit is set when a"class"
attribute is added and removed when that attribute isremoved, so this is a very efficient way to check if an object has aclass attribute. (S4 objects always should.)
Note that typical basic (‘atomic’, seeis.atomic
)R vectors and arraysx
arenot objects in the abovesense asattributes(x)
doesnot contain"class"
.
is.object(x)
is.object(x)
x | object to be tested. |
This is aprimitive function.
isS4
.
is.object(1) # FALSEis.object(as.factor(1:3)) # TRUE
is.object(1)# FALSEis.object(as.factor(1:3))# TRUE
is.atomic
returnsTRUE
ifx
is of an atomic typeandFALSE
otherwise.
is.recursive
returnsTRUE
ifx
has a recursive(list-like) structure andFALSE
otherwise.
is.atomic(x)is.recursive(x)
is.atomic(x)is.recursive(x)
x | object to be tested. |
is.atomic
is true for theatomic types("logical"
,"integer"
,"numeric"
,"complex"
,"character"
and"raw"
).
Most types of objects are regarded as recursive. Exceptions are the atomictypes,NULL
, symbols (as given byas.name
),S4
objects with slots, external pointers, and—rarely visiblefromR—weak references and byte code, seetypeof
.
It is common to call the atomic types ‘atomic vectors’, butnote thatis.vector
imposes further restrictions: anobject can be atomic but not a vector (in that sense).
These areprimitive functions.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
is.list
,is.language
, etc,and thedemo("is.things")
.
require(stats)is.a.r <- function(x) c(is.atomic(x), is.recursive(x))is.a.r(c(a = 1, b = 3)) # TRUE FALSEis.a.r(list()) # FALSE TRUE - a list is a listis.a.r(list(2)) # FALSE TRUEis.a.r(lm) # FALSE TRUEis.a.r(y ~ x) # FALSE TRUEis.a.r(expression(x+1)) # FALSE TRUEis.a.r(quote(exp)) # FALSE FALSEis.a.r(NULL) # FALSE FALSE
require(stats)is.a.r<-function(x) c(is.atomic(x), is.recursive(x))is.a.r(c(a=1, b=3))# TRUE FALSEis.a.r(list())# FALSE TRUE - a list is a listis.a.r(list(2))# FALSE TRUEis.a.r(lm)# FALSE TRUEis.a.r(y~ x)# FALSE TRUEis.a.r(expression(x+1))# FALSE TRUEis.a.r(quote(exp))# FALSE FALSEis.a.r(NULL)# FALSE FALSE
is.single
reports an error. There are no single precisionvalues in R.
is.single(x)
is.single(x)
x | object to be tested. |
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Test if an object is not sorted (in increasing order), without thecost of sorting it.
is.unsorted(x, na.rm = FALSE, strictly = FALSE)
is.unsorted(x, na.rm=FALSE, strictly=FALSE)
x | anR object with a class or a numeric, complex, character,logical or raw vector. |
na.rm | logical. Should missing values be removed before checking? |
strictly | logical indicating if the check should be forstrictly increasing values. |
is.unsorted
is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.
A length-one logical value. All objects of length 0 or 1 are sorted.Otherwise, the result will beNA
except for atomic vectors andobjects with an S3 class (where the>=
or>
method isused to comparex[i]
withx[i-1]
fori
in2:length(x)
) or with an S4 class where you have to provide amethod foris.unsorted()
.
This function is designed for objects with one-dimensional indices, asdescribed above. Data frames, matrices and other arrays may givesurprising results.
Convenience wrappers to create date-times from numeric representations.
ISOdatetime(year, month, day, hour, min, sec, tz = "")ISOdate(year, month, day, hour = 12, min = 0, sec = 0, tz = "GMT")
ISOdatetime(year, month, day, hour, min, sec, tz="")ISOdate(year, month, day, hour=12, min=0, sec=0, tz="GMT")
year ,month ,day | numerical values to specify a day. |
hour ,min ,sec | numerical values for a time within a day.Fractional seconds are allowed. |
tz | atime zone specification to be used for the conversion. |
ISOdatetime
andISOdate
are convenience wrappers forstrptime
that differ only in their defaults and thatISOdate
sets UTC as the time zone. For dates without times itwould normally be better to use the"Date"
class.
The main arguments will be recycled using the usual recycling rules.
Because these make use ofstrptime
, only years in therange0:9999
are accepted.
An object of class"POSIXct"
.
DateTimeClasses for details of the date-time classes;strptime
for conversions from character strings.
Tests whether the object is an instance of an S4 class.
isS4(object)asS4(object, flag = TRUE, complete = TRUE)asS3(object, flag = TRUE, complete = TRUE)
isS4(object)asS4(object, flag=TRUE, complete=TRUE)asS3(object, flag=TRUE, complete=TRUE)
object | Any R object. |
flag | Optional, logical: indicate direction of conversion. |
complete | Optional, logical: whether conversion to S3 iscompleted. Not usually needed, but see the details section. |
Note thatisS4
does not rely on themethodspackage, so in particular it can be used to detect the need torequire
that package.
asS3
uses the value ofcomplete
to control whether an attempt is made to transformobject
into a valid object of the implied S3 class. Ifcomplete
isTRUE
,then an object from an S4 class extending an S3 class will betransformed into an S3 object with the corresponding S3 class (seeS3Part
). This includes classes extending thepseudo-classesarray
andmatrix
: such objects will havetheir class attribute set toNULL
.
isS4
isprimitive.
isS4
always returnsTRUE
orFALSE
according towhether the internal flag marking an S4 object has been turned on forthis object.
asS4
andasS3
will turn this flag on or off,andasS3
will set the class from the objects.S3Class
slot if one exists. Note thatasS3
willnot turnthe object into an S3 objectunless there is a valid conversion; that is, an object of type otherthan"S4"
for which the S4 object is an extension, unlessargumentcomplete
isFALSE
.
is.object
for a more general test;Introductionfor general information on S4;Classes_Details for more on S4class definitions.
isS4(pi) # FALSEisS4(getClass("MethodDefinition")) # TRUE
isS4(pi)# FALSEisS4(getClass("MethodDefinition"))# TRUE
Generic function to test ifobject
is symmetric or not.Currently only a matrix method is implemented, where acomplex
matrixZ
must be “Hermitian” forisSymmetric(Z)
to be true.
isSymmetric(object, ...)## S3 method for class 'matrix'isSymmetric(object, tol = 100 * .Machine$double.eps, tol1 = 8 * tol, ...)
isSymmetric(object,...)## S3 method for class 'matrix'isSymmetric(object, tol=100* .Machine$double.eps, tol1=8* tol,...)
object | anyR object; a |
tol | numeric scalar >= 0. Smaller differences are notconsidered, see |
tol1 | numeric scalar >= 0. |
... | further arguments passed to methods; the matrix methodpasses these to |
Thematrix
method is used insideeigen
bydefault to test symmetry of matricesup to rounding error, usingall.equal
. It might not be appropriate in allsituations.
Note that a matrixm
is only symmetric if itsrownames
andcolnames
are identical. Consider usingunname(m)
.
logical indicating ifobject
is symmetric or not.
eigen
which callsisSymmetric
when itssymmetric
argument is missing.
isSymmetric(D3 <- diag(3)) # -> TRUED3[2, 1] <- 1e-100D3isSymmetric(D3) # TRUEisSymmetric(D3, tol = 0) # FALSE for zero-tolerance## Complex Matrices - Hermitian or notZ <- sqrt(matrix(-1:2 + 0i, 2)); Z <- t(Conj(Z)) %*% ZZisSymmetric(Z) # TRUEisSymmetric(Z + 1) # TRUEisSymmetric(Z + 1i) # FALSE -- a Hermitian matrix has a *real* diagonalcolnames(D3) <- c("X", "Y", "Z")isSymmetric(D3) # FALSE (as row and column names differ)isSymmetric(D3, check.attributes=FALSE) # TRUE (as names are not checked)
isSymmetric(D3<- diag(3))# -> TRUED3[2,1]<-1e-100D3isSymmetric(D3)# TRUEisSymmetric(D3, tol=0)# FALSE for zero-tolerance## Complex Matrices - Hermitian or notZ<- sqrt(matrix(-1:2+0i,2)); Z<- t(Conj(Z))%*% ZZisSymmetric(Z)# TRUEisSymmetric(Z+1)# TRUEisSymmetric(Z+1i)# FALSE -- a Hermitian matrix has a *real* diagonalcolnames(D3)<- c("X","Y","Z")isSymmetric(D3)# FALSE (as row and column names differ)isSymmetric(D3, check.attributes=FALSE)# TRUE (as names are not checked)
Add a small amount of noise to a numeric vector.
jitter(x, factor = 1, amount = NULL)
jitter(x, factor=1, amount=NULL)
x | numeric vector to whichjitter should be added. |
factor | numeric. |
amount | numeric; if positive, used asamount (see below),otherwise, if Default ( |
The result, sayr
, isr <- x + runif(n, -a, a)
wheren <- length(x)
anda
is theamount
argument (if specified).
Letz <- max(x) - min(x)
(assuming the usual case).The amounta
to be added is either provided aspositiveargumentamount
or otherwise computed fromz
, asfollows:
Ifamount == 0
, we seta <- factor * z/50
(same as S).
Ifamount
isNULL
(default), we seta <- factor * d/5
whered is the smallestdifference between adjacent unique (apart from fuzz)x
values.
jitter(x, ...)
returns a numeric of the same length asx
, but with anamount
of noise added in order to breakties.
Werner Stahel and Martin Maechler, ETH Zurich
Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P.A. (1983)Graphical Methods for Data Analysis. Wadsworth; figures 2.8,4.22, 5.4.
Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
rug
which you may want to combine withjitter
.
round(jitter(c(rep(1, 3), rep(1.2, 4), rep(3, 3))), 3)## These two 'fail' with S-plus 3.x:jitter(rep(0, 7))jitter(rep(10000, 5))
round(jitter(c(rep(1,3), rep(1.2,4), rep(3,3))),3)## These two 'fail' with S-plus 3.x:jitter(rep(0,7))jitter(rep(10000,5))
The condition number of a regular (square) matrix is the product ofthenorm of the matrix and the norm of its inverse (orpseudo-inverse), and hence depends on the kind of matrix-norm.
kappa()
computes by default (an estimate of) the 2-normcondition number of a matrix or of the matrix of a
decomposition, perhaps of a linear fit. The 2-norm condition numbercan be shown to be the ratio of the largest to the smallestnon-zero singular value of the matrix.
rcond()
computes an approximation of thereciprocalcondition number, see the details.
kappa(z, ...)## Default S3 method:kappa(z, exact = FALSE, norm = NULL, method = c("qr", "direct"), inv_z = solve(z), triangular = FALSE, uplo = "U", ...)## S3 method for class 'lm'kappa(z, ...)## S3 method for class 'qr'kappa(z, ...).kappa_tri(z, exact = FALSE, LINPACK = TRUE, norm = NULL, uplo = "U", ...)rcond(x, norm = c("O","I","1"), triangular = FALSE, uplo = "U", ...)
kappa(z,...)## Default S3 method:kappa(z, exact=FALSE, norm=NULL, method= c("qr","direct"), inv_z= solve(z), triangular=FALSE, uplo="U",...)## S3 method for class 'lm'kappa(z,...)## S3 method for class 'qr'kappa(z,...).kappa_tri(z, exact=FALSE, LINPACK=TRUE, norm=NULL, uplo="U",...)rcond(x, norm= c("O","I","1"), triangular=FALSE, uplo="U",...)
z ,x | a numeric or complex matrix or a result of |
exact | logical. Should the result be exact (up to small roundingerror) as opposed to fast (but quite inaccurate)? |
norm | character string, specifying the matrix norm with respectto which the condition number is to be computed, see the function |
method | a partially matched character string specifying the method to be used; |
inv_z | for |
triangular | logical. If true, the matrix used is just the upper orlower triangular part of |
uplo | character string, either |
LINPACK | logical. If true and |
... | further arguments passed to or from other methods;for |
Forkappa()
, ifexact = FALSE
(the default) thecondition number is estimated by a cheap approximation to the 1-norm ofthe triangular matrix of the
qr(x)
decomposition. However, the exact 2-norm calculation (via
svd
) is also likely to be quick enough.
Note that the approximate 1- and Inf-norm condition numbers viamethod = "direct"
are much faster tocalculate, andrcond()
computes thesereciprocalcondition numbers, also for complex matrices, using standard LAPACKroutines.Currently, also thekappa*()
functions compute theseapproximations wheneverexact
is false, i.e., by default.
kappa
andrcond
are different interfaces topartly identical functionality.
.kappa_tri
is an internal function called bykappa.qr
andkappa.default
;tri
is fortriangular and its methodsonly consider the upper or lower triangular part of the matrix, dependingonuplo = "U"
or"L"
, where"U"
was internally hardwired beforeR 4.4.0.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.
The condition number,, or an approximation if
exact = FALSE
.
The design was inspired by (but differs considerably from)the S function of the same name described in Chambers (1992).
The LAPACK routinesDTRCON
andZTRCON
and the LINPACKroutineDTRCO
.
LAPACK and LINPACK are fromhttps://netlib.org/lapack/ andhttps://netlib.org/linpack/ and their guides are listedin the references.
Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.
Chambers, J. M. (1992)Linear models.Chapter 4 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide. Philadelphia: SIAM Publications.
norm
;svd
for the singular value decomposition andqr
for the one.
kappa(x1 <- cbind(1, 1:10)) # 15.71kappa(x1, exact = TRUE) # 13.68kappa(x2 <- cbind(x1, 2:11)) # high! [x2 is singular!]hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, `+`) }sv9 <- svd(h9 <- hilbert(9))$ dkappa(h9) # pretty high; by default {exact=FALSE, method="qr"} :kappa(h9) == kappa(qr.R(qr(h9)), norm = "1")all.equal(kappa(h9, exact = TRUE), # its definition: max(sv9) / min(sv9), tolerance = 1e-12) ## the same (typically down to 2.22e-16)kappa(h9, exact = TRUE) / kappa(h9) # 0.677 (i.e., rel.error = 32%)## Exact kappa for rectangular matrix## panmagic.6npm1(7) :pm7 <- rbind(c( 1, 13, 18, 23, 35, 40, 45), c(37, 49, 5, 10, 15, 27, 32), c(24, 29, 41, 46, 2, 14, 19), c(11, 16, 28, 33, 38, 43, 6), c(47, 3, 8, 20, 25, 30, 42), c(34, 39, 44, 7, 12, 17, 22), c(21, 26, 31, 36, 48, 4, 9))kappa(pm7, exact=TRUE, norm="1") # no problem for square matrixm76 <- pm7[,1:6](m79 <- cbind(pm7, 50:56, 63:57))## Moore-Penrose inverse { ~= MASS::ginv(); differing tol (value & meaning)}:## pinv := p(seudo) inv(erse)pinv <- function(X, s = svd(X), tol = 64*.Machine$double.eps) { if (is.complex(X)) s$u <- Conj(s$u) dx <- dim(X) ## X = U D V' ==> Result = V {1/D} U' pI <- function(u,d,v) tcrossprod(v, u / rep(d, each = dx[1L])) pos <- (d <- s$d) > max(tol * max(dx) * d[1L], 0) if (all(pos)) pI(s$u, d, s$v) else if (!any(pos)) array(0, dX[2L:1L]) else { # some pos, some not: i <- which(pos) pI(s$u[, i, drop = FALSE], d[i], s$v[, i, drop = FALSE]) }}## rectangularkappa(m76, norm="1")try( kappa(m76, exact=TRUE, norm="1") )# error in solve().. must be square## ==> use pseudo-inverse instead of solve() for rectangular {and norm != "2"}:iZ <- pinv(m76)kappa(m76, exact=TRUE, norm="1", inv_z = iZ)kappa(m76, exact=TRUE, norm="M", inv_z = iZ)kappa(m76, exact=TRUE, norm="I", inv_z = iZ)iX <- pinv(m79)kappa(m79, exact=TRUE, norm="1", inv_z = iX)kappa(m79, exact=TRUE, norm="M", inv_z = iX)kappa(m79, exact=TRUE, norm="I", inv_z = iX)
kappa(x1<- cbind(1,1:10))# 15.71kappa(x1, exact=TRUE)# 13.68kappa(x2<- cbind(x1,2:11))# high! [x2 is singular!]hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}sv9<- svd(h9<- hilbert(9))$ dkappa(h9)# pretty high; by default {exact=FALSE, method="qr"} :kappa(h9)== kappa(qr.R(qr(h9)), norm="1")all.equal(kappa(h9, exact=TRUE),# its definition: max(sv9)/ min(sv9), tolerance=1e-12)## the same (typically down to 2.22e-16)kappa(h9, exact=TRUE)/ kappa(h9)# 0.677 (i.e., rel.error = 32%)## Exact kappa for rectangular matrix## panmagic.6npm1(7) :pm7<- rbind(c(1,13,18,23,35,40,45), c(37,49,5,10,15,27,32), c(24,29,41,46,2,14,19), c(11,16,28,33,38,43,6), c(47,3,8,20,25,30,42), c(34,39,44,7,12,17,22), c(21,26,31,36,48,4,9))kappa(pm7, exact=TRUE, norm="1")# no problem for square matrixm76<- pm7[,1:6](m79<- cbind(pm7,50:56,63:57))## Moore-Penrose inverse { ~= MASS::ginv(); differing tol (value & meaning)}:## pinv := p(seudo) inv(erse)pinv<-function(X, s= svd(X), tol=64*.Machine$double.eps){if(is.complex(X)) s$u<- Conj(s$u) dx<- dim(X)## X = U D V' ==> Result = V {1/D} U' pI<-function(u,d,v) tcrossprod(v, u/ rep(d, each= dx[1L])) pos<-(d<- s$d)> max(tol* max(dx)* d[1L],0)if(all(pos)) pI(s$u, d, s$v)elseif(!any(pos)) array(0, dX[2L:1L])else{# some pos, some not: i<- which(pos) pI(s$u[, i, drop=FALSE], d[i], s$v[, i, drop=FALSE])}}## rectangularkappa(m76, norm="1")try( kappa(m76, exact=TRUE, norm="1"))# error in solve().. must be square## ==> use pseudo-inverse instead of solve() for rectangular {and norm != "2"}:iZ<- pinv(m76)kappa(m76, exact=TRUE, norm="1", inv_z= iZ)kappa(m76, exact=TRUE, norm="M", inv_z= iZ)kappa(m76, exact=TRUE, norm="I", inv_z= iZ)iX<- pinv(m79)kappa(m79, exact=TRUE, norm="1", inv_z= iX)kappa(m79, exact=TRUE, norm="M", inv_z= iX)kappa(m79, exact=TRUE, norm="I", inv_z= iX)
Computes the generalised Kronecker product of two arrays,X
andY
.
kronecker(X, Y, FUN = "*", make.dimnames = FALSE, ...)X %x% Y
kronecker(X, Y, FUN="*", make.dimnames=FALSE,...)X%x% Y
X | a vector or array. |
Y | a vector or array. |
FUN | a function; it may be a quoted string. |
make.dimnames | logical: provide dimnames that are the product of thedimnames of |
... | optional arguments to be passed to |
IfX
andY
do not have the same number ofdimensions, the smaller array is padded with dimensions of sizeone. The returned array comprises submatrices constructed bytakingX
one term at a time and expanding that term asFUN(x, Y, ...)
.
%x%
is an alias forkronecker
(whereFUN
is hardwired to"*"
).
An arrayA
with dimensionsdim(X) * dim(Y)
.
Jonathan Rougier
Shayle R. Searle (1982)Matrix Algebra Useful for Statistics. John Wiley and Sons.
outer
, on whichkronecker
is builtand%*%
for usual matrix multiplication.
# simple scalar multiplication( M <- matrix(1:6, ncol = 2) )kronecker(4, M)# Block diagonal matrix:kronecker(diag(1, 3), M)# ask for dimnamesfred <- matrix(1:12, 3, 4, dimnames = list(LETTERS[1:3], LETTERS[4:7]))bill <- c("happy" = 100, "sad" = 1000)kronecker(fred, bill, make.dimnames = TRUE)bill <- outer(bill, c("cat" = 3, "dog" = 4))kronecker(fred, bill, make.dimnames = TRUE)
# simple scalar multiplication( M<- matrix(1:6, ncol=2))kronecker(4, M)# Block diagonal matrix:kronecker(diag(1,3), M)# ask for dimnamesfred<- matrix(1:12,3,4, dimnames= list(LETTERS[1:3], LETTERS[4:7]))bill<- c("happy"=100,"sad"=1000)kronecker(fred, bill, make.dimnames=TRUE)bill<- outer(bill, c("cat"=3,"dog"=4))kronecker(fred, bill, make.dimnames=TRUE)
Report on localization information.
l10n_info()
l10n_info()
‘A Latin-1 locale’ includes supersets (for printablecharacters) such as Windows codepage 1252 but not Latin-9 (ISO 8859-15).
OnWindows (where the resulting list containscodepage
andsystem.codepage
components additionally), commoncodepages are 1252 (Western European), 1250 (Central European),1251 (Cyrillic), 1253 (Greek), 1254 (Turkish), 1255 (Hebrew), 1256(Arabic), 1257 (Baltic), 1258 (Vietnamese), 874 (Thai), 932(Japanese), 936 (Simplified Chinese), 949 (Korean) and 950(Traditional Chinese). Codepage 28605 is Latin-9 and 65001 is UTF-8(where supported).R does not allow the C locale, and uses 1252 asthe default codepage.
A list with three logical elements and further OS-specific elements:
MBCS | If a multi-byte character set in use? |
UTF-8 | Is this known to be a UTF-8 locale? |
Latin-1 | Is this known to be a Latin-1 locale? |
Not on Windows:
codeset | character. The encoding name as reported by the OS,possibly |
Only on Windows:
codepage | integer: the Windows codepage corresponding to thelocaleR is using (and not necessarily that Windows is using). |
system.codepage | integer: the Windows system/ANSI codepage(the codepage Windows is using). Added inR 4.1.0. |
l10n_info()
l10n_info()
Report the name of the shared object file withLAPACK
implementationin use.
La_library()
La_library()
A character vector of length one (""
when the name is not known).The value can be used as an indication of whichLAPACK
implementation is in use. Typically, theR version ofLAPACK
willappear aslibRlapack.so
(libRlapack.dylib
), depending on howR was built. Note thatlibRlapack.so
(libRlapack.dylib
) mayalso be shown for an externalLAPACK
implementation that had beencopied, hard-linked or renamed by the system administrator. Otherwise,the shared object file will be given and its path/name may indicatethe vendor/version.
The detection does not work on Windows, nor for the Accelerateframework on macOS, nor in the rare (and unsupported) case of a staticexternal library.
It is possible to buildR against an enhanced BLAS which containssome but not all LAPACK routines, in which case this function reportsthe library containing routineILAVER
.
extSoftVersion
for versions of other third-party softwareincludingBLAS
.
La_version
for the version of LAPACK in use.
La_library()
La_library()
Report the version of LAPACK in use.
La_version()
La_version()
A character vector of length one.
Note that this is the version as reported by the library at runtime.It may differ from the reference (‘netlib’) implementation, forexample by having some optimized or patched routines. For the versionincluded withR, the older (not Fortran 90) versions of
DLARTG DLASSQ ZLARTG ZLASSQ
are used.
extSoftVersion
for versions of other third-party software.
La_library
for binary/executable file with LAPACK in use.
La_version()
La_version()
Find a suitable set of labels from an object for use in printing orplotting, for example. A generic function.
labels(object, ...)
labels(object,...)
object | anyR object: the function is generic. |
... | further arguments passed to or from other methods. |
A character vector or list of such vectors. For a vector the resultsis the names orseq_along(x)
and for a data frame or array itis the dimnames (withNULL
expanded toseq_len(d[i])
).
Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
lapply
returns a list of the same length asX
, eachelement of which is the result of applyingFUN
to thecorresponding element ofX
.
sapply
is a user-friendly version and wrapper oflapply
by default returning a vector, matrix or, ifsimplify = "array"
, anarray if appropriate, by applyingsimplify2array()
.sapply(x, f, simplify = FALSE, USE.NAMES = FALSE)
is the same aslapply(x, f)
.
vapply
is similar tosapply
, but has a pre-specifiedtype of return value, so it can be safer (and sometimes faster) touse.
replicate
is a wrapper for the common use ofsapply
forrepeated evaluation of an expression (which will usually involverandom number generation).
simplify2array()
is the utility called fromsapply()
whensimplify
is not false and is similarly called frommapply()
.
lapply(X, FUN, ...)sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)replicate(n, expr, simplify = "array")simplify2array(x, higher = TRUE, except = c(0L, 1L))
lapply(X, FUN,...)sapply(X, FUN,..., simplify=TRUE, USE.NAMES=TRUE)vapply(X, FUN, FUN.VALUE,..., USE.NAMES=TRUE)replicate(n, expr, simplify="array")simplify2array(x, higher=TRUE, except= c(0L,1L))
X | a vector (atomic or list) or an |
FUN | the function to be applied to each element of |
... | optional arguments to |
simplify | logical or character string; should the result besimplified to a vector, matrix or higher dimensional array ifpossible? For |
USE.NAMES | logical; if |
FUN.VALUE | a (generalized) vector; a template for the returnvalue from FUN. See ‘Details’. |
n | integer: the number of replications. |
expr | the expression (alanguage object, usually a call)to evaluate repeatedly. |
x | a list, typically returned from |
higher | logical; if true, |
except | integer vector or |
FUN
is found by a call tomatch.fun
and typicallyis specified as a function or a symbol (e.g., a backquoted name) or acharacter string specifying a function to be searched for from theenvironment of the call tolapply
.
FunctionFUN
must be able to accept as input any of theelements ofX
. If the latter is an atomic vector,FUN
will always be passed a length-one vector of the same type asX
.
Arguments in...
cannot have the same name as any of theother arguments, and care may be needed to avoid partial matching toFUN
. In general-purpose code it is good practice to name thefirst two argumentsX
andFUN
if...
is passedthrough: this both avoids partial matching toFUN
and ensuresthat a sensible error message is given if arguments namedX
orFUN
are passed through...
.
Simplification insapply
is only attempted ifX
haslength greater than zero and if the return values from all elementsofX
are all of the same (positive) length. If the commonlength is one the result is a vector, and if greater than one is amatrix with a column corresponding to each element ofX
.
Simplification is always done invapply
. This functionchecks that all values ofFUN
are compatible with theFUN.VALUE
, in that they must have the same length and type.(Types may be promoted to a higher type within the ordering logical< integer < double < complex, but not demoted.)
Users of S4 classes should pass a list tolapply
andvapply
: the internal coercion is done by theas.list
inthe base namespace and not one defined by a user (e.g., by setting S4methods on the base function).
Forlapply
,sapply(simplify = FALSE)
andreplicate(simplify = FALSE)
, a list.
Forsapply(simplify = TRUE)
andreplicate(simplify = TRUE)
: ifX
has length zero orn = 0
, an empty list.Otherwise an atomic vector or matrix or list of the same length asX
(of lengthn
forreplicate
). If simplificationoccurs, the output type is determined from the highest type of thereturn values in the hierarchy NULL < raw < logical < integer < double <complex < character < list < expression, after coercion of pairliststo lists.
vapply
returns a vector or array of type matching theFUN.VALUE
. Iflength(FUN.VALUE) == 1
avector of the same length asX
is returned, otherwisean array. IfFUN.VALUE
is not anarray
, theresult is a matrix withlength(FUN.VALUE)
rows andlength(X)
columns, otherwise an arraya
withdim(a) == c(dim(FUN.VALUE), length(X))
.
The (Dim)names of the array value are taken from theFUN.VALUE
if it is named, otherwise from the result of the first function call.Column names of the matrix or more generally the names of the lastdimension of the array value or names of the vector value are set fromX
as insapply
.
sapply(*, simplify = FALSE, USE.NAMES = FALSE)
isequivalent tolapply(*)
.
For historical reasons, the calls created bylapply
areunevaluated, and code has been written (e.g.,bquote
) thatrelies on this. This means that the recorded call is always of theformFUN(X[[i]], ...)
, withi
replaced by the current(integer or double) index. This is not normally a problem, but it canbe ifFUN
usessys.call
ormatch.call
or if it is a primitive function that makesuse of the call. This means that it is often safer to call primitivefunctions with a wrapper, so that e.g.lapply(ll, function(x) is.numeric(x))
is required to ensure that method dispatch foris.numeric
occurs correctly.
Ifexpr
is a function call, be aware of assumptions about whereit is evaluated, and in particular what...
might refer to.You can pass additional named arguments to a function call asadditional named arguments toreplicate
: see ‘Examples’.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
apply
,tapply
,mapply
for applying a function tomultiplearguments, andrapply
for arecursive version oflapply()
,eapply
for applying a function to eachentry in anenvironment
.
require(stats); require(graphics)x <- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE,TRUE))# compute the list mean for each list elementlapply(x, mean)# median and quartiles for each list elementlapply(x, quantile, probs = 1:3/4)sapply(x, quantile)i39 <- sapply(3:9, seq) # list of vectorssapply(i39, fivenum)vapply(i39, fivenum, c(Min. = 0, "1st Qu." = 0, Median = 0, "3rd Qu." = 0, Max. = 0))## sapply(*, "array") -- artificial example(v <- structure(10*(5:8), names = LETTERS[1:4]))f2 <- function(x, y) outer(rep(x, length.out = 3), y)(a2 <- sapply(v, f2, y = 2*(1:5), simplify = "array"))a.2 <- vapply(v, f2, outer(1:3, 1:5), y = 2*(1:5))stopifnot(dim(a2) == c(3,5,4), all.equal(a2, a.2), identical(dimnames(a2), list(NULL,NULL,LETTERS[1:4])))hist(replicate(100, mean(rexp(10))))## use of replicate() with parameters:foo <- function(x = 1, y = 2) c(x, y)# does not work: bar <- function(n, ...) replicate(n, foo(...))bar <- function(n, x) replicate(n, foo(x = x))bar(5, x = 3)
require(stats); require(graphics)x<- list(a=1:10, beta= exp(-3:3), logic= c(TRUE,FALSE,FALSE,TRUE))# compute the list mean for each list elementlapply(x, mean)# median and quartiles for each list elementlapply(x, quantile, probs=1:3/4)sapply(x, quantile)i39<- sapply(3:9, seq)# list of vectorssapply(i39, fivenum)vapply(i39, fivenum, c(Min.=0,"1st Qu."=0, Median=0,"3rd Qu."=0, Max.=0))## sapply(*, "array") -- artificial example(v<- structure(10*(5:8), names= LETTERS[1:4]))f2<-function(x, y) outer(rep(x, length.out=3), y)(a2<- sapply(v, f2, y=2*(1:5), simplify="array"))a.2<- vapply(v, f2, outer(1:3,1:5), y=2*(1:5))stopifnot(dim(a2)== c(3,5,4), all.equal(a2, a.2), identical(dimnames(a2), list(NULL,NULL,LETTERS[1:4])))hist(replicate(100, mean(rexp(10))))## use of replicate() with parameters:foo<-function(x=1, y=2) c(x, y)# does not work: bar <- function(n, ...) replicate(n, foo(...))bar<-function(n, x) replicate(n, foo(x= x))bar(5, x=3)
The value of the internal evaluation of a top-levelR expressionis always assigned to.Last.value
(inpackage:base
)before further processing (e.g., printing).
.Last.value
.Last.value
The value of a top-level assignmentis put in.Last.value
,unlike S.
Do not assign to.Last.value
in the workspace, because thiswill always mask the object of the same name inpackage:base
.
## These will not work correctly from example(),## but they will in make check or if pasted in,## as example() does not run them at the top levelgamma(1:15) # think of some intensive calculation...fac14 <- .Last.value # keep themlibrary("splines") # returns invisibly.Last.value # shows what library(.) above returned
## These will not work correctly from example(),## but they will in make check or if pasted in,## as example() does not run them at the top levelgamma(1:15)# think of some intensive calculation...fac14<- .Last.value# keep themlibrary("splines")# returns invisibly.Last.value# shows what library(.) above returned
Get or set the length of vectors (including lists) and factors, and ofany otherR object for which a method has been defined.
length(x)length(x) <- value
length(x)length(x)<- value
x | anR object. For replacement, a vector or factor. |
value | a non-negative integer or double (which will be rounded down). |
Both functions are generic: you can write methods to handle specificclasses of objects, seeInternalMethods.length<-
has a"factor"
method.
The replacement form can be used to reset the length of a vector. Ifa vector is shortened, extra values are discarded and when a vector islengthened, it is padded out to its new length withNA
s(nul
for raw vectors).
Both areprimitive functions.
The default method forlength
currently returns a non-negativeinteger
of length 1, except for vectors of more than elements, when it returns a double.
For vectors (including lists) and factors the length is the number ofelements. For an environment it is the number of objects in theenvironment, andNULL
has length 0. For expressions andpairlists (includinglanguage objects and dot-dot-dot lists) it is thelength of the pairlist chain. All other objects (including functions)have length one: note that for functions this differs from S.
The replacement form removes all the attributes ofx
except itsnames, which are adjusted (and if necessary extended by""
).
Package authors have written methods that return a result of lengthother than one (Formula) and that return a vector of typedouble
(Matrix), even with non-integer values(earlier versions ofsets). Where a single double value isreturned that can be represented as an integer it is returned as alength-one integer vector.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
nchar
for counting the number of characters in charactervectors,lengths
for getting the length of every elementin a list.
length(diag(4)) # = 16 (4 x 4)length(options()) # 12 or morelength(y ~ x1 + x2 + x3) # 3length(expression(x, {y <- x^2; y+2}, x^y)) # 3## from example(warpbreaks)require(stats)fm1 <- lm(breaks ~ wool * tension, data = warpbreaks)length(fm1$call) # 3, lm() and two arguments.length(formula(fm1)) # 3, ~ lhs rhs
length(diag(4))# = 16 (4 x 4)length(options())# 12 or morelength(y~ x1+ x2+ x3)# 3length(expression(x,{y<- x^2; y+2}, x^y))# 3## from example(warpbreaks)require(stats)fm1<- lm(breaks~ wool* tension, data= warpbreaks)length(fm1$call)# 3, lm() and two arguments.length(formula(fm1))# 3, ~ lhs rhs
Get the length of each element of alist
or atomicvector (is.atomic
) as an integer or numeric vector.
lengths(x, use.names = TRUE)
lengths(x, use.names=TRUE)
x | a |
use.names | logical indicating if the result should inherit the |
This function loops overx
and returns a compatible vectorcontaining the length of each element inx
. Effectively,length(x[[i]])
is called for alli
, so any methods onlength
are considered.
lengths
is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.
A non-negativeinteger
of lengthlength(x)
,except when any element has a length of more than elements, when it returns a double vector.When
use.names
is true, the names are taken from the names onx
, if any.
One raison d'être oflengths(x)
is its use as amore efficient version ofsapply(x, length)
and similar*apply
calls tolength
. This is the reason whyx
may be an atomic vector, even thoughlengths(x)
istrivial in that case.
length
for getting the length of anyR object.
require(stats)## summarize by monthl <- split(airquality$Ozone, airquality$Month)avgOz <- lapply(l, mean, na.rm=TRUE)## merge resultairquality$avgOz <- rep(unlist(avgOz, use.names=FALSE), lengths(l))## but this is safer and cleaner, but can be slowerairquality$avgOz <- unsplit(avgOz, airquality$Month)## should always be true, except when a length does not fit in 32 bitsstopifnot(identical(lengths(l), vapply(l, length, integer(1L))))## empty lists are not a problemx <- list()stopifnot(identical(lengths(x), integer()))## nor are "list-like" expressions:lengths(expression(u, v, 1+ 0:9))## and we should dispatch to length methodsf <- c(rep(1, 3), rep(2, 6), 3)dates <- split(as.POSIXlt(Sys.time() + 1:10), f)stopifnot(identical(lengths(dates), vapply(dates, length, integer(1L))))
require(stats)## summarize by monthl<- split(airquality$Ozone, airquality$Month)avgOz<- lapply(l, mean, na.rm=TRUE)## merge resultairquality$avgOz<- rep(unlist(avgOz, use.names=FALSE), lengths(l))## but this is safer and cleaner, but can be slowerairquality$avgOz<- unsplit(avgOz, airquality$Month)## should always be true, except when a length does not fit in 32 bitsstopifnot(identical(lengths(l), vapply(l, length, integer(1L))))## empty lists are not a problemx<- list()stopifnot(identical(lengths(x), integer()))## nor are "list-like" expressions:lengths(expression(u, v,1+0:9))## and we should dispatch to length methodsf<- c(rep(1,3), rep(2,6),3)dates<- split(as.POSIXlt(Sys.time()+1:10), f)stopifnot(identical(lengths(dates), vapply(dates, length, integer(1L))))
levels
provides access to the levels attribute of a variable.The first form returns the value of the levels of its argumentand the second sets the attribute.
levels(x)levels(x) <- value
levels(x)levels(x)<- value
x | an object, for example a factor. |
value | a valid value for |
Both the extractor and replacement forms are generic and new methodscan be written for them. The most important method for the replacementfunction is that forfactor
s.
For the factor replacement method, aNA
invalue
causes that level to be removed from the levels and the elementsformerly with that level to be replaced byNA
.
Note that for a factor, replacing the levels vialevels(x) <- value
is not the same as (and is preferred to)attr(x, "levels") <- value
.
The replacement function isprimitive.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
## assign individual levelsx <- gl(2, 4, 8)levels(x)[1] <- "low"levels(x)[2] <- "high"x## or as a groupy <- gl(2, 4, 8)levels(y) <- c("low", "high")y## combine some levelsz <- gl(3, 2, 12, labels = c("apple", "salad", "orange"))zlevels(z) <- c("fruit", "veg", "fruit")z## same, using a named listz <- gl(3, 2, 12, labels = c("apple", "salad", "orange"))zlevels(z) <- list("fruit" = c("apple","orange"), "veg" = "salad")z## we can add levels this way:f <- factor(c("a","b"))levels(f) <- c("c", "a", "b")ff <- factor(c("a","b"))levels(f) <- list(C = "C", A = "a", B = "b")f
## assign individual levelsx<- gl(2,4,8)levels(x)[1]<-"low"levels(x)[2]<-"high"x## or as a groupy<- gl(2,4,8)levels(y)<- c("low","high")y## combine some levelsz<- gl(3,2,12, labels= c("apple","salad","orange"))zlevels(z)<- c("fruit","veg","fruit")z## same, using a named listz<- gl(3,2,12, labels= c("apple","salad","orange"))zlevels(z)<- list("fruit"= c("apple","orange"),"veg"="salad")z## we can add levels this way:f<- factor(c("a","b"))levels(f)<- c("c","a","b")ff<- factor(c("a","b"))levels(f)<- list(C="C", A="a", B="b")f
Report version oflibcurl
in use.
libcurlVersion()
libcurlVersion()
A character string, with value thelibcurl
version in use or""
if none is. Iflibcurl
is available, has attributes
ssl_version | A character string naming theSSL/TLS implementationand version, possibly |
libssh_version | A character string naming the |
protocols | A character vector of the names of supportedprotocols, also known as ‘schemes’ when part of a URL. |
In late 2017 alibcurl
installation was seen divided into twolibraries,libcurl
andlibcurl-feature
, and the firsthad been updated but not the second. As the compiled functionrecording the version was in the latter, the version reported bylibcurlVersion
was misleading.
extSoftVersion
for versions of other third-partysoftware.
curlGetHeaders
,download.file
andurl
for functions which (optionally) uselibcurl
.
https://curl.se/docs/sslcerts.html andhttps://curl.se/docs/ssl-compared.html for more details onSSL versions (the current standard being known asTLS). Normallylibcurl
used withR uses SecureTransport on macOS, OpenSSL onWindows and GnuTLS, NSS or OpenSSL on Unix-alikes. (At the time ofwriting Debian-based Linuxen use GnuTLS and RedHat-based ones useOpenSSL, having previously used NSS.)
libcurlVersion()
libcurlVersion()
.libPaths
gets/sets the library trees within which packages arelooked for.
.libPaths(new, include.site = TRUE).Library.Library.site
.libPaths(new, include.site=TRUE).Library.Library.site
new | a character vector with the locations ofR librarytrees. Tilde expansion ( |
include.site | a logical value indicating whether the value of |
.Library
is a character string giving the location of thedefault library, the ‘library’ subdirectory ofR_HOME.
.Library.site
is a (possibly empty) character vector giving thelocations of the site libraries.
.libPaths
is used for getting or setting the library trees thatRknows about and hence uses when looking for packages (the library searchpath). If called with argumentnew
, by default, the library searchpath is set to the existing directories inunique(c(new, .Library.site, .Library))
and this is returned. Ifinclude.site
isFALSE
when thenew
argument is set,.Library.site
is not added to the new library search path. If called without thenew
argument, a character vector with the currently active librarytrees is returned.
How paths innew
with a trailing slash are treated isOS-dependent. On a POSIX filesystem existing directories can usuallybe specified with a trailing slash. On Windows filepaths with atrailing slash (or backslash) are invalid and existing directoriesspecified with a trailing slash may not be added to the library search path.
At startup, the library search path is initialized from theenvironment variablesR_LIBS,R_LIBS_USER andR_LIBS_SITE, which if set should give lists of directories whereR library trees are rooted, colon-separated on Unix-alike systems andsemicolon-separated on Windows. For the latter two, a value ofNULL
indicates an empty list of directories. (Note that as fromR 4.2.0, both are set byR start-up code if not already set or emptyso can be interrogated from anR session to find their defaults:in earlier versions this was true only forR_LIBS_USER.)
First,.Library.site
is initialized fromR_LIBS_SITE. Ifthis is unset or empty, the ‘site-library’ subdirectory ofR_HOME is used. Only directories which exist at the time ofinitialization are retained. Then,.libPaths()
is called withthe combination of the directories given byR_LIBS andR_LIBS_USER. By defaultR_LIBS is unset, and ifR_LIBS_USER is unset or empty, it is set to directory‘R/R.version$platform-library/x.y’ of the homedirectory on Unix-alike systems (or‘Library/R/m/x.y/library’ for CRAN macOS builds, withmSys.info()["machine"]
) and‘R/win-library/x.y’ subdirectory ofLOCALAPPDATA onWindows, forRx.y.z.
BothR_LIBS_USER andR_LIBS_SITE feature possibleexpansion of specifiers forR-version-specific information as part ofthe startup process. The possible conversion specifiers all startwith a ‘%’ and are followed by a single letter (use ‘%%’to obtain ‘%’), with currently available conversionspecifications as follows:
R version number including the patch level (e.g.,‘2.5.0’).
R version number excluding the patch level (e.g.,‘2.5’).
the platform for whichR was built, the value ofR.version$platform
.
the underlying operating system, the value ofR.version$os
.
the architecture (CPU)R was built on/for, thevalue ofR.version$arch
.
(Seeversion
for details on R version information.)In addition, ‘%U’ and ‘%S’ expand to theR defaults for,respectively,R_LIBS_USER andR_LIBS_SITE.
Function.libPaths
always uses the values of.Library
and.Library.site
in the base namespace..Library.site
can be set by the site in ‘Rprofile.site’, which should befollowed by a call to.libPaths(.libPaths())
to make use of theupdated value.
For consistency, the paths are always normalized bynormalizePath(winslash = "/")
.
LOCALAPPDATA (usuallyC:\Users\username\AppData\Local
) onWindows is a hidden directory and may not be viewed by some software. Itmay be opened byshell.exec(Sys.getenv("LOCALAPPDATA"))
.
A character vector of file paths.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
.libPaths() # all library trees R knows about
.libPaths()# all library trees R knows about
library
andrequire
load and attach add-on packages.
library(package, help, pos = 2, lib.loc = NULL, character.only = FALSE, logical.return = FALSE, warn.conflicts, quietly = FALSE, verbose = getOption("verbose"), mask.ok, exclude, include.only, attach.required = missing(include.only))require(package, lib.loc = NULL, quietly = FALSE, warn.conflicts, character.only = FALSE, mask.ok, exclude, include.only, attach.required = missing(include.only))conflictRules(pkg, mask.ok = NULL, exclude = NULL)
library(package, help, pos=2, lib.loc=NULL, character.only=FALSE, logical.return=FALSE, warn.conflicts, quietly=FALSE, verbose= getOption("verbose"), mask.ok, exclude, include.only, attach.required= missing(include.only))require(package, lib.loc=NULL, quietly=FALSE, warn.conflicts, character.only=FALSE, mask.ok, exclude, include.only, attach.required= missing(include.only))conflictRules(pkg, mask.ok=NULL, exclude=NULL)
package ,help | the name of a package, given as aname orliteral character string, or a character string, depending onwhether |
pos | the position on the search list at which to attach theloaded namespace. Can also be the name of a position on the currentsearch list as given by |
lib.loc | a character vector describing the location ofRlibrary trees to search through, or |
character.only | a logical indicating whether |
logical.return | logical. If it is |
warn.conflicts | logical. If |
verbose | a logical. If |
quietly | a logical. If |
pkg | character string naming a package. |
mask.ok | character vector of names of objects that can maskobjects on the search path without signaling an error when strictconflict checking is enabled. |
exclude ,include.only | character vector of names of objects toexclude or include in the attached frame. Only one of these argumentsmay be used in a call to |
attach.required | logical specifying whether required packageslisted in the |
library(package)
andrequire(package)
both load thenamespace of the package with namepackage
and attach it on thesearch list.require
is designed for use inside otherfunctions; it returnsFALSE
and gives a warning (rather than anerror aslibrary()
does by default) if the package does notexist. Both functions check and update the list of currently attachedpackages and do not reload a namespace which is already loaded. (Ifyou want to reload such a package, calldetach(unload = TRUE)
orunloadNamespace
first.) If you want to load apackage without attaching it on the search list, seerequireNamespace
.
To suppress messages during the loading of packages usesuppressPackageStartupMessages
: this will suppress allmessages fromR itself but not necessarily all those from packageauthors.
Iflibrary
is called with nopackage
orhelp
argument, it lists all available packages in the libraries specifiedbylib.loc
, and returns the corresponding information in anobject of class"libraryIQR"
. (The structure of this class maychange in future versions.) Use.packages(all = TRUE)
toobtain just the names of all available packages, andinstalled.packages()
for even more information.
library(help = somename)
computes basic information about thepackagesomename, and returns this in an object of class"packageInfo"
. (The structure of this class may change infuture versions.) When used with the default value (NULL
) forlib.loc
, the attached packages are searched before the libraries.
Normallylibrary
returns (invisibly) the list of attachedpackages, butTRUE
orFALSE
iflogical.return
isTRUE
. When called aslibrary()
it returns an object ofclass"libraryIQR"
, and forlibrary(help=)
, one ofclass"packageInfo"
.
require
returns (invisibly) a logical indicating whether the requiredpackage is available.
Handling of conflicts depends on the setting of theconflicts.policy
option. If this option is not set, thenconflicts result in warning messages if the argumentwarn.conflicts
isTRUE
. If the option is set to thecharacter string"strict"
, then all unresolved conflicts signalerrors. Conflicts can be resolved using themask.ok
,exclude
, andinclude.only
arguments tolibrary
andrequire
. Defaults formask.ok
andexclude
can bespecified usingconflictRules
.
If theconflicts.policy
option is set to the string"depends.ok"
then conflicts resulting from attaching declareddependencies will not produce errors, but other conflicts will.This is likely to be the best setting for most users wanting someadditional protection against unexpected conflicts.
The policy can be tuned further by specifying theconflicts.policy
option as a named list with the followingfields:
error
:logical; ifTRUE
treat unresolvedconflicts as errors.
warn
:logical; unlessFALSE
issue a warningmessage when conflicts are found.
generics.ok
:logical; ifTRUE
ignore conflictscreated by defining S4 generics for functions on the search path.
depends.ok
:logical; ifTRUE
do not treatconflicts with required packages as errors.
can.mask
:character vector of names of packages thatare allowed to be masked. These would typically be base packagesattached by default.
Some packages have restrictive licenses, and there is a mechanism toallow users to be aware of such licenses. IfgetOption("checkPackageLicense") == TRUE
, then at firstuse of a namespace of a package with a not-known-to-be-FOSS (seebelow) license the user is asked to view and accept the license: alist of accepted licenses is stored in file ‘~/.R/licensed’. Ina non-interactive session it is an error to use such a package whoselicense has not already been recorded as accepted.
Free or Open Source Software (FOSS,e.g.https://en.wikipedia.org/wiki/FOSS) packages aredetermined by the same filters used byavailable.packages
but applied to just the currentpackage, not its dependencies.
There can also be a site-wide file ‘R_HOME/etc/licensed.site’ ofpackages (one per line).
library
takes some further actions when packagemethodsis attached (as it is by default). Packages may define formal genericfunctions as well as re-defining functions in other packages (notablybase) to be generic, and this information is cached wheneversuch a namespace is loaded aftermethods and re-defined functions(implicit generics) are excluded from the list of conflicts.The caching and check for conflicts require looking for a pattern ofobjects; the search may be avoided by defining an object.noGenerics
(with any value) in the namespace. Naturally, if thepackagedoes have any such methods, this will prevent them frombeing used.
library
andrequire
can only load/attach aninstalled package, and this is detected by having a‘DESCRIPTION’ file containing a ‘Built:’ field.
Under Unix-alikes, the code checks that the package was installedunder a similar operating system as given byR.version$platform
(the canonical name of the platform under which R was compiled),provided it contains compiled code. Packages which do not containcompiled code can be shared between Unix-alikes, but not to other OSesbecause of potential problems with line endings and OS-specific helpfiles. If sub-architectures are used, the OS similarity is notchecked since the OS used to build may differ(e.g.i386-pc-linux-gnu
code can be built on anx86_64-unknown-linux-gnu
OS).
The package name given tolibrary
andrequire
must matchthe name given in the package's ‘DESCRIPTION’ file exactly, evenon case-insensitive file systems such as are common on Windows andmacOS.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
attach
,detach
,search
,objects
,autoload
,requireNamespace
,library.dynam
,data
,install.packages
andinstalled.packages
;INSTALL
,REMOVE
.
The initial set of packages attached is set byoptions(defaultPackages=)
: see alsoStartup
.
library() # list all available packageslibrary(lib.loc = .Library) # list all packages in the default librarylibrary(help = splines) # documentation on package 'splines'library(splines) # attach package 'splines'require(splines) # the samesearch() # "splines", toodetach("package:splines")# if the package name is in a character vector, usepkg <- "splines"library(pkg, character.only = TRUE)detach(pos = match(paste("package", pkg, sep = ":"), search()))require(pkg, character.only = TRUE)detach(pos = match(paste("package", pkg, sep = ":"), search()))require(nonexistent) # FALSE## Not run: ## if you want to mask as little as possible, uselibrary(mypkg, pos = "package:base")## End(Not run)
library()# list all available packageslibrary(lib.loc= .Library)# list all packages in the default librarylibrary(help= splines)# documentation on package 'splines'library(splines)# attach package 'splines'require(splines)# the samesearch()# "splines", toodetach("package:splines")# if the package name is in a character vector, usepkg<-"splines"library(pkg, character.only=TRUE)detach(pos= match(paste("package", pkg, sep=":"), search()))require(pkg, character.only=TRUE)detach(pos= match(paste("package", pkg, sep=":"), search()))require(nonexistent)# FALSE## Not run:## if you want to mask as little as possible, uselibrary(mypkg, pos="package:base")## End(Not run)
Load the specified file of compiled code if it has not been loadedalready, or unloads it.
library.dynam(chname, package, lib.loc, verbose = getOption("verbose"), file.ext = .Platform$dynlib.ext, ...)library.dynam.unload(chname, libpath, verbose = getOption("verbose"), file.ext = .Platform$dynlib.ext).dynLibs(new)
library.dynam(chname, package, lib.loc, verbose= getOption("verbose"), file.ext= .Platform$dynlib.ext,...)library.dynam.unload(chname, libpath, verbose= getOption("verbose"), file.ext= .Platform$dynlib.ext).dynLibs(new)
chname | a character string naming a DLL (also known as a dynamicshared object or library) to load. |
package | a character vector with the name of package. |
lib.loc | a character vector describing the location ofRlibrary trees to search through. |
libpath | the path to the loaded package whose DLL is to be unloaded. |
verbose | a logical value indicating whether an announcementis printed on the console before loading the DLL. Thedefault value is taken from the verbose entry in the system |
file.ext | the extension (including ‘.’ if used) to appendto the file name to specify the library to be loaded. This defaultsto the appropriate value for the operating system. |
... | additional arguments needed by some libraries thatare passed to the call to |
new | a list of |
Seedyn.load
for what sort of objects these functions handle.
library.dynam
is designed to be used inside a package ratherthan at the command line, and should really only be used inside.onLoad
. The system-specific extension for DLLs (e.g.,‘.so’ or ‘.sl’ on Unix-alike systems,‘.dll’ on Windows) should not be added.
library.dynam.unload
is designed for use in.onUnload
: it unloads the DLL and updates the value of.dynLibs()
.dynLibs
is used for getting (with no argument) or setting theDLLs which are currently loaded by packages (usinglibrary.dynam
).
Ifchname
is not specified,library.dynam
returns anobject of class"DLLInfoList"
corresponding to the DLLsloaded by packages.
Ifchname
is specified, an object of class"DLLInfo"
that identifies the DLL and which can be usedin future calls is returned invisibly. Note that the class"DLLInfo"
has a method for$
which can be used toresolve native symbols within that DLL.
library.dynam.unload
invisibly returns an object of class"DLLInfo"
identifying the DLL successfully unloaded.
.dynLibs
returns an object of class"DLLInfoList"
corresponding to its current value.
Do not usedyn.unload
on a DLL loaded bylibrary.dynam
: uselibrary.dynam.unload
to ensurethat.dynLibs
gets updated. Otherwise a subsequent call tolibrary.dynam
will be told the object is already loaded.
Note that whether or not it is possible to unload a DLL and thenreload a revised version of the same file is OS-dependent: see the‘Value’ section of the help fordyn.unload
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
getLoadedDLLs
for information on"DLLInfo"
and"DLLInfoList"
objects.
.onLoad
,library
,dyn.load
,.packages
,.libPaths
SHLIB
for how to create suitable DLLs.
## Which DLLs were dynamically loaded by packages?library.dynam()## More on library.dynam.unload() :require(nlme)nlme:::.onUnload # shows library.dynam.unload() calldetach("package:nlme") # by default, unload=FALSE , so,tail(library.dynam(), 2)# nlme still there## How to unload the DLL ?## Best is to unload the namespace, unloadNamespace("nlme")## If we need to do it separately which should be exceptional:pd.file <- attr(packageDescription("nlme"), "file")library.dynam.unload("nlme", libpath = sub("/Meta.*", '', pd.file))tail(library.dynam(), 2)# 'nlme' is gone nowunloadNamespace("nlme") # now gives warning
## Which DLLs were dynamically loaded by packages?library.dynam()## More on library.dynam.unload() :require(nlme)nlme:::.onUnload# shows library.dynam.unload() calldetach("package:nlme")# by default, unload=FALSE , so,tail(library.dynam(),2)# nlme still there## How to unload the DLL ?## Best is to unload the namespace, unloadNamespace("nlme")## If we need to do it separately which should be exceptional:pd.file<- attr(packageDescription("nlme"),"file")library.dynam.unload("nlme", libpath= sub("/Meta.*",'', pd.file))tail(library.dynam(),2)# 'nlme' is gone nowunloadNamespace("nlme")# now gives warning
The license terms under whichR is distributed.
license()licence()
license()licence()
R is distributed under the terms of the GNU GENERAL PUBLIC LICENSE,either Version 2, June 1991 or Version 3, June 2007. A copy of theversion 2 license is in file ‘R_HOME/doc/COPYING’and can be viewed byRShowDoc("COPYING")
. Version 3 of thelicense can be displayed byRShowDoc("GPL-3")
.
A small number of files (some of the API header files) are distributedunder the LESSER GNU GENERAL PUBLIC LICENSE, version 2.1 or later. Acopy of this license is in file ‘R_SHARE_DIR/licenses/LGPL-2.1’and can be viewed byRShowDoc("LGPL-2.1")
. Version 3 of thelicense can be displayed byRShowDoc("LGPL-3")
.
Functions to construct, coerce and check for both kinds ofR lists.
list(...)pairlist(...)as.list(x, ...)## S3 method for class 'environment'as.list(x, all.names = FALSE, sorted = FALSE, ...)as.pairlist(x)is.list(x)is.pairlist(x)alist(...)
list(...)pairlist(...)as.list(x,...)## S3 method for class 'environment'as.list(x, all.names=FALSE, sorted=FALSE,...)as.pairlist(x)is.list(x)is.pairlist(x)alist(...)
... | objects, possibly named. |
x | object to be coerced or tested. |
all.names | a logical indicating whether to copy all values or(default) only those whose names do not begin with a dot. |
sorted | a logical indicating whether the |
Almost all lists inR internally areGeneric Vectors, whereastraditionaldotted pair lists (as in LISP) remain available butrarely seen by users (except asformals
of functions).
The arguments tolist
orpairlist
are of the formvalue
ortag = value
. The functions return a list ordotted pair list composed of its arguments with each value eithertagged or untagged, depending on how the argument was specified.
alist
handles its arguments as if they described functionarguments. So the values are not evaluated, and tagged arguments withno value are allowed whereaslist
simply ignores them.alist
is most often used in conjunction withformals
.
as.list
attempts to coerce its argument to a list. Forfunctions, this returns the concatenation of the list of formalarguments and the function body. For expressions, the list ofconstituent elements is returned.as.list
is generic, and asthe default method callsas.vector(mode = "list")
for anon-list, methods foras.vector
may be invoked.as.list
turns a factor into a list of one-element factors, keepingnames
. Other attributes maybe dropped unless the argument already is a list or expression. (Thisis inconsistent with functions such asas.character
which always drop attributes, and is for efficiency since lists can beexpensive to copy.)
is.list
returnsTRUE
if and only if its argumentis alist
or apairlist
oflength
.
is.pairlist
returnsTRUE
if and only if the argumentis a pairlist orNULL
(see below).
The"environment"
method foras.list
copies thename-value pairs (for names not beginning with a dot) from anenvironment to a named list. The user can request that all namedobjects are copied. Unlesssorted = TRUE
, the list is in noparticular order (the orderdepends on the order of creation of objects and whether theenvironment is hashed). No enclosing environments are searched.(Objects copied are duplicated so this can be an expensive operation.)Note that there is an inverse operation, theas.environment()
method for list objects.
An empty pairlist,pairlist()
is the same asNULL
. This is different fromlist()
: some butnot all operations will promote an empty pairlist to an empty list.
as.pairlist
is implemented asas.vector(x, "pairlist")
, and hence will dispatch methods for the generic functionas.vector
. Lists are copied element-by-element into a pairlistand the names of the list used as tags for the pairlist: the returnvalue for other types of argument is undocumented.
list
,is.list
andis.pairlist
areprimitive functions.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
vector("list", length)
for creation of a list with emptycomponents;c
, for concatenation;formals
.unlist
is an approximate inverse toas.list()
.
‘plotmath’ for the use oflist
in plot annotation.
require(graphics)# create a plotting structurepts <- list(x = cars[,1], y = cars[,2])plot(pts)is.pairlist(.Options) # a user-level pairlist## "pre-allocate" an empty list of length 5vector("list", 5)# Argument listsf <- function() x# Note the specification of a "..." argument:formals(f) <- al <- alist(x = , y = 2+3, ... = )fal## environment->list coercione1 <- new.env()e1$a <- 10e1$b <- 20as.list(e1)
require(graphics)# create a plotting structurepts<- list(x= cars[,1], y= cars[,2])plot(pts)is.pairlist(.Options)# a user-level pairlist## "pre-allocate" an empty list of length 5vector("list",5)# Argument listsf<-function() x# Note the specification of a "..." argument:formals(f)<- al<- alist(x=, y=2+3,...=)fal## environment->list coercione1<- new.env()e1$a<-10e1$b<-20as.list(e1)
These functions produce a character vector of the names of files ordirectories in the named directory.
list.files(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE) dir(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)list.dirs(path = ".", full.names = TRUE, recursive = TRUE)
list.files(path=".", pattern=NULL, all.files=FALSE, full.names=FALSE, recursive=FALSE, ignore.case=FALSE, include.dirs=FALSE, no..=FALSE) dir(path=".", pattern=NULL, all.files=FALSE, full.names=FALSE, recursive=FALSE, ignore.case=FALSE, include.dirs=FALSE, no..=FALSE)list.dirs(path=".", full.names=TRUE, recursive=TRUE)
path | a character vector of full path names; the defaultcorresponds to the working directory, |
pattern | an optionalregular expression. Only file nameswhich match the regular expression will be returned. |
all.files | a logical value. If |
full.names | a logical value. If |
recursive | logical. Should the listing recurse into directories? |
ignore.case | logical. Should pattern-matching be case-insensitive? |
include.dirs | logical. Should subdirectory names be included inrecursive listings? (They always are in non-recursive ones). |
no.. | logical. Should both |
A character vector containing the names of the files in thespecified directories (empty if there were no files). If apath does not exist or is not a directory or is unreadable itis skipped.
The files are sorted in alphabetical order, on the full pathiffull.names = TRUE
.
list.dirs
implicitly hasall.files = TRUE
, and ifrecursive = TRUE
, the answer includespath
itself(provided it is a readable directory).
dir
is an alias forlist.files
.
File naming conventions are platform dependent. The pattern matchingworks with the case of file names as returned by the OS.
On a POSIX filesystem recursive listings will follow symbolic links todirectories.
Ross Ihaka, Brian Ripley
file.info
,file.access
andfiles
for many more file handling functions andfile.choose
for interactive selection.
glob2rx
to convert wildcards (as used by system filecommands and shells) to regular expressions.
Sys.glob
for wildcard expansion on file paths.basename
anddirname
, useful for splitting pathsinto non-directory (aka ‘filename’) and directory parts.
list.files(R.home())## Only files starting with a-l or r## Note that a-l is locale-dependent, but using case-insensitive## matching makes it unambiguous in English localesdir("../..", pattern = "^[a-lr]", full.names = TRUE, ignore.case = TRUE)list.dirs(R.home("doc"))list.dirs(R.home("doc"), full.names = FALSE)
list.files(R.home())## Only files starting with a-l or r## Note that a-l is locale-dependent, but using case-insensitive## matching makes it unambiguous in English localesdir("../..", pattern="^[a-lr]", full.names=TRUE, ignore.case=TRUE)list.dirs(R.home("doc"))list.dirs(R.home("doc"), full.names=FALSE)
Create a data frame from a list of variables.
list2DF(x = list(), nrow = 0)
list2DF(x= list(), nrow=0)
x | A list of same-length variables for the data frame. |
nrow | An integer giving the desired number of rows for the data frame incase |
Note that all list elements are taken “as is”.
A data frame with the given variables.
## Create a data frame holding a list of character vectors and the## corresponding lengths:x <- list(character(), "A", c("B", "C"))n <- lengths(x)list2DF(list(x = x, n = n))## Create data frames with no variables and the desired number of rows:list2DF()list2DF(nrow = 3L)
## Create a data frame holding a list of character vectors and the## corresponding lengths:x<- list(character(),"A", c("B","C"))n<- lengths(x)list2DF(list(x= x, n= n))## Create data frames with no variables and the desired number of rows:list2DF()list2DF(nrow=3L)
From anamedlist x
, create anenvironment
containing all list components as objects, or“multi-assign” fromx
into a pre-existing environment.
list2env(x, envir = NULL, parent = parent.frame(), hash = (length(x) > 100), size = max(29L, length(x)))
list2env(x, envir=NULL, parent= parent.frame(), hash=(length(x)>100), size= max(29L, length(x)))
x | |
envir | an |
parent | (for the case |
hash | (for the case |
size | (in the case |
This will be very slow for large inputs unless hashing is used on theenvironment.
Environments must have uniquely named entries, but named lists neednot: where the list has duplicate names it is thelast elementwith the name that is used. Empty names throw an error.
Anenvironment
, either newly created (as bynew.env
) if theenvir
argument wasNULL
,otherwise the updated environmentenvir
. Since environmentsare never duplicated, the argumentenvir
is also changed.
Martin Maechler
environment
,new.env
,as.environment
; further,assign
.
The (semantical) “inverse”:as.list.environment
.
L <- list(a = 1, b = 2:4, p = pi, ff = gl(3, 4, labels = LETTERS[1:3]))e <- list2env(L)ls(e)stopifnot(ls(e) == sort(names(L)), identical(L$b, e$b)) # "$" working for environments as for lists## consistency, when we do the inverse:ll <- as.list(e) # -> dispatching to the as.list.environment() methodrbind(names(L), names(ll)) # not in the same order, typically, # but the same content:stopifnot(identical(L [sort.list(names(L ))], ll[sort.list(names(ll))]))## now add to e -- can be seen as a fast "multi-assign":list2env(list(abc = LETTERS, note = "just an example", df = data.frame(x = rnorm(20), y = rbinom(20, 1, prob = 0.2))), envir = e)utils::ls.str(e)
L<- list(a=1, b=2:4, p= pi, ff= gl(3,4, labels= LETTERS[1:3]))e<- list2env(L)ls(e)stopifnot(ls(e)== sort(names(L)), identical(L$b, e$b))# "$" working for environments as for lists## consistency, when we do the inverse:ll<- as.list(e)# -> dispatching to the as.list.environment() methodrbind(names(L), names(ll))# not in the same order, typically,# but the same content:stopifnot(identical(L[sort.list(names(L))], ll[sort.list(names(ll))]))## now add to e -- can be seen as a fast "multi-assign":list2env(list(abc= LETTERS, note="just an example", df= data.frame(x= rnorm(20), y= rbinom(20,1, prob=0.2))), envir= e)utils::ls.str(e)
Reload datasets written with the functionsave
.
load(file, envir = parent.frame(), verbose = FALSE)
load(file, envir= parent.frame(), verbose=FALSE)
file | a (readable binary-mode)connection or a character stringgiving the name of the file to load (whentilde expansionis done). |
envir | the environment where the data should be loaded. |
verbose | should item names be printed during loading? |
load
can loadR objects saved in the current or any earlierformat. It can read a compressed file (seesave
)directly from a file or from a suitable connection (including a calltourl
).
A not-open connection will be opened in mode"rb"
and closedafter use. Any connection other than agzfile
orgzcon
connection will be wrapped ingzcon
to allow compressed saves to be handled: note that this leaves theconnection in an altered state (in particular, binary-only), and thatit needs to be closed explicitly (it will not be garbage-collected).
OnlyR objects saved in the current format (used sinceR 1.4.0)can be read from a connection. If no input is available on aconnection a warning will be given, but any input not in the currentformat will result in a error.
Loading from an earlier version will give a warning about the‘magic number’: magic numbers1971:1977
are fromR <0.99.0, andRD[ABX]1
fromR 0.99.0 toR 1.3.1. These are allobsolete, and you are strongly recommended to re-save such files in acurrent format.
Theverbose
argument is mainly intended for debugging. If itisTRUE
, then as objects from the file are loaded, theirnames will be printed to the console. Ifverbose
is set toan integer value greater than one, additional names corresponding toattributes and other parts of individual objects will also be printed.Larger values will print names to a greater depth.
Objects can be saved with references to namespaces, usually as part ofthe environment of a function or formula. Such objects can be loadedeven if the namespace is not available: it is replaced by a referenceto the global environment with a warning. The warning identifies thefirst object with such a reference (but there may be more than one).
A character vector of the names of objects created, invisibly.
SavedR objects are binary files, even those saved withascii = TRUE
, so ensure that they are transferred withoutconversion of end of line markers.load
tries to detect such aconversion and gives an informative error message.
load(file)
replaces all existing objects with the same namesin the current environment (typically your workspace,.GlobalEnv
) and hence potentially overwrites important data.It is considerably safer to useenvir =
to load into adifferent environment, or toattach(file)
whichload()
s into a new entry in thesearch
path.
save
,download.file
; furtherattach
as wrapper forload()
.
For other interfaces to the underlying serialization format, seeunserialize
andreadRDS
.
## save all dataxx <- pi # to ensure there is some datasave(list = ls(all.names = TRUE), file= "all.rda")rm(xx)## restore the saved values to the current environmentlocal({ load("all.rda") ls()})xx <- exp(1:3)## restore the saved values to the user's workspaceload("all.rda") ## which is here *equivalent* to## load("all.rda", .GlobalEnv)## This however annihilates all objects in .GlobalEnv with the same names !xx # no longer exp(1:3)rm(xx)attach("all.rda") # safer and will warn about masked objects w/ same name in .GlobalEnvls(pos = 2)## also typically need to cleanup the search path:detach("file:all.rda")## clean up (the example):unlink("all.rda")## Not run: con <- url("http://some.where.net/R/data/example.rda")## print the value to see what objects were created.print(load(con))close(con) # url() always opens the connection## End(Not run)
## save all dataxx<- pi# to ensure there is some datasave(list= ls(all.names=TRUE), file="all.rda")rm(xx)## restore the saved values to the current environmentlocal({ load("all.rda") ls()})xx<- exp(1:3)## restore the saved values to the user's workspaceload("all.rda")## which is here *equivalent* to## load("all.rda", .GlobalEnv)## This however annihilates all objects in .GlobalEnv with the same names !xx# no longer exp(1:3)rm(xx)attach("all.rda")# safer and will warn about masked objects w/ same name in .GlobalEnvls(pos=2)## also typically need to cleanup the search path:detach("file:all.rda")## clean up (the example):unlink("all.rda")## Not run:con<- url("http://some.where.net/R/data/example.rda")## print the value to see what objects were created.print(load(con))close(con)# url() always opens the connection## End(Not run)
Get details of or set aspects of the locale for theR process.
Sys.getlocale (category = "LC_ALL")Sys.setlocale (category = "LC_ALL", locale = "").LC.categories
Sys.getlocale(category="LC_ALL")Sys.setlocale(category="LC_ALL", locale="").LC.categories
category | character string. The following categories shouldalways be supported: |
locale | character string. A valid locale name on the system inuse. Normally |
The locale describes aspects of the internationalization of a program.Initially most aspects of the locale ofR are set to"C"
(which is the default for the C language and reflects North-Americanusage – also known as"POSIX"
).R sets"LC_CTYPE"
and"LC_COLLATE"
, which allow the use of a different character setand alphabetic comparisons in that character set (including the use ofsort
),"LC_MONETARY"
(for use bySys.localeconv
) and"LC_TIME"
may affect thebehaviour ofas.POSIXlt
andstrptime
andfunctions which use them (but notdate
).
The first seven categories described here are those specified byPOSIX."LC_MESSAGES"
will be"C"
on systems that do notsupport message translation, and is not supported on Windows, whereyoumust use theLANGUAGE environment variable formessage translation, see below and theSys.setLanguage()
utility. Trying to use an unsupported category is an error forSys.setlocale
.
Note that setting category"LC_ALL"
sets only categories"LC_COLLATE"
,"LC_CTYPE"
,"LC_MONETARY"
and"LC_TIME"
.
Attempts to set an invalid locale are ignored. There may or may notbe a warning, depending on the OS.
Attempts to change the character set (bySys.setlocale("LC_CTYPE", )
, if that implies a differentcharacter set) during a session may not work and are likely to lead tosome confusion.
Note that theLANGUAGE environment variable has precedence over"LC_MESSAGES"
in selecting the language for message translationon mostR platforms.
On platforms where ICU is used for collation the locale used forcollation can be reset byicuSetCollate
. Except onWindows, the initial setting is taken from the"LC_COLLATE"
category, and it is reset when this is changed by a call toSys.setlocale
.
A character string of length one describing the locale in use (aftersetting forSys.setlocale
), or an empty character string if thecurrent locale settings are invalid orNULL
if localeinformation is unavailable.
Forcategory = "LC_ALL"
the details of the string aresystem-specific: it might be a single locale name or a set of localenames separated by"/"
(macOS) or";"
(Windows, Linux). For portability, it is best to query categoriesindividually: it is not necessarily the case that the result offoo <- Sys.getlocale()
can be used inSys.setlocale("LC_ALL", locale = foo)
.
On most Unix-alikes the POSIX shell commandlocale -a
willlist the ‘available public’ locales. What that means isplatform-dependent. On recent Linuxen this may mean ‘availableto be installed’ as on some RPM-based systems the locale data is inseparateRPMs. On Debian/Ubuntu the set of available locales ismanaged by OS-specific facilities such aslocale-gen
andlocale -a
lists those currently enabled.
For Windows, Microsoft moves its documentation frequently so a Websearch is the best way to find current information. FromR 4.2,UCRTlocale names should be used. The character set should match thesystem/ANSI codepage (l10n_info()$codepage
be the same asl10n_info()$system.codepage
). Setting it to any other valueresults in a warning and may cause encoding problems. As fromR 4.2on recent Windows the system codepage is 65001 and one should alwaysuse locale names ending with".UTF-8"
(except for"C"
and""
), otherwise Windows may add a different character set.
Setting"LC_NUMERIC"
to any value other than"C"
maycauseR to function anomalously, so gives a warning. Inputconversions inR itself are unaffected, but the reading and writingof ASCIIsave
files will be, as may packages which dotheir own input/output.
Setting it temporarily on a Unix-alike to produce graphical or textoutput may work well enough, butoptions(OutDec)
isoften preferable.
Almost all the output routines used byR itself under Windows ignorethe setting of"LC_NUMERIC"
since they make use of the Triolibrary which is not internationalized.
Changing the values of locale categories whilstR is running oughtto be noticed by the OS services, and usually is but exceptions havebeen seen (usually in collation services).
Do not use the value ofSys.getlocale("LC_CTYPE")
to attempt tofind the character set – for example UTF-8 locales can have suffix‘.UTF-8’ or ‘.utf8’ (more common on Linux than ‘UTF-8’)or none (as on macOS) and Latin-9 locales can have suffix‘ISO8859-15’, ‘iso885915’, ‘iso885915@euro’ or‘ISO8859-15@euro’. Usel10n_info
instead.
strptime
for uses ofcategory = "LC_TIME"
.Sys.localeconv
for details of numerical and monetaryrepresentations.
l10n_info
gives some summary facts about the locale andits encoding (including if it is UTF-8).
The ‘R Installation and Administration’ manual for backgroundon locales and how to find out locale names on your system.
Sys.getlocale()## Date-time related :Sys.getlocale("LC_TIME") -> olcTthen <- as.POSIXlt("2001-01-01 01:01:01", tz = "UTC")## Not run: c(m = months(then), wd = weekdays(then)) # locale specificSys.setlocale("LC_TIME", "de") # Solaris: details are OS-dependentSys.setlocale("LC_TIME", "de_DE") # Many Unix-alikesSys.setlocale("LC_TIME", "de_DE.UTF-8") # Linux, macOS, other Unix-alikesSys.setlocale("LC_TIME", "de_DE.utf8") # some Linux versionsSys.setlocale("LC_TIME", "German.UTF-8") # WindowsSys.getlocale("LC_TIME") # the last one successfully set abovec(m = months(then), wd = weekdays(then)) # in C_TIME locale 'cT' ; typically German## End(Not run)Sys.setlocale("LC_TIME", "C")c(m = months(then), wd = weekdays(then)) # "standard" (still platform specific ?)Sys.setlocale("LC_TIME", olcT) # reset to previous## Other localesSys.getlocale("LC_PAPER") # may or may not be set.LC.categories # of length 9 on all platforms## Not run: Sys.setlocale("LC_COLLATE", "C") # turn off locale-specific sorting, # usually (but not on all platforms)Sys.setenv("LANGUAGE" = "es") # set the language for error/warning messages## End(Not run)## some nice formatting; should work on most platforms, ## macOS does not name the entries. sep <- switch(Sys.info()[["sysname"]], "Darwin"=, "SunOS" = "/", "Linux" =, "Windows" = ";") ##' show a "full" Sys.getlocale() nicely: showL <- function(loc) { sl <- strsplit(strsplit(loc, sep)[[1L]], "=") if(all(sapply(sl, length) == 2L)) setNames(sapply(sl, `[[`, 2L), sapply(sl, `[[`, 1L)) else setNames(as.character(sl), .LC.categories[1+seq_along(sl)]) } print.Dlist(lloc <- showL(Sys.getlocale())) ## R-supported ones (but LC_ALL): lloc[.LC.categories[-1]]
Sys.getlocale()## Date-time related :Sys.getlocale("LC_TIME")-> olcTthen<- as.POSIXlt("2001-01-01 01:01:01", tz="UTC")## Not run:c(m= months(then), wd= weekdays(then))# locale specificSys.setlocale("LC_TIME","de")# Solaris: details are OS-dependentSys.setlocale("LC_TIME","de_DE")# Many Unix-alikesSys.setlocale("LC_TIME","de_DE.UTF-8")# Linux, macOS, other Unix-alikesSys.setlocale("LC_TIME","de_DE.utf8")# some Linux versionsSys.setlocale("LC_TIME","German.UTF-8")# WindowsSys.getlocale("LC_TIME")# the last one successfully set abovec(m= months(then), wd= weekdays(then))# in C_TIME locale 'cT' ; typically German## End(Not run)Sys.setlocale("LC_TIME","C")c(m= months(then), wd= weekdays(then))# "standard" (still platform specific ?)Sys.setlocale("LC_TIME", olcT)# reset to previous## Other localesSys.getlocale("LC_PAPER")# may or may not be set.LC.categories# of length 9 on all platforms## Not run: Sys.setlocale("LC_COLLATE", "C") # turn off locale-specific sorting,# usually (but not on all platforms)Sys.setenv("LANGUAGE"="es")# set the language for error/warning messages## End(Not run)## some nice formatting; should work on most platforms,## macOS does not name the entries. sep<- switch(Sys.info()[["sysname"]],"Darwin"=,"SunOS"="/","Linux"=,"Windows"=";")##' show a "full" Sys.getlocale() nicely: showL<-function(loc){ sl<- strsplit(strsplit(loc, sep)[[1L]],"=")if(all(sapply(sl, length)==2L)) setNames(sapply(sl, `[[`,2L), sapply(sl, `[[`,1L))else setNames(as.character(sl), .LC.categories[1+seq_along(sl)])} print.Dlist(lloc<- showL(Sys.getlocale()))## R-supported ones (but LC_ALL): lloc[.LC.categories[-1]]
log
computes logarithms, by default natural logarithms,log10
computes common (i.e., base 10) logarithms, andlog2
computes binary (i.e., base 2) logarithms.The general formlog(x, base)
computes logarithms with basebase
.
log1p(x)
computes accurately also for
.
exp
computes the exponential function.
expm1(x)
computes accurately also for
.
log(x, base = exp(1))logb(x, base = exp(1))log10(x)log2(x)log1p(x)exp(x)expm1(x)
log(x, base= exp(1))logb(x, base= exp(1))log10(x)log2(x)log1p(x)exp(x)expm1(x)
x | a numeric or complex vector. |
base | a positive or complex number: the base with respect to whichlogarithms are computed. Defaults to |
All exceptlogb
are generic functions: methods can be definedfor them individually or via theMath
group generic.
log10
andlog2
are only convenience wrappers, but logsto bases 10 and 2 (whether computedvialog
or the wrappers)will be computed more efficiently and accurately where supported by the OS.Methods can be set for them individually (and otherwise methods forlog
will be used).
logb
is a wrapper forlog
for compatibility with S. If(S3 or S4) methods are set forlog
they will be dispatched.Do not set S4 methods onlogb
itself.
All exceptlog
areprimitive functions.
A vector of the same length asx
containing the transformedvalues.log(0)
gives-Inf
, andlog(x)
fornegative values ofx
isNaN
.exp(-Inf)
is0
.
For complex inputs to the log functions, the value is a complex numberwith imaginary part in the range: whichend of the range is used might be platform-specific.
exp
,expm1
,log
,log10
,log2
andlog1p
are S4 generic and are members of theMath
group generic.
Note that this means that the S4 generic forlog
has asignature with only one argument,x
, but thatbase
canbe passed to methods (but will not be used for method selection). Onthe other hand, if you only set a method for theMath
groupgeneric thenbase
argument oflog
will be ignored foryour class.
log1p
andexpm1
may be taken from the operating system,but if not available there then they are based on the Fortran subroutinedlnrel
by W. Fullerton of Los Alamos Scientific Laboratory (seehttps://netlib.org/slatec/fnlib/dlnrel.f) and (for small x) asingle Newton step for the solution oflog1p(y) = x
respectively.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.(forlog
,log10
andexp
.)
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer. (forlogb
.)
log(exp(3))log10(1e7) # = 7x <- 10^-(1+2*1:9)cbind(deparse.level=2, # to get nice column names x, log(1+x), log1p(x), exp(x)-1, expm1(x))
log(exp(3))log10(1e7)# = 7x<-10^-(1+2*1:9)cbind(deparse.level=2,# to get nice column names x, log(1+x), log1p(x), exp(x)-1, expm1(x))
These operators act on raw, logical and number-like vectors.
! xx & yx && yx | yx || yxor(x, y)isTRUE (x)isFALSE(x)
! xx& yx&& yx| yx|| yxor(x, y)isTRUE(x)isFALSE(x)
x ,y |
|
!
indicates logical negation (NOT).
&
and&&
indicate logical AND and|
and||
indicate logical OR. The shorter forms performs elementwisecomparisons in much the same way as arithmetic operators. The longerforms evaluates left to right, proceeding only until the result isdetermined. The longer form is appropriate for programmingcontrol-flow and typically preferred inif
clauses.
Using vectors of more than one element in&&
or||
willgive an error.
xor
indicates elementwise exclusive OR.
isTRUE(x)
is the same as{ is.logical(x) && length(x) == 1 && !is.na(x) && x }
;isFALSE()
is defined analogously. Consequently,if(isTRUE(cond))
may be preferable toif(cond)
becauseofNA
s.
In earlierR versions,isTRUE <- function(x) identical(x, TRUE)
,had the drawback to be false e.g., forx <- c(val = TRUE)
.
Numeric and complex vectors will be coerced to logical values, withzero being false and all non-zero values being true. Raw vectors arehandled without any coercion for!
,&
,|
andxor
, with these operators being applied bitwise (so!
isthe 1s-complement).
The operators!
,&
and|
are generic functions:methods can be written for them individually or via theOps
(or S4Logic
, see below)group generic function. (SeeOps
forhow dispatch is computed.)
NA
is a valid logical object. Where a component ofx
ory
isNA
, the result will beNA
if theoutcome is ambiguous. In other wordsNA & TRUE
evaluates toNA
, butNA & FALSE
evaluates toFALSE
. See theexamples below.
SeeSyntax for the precedence of these operators: unlike manyother languages (including S) the AND and OR operators do not have thesame precedence (the AND operators have higher precedence than the ORoperators).
For!
, a logical or raw vector(for rawx
) of the samelength asx
: names, dims and dimnames are copied fromx
,and all other attributes (including class) if no coercion is done.
For|
,&
andxor
a logical or raw vector. Ifinvolving a zero-length vector the result has length zero. Otherwise,the elements of shorter vectors are recycled as necessary (with awarning
when they are recycled onlyfractionally).The rules for determining the attributes of the result are rathercomplicated. Most attributes are taken from the longer argument, thefirst if they are of the same length. Names will be copied from thefirst if it is the same length as the answer, otherwise from thesecond if that is. For time series, these operations are allowed onlyif the series are compatible, when the class andtsp
attribute of whichever is a time series (the same, if both are) areused. For arrays (and an array result) the dimensions and dimnamesare taken from first argument if it is an array, otherwise the second.
For||
,&&
andisTRUE
, a length-one logical vector.
!
,&
and|
are S4 generics, the latter two partof theLogic
group generic (andhence methods need argument namese1, e2
).
The elementwise operators are sometimes called as functions ase.g.`&`(x, y)
: see the description of howargument-matching is done inOps
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
any
andall
for OR and AND on many scalararguments.
Syntax
for operator precedence.
L%||% R
which takesL
if it is notNULL
,andR
otherwise.
bitwAnd
for bitwise versions for integer vectors.
y <- 1 + (x <- stats::rpois(50, lambda = 1.5) / 4 - 1)x[(x > 0) & (x < 1)] # all x values between 0 and 1if (any(x == 0) || any(y == 0)) "zero encountered"## construct truth tables :x <- c(NA, FALSE, TRUE)names(x) <- as.character(x)outer(x, x, `&`) ## AND tableouter(x, x, `|`) ## OR table
y<-1+(x<- stats::rpois(50, lambda=1.5)/4-1)x[(x>0)&(x<1)]# all x values between 0 and 1if(any(x==0)|| any(y==0))"zero encountered"## construct truth tables :x<- c(NA,FALSE,TRUE)names(x)<- as.character(x)outer(x, x, `&`)## AND tableouter(x, x, `|`)## OR table
Create or test for objects of type"logical"
, and the basiclogical constants.
TRUEFALSET; Flogical(length = 0)as.logical(x, ...)is.logical(x)
TRUEFALSET; Flogical(length=0)as.logical(x,...)is.logical(x)
length | a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error. |
x | object to be coerced or tested. |
... | further arguments passed to or from other methods. |
TRUE
andFALSE
arereserved words denoting logicalconstants in theR language, whereasT
andF
are globalvariables whose initial values set to these. All four arelogical(1)
vectors.
as.logical
is a generic function. Methods should return an objectof type"logical"
.
Logical vectors are coerced to integer vectors in contexts where anumerical value is required, withTRUE
being mapped to1L
,FALSE
to0L
andNA
toNA_integer_
.
logical
creates a logical vector of the specified length.Each element of the vector is equal toFALSE
.
as.logical
attempts to coerce its argument to be of logicaltype. In numeric and complex vectors, zeros areFALSE
andnon-zero values areTRUE
.Forfactor
s, this uses thelevels
(labels). Likeas.vector
it strips attributes includingnames. Character stringsc("T", "TRUE", "True", "true")
areregarded as true,c("F", "FALSE", "False", "false")
as false,and all others asNA
.
is.logical
returnsTRUE
orFALSE
depending onwhether its argument is of logical type or not.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
NA
, the other logical constant.Logical operators are documented inLogic
.
## non-zero values are TRUEas.logical(c(pi,0))if (length(letters)) cat("26 is TRUE\n")## logical interpretation of particular stringscharvec <- c("FALSE", "F", "False", "false", "fAlse", "0", "TRUE", "T", "True", "true", "tRue", "1")as.logical(charvec)## factors are converted via their levels, so string conversion is usedas.logical(factor(charvec))as.logical(factor(c(0,1))) # "0" and "1" give NA
## non-zero values are TRUEas.logical(c(pi,0))if(length(letters)) cat("26 is TRUE\n")## logical interpretation of particular stringscharvec<- c("FALSE","F","False","false","fAlse","0","TRUE","T","True","true","tRue","1")as.logical(charvec)## factors are converted via their levels, so string conversion is usedas.logical(factor(charvec))as.logical(factor(c(0,1)))# "0" and "1" give NA
Vectors of or more elements were added inR 3.0.0.
Prior toR 3.0.0, all vectors inR were restricted to at most elements and could be indexed by integervectors.
Currently allatomic (raw, logical, integer, numeric, complex,character) vectors,lists andexpressions can be muchlonger on 64-bit platforms: such vectors are referred to as‘long vectors’ and have a slightly different internalstructure. In theory they can contain up to elements, butaddress space limits of current CPUs and OSes will be much smaller.Such objects will have alength that is expressed as a double,and can be indexed by double vectors.
Arrays (including matrices) can be based on long vectors provided eachof their dimensions is at most: thus thereare no 1-dimensional long arrays.
R code typically only needs minor changes to work with long vectors,maybe only checking thatas.integer
is not used unnecessarilyfor e.g. lengths. However, compiled code typically needs quiteextensive changes. Note that the.C
and.Fortran
interfaces do not accept long vectors, so.Call
(or similar) has to be used.
Because of the storage requirements (a minimum of 64 bytes percharacter string), character vectors are only going to be usable ifthey have a small number of distinct elements, and even then factorswill be more efficient (4 bytes per element rather than 8). So it isexpected that most of the usage of long vectors will be integervectors (including factors) and numeric vectors.
It is now possible to use matrices with morethan 2 billion elements. Whether matrix algebra (including
%*%
,crossprod
,svd
,qr
,solve
andeigen
) willactually work is somewhat implementation dependent, including theFortran compiler used and if an external BLAS or LAPACK is used.
An efficient parallel BLAS implementation will often be important toobtain usable performance. For example on one particular platformchol
on a 47,000 square matrix took about 5 hours with theinternal BLAS, 21 minutes using an optimized BLAS on one core, and 2minutes using an optimized BLAS on 16 cores.
Returns a matrix of logicals the same size of a given matrix withentriesTRUE
in the lower or upper triangle.
lower.tri(x, diag = FALSE)upper.tri(x, diag = FALSE)
lower.tri(x, diag=FALSE)upper.tri(x, diag=FALSE)
x | a matrix or otherR object with |
diag | logical. Should the diagonal be included? |
diag
,matrix
; furtherrow
andcol
on whichlower.tri()
andupper.tri()
are built.
(m2 <- matrix(1:20, 4, 5))lower.tri(m2)m2[lower.tri(m2)] <- NAm2
(m2<- matrix(1:20,4,5))lower.tri(m2)m2[lower.tri(m2)]<-NAm2
ls
andobjects
return a vector of character stringsgiving the names of the objects in the specified environment. Wheninvoked with no argument at the top level prompt,ls
shows whatdata sets and functions a user has defined. When invoked with noargument inside a function,ls
returns the names of thefunction's local variables: this is useful in conjunction withbrowser
.
ls(name, pos = -1L, envir = as.environment(pos), all.names = FALSE, pattern, sorted = TRUE)objects(name, pos= -1L, envir = as.environment(pos), all.names = FALSE, pattern, sorted = TRUE)
ls(name, pos=-1L, envir= as.environment(pos), all.names=FALSE, pattern, sorted=TRUE)objects(name, pos=-1L, envir= as.environment(pos), all.names=FALSE, pattern, sorted=TRUE)
name | which environment to use in listing the available objects.Defaults to thecurrent environment. Although called |
pos | an alternative argument to |
envir | an alternative argument to |
all.names | a logical value. If |
pattern | an optionalregular expression. Only namesmatching |
sorted | logical indicating if the resulting |
Thename
argument can specify the environment from whichobject names are taken in one of several forms:as an integer (the position in thesearch
list); asthe character string name of an element in the search list; or as anexplicitenvironment
(including usingsys.frame
to access the currently active function calls).By default, the environment of the call tols
orobjects
is used. Thepos
andenvir
arguments are an alternativeway to specify an environment, but are primarily there for backcompatibility.
Note that theorder of strings forsorted = TRUE
islocale dependent, seeSys.getlocale
. Ifsorted = FALSE
the order is arbitrary, depending if the environment ishashed, the order of insertion of objects, ....
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
glob2rx
for converting wildcard patterns to regularexpressions.
ls.str
for a long listing based onstr
.apropos
(orfind
)for finding objects in the whole search path;grep
for more details on ‘regular expressions’;class
,methods
, etc., forobject-oriented programming.
.Ob <- 1ls(pattern = "O")ls(pattern= "O", all.names = TRUE) # also shows ".[foo]"# shows an empty list because inside myfunc no variables are definedmyfunc <- function() {ls()}myfunc()# define a local variable inside myfuncmyfunc <- function() {y <- 1; ls()}myfunc() # shows "y"
.Ob<-1ls(pattern="O")ls(pattern="O", all.names=TRUE)# also shows ".[foo]"# shows an empty list because inside myfunc no variables are definedmyfunc<-function(){ls()}myfunc()# define a local variable inside myfuncmyfunc<-function(){y<-1; ls()}myfunc()# shows "y"
Make syntactically valid names out of character vectors.
make.names(names, unique = FALSE, allow_ = TRUE)
make.names(names, unique=FALSE, allow_=TRUE)
names | character vector to be coerced to syntactically validnames. This is coerced to character if necessary. |
unique | logical; if |
allow_ | logical. For compatibility withR prior to 1.9.0. |
A syntactically valid name consists of letters, numbers and the dot orunderline characters and starts with a letter or the dot not followedby a number. Names such as".2way"
are not valid, and neitherare thereserved words.
The definition of aletter depends on the current locale, butonly ASCII digits are considered to be digits.
The character"X"
is prepended if necessary.All invalid characters are translated to"."
. A missing valueis translated to"NA"
. Names which matchR keywords have a dotappended to them. Duplicated values are altered bymake.unique
.
A character vector of same length asnames
with each changed toa syntactically valid name, in the current locale's encoding.
Some OSes, notably FreeBSD, report extremely incorrect information aboutwhich characters are alphabetic in some locales (typically, allmulti-byte locales including UTF-8 locales). However,R providessubstitutes on Windows, macOS and AIX.
Prior toR version 1.9.0, underscores were not valid in variable names,and code that relies on them being converted to dots will no longerwork. Useallow_ = FALSE
for back-compatibility.
allow_ = FALSE
is also useful when creating names for export toapplications which do not allow underline in names (such as someDBMSes).
make.unique
,names
,character
,data.frame
.
make.names(c("a and b", "a-and-b"), unique = TRUE)# "a.and.b" "a.and.b.1"make.names(c("a and b", "a_and_b"), unique = TRUE)# "a.and.b" "a_and_b"make.names(c("a and b", "a_and_b"), unique = TRUE, allow_ = FALSE)# "a.and.b" "a.and.b.1"make.names(c("", "X"), unique = TRUE)# "X.1" "X" currently; R up to 3.0.2 gave "X" "X.1"state.name[make.names(state.name) != state.name] # those 10 with a space
make.names(c("a and b","a-and-b"), unique=TRUE)# "a.and.b" "a.and.b.1"make.names(c("a and b","a_and_b"), unique=TRUE)# "a.and.b" "a_and_b"make.names(c("a and b","a_and_b"), unique=TRUE, allow_=FALSE)# "a.and.b" "a.and.b.1"make.names(c("","X"), unique=TRUE)# "X.1" "X" currently; R up to 3.0.2 gave "X" "X.1"state.name[make.names(state.name)!= state.name]# those 10 with a space
Makes the elements of a character vector unique byappending sequence numbers to duplicates.
make.unique(names, sep = ".")
make.unique(names, sep=".")
names | a character vector. |
sep | a character string used to separate a duplicate name fromits sequence number. |
The algorithm used bymake.unique
has the property thatmake.unique(c(A, B)) == make.unique(c(make.unique(A), B))
.
In other words, you can append one string at a time to a vector,making it unique each time, and get the same result as applyingmake.unique
to all of the strings at once.
If character vectorA
is already unique, thenmake.unique(c(A, B))
preservesA
.
A character vector of same length asnames
with duplicateschanged, in the current locale's encoding.
Thomas P. Minka
make.unique(c("a", "a", "a"))make.unique(c(make.unique(c("a", "a")), "a"))make.unique(c("a", "a", "a.2", "a"))make.unique(c(make.unique(c("a", "a")), "a.2", "a"))## Now show a bit where this is used :trace(make.unique)## Applied in data.frame() constructions:(d1 <- data.frame(x = 1, x = 2, x = 3)) # direct d2 <- data.frame(data.frame(x = 1, x = 2), x = 3) # pairwisestopifnot(identical(d1, d2), colnames(d1) == c("x", "x.1", "x.2"))untrace(make.unique)
make.unique(c("a","a","a"))make.unique(c(make.unique(c("a","a")),"a"))make.unique(c("a","a","a.2","a"))make.unique(c(make.unique(c("a","a")),"a.2","a"))## Now show a bit where this is used :trace(make.unique)## Applied in data.frame() constructions:(d1<- data.frame(x=1, x=2, x=3))# direct d2<- data.frame(data.frame(x=1, x=2), x=3)# pairwisestopifnot(identical(d1, d2), colnames(d1)== c("x","x.1","x.2"))untrace(make.unique)
mapply
is a multivariate version ofsapply
.mapply
appliesFUN
to the first elements of each ...argument, the second elements, the third elements, and so on.Arguments are recycled if necessary.
.mapply()
is a bare-bones version ofmapply()
, e.g., to beused in other functions.
mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE).mapply(FUN, dots, MoreArgs)
mapply(FUN,..., MoreArgs=NULL, SIMPLIFY=TRUE, USE.NAMES=TRUE).mapply(FUN, dots, MoreArgs)
FUN | function to apply, found via |
... | arguments to vectorize over, will be recycled to commonlength (zero if one of them is). See also ‘Details’. |
dots |
|
MoreArgs | a list of other arguments to |
SIMPLIFY | logical or character string; attempt to reduce theresult to a vector, matrix or higher dimensional array; seethe |
USE.NAMES | logical; use the names of the first ... argument, orif that is an unnamed character vector, use that vector as the names. |
mapply
callsFUN
for the values of...
(re-cycled to the length of the longest, unless any have length zerowhere recycling to zero length will returnlist()
),followed by the arguments given inMoreArgs
. The arguments inthe call will be named if...
orMoreArgs
are named.
For the arguments in...
(or components indots
) class specificsubsetting (such as[
) andlength
methods will beused where applicable.
Alist
, or forSIMPLIFY = TRUE
, a vector, array or list.
sapply
, after whichmapply()
is modelled.
outer
, which applies a vectorized function to allcombinations of two arguments.
mapply(rep, 1:4, 4:1)mapply(rep, times = 1:4, x = 4:1)mapply(rep, times = 1:4, MoreArgs = list(x = 42))mapply(function(x, y) seq_len(x) + y, c(a = 1, b = 2, c = 3), # names from first c(A = 10, B = 0, C = -10))word <- function(C, k) paste(rep.int(C, k), collapse = "")## names from the first, too:utils::str(L <- mapply(word, LETTERS[1:6], 6:1, SIMPLIFY = FALSE))mapply(word, "A", integer()) # gave Error, now list()
mapply(rep,1:4,4:1)mapply(rep, times=1:4, x=4:1)mapply(rep, times=1:4, MoreArgs= list(x=42))mapply(function(x, y) seq_len(x)+ y, c(a=1, b=2, c=3),# names from first c(A=10, B=0, C=-10))word<-function(C, k) paste(rep.int(C, k), collapse="")## names from the first, too:utils::str(L<- mapply(word, LETTERS[1:6],6:1, SIMPLIFY=FALSE))mapply(word,"A", integer())# gave Error, now list()
For a contingency table in array form, compute the sum of tableentries for a given margin or set of margins.
marginSums(x, margin = NULL)margin.table(x, margin = NULL)
marginSums(x, margin=NULL)margin.table(x, margin=NULL)
x | an array, usually a |
margin | a vector giving the margins to compute sums for.E.g., for a matrix |
The relevant marginal table, or just the sum of all entries ifmargin
has length zero. The class ofx
is copied to theoutput table ifmargin
is non-NULL.
margin.table
is an earlier name, retained for back-compatibility.
Peter Dalgaard
rowSums
andcolSums
for similar functionality.
m <- matrix(1:4, 2)marginSums(m, 1) # = rowSums(m)marginSums(m, 2) # = colSums(m)DF <- as.data.frame(UCBAdmissions)tbl <- xtabs(Freq ~ Gender + Admit, DF)tblmarginSums(tbl, "Gender") # a 1-dim "table"rowSums(tbl) # a numeric vector
m<- matrix(1:4,2)marginSums(m,1)# = rowSums(m)marginSums(m,2)# = colSums(m)DF<- as.data.frame(UCBAdmissions)tbl<- xtabs(Freq~ Gender+ Admit, DF)tblmarginSums(tbl,"Gender")# a 1-dim "table"rowSums(tbl)# a numeric vector
mat.or.vec
creates annr
bync
zero matrix ifnc
is greater than 1, and a zero vector of lengthnr
ifnc
equals 1.
mat.or.vec(nr, nc)
mat.or.vec(nr, nc)
nr ,nc | numbers of rows and columns. |
mat.or.vec(3, 1)mat.or.vec(3, 2)
mat.or.vec(3,1)mat.or.vec(3,2)
match
returns a vector of the positions of (first) matches ofits first argument in its second.
%in%
is a more intuitive interface as a binary operator,which returns a logical vector indicating if there is a match or notfor its left operand.
match(x, table, nomatch = NA_integer_, incomparables = NULL)x %in% table
match(x, table, nomatch=NA_integer_, incomparables=NULL)x%in% table
x | vector or |
table | vector or |
nomatch | the value to be returned in the case when no match isfound. Note that it is coerced to |
incomparables | a vector of values that cannot be matched. Anyvalue in |
%in%
is currently defined as"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0
Factors, raw vectors and lists are converted to character vectors,internally classed objects are transformed viamtfrm
, andthenx
andtable
are coerced to a common type (the laterof the two types inR's ordering, logical < integer < numeric <complex < character) before matching. Ifincomparables
haspositive length it is coerced to the common type.
Matching for lists is potentially very slow and best avoided except insimple cases.
Exactly what matches what is to some extent a matter of definition.For all types,NA
matchesNA
and no other value.For real and complex values,NaN
values are regardedas matching any otherNaN
value, but not matchingNA
,where for complexx
, real and imaginary parts must match both(unless containing at least oneNA
).
Character strings will be compared as byte sequences if any input ismarked as"bytes"
, and otherwise are regarded as equal if they arein different encodings but would agree when translated to UTF-8 (seeEncoding
).
That%in%
never returnsNA
makes it particularlyuseful inif
conditions.
A vector of the same length asx
.
match
: An integer vector giving the position intable
ofthe first match if there is a match, otherwisenomatch
.
Ifx[i]
is found to equaltable[j]
then the valuereturned in thei
-th position of the return value isj
,for the smallest possiblej
. If no match is found, the valueisnomatch
.
%in%
: A logical vector, indicating if a match was located foreach element ofx
: thus the values areTRUE
orFALSE
and neverNA
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
pmatch
andcharmatch
for (partial)string matching,match.arg
, etc for function argumentmatching.findInterval
similarly returns a vector of positions, butfinds numbers within intervals, rather than exact matches.
is.element
for an S-compatible equivalent of%in%
.
unique
(andduplicated
) are using the samedefinitions of “match” or “equality” asmatch()
,and these are less strict than==
, e.g., forNA
andNaN
in numeric or complex vectors,or for strings with different encodings, see also above.
## The intersection of two sets can be defined via match():## Simple version:## intersect <- function(x, y) y[match(x, y, nomatch = 0)]intersect # the R function in base is slightly more carefulintersect(1:10, 7:20)1:10 %in% c(1,3,5,9)sstr <- c("c","ab","B","bba","c",NA,"@","bla","a","Ba","%")sstr[sstr %in% c(letters, LETTERS)]"%w/o%" <- function(x, y) x[!x %in% y] #-- x without y(1:10) %w/o% c(3,7,12)## Note that setdiff() is very similar and typically makes more sense: c(1:6,7:2) %w/o% c(3,7,12) # -> keeps duplicatessetdiff(c(1:6,7:2), c(3,7,12)) # -> unique values## Illuminating example about NA matchingr <- c(1, NA, NaN)zN <- c(complex(real = NA , imaginary = r ), complex(real = r , imaginary = NA ), complex(real = r , imaginary = NaN), complex(real = NaN, imaginary = r ))zM <- cbind(Re=Re(zN), Im=Im(zN), match = match(zN, zN))rownames(zM) <- format(zN)zM ##--> many "NA's" (= 1) and the four non-NA's (3 different ones, at 7,9,10)length(zN) # 12unique(zN) # the "NA" and the 3 different non-NA NaN'sstopifnot(identical(unique(zN), zN[c(1, 7,9,10)]))## very strict equality would have 4 duplicates (of 12):symnum(outer(zN, zN, Vectorize(identical,c("x","y")), FALSE,FALSE,FALSE,FALSE))## removing "(very strictly) duplicates",i <- c(5,8,11,12) # we get 8 pairwise non-identicals :Ixy <- outer(zN[-i], zN[-i], Vectorize(identical,c("x","y")), FALSE,FALSE,FALSE,FALSE)stopifnot(identical(Ixy, diag(8) == 1))
## The intersection of two sets can be defined via match():## Simple version:## intersect <- function(x, y) y[match(x, y, nomatch = 0)]intersect# the R function in base is slightly more carefulintersect(1:10,7:20)1:10%in% c(1,3,5,9)sstr<- c("c","ab","B","bba","c",NA,"@","bla","a","Ba","%")sstr[sstr%in% c(letters, LETTERS)]"%w/o%"<-function(x, y) x[!x%in% y]#-- x without y(1:10)%w/o% c(3,7,12)## Note that setdiff() is very similar and typically makes more sense: c(1:6,7:2)%w/o% c(3,7,12)# -> keeps duplicatessetdiff(c(1:6,7:2), c(3,7,12))# -> unique values## Illuminating example about NA matchingr<- c(1,NA,NaN)zN<- c(complex(real=NA, imaginary= r), complex(real= r, imaginary=NA), complex(real= r, imaginary=NaN), complex(real=NaN, imaginary= r))zM<- cbind(Re=Re(zN), Im=Im(zN), match= match(zN, zN))rownames(zM)<- format(zN)zM##--> many "NA's" (= 1) and the four non-NA's (3 different ones, at 7,9,10)length(zN)# 12unique(zN)# the "NA" and the 3 different non-NA NaN'sstopifnot(identical(unique(zN), zN[c(1,7,9,10)]))## very strict equality would have 4 duplicates (of 12):symnum(outer(zN, zN, Vectorize(identical,c("x","y")),FALSE,FALSE,FALSE,FALSE))## removing "(very strictly) duplicates",i<- c(5,8,11,12)# we get 8 pairwise non-identicals :Ixy<- outer(zN[-i], zN[-i], Vectorize(identical,c("x","y")),FALSE,FALSE,FALSE,FALSE)stopifnot(identical(Ixy, diag(8)==1))
match.arg
matches a characterarg
against a table ofcandidate values as specified bychoices
.
match.arg(arg, choices, several.ok = FALSE)
match.arg(arg, choices, several.ok=FALSE)
arg | a character vector (of length one unless |
choices | a character vector of candidate values, often missing, see‘Details’. |
several.ok | logical specifying if |
In the one-argument formmatch.arg(arg)
, the choices areobtained from a default setting for the formal argumentarg
ofthe function from whichmatch.arg
was called. (Since defaultargument matching will setarg
tochoices
, this isallowed as an exception to the ‘length one unlessseveral.ok
isTRUE
’ rule, and returns the firstelement.)
Matching is done usingpmatch
, soarg
may beabbreviated and the empty string (""
) never matches, not evenitself, seepmatch
.
The unabbreviated version of the exact or unique partial match ifthere is one; otherwise, an error is signalled ifseveral.ok
isfalse, as per default. Whenseveral.ok
is true and (at least)one element ofarg
has a match, all unabbreviated versions ofmatches are returned.
The error messages given are liable to change and did so inR 4.2.0.Do not test them in packages.
require(stats)## Extends the example for 'switch'center <- function(x, type = c("mean", "median", "trimmed")) { type <- match.arg(type) switch(type, mean = mean(x), median = median(x), trimmed = mean(x, trim = .1))}x <- rcauchy(10)center(x, "t") # Workscenter(x, "med") # Workstry(center(x, "m")) # Errorstopifnot(identical(center(x), center(x, "mean")), identical(center(x, NULL), center(x, "mean")) )## Allowing more than one 'arg' and hence more than one match:match.arg(c("gauss", "rect", "ep"), c("gaussian", "epanechnikov", "rectangular", "triangular"), several.ok = TRUE)match.arg(c("a", ""), c("", NA, "bb", "abc"), several.ok=TRUE) # |--> "abc"
require(stats)## Extends the example for 'switch'center<-function(x, type= c("mean","median","trimmed")){ type<- match.arg(type) switch(type, mean= mean(x), median= median(x), trimmed= mean(x, trim=.1))}x<- rcauchy(10)center(x,"t")# Workscenter(x,"med")# Workstry(center(x,"m"))# Errorstopifnot(identical(center(x), center(x,"mean")), identical(center(x,NULL), center(x,"mean")))## Allowing more than one 'arg' and hence more than one match:match.arg(c("gauss","rect","ep"), c("gaussian","epanechnikov","rectangular","triangular"), several.ok=TRUE)match.arg(c("a",""), c("",NA,"bb","abc"), several.ok=TRUE)# |--> "abc"
match.call
returns a call in which all of the specified arguments arespecified by their full names.
match.call(definition = sys.function(sys.parent()), call = sys.call(sys.parent()), expand.dots = TRUE, envir = parent.frame(2L))
match.call(definition= sys.function(sys.parent()), call= sys.call(sys.parent()), expand.dots=TRUE, envir= parent.frame(2L))
definition | a function, by default the function from which |
call | an unevaluated call to the function specified by |
expand.dots | logical. Should arguments matching |
envir | an environment, from which the |
‘function’ on this help page means an interpreted function(also known as a ‘closure’):match.call
does not supportprimitive functions (where argument matching is normallypositional).
match.call
is most commonly used in two circumstances:
To record the call for later re-use: for example mostmodel-fitting functions record the call as elementcall
ofthe list they return. Here the defaultexpand.dots = TRUE
is appropriate.
To pass most of the call to another function, oftenmodel.frame
. Here the common idiom is thatexpand.dots = FALSE
is used, and the...
elementof the matched call is removed. An alternative is toexplicitly select the arguments to be passed on, as is done inlm
.
Callingmatch.call
outside a function without specifyingdefinition
is an error.
An object of classcall
.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
sys.call()
is similar, but doesnot expand theargument names;call
,pmatch
,match.arg
,match.fun
.
match.call(get, call("get", "abc", i = FALSE, p = 3))## -> get(x = "abc", pos = 3, inherits = FALSE)fun <- function(x, lower = 0, upper = 1) { structure((x - lower) / (upper - lower), CALL = match.call())}fun(4 * atan(1), u = pi)
match.call(get, call("get","abc", i=FALSE, p=3))## -> get(x = "abc", pos = 3, inherits = FALSE)fun<-function(x, lower=0, upper=1){ structure((x- lower)/(upper- lower), CALL= match.call())}fun(4* atan(1), u= pi)
When called inside functions that take a function as argument, extractthe desired function object while avoiding undesired matching toobjects of other types.
match.fun(FUN, descend = TRUE)
match.fun(FUN, descend=TRUE)
FUN | item to match as function: a function, symbol orcharacter string. See ‘Details’. |
descend | logical; control whether to search past non-functionobjects. |
match.fun
is not intended to be used at the top level since itwill perform matching in theparent of the caller.
IfFUN
is a function, it is returned. If it is a symbol (forexample, enclosed in backquotes) or acharacter vector of length one, it will be looked up usingget
in the environment of the parent of the caller. If it is of any othermode, it is attempted first to get the argument to the caller as asymbol (usingsubstitute
twice), and if that fails, an error isdeclared.
Ifdescend = TRUE
,match.fun
will look past non-functionobjects with the given name; otherwise ifFUN
points to anon-function object then an error is generated.
This is used in base functions such asapply
,lapply
,outer
, andsweep
.
A function matchingFUN
or an error is generated.
Thedescend
argument is a bit of misnomer and probably notactually needed by anything. It may go away in the future.
It is impossible to fully foolproof this. If oneattach
es alist or data frame containing a length-one character vector with thesame name as a function, it may be used (although namespaceswill help).
Peter Dalgaard and Robert Gentleman, based on an earlier versionby Jonathan Rougier.
# Same as get("*"):match.fun("*")# Overwrite outer with a vectorouter <- 1:5try(match.fun(outer, descend = FALSE)) #-> Error: not a functionmatch.fun(outer) # finds it anywayis.function(match.fun("outer")) # as well
# Same as get("*"):match.fun("*")# Overwrite outer with a vectorouter<-1:5try(match.fun(outer, descend=FALSE))#-> Error: not a functionmatch.fun(outer)# finds it anywayis.function(match.fun("outer"))# as well
abs(x)
computes the absolute value of x,sqrt(x)
computes the(principal) square root of x,.
The naming follows the standard for computer languages such as C or Fortran.
abs(x)sqrt(x)
abs(x)sqrt(x)
x | a numeric or |
These areinternal genericprimitive functions: methodscan be defined for them individually or via theMath
group generic. For complexarguments (and the default method),z
,abs(z) ==Mod(z)
andsqrt(z) == z^0.5
.
abs(x)
returns aninteger
vector whenx
isinteger
orlogical
.
Both are S4 generic and members of theMath
group generic.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Arithmetic
for simple,log
for logarithmic,sin
for trigonometric, andSpecial
forspecial mathematical functions.
‘plotmath’ for the use ofsqrt
in plot annotation.
require(stats) # for splinerequire(graphics)xx <- -9:9plot(xx, sqrt(abs(xx)), col = "red")lines(spline(xx, sqrt(abs(xx)), n=101), col = "pink")
require(stats)# for splinerequire(graphics)xx<--9:9plot(xx, sqrt(abs(xx)), col="red")lines(spline(xx, sqrt(abs(xx)), n=101), col="pink")
Multiplies two matrices, if they are conformable. If one argument isa vector, it will be promoted to either a row or column matrix to makethe two arguments conformable. If both are vectors of the samelength, it will return the inner product (as a matrix).
x %*% y
x%*% y
x ,y | numeric or complex matrices or vectors. |
When a vector is promoted to a matrix, its names are notpromoted to row or column names, unlikeas.matrix
.
Promotion of a vector to a 1-row or 1-column matrix happens when oneof the two choices allowsx
andy
to get conformabledimensions.
This operator is a generic function: methods can be written for itindividually or via thematOps
groupgeneric function; it dispatches to S3 and S4 methods. Methods need to bewritten for a function that takes two arguments namedx
andy
.
A double or complex matrix product. Usedrop
to removedimensions which have only one level.
The propagation of NaN/Inf values, precision, and performance of matrixproducts can be controlled byoptions("matprod")
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
For matrixcross products,crossprod()
andtcrossprod()
are typically preferable.matrix
,Arithmetic
,diag
.
x <- 1:4(z <- x %*% x) # scalar ("inner") product (1 x 1 matrix)drop(z) # as scalary <- diag(x)z <- matrix(1:12, ncol = 3, nrow = 4)y %*% zy %*% xx %*% z
x<-1:4(z<- x%*% x)# scalar ("inner") product (1 x 1 matrix)drop(z)# as scalary<- diag(x)z<- matrix(1:12, ncol=3, nrow=4)y%*% zy%*% xx%*% z
matrix
creates a matrix from the given set of values.
as.matrix
attempts to turn its argument into a matrix.
is.matrix
tests if its argument is a (strict) matrix.
matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)as.matrix(x, ...)## S3 method for class 'data.frame'as.matrix(x, rownames.force = NA, ...)is.matrix(x)
matrix(data=NA, nrow=1, ncol=1, byrow=FALSE, dimnames=NULL)as.matrix(x,...)## S3 method for class 'data.frame'as.matrix(x, rownames.force=NA,...)is.matrix(x)
data | an optional data vector (including a list or |
nrow | the desired number of rows. |
ncol | the desired number of columns. |
byrow | logical. If |
dimnames | a |
x | anR object. |
... | additional arguments to be passed to or from methods. |
rownames.force | logical indicating if the resulting matrixshould have character (rather than |
If one ofnrow
orncol
is not given, an attempt ismade to infer it from the length ofdata
and the otherparameter. If neither is given, a one-column matrix is returned.
If there are too few elements indata
to fill the matrix,then the elements indata
are recycled. Ifdata
haslength zero,NA
of an appropriate type is used for atomicvectors (0
for raw vectors) andNULL
for lists.
is.matrix
returnsTRUE
ifx
is a vector and has a"dim"
attribute of length 2 andFALSE
otherwise.Note that adata.frame
isnot a matrix by thistest. The function is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.
as.matrix
is a generic function. The method for data frameswill return a character matrix if there is only atomic columns and anynon-(numeric/logical/complex) column, applyingas.vector
to factors andformat
to other non-character columns.Otherwise, the usual coercion hierarchy (logical < integer < double <complex) will be used, e.g., all-logical data frames will be coercedto a logical matrix, mixed logical-integer will give a integer matrix,etc.
The default method foras.matrix
callsas.vector(x)
, andhence e.g. coerces factors to character vectors.
When coercing a vector, it produces a one-column matrix, andpromotes the names (if any) of the vector to the rownames of the matrix.
is.matrix
is aprimitive function.
Theprint
method for a matrix gives a rectangular layout withdimnames or indices. For a list matrix, the entries of length notone are printed in the form ‘integer,7’ indicating the typeand length.
If you just want to convert a vector to a matrix, something like
dim(x) <- c(nx, ny) dimnames(x) <- list(row_names, col_names)
will avoid duplicatingx
and preserveclass(x)
which may be useful, e.g.,forDate
objects.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
data.matrix
, which attempts to convert to a numericmatrix.
A matrix is the special case of a two-dimensionalarray
.inherits(m, "array")
is true for amatrix
m
.
is.matrix(as.matrix(1:10))!is.matrix(warpbreaks) # data.frame, NOT matrix!warpbreaks[1:10,]as.matrix(warpbreaks[1:10,]) # using as.matrix.data.frame(.) method## Example of setting row and column namesmdat <- matrix(c(1,2,3, 11,12,13), nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c("row1", "row2"), c("C.1", "C.2", "C.3")))mdat
is.matrix(as.matrix(1:10))!is.matrix(warpbreaks)# data.frame, NOT matrix!warpbreaks[1:10,]as.matrix(warpbreaks[1:10,])# using as.matrix.data.frame(.) method## Example of setting row and column namesmdat<- matrix(c(1,2,3,11,12,13), nrow=2, ncol=3, byrow=TRUE, dimnames= list(c("row1","row2"), c("C.1","C.2","C.3")))mdat
Find the maximum position for each row of a matrix, breaking ties at random.
max.col(m, ties.method = c("random", "first", "last"))
max.col(m, ties.method= c("random","first","last"))
m | a numerical matrix. |
ties.method | a character string specifying how ties arehandled, |
Whenties.method = "random"
, as per default, ties are broken atrandom. In this case, the determination of a tie assumes thatthe entries are probabilities: there is a relative tolerance of, relative to the largest (in magnitude, omittinginfinity) entry in the row.
Ifties.method = "first"
,max.col
returns thecolumn number of thefirst of several maxima in every row, thesame asunname(apply(m, 1,which.max))
ifm
has no missing values.
Correspondingly,ties.method = "last"
returns thelastof possibly several indices.
index of a maximal value for each row, an integer vector oflengthnrow(m)
.
Venables, W. N. and Ripley, B. D. (2002)Modern Applied Statistics with S.New York: Springer (4th ed).
which.max
for vectors.
table(mc <- max.col(swiss)) # mostly "1" and "5", 5 x "2" and once "4"swiss[unique(print(mr <- max.col(t(swiss)))) , ] # 3 33 45 45 33 6set.seed(1) # reproducible example:(mm <- rbind(x = round(2*stats::runif(12)), y = round(5*stats::runif(12)), z = round(8*stats::runif(12))))## Not run: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]x 1 1 1 2 0 2 2 1 1 0 0 0y 3 2 4 2 4 5 2 4 5 1 3 1z 2 3 0 3 7 3 4 5 4 1 7 5## End(Not run)## column indices of all row maxima :utils::str(lapply(1:3, function(i) which(mm[i,] == max(mm[i,]))))max.col(mm) ; max.col(mm) # "random"max.col(mm, "first") # -> 4 6 5max.col(mm, "last") # -> 7 9 11
table(mc<- max.col(swiss))# mostly "1" and "5", 5 x "2" and once "4"swiss[unique(print(mr<- max.col(t(swiss)))),]# 3 33 45 45 33 6set.seed(1)# reproducible example:(mm<- rbind(x= round(2*stats::runif(12)), y= round(5*stats::runif(12)), z= round(8*stats::runif(12))))## Not run:[,1][,2][,3][,4][,5][,6][,7][,8][,9][,10][,11][,12]x111202211000y324245245131z230373454175## End(Not run)## column indices of all row maxima :utils::str(lapply(1:3,function(i) which(mm[i,]== max(mm[i,]))))max.col(mm); max.col(mm)# "random"max.col(mm,"first")# -> 4 6 5max.col(mm,"last")# -> 7 9 11
Generic function for the (trimmed) arithmetic mean.
mean(x, ...)## Default S3 method:mean(x, trim = 0, na.rm = FALSE, ...)
mean(x,...)## Default S3 method:mean(x, trim=0, na.rm=FALSE,...)
x | anR object. Currently there are methods fornumeric/logical vectors anddate,date-time andtime interval objects. Complex vectorsare allowed for |
trim | the fraction (0 to 0.5) of observations to betrimmed from each end of |
na.rm | a logical evaluating to |
... | further arguments passed to or from other methods. |
Iftrim
is zero (the default), the arithmetic mean of thevalues inx
is computed, as a numeric or complex vector oflength one. Ifx
is not logical (coerced to numeric), numeric(including integer) or complex,NA_real_
is returned, with a warning.
Iftrim
is non-zero, a symmetrically trimmed mean is computedwith a fraction oftrim
observations deleted from each endbefore the mean is computed.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
weighted.mean
,mean.POSIXct
,colMeans
for row and column means.
x <- c(0:10, 50)xm <- mean(x)c(xm, mean(x, trim = 0.10))
x<- c(0:10,50)xm<- mean(x)c(xm, mean(x, trim=0.10))
In-memory compression or decompression for raw vectors.
memCompress(from, type = c("gzip", "bzip2", "xz", "none"))memDecompress(from, type = c("unknown", "gzip", "bzip2", "xz", "none"), asChar = FALSE)
memCompress(from, type= c("gzip","bzip2","xz","none"))memDecompress(from, type= c("unknown","gzip","bzip2","xz","none"), asChar=FALSE)
from | raw vector. For |
type | character string, the type of compression. May beabbreviated to a single letter, defaults to the first of the alternatives. |
asChar | logical: should the result be converted to a characterstring? NB: character strings have a limit of |
type = "none"
passes the input through unchanged, but may beuseful iftype
is a variable.
type = "unknown"
attempts to detect the type of compressionapplied (if any): this will always succeed forbzip2
compression, and will succeed for other forms if there is a suitableheader. If no type of compression is detected this is the same astype = "none"
but a warning is given.
gzip
compression uses whatever is the default compressionlevel of the underlying library (usually6
). This supports theRFC 1950 format, sometimes known as ‘zlib’ format, forcompression and decompression and for decompression only RFC 1952, the‘gzip’ format (which wraps the ‘zlib’ format with aheader and footer).
bzip2
compression always adds a header ("BZh"
). Theunderlying library only supports in-memory (de)compression of up to elements. Compression is equivalent to
bzip2 -9
(the default).
Compressing withtype = "xz"
is equivalent to compressing afile withxz -9e
(including adding the ‘magic’header): decompression should cope with the contents of any filecompressed byxz
version 4.999 and later, as well as by someversions oflzma
. There are other versions, in particular‘raw’ streams, that are not currently handled.
All the types of compression can expand the input: for"gzip"
and"bzip2"
the maximum expansion is known and somemCompress
can always allocate sufficient space. For"xz"
it is possible (but extremely unlikely) that compressionwill fail if the output would have been too large.
A raw vector or a character string (ifasChar = TRUE
).
libdeflate
Support for thelibdeflate
library was added forR 4.4.0. Ituses different code for the RFC 1950 ‘zlib’ format (and RFC1952 for decompression), expected to be substantially faster thanusing the reference (or system) zlib library. It is used fortype = "gzip"
if available.
The headers and sources can be downloaded fromhttps://github.com/ebiggers/libdeflate and pre-built versionsare available for most Linux distributions. It is used for binaryWindows distributions.
extSoftVersion
for the versions of thezlib
orlibdeflate
,bzip2
andxz
libraries in use.
https://en.wikipedia.org/wiki/Data_compression for background ondata compression,https://zlib.net/,https://en.wikipedia.org/wiki/Gzip,http://www.bzip.org/,https://en.wikipedia.org/wiki/Bzip2,andhttps://en.wikipedia.org/wiki/XZ_Utils for references about theparticular schemes used.
txt <- readLines(file.path(R.home("doc"), "COPYING"))sum(nchar(txt))txt.gz <- memCompress(txt, "g") # "gzip", the defaultlength(txt.gz)txt2 <- strsplit(memDecompress(txt.gz, "g", asChar = TRUE), "\n")[[1]]stopifnot(identical(txt, txt2))## as from R 4.4.0 this is detected if not specified.txt2b <- strsplit(memDecompress(txt.gz, asChar = TRUE), "\n")[[1]]stopifnot(identical(txt2b, txt2))txt.bz2 <- memCompress(txt, "b")length(txt.bz2)## can auto-detect bzip2:txt3 <- strsplit(memDecompress(txt.bz2, asChar = TRUE), "\n")[[1]]stopifnot(identical(txt, txt3))## xz compression is only worthwhile for large objectstxt.xz <- memCompress(txt, "x")length(txt.xz)txt3 <- strsplit(memDecompress(txt.xz, asChar = TRUE), "\n")[[1]]stopifnot(identical(txt, txt3))## test decompressing a gzip-ed filetf <- tempfile(fileext = ".gz")con <- gzfile(tf, "w")writeLines(txt, con)close(con)(nf <- file.size(tf))# if (nzchar(Sys.which("file"))) system2("file", tf)foo <- readBin(tf, "raw", n = nf)unlink(tf)## will detect the gzip header and choose type = "gzip"txt3 <- strsplit(memDecompress(foo, asChar = TRUE), "\n")[[1]]stopifnot(identical(txt, txt3))
txt<- readLines(file.path(R.home("doc"),"COPYING"))sum(nchar(txt))txt.gz<- memCompress(txt,"g")# "gzip", the defaultlength(txt.gz)txt2<- strsplit(memDecompress(txt.gz,"g", asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt2))## as from R 4.4.0 this is detected if not specified.txt2b<- strsplit(memDecompress(txt.gz, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt2b, txt2))txt.bz2<- memCompress(txt,"b")length(txt.bz2)## can auto-detect bzip2:txt3<- strsplit(memDecompress(txt.bz2, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt3))## xz compression is only worthwhile for large objectstxt.xz<- memCompress(txt,"x")length(txt.xz)txt3<- strsplit(memDecompress(txt.xz, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt3))## test decompressing a gzip-ed filetf<- tempfile(fileext=".gz")con<- gzfile(tf,"w")writeLines(txt, con)close(con)(nf<- file.size(tf))# if (nzchar(Sys.which("file"))) system2("file", tf)foo<- readBin(tf,"raw", n= nf)unlink(tf)## will detect the gzip header and choose type = "gzip"txt3<- strsplit(memDecompress(foo, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt3))
Query and set the maximal size of the vector heap and the maximalnumber of heap nodes for the currentR process.
mem.maxVSize(vsize = 0)mem.maxNSize(nsize = 0)
mem.maxVSize(vsize=0)mem.maxNSize(nsize=0)
vsize | numeric; new size limit in Mb. |
nsize | numeric; new maximal node number. |
New limits lower than current usage are ignored.Specifying a size ofInf
sets the limit to the maximal possiblevalue for the platform.
The default maximal values are unlimited on most platforms, but can beadjusted using environment variables as described inMemory
. On macOS a lower default vector heap limit isused to protect against theR process being killed when macOSover-commits memory.
Adjusting the maximal number of nodes is rarely necessary. Adjustingthe vector heap size limit can be useful on macOS in particular butshould be done with caution.
The current or new value, in Mb formem.maxVSize
.Inf
isreturned if the current value is unlimited.
HowR manages its workspace.
R has a variable-sized workspace. There are (rarely-used)command-line options to control its minimum size, but no longer any tocontrol the maximum size.
R maintains separate areas for fixed and variable sized objects. Thefirst of these is allocated as an array ofcons cells (Lispprogrammers will know what they are, others may think of them as thebuilding blocks of the language itself, parse trees, etc.), and thesecond are thrown on aheap of ‘Vcells’ of 8 bytes each.Each cons cell occupies 28 bytes on a 32-bit build ofR, (usually) 56bytes on a 64-bit build.
The default values are (currently) an initial setting of 350k conscells and 6Mb of vector heap. Note that the areas are not actuallyallocated initially: rather these values are the sizes for triggeringgarbage collection. These values can be set by the command lineoptions--min-nsize and--min-vsize (or if they arenot used, the environment variablesR_NSIZE andR_VSIZE)whenR is started. ThereafterR will grow or shrink the areasdepending on usage, never decreasing below the initial values. Themaximal vector heap size can be set with the environment variableR_MAX_VSIZE. An attempt to set a lower maximum than the currentusage is ignored. Vector heap limits are given in bytes.
How much timeR spends in the garbage collector will depend on theseinitial settings and on the trade-off the memory manager makes, whenmemory fills up, between collecting garbage to free up unused memoryand growing these areas. The strategy used for growth can bespecified by setting the environment variableR_GC_MEM_GROW toan integer value between 0 and 3. This variable is read atstart-up. Higher values grow the heap more aggressively, thus reducinggarbage collection time but using more memory.
You can find out the current memory consumption (the heap and conscells used as numbers and megabytes) by typinggc()
at theR prompt. Note that followinggcinfo(TRUE)
, automaticgarbage collection always prints memory use statistics.
The command-line option--max-ppsize controls the maximumsize of the pointer protection stack. This defaults to 50000, but canbe increased to allow deep recursion or large and complicatedcalculations to be done.Note that parts of the garbagecollection process goes through the full reserved pointer protectionstack and hence becomes slower when the size is increased. Currentlythe maximum value accepted is 500000.
An Introduction to R for more command-line options.
Memory-limits
for the design limitations.
gc
for information on the garbage collector and totalmemory usage,object.size(a)
for the (approximate)size ofR objecta
.memory.profile
forprofiling the usage of cons cells.
R holds objects it is using in virtual memory. This help filedocuments the current design limitations on large objects: thesediffer between 32-bit and 64-bit builds ofR.
CurrentlyR runs on 32- and 64-bit operating systems, and most 64-bitOSes (including Linux, Solaris, Windows and macOS) can run either32- or 64-bit builds ofR. The memory limits depends mainly on thebuild, but for a 32-bit build ofR on Windows they also depend on theunderlying OS version.
R holds all objects in virtual memory, and there are limits based on theamount of memory that can be used by all objects:
There may be limits on the size of the heap and the number ofcons cells allowed – seeMemory
– but these areusually not imposed.
There is a limit on the (user) address space of a singleprocess such as theR executable. This is system-specific, and candepend on the executable.
The environment may impose limitations on the resourcesavailable to a single process: Windows' versions ofR do so directly.
Error messages beginning ‘cannot allocate vector of size’indicate a failure to obtain memory, either because the size exceededthe address-space limit for a process or, more likely, because thesystem was unable to provide the memory. Note that on a 32-bit buildthere may well be enough free memory available, but not a large enoughcontiguous block of address space into which to map it.
There are also limits on individual objects. The storage spacecannot exceed the address limit, and if you try to exceed that limit,the error message begins ‘cannot allocate vector of length’.The number of bytes in a character string is limited to,which is also the limit on each dimension of an array.
The address-space limit is system-specific: 32-bit OSesimposes a limit of no more than 4Gb: it is often 3Gb. Running32-bit executables on a 64-bit OS will have similar limits: 64-bitexecutables will have an essentially infinite system-specific limit(e.g., 128Tb for Linux on x86_64 CPUs).
See the OS/shell's help on commands such aslimit
orulimit
for how to impose limitations on the resources availableto a single process. For example abash
user could use
ulimit -t 600 -v 4000000
whereas acsh
user might use
limit cputime 10mlimit vmemoryuse 4096m
to limit a process to 10 minutes of CPU time and (around) 4Gb ofvirtual memory. (There are other options to set the RAM in use, but theyare not generally honoured.)
The address-space limit is 2Gb under 32-bit Windows unless the OS'sdefault has been changed to allow more (up to 3Gb). Seehttps://docs.microsoft.com/en-gb/windows/desktop/Memory/physical-address-extensionandhttps://docs.microsoft.com/en-gb/windows/desktop/Memory/4-gigabyte-tuning.Under most 64-bit versions of Windows the limit for a 32-bit buildofR is 4Gb: for the oldest ones it is 2Gb. The limit for a 64-bitbuild ofR (imposed by the OS) is 8Tb.
It is not normally possible to allocate as much as 2Gb to a singlevector in a 32-bit build ofR even on 64-bit Windows because ofpreallocations by Windows in the middle of the address space.
object.size(a)
for the (approximate) size ofR objecta
.
Lists the usage of the cons cells bySEXPREC
type.
memory.profile()
memory.profile()
The current types and their uses are listed in the include file‘Rinternals.h’.
A vector of counts, named by the types. Seetypeof
foran explanation of types.
gc
for the overall usage of cons cells.Rprofmem
andtracemem
allow memory profilingof specific code or objects, but need to be enabled at compile time.
memory.profile()
memory.profile()
Merge two data frames by common columns or row names, or do otherversions of databasejoin operations.
merge(x, y, ...)## Default S3 method:merge(x, y, ...)## S3 method for class 'data.frame'merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all, sort = TRUE, suffixes = c(".x",".y"), no.dups = TRUE, incomparables = NULL, ...)
merge(x, y,...)## Default S3 method:merge(x, y,...)## S3 method for class 'data.frame'merge(x, y, by= intersect(names(x), names(y)), by.x= by, by.y= by, all=FALSE, all.x= all, all.y= all, sort=TRUE, suffixes= c(".x",".y"), no.dups=TRUE, incomparables=NULL,...)
x ,y | data frames, or objects to be coerced to one. |
by ,by.x ,by.y | specifications of the columns used for merging.See ‘Details’. |
all | logical; |
all.x | logical; if |
all.y | logical; analogous to |
sort | logical. Should the result be sorted on the |
suffixes | a character vector of length 2 specifying the suffixesto be used for making unique the names of columns in the resultwhich are not used for merging (appearing in |
no.dups | logical indicating that |
incomparables | values which cannot be matched. See |
... | arguments to be passed to or from methods. |
merge
is a generic function whose principal method is for dataframes: the default method coerces its arguments to data frames andcalls the"data.frame"
method.
By default the data frames are merged on the columns with names theyboth have, but separate specifications of the columns can be given byby.x
andby.y
. The rows in the two data frames thatmatch on the specified columns are extracted, and joined together. Ifthere is more than one match, all possible matches contribute one roweach. For the precise meaning of ‘match’, seematch
.
Columns to merge on can be specified by name, number or by a logicalvector: the name"row.names"
or the number0
specifiesthe row names. If specified by name it must correspond uniquely to anamed column in the input.
Ifby
or bothby.x
andby.y
are of length 0 (alength zero vector orNULL
), the result,r
, is theCartesian product ofx
andy
, i.e.,dim(r) = c(nrow(x)*nrow(y), ncol(x) + ncol(y))
.
Ifall.x
is true, all the non matching cases ofx
areappended to the result as well, withNA
filled in thecorresponding columns ofy
; analogously forall.y
.
If the columns in the data frames not used in merging have any commonnames, these havesuffixes
(".x"
and".y"
bydefault) appended to try to make the names of the result unique. Ifthis is not possible, an error is thrown.
If aby.x
column name matches one ofy
, and ifno.dups
is true (as by default), the y version gets suffixed aswell, avoiding duplicate column names in the result.
The complexity of the algorithm used is proportional to the length ofthe answer.
In SQL database terminology, the default value ofall = FALSE
gives anatural join, a special case of aninnerjoin. Specifyingall.x = TRUE
gives aleft (outer)join,all.y = TRUE
aright (outer) join, and both(all = TRUE
) a(full) outer join.DBMSes do not matchNULL
records, equivalent toincomparables = NA
inR.
A data frame. The rows are by default lexicographically sorted on thecommon columns, but forsort = FALSE
are in an unspecified order.The columns are the common columns followed by theremaining columns inx
and then those iny
. If thematching involved row names, an extra character column calledRow.names
is added at the left, and in all cases the result has‘automatic’ row names.
This is intended to work with data frames with vector-like columns:some aspects work with data frames containing matrices, but not all.
Currently long vectors are not accepted for inputs, which are thusrestricted to less than 2^31 rows. That restriction also applies tothe result for 32-bit platforms.
dendrogram
for a class which has amerge
method.
authors <- data.frame( ## I(*) : use character columns of names to get sensible sort order surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")), nationality = c("US", "Australia", "US", "UK", "Australia"), deceased = c("yes", rep("no", 4)))authorN <- within(authors, { name <- surname; rm(surname) })books <- data.frame( name = I(c("Tukey", "Venables", "Tierney", "Ripley", "Ripley", "McNeil", "R Core")), title = c("Exploratory Data Analysis", "Modern Applied Statistics ...", "LISP-STAT", "Spatial Statistics", "Stochastic Simulation", "Interactive Data Analysis", "An Introduction to R"), other.author = c(NA, "Ripley", NA, NA, NA, NA, "Venables & Smith"))(m0 <- merge(authorN, books))(m1 <- merge(authors, books, by.x = "surname", by.y = "name")) m2 <- merge(books, authors, by.x = "name", by.y = "surname")stopifnot(exprs = { identical(m0, m2[, names(m0)]) as.character(m1[, 1]) == as.character(m2[, 1]) all.equal(m1[, -1], m2[, -1][ names(m1)[-1] ]) identical(dim(merge(m1, m2, by = NULL)), c(nrow(m1)*nrow(m2), ncol(m1)+ncol(m2)))})## "R core" is missing from authors and appears only here :merge(authors, books, by.x = "surname", by.y = "name", all = TRUE)## example of using 'incomparables'x <- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)y <- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)merge(x, y, by = c("k1","k2")) # NA's matchmerge(x, y, by = "k1") # NA's match, so 6 rowsmerge(x, y, by = "k2", incomparables = NA) # 2 rows
authors<- data.frame(## I(*) : use character columns of names to get sensible sort order surname= I(c("Tukey","Venables","Tierney","Ripley","McNeil")), nationality= c("US","Australia","US","UK","Australia"), deceased= c("yes", rep("no",4)))authorN<- within(authors,{ name<- surname; rm(surname)})books<- data.frame( name= I(c("Tukey","Venables","Tierney","Ripley","Ripley","McNeil","R Core")), title= c("Exploratory Data Analysis","Modern Applied Statistics ...","LISP-STAT","Spatial Statistics","Stochastic Simulation","Interactive Data Analysis","An Introduction to R"), other.author= c(NA,"Ripley",NA,NA,NA,NA,"Venables & Smith"))(m0<- merge(authorN, books))(m1<- merge(authors, books, by.x="surname", by.y="name")) m2<- merge(books, authors, by.x="name", by.y="surname")stopifnot(exprs={ identical(m0, m2[, names(m0)]) as.character(m1[,1])== as.character(m2[,1]) all.equal(m1[,-1], m2[,-1][ names(m1)[-1]]) identical(dim(merge(m1, m2, by=NULL)), c(nrow(m1)*nrow(m2), ncol(m1)+ncol(m2)))})## "R core" is missing from authors and appears only here :merge(authors, books, by.x="surname", by.y="name", all=TRUE)## example of using 'incomparables'x<- data.frame(k1= c(NA,NA,3,4,5), k2= c(1,NA,NA,4,5), data=1:5)y<- data.frame(k1= c(NA,2,NA,4,5), k2= c(NA,NA,3,4,5), data=1:5)merge(x, y, by= c("k1","k2"))# NA's matchmerge(x, y, by="k1")# NA's match, so 6 rowsmerge(x, y, by="k2", incomparables=NA)# 2 rows
Generate a diagnostic message from its arguments.
message(..., domain = NULL, appendLF = TRUE)suppressMessages(expr, classes = "message")packageStartupMessage(..., domain = NULL, appendLF = TRUE)suppressPackageStartupMessages(expr).makeMessage(..., domain = NULL, appendLF = FALSE)
message(..., domain=NULL, appendLF=TRUE)suppressMessages(expr, classes="message")packageStartupMessage(..., domain=NULL, appendLF=TRUE)suppressPackageStartupMessages(expr).makeMessage(..., domain=NULL, appendLF=FALSE)
... | zero or more objects which can be coerced to character(and which are pasted together with no separator) or (for |
domain | see |
appendLF | logical: should messages given as a character stringhave a newline appended? |
expr | expression to evaluate. |
classes | character, indicating which classes of messages shouldbe suppressed. |
message
is used for generating ‘simple’ diagnosticmessages which are neither warnings nor errors, but neverthelessrepresented as conditions. Unlike warnings and errors, a finalnewline is regarded as part of the message, and is optional.The default handler sends the message to thestderr()
connection.
If a condition object is supplied tomessage
it should bethe only argument, and further arguments will be ignored, with a warning.
While the message is being processed, amuffleMessage
restartis available.
suppressMessages
evaluates its expression in a context thatignores all ‘simple’ diagnostic messages.
packageStartupMessage
is a variant whose messages can besuppressed separately bysuppressPackageStartupMessages
. (Theyare still messages, so can be suppressed bysuppressMessages
.)
.makeMessage
is a utility used bymessage
,warning
andstop
to generate a text message from the...
arguments by possible translation (seegettext
) andconcatenation (with no separator).
warning
andstop
for generating warningsand errors;conditions
for condition handling andrecovery.
gettext
for the mechanisms for the automated translationof text.
message("ABC", "DEF")suppressMessages(message("ABC"))testit <- function() { message("testing package startup messages") packageStartupMessage("initializing ...", appendLF = FALSE) Sys.sleep(1) packageStartupMessage(" done")}testit()suppressPackageStartupMessages(testit())suppressMessages(testit())
message("ABC","DEF")suppressMessages(message("ABC"))testit<-function(){ message("testing package startup messages") packageStartupMessage("initializing ...", appendLF=FALSE) Sys.sleep(1) packageStartupMessage(" done")}testit()suppressPackageStartupMessages(testit())suppressMessages(testit())
missing
can be used to test whether a value was specifiedas an argument to a function.
missing(x)
missing(x)
x | a formal argument. |
missing(x)
is only reliable ifx
has not been alteredsince entering the function: in particular it willalwaysbe false afterx <- match.arg(x)
.
The example shows how a plotting function can be written to work witheither a pair of vectors giving x and y coordinates of points to beplotted or a single vector giving y values to be plotted against theirindices.
Currentlymissing
can only be used in the immediate body ofthe function that defines the argument, not in the body of a nestedfunction or alocal
call. This may change in the future.
This is a ‘special’primitive function: it must notevaluate its argument.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
substitute
for argument expression;NA
for missing values in data.
myplot <- function(x, y) { if(missing(y)) { y <- x x <- 1:length(y) } plot(x, y) }
myplot<-function(x, y){if(missing(y)){ y<- x x<-1:length(y)} plot(x, y)}
Get or set the ‘mode’ (a kind of ‘type’), or the storagemode of anR object.
mode(x)mode(x) <- valuestorage.mode(x)storage.mode(x) <- value
mode(x)mode(x)<- valuestorage.mode(x)storage.mode(x)<- value
x | anyR object. |
value | a character string giving the desired mode or‘storage mode’ (type) of the object. |
Bothmode
andstorage.mode
return a character stringgiving the (storage) mode of the object — often the same — bothrelying on the output oftypeof(x)
, see the examplebelow.
mode(x) <- "newmode"
changes themode
of objectx
tonewmode
. This is only supported if there is an appropriateas.newmode
function, for example"logical"
,"integer"
,"double"
,"complex"
,"raw"
,"character"
,"list"
,"expression"
,"name"
,"symbol"
and"function"
. Attributes arepreserved (but see below).
storage.mode(x) <- "newmode"
is a more efficientprimitiveversion ofmode<-
, which works for"newmode"
which isone of the internal types (seetypeof
), but not for"single"
. Attributes are preserved.
As storage mode"single"
is only a pseudo-mode inR, it willnot be reported bymode
orstorage.mode
: useattr(object, "Csingle")
to examine this. However,mode<-
can be used to set the mode to"single"
,which sets the real mode to"double"
and the"Csingle"
attribute toTRUE
. Setting any other mode will remove thisattribute.
Note (in the examples below) that somecall
s have mode"("
which is S compatible.
Modes have the same set of names as types (seetypeof
)except that
types"integer"
and"double"
arereturned as"numeric"
.
types"special"
,"builtin"
and"closure"
are returned as"function"
.
type"symbol"
is called mode"name"
.
type"language"
is returned as"("
or"call"
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
typeof
for the R-internal ‘mode’ or ‘type’,type.convert
,attributes
.
require(stats)sapply(options(), mode)cex3 <- c("NULL", "1", "1:1", "1i", "list(1)", "data.frame(x = 1)", "pairlist(pi)", "c", "lm", "formals(lm)[[1]]", "formals(lm)[[2]]", "y ~ x","expression((1))[[1]]", "(y ~ x)[[1]]", "expression(x <- pi)[[1]][[1]]")lex3 <- sapply(cex3, function(x) eval(str2lang(x)))mex3 <- t(sapply(lex3, function(x) c(typeof(x), storage.mode(x), mode(x))))dimnames(mex3) <- list(cex3, c("typeof(.)","storage.mode(.)","mode(.)"))mex3## This also makes a local copy of 'pi':storage.mode(pi) <- "complex"storage.mode(pi)rm(pi)
require(stats)sapply(options(), mode)cex3<- c("NULL","1","1:1","1i","list(1)","data.frame(x = 1)","pairlist(pi)","c","lm","formals(lm)[[1]]","formals(lm)[[2]]","y ~ x","expression((1))[[1]]","(y ~ x)[[1]]","expression(x <- pi)[[1]][[1]]")lex3<- sapply(cex3,function(x) eval(str2lang(x)))mex3<- t(sapply(lex3,function(x) c(typeof(x), storage.mode(x), mode(x))))dimnames(mex3)<- list(cex3, c("typeof(.)","storage.mode(.)","mode(.)"))mex3## This also makes a local copy of 'pi':storage.mode(pi)<-"complex"storage.mode(pi)rm(pi)
Transform objects for matching viamatch()
, think“match form”-> "mtfrm"
.base provides the S3 generic and adefault
plus"POSIXct"
and"POSIXlt"
methods.
mtfrm(x)
mtfrm(x)
x | anR object |
Matching viamatch
will usemtfrm
to transforminternally classed objects (seeis.object
) to a vectorrepresentation appropriate for matching. The default method performsas.character
if this preserves the length.
Ideally, methods formtfrm
should ensure that comparisons ofsame-classed objects viamatch
are consistent with thoseemployed by methods forduplicated
/unique
and==
/!=
(where applicable).
A vector of the same length asx
.
NA
is a logical constant of length 1 which contains a missingvalue indicator.NA
can be coerced to any other vectortype except raw. There are also constantsNA_integer_
,NA_real_
,NA_complex_
andNA_character_
of theother atomic vector types which support missing values: all of thesearereserved words in theR language.
The generic functionis.na
indicates which elements are missing.
The generic functionis.na<-
sets elements toNA
.
The generic functionanyNA
implementsany(is.na(x))
in apossibly faster way (especially for atomic vectors).
NAis.na(x)anyNA(x, recursive = FALSE)## S3 method for class 'data.frame'is.na(x)is.na(x) <- value
NAis.na(x)anyNA(x, recursive=FALSE)## S3 method for class 'data.frame'is.na(x)is.na(x)<- value
x | anR object to be tested: the default method for |
recursive | logical: should |
value | a suitable index vector for use with |
TheNA
of character type is distinct from the string"NA"
. Programmers who need to specify an explicit missingstring should useNA_character_
(rather than"NA"
) or setelements toNA
usingis.na<-
.
is.na
andanyNA
are generic: you can writemethods to handle specific classes of objects, seeInternalMethods.
Functionis.na<-
may provide a safer way to set missingness.It behaves differently for factors, for example.
Numerical computations usingNA
will normally result inNA
: a possible exception is whereNaN
is alsoinvolved, in which case either might result (which may depend ontheR platform). However, this is not guaranteed and future CPUsand/or compilers may behave differently. Dynamic binary translation mayalso impact this behavior (with valgrind, computations usingNA
may result inNaN
even when noNaN
is involved).
Logical computations treatNA
as a missingTRUE/FALSE
value, and so may returnTRUE
orFALSE
if the expressiondoes not depend on theNA
operand.
The default method foranyNA
handles atomic vectors without aclass andNULL
. It callsany(is.na(x))
on objects withclasses and forrecursive = FALSE
, on lists and pairlists.
The default method foris.na
applied to an atomic vectorreturns a logical vector of the same length as its argumentx
,containingTRUE
for those elements markedNA
or, fornumeric or complex vectors,NaN
, andFALSE
otherwise. (A complex value is regarded asNA
if either itsreal or imaginary part isNA
orNaN
.)dim
,dimnames
andnames
attributes are copied tothe result.
The default methods also work for lists and pairlists:
Foris.na
, elementwise the result is false unless that elementis a length-one atomic vector and the single element of that vector isregarded asNA
orNaN
(note that anyis.na
method for the class of the element is ignored).anyNA(recursive = FALSE)
works the same way asis.na
;anyNA(recursive = TRUE)
appliesanyNA
(with methoddispatch) to each element.
The data frame method foris.na
returns a logical matrixwith the same dimensions as the data frame, and with dimnames takenfrom the row and column names of the data frame.
anyNA(NULL)
is false;is.na(NULL)
islogical(0)
(no longer warning sinceR version 3.5.0).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
NaN
,is.nan
, etc.,and the utility functioncomplete.cases
.
na.action
,na.omit
,na.fail
on how methods can be tuned to deal with missing values.
is.na(c(1, NA)) #> FALSE TRUEis.na(paste(c(1, NA))) #> FALSE FALSE(xx <- c(0:4))is.na(xx) <- c(2, 4)xx #> 0 NA 2 NA 4anyNA(xx) # TRUE# Some logical operations do not return NAc(TRUE, FALSE) & NAc(TRUE, FALSE) | NA## Measure speed difference in a favourable case:## the difference depends on the platform, on most ca 3x.x <- 1:10000; x[5000] <- NaN # coerces x to be doubleif(require("microbenchmark")) { # does not work reliably on all platforms print(microbenchmark(any(is.na(x)), anyNA(x)))} else { nSim <- 2^13 print(rbind(is.na = system.time(replicate(nSim, any(is.na(x)))), anyNA = system.time(replicate(nSim, anyNA(x)))))}## anyNA() can work recursively with list()s:LL <- list(1:5, c(NA, 5:8), c("A","NA"), c("a", NA_character_))L2 <- LL[c(1,3)]sapply(LL, anyNA); c(anyNA(LL), anyNA(LL, TRUE))sapply(L2, anyNA); c(anyNA(L2), anyNA(L2, TRUE))## ... lists, and hence data frames, too:dN <- dd <- USJudgeRatings; dN[3,6] <- NAanyNA(dd) # FALSEanyNA(dN) # TRUE
is.na(c(1,NA))#> FALSE TRUEis.na(paste(c(1,NA)))#> FALSE FALSE(xx<- c(0:4))is.na(xx)<- c(2,4)xx#> 0 NA 2 NA 4anyNA(xx)# TRUE# Some logical operations do not return NAc(TRUE,FALSE)&NAc(TRUE,FALSE)|NA## Measure speed difference in a favourable case:## the difference depends on the platform, on most ca 3x.x<-1:10000; x[5000]<-NaN# coerces x to be doubleif(require("microbenchmark")){# does not work reliably on all platforms print(microbenchmark(any(is.na(x)), anyNA(x)))}else{ nSim<-2^13 print(rbind(is.na= system.time(replicate(nSim, any(is.na(x)))), anyNA= system.time(replicate(nSim, anyNA(x)))))}## anyNA() can work recursively with list()s:LL<- list(1:5, c(NA,5:8), c("A","NA"), c("a",NA_character_))L2<- LL[c(1,3)]sapply(LL, anyNA); c(anyNA(LL), anyNA(LL,TRUE))sapply(L2, anyNA); c(anyNA(L2), anyNA(L2,TRUE))## ... lists, and hence data frames, too:dN<- dd<- USJudgeRatings; dN[3,6]<-NAanyNA(dd)# FALSEanyNA(dN)# TRUE
A ‘name’ (also known as a ‘symbol’) is a way to refer toR objects by name (rather than the value of the object, if any, boundto that name).
as.name
andas.symbol
are identical: they attempt tocoerce the argument to a name.
is.symbol
and the identicalis.name
returnTRUE
orFALSE
depending on whether the argument is a name or not.
as.symbol(x)is.symbol(x)as.name(x)is.name(x)
as.symbol(x)is.symbol(x)as.name(x)is.name(x)
x | object to be coerced or tested. |
Names are limited to 10,000 bytes (and were to 256 bytes in versionsofR before 2.13.0).
as.name
first coerces its argument internally to a charactervector (so methods foras.character
are not used). It thentakes the first element and provided it is not""
, returns asymbol of that name (and if the element isNA_character_
, thename is`NA`
).
as.name
is implemented asas.vector(x, "symbol")
,and hence will dispatch methods for the generic functionas.vector
.
is.name
andis.symbol
areprimitive functions.
Foras.name
andas.symbol
, anR object of type"symbol"
(seetypeof
).
Foris.name
andis.symbol
, a length-one logical vectorwith valueTRUE
orFALSE
.
The term ‘symbol’ is from the LISP background ofR, whereas‘name’ has been the standard S term for this.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
call
,is.language
.For the internal object mode,typeof
.
plotmath
for another use of ‘symbol’.
an <- as.name("arrg")is.name(an) # TRUEmode(an) # nametypeof(an) # symbol
an<- as.name("arrg")is.name(an)# TRUEmode(an)# nametypeof(an)# symbol
Functions to get or set the names of an object.
names(x)names(x) <- value
names(x)names(x)<- value
x | anR object. |
value | a character vector of up to the same length as |
names
is a generic accessor function, andnames<-
is ageneric replacement function. The default methods get and setthe"names"
attribute of a vector (including a list) orpairlist.
For anenvironment
env
,names(env)
givesthe names of the corresponding list, i.e.,names(as.list(env, all.names = TRUE))
which are also given byls(env, all.names = TRUE, sorted = FALSE)
. If theenvironment is used as a hash table,names(env)
are its“keys”.
Ifvalue
is shorter thanx
, it is extended by characterNA
s to the length ofx
.
It is possible to update just part of the names attribute via thegeneral rules: see the examples. This works because the expressionthere is evaluated asz <- "names<-"(z, "[<-"(names(z), 3, "c2"))
.
The name""
is special: it is used to indicate that there is noname associated with an element of a (atomic or generic) vector.Subscripting by""
will match nothing (not even elements whichhave no name).
A name can be characterNA
, but such a name will never bematched and is likely to lead to confusion.
Both areprimitive functions.
Fornames
,NULL
or a character vector of the same lengthasx
. (NULL
is given if the object has no names,including for objects of types which cannot have names.) For anenvironment, the length is the number of objects in the environmentbut the order of the names is arbitrary.
Fornames<-
, the updated object. (Note that the value ofnames(x) <- value
is that of the assignment,value
, notthe return value from the left-hand side.)
For vectors, the names are one of theattributes withrestrictions on the possible values. For pairlists, the names are thetags and converted to and from a character vector.
For a one-dimensional array thenames
attribute really isdimnames[[1]]
.
Formally classed aka “S4” objects typically haveslotNames()
(and nonames()
).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
# print the names attribute of the islands data setnames(islands)# remove the names attributenames(islands) <- NULLislandsrm(islands) # remove the copy madez <- list(a = 1, b = "c", c = 1:3)names(z)# change just the name of the third element.names(z)[3] <- "c2"zz <- 1:3names(z)## assign just one namenames(z)[2] <- "b"z
# print the names attribute of the islands data setnames(islands)# remove the names attributenames(islands)<-NULLislandsrm(islands)# remove the copy madez<- list(a=1, b="c", c=1:3)names(z)# change just the name of the third element.names(z)[3]<-"c2"zz<-1:3names(z)## assign just one namenames(z)[2]<-"b"z
When used inside a function body,nargs
returns the number ofarguments supplied to that function,including positionalarguments left blank.
nargs()
nargs()
The count includes empty (missing) arguments, so thatfoo(x,,z)
will be considered to have three arguments (see ‘Examples’).This can occur in rather indirect ways, so for examplex[]
might dispatch a call to`[.some_method`(x, )
which isconsidered to have two arguments.
This is aprimitive function.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
tst <- function(a, b = 3, ...) {nargs()}tst() # 0tst(clicketyclack) # 1 (even non-existing)tst(c1, a2, rr3) # 3foo <- function(x, y, z, w) { cat("call was ", deparse(match.call()), "\n", sep = "") nargs()}foo() # 0foo(, , 3) # 3foo(z = 3) # 1, even though this is the same callnargs() # not really meaningful
tst<-function(a, b=3,...){nargs()}tst()# 0tst(clicketyclack)# 1 (even non-existing)tst(c1, a2, rr3)# 3foo<-function(x, y, z, w){ cat("call was ", deparse(match.call()),"\n", sep="") nargs()}foo()# 0foo(,,3)# 3foo(z=3)# 1, even though this is the same callnargs()# not really meaningful
nchar
takes a character vector as an argument andreturns a vector whose elements contain the sizes ofthe corresponding elements ofx
. Internally, it is a generic,for which methods can be defined (seeInternalMethods).
nzchar
is a fast way to find out if elements of a charactervector are non-empty strings.
nchar(x, type = "chars", allowNA = FALSE, keepNA = NA)nzchar(x, keepNA = FALSE)
nchar(x, type="chars", allowNA=FALSE, keepNA=NA)nzchar(x, keepNA=FALSE)
x | character vector, or a vector to be coerced to a charactervector. Giving a factor is an error. |
type | character string: partial matching to one of |
allowNA | logical: should |
keepNA | logical: should |
The ‘size’ of a character string can be measured in one ofthree ways (corresponding to thetype
argument):
bytes
The number of bytes needed to store the string(plus in C a final terminator which is not counted).
chars
The number of characters.
width
The number of columnscat
will use toprint the string in a monospaced font. The same aschars
if this cannot be calculated.
These will often be the same, and usually will be in single-bytelocales (but note howtype
determines the default forkeepNA
). There will be differences between the first two withmultibyte character sequences, e.g. in UTF-8 locales.
The internal equivalent of the default method ofas.character
is performed onx
(so there is nomethod dispatch). If you want to operate on non-vector objectspassing them throughdeparse
first will be required.
Fornchar
, an integer vector giving the sizes of each element.For missing values (i.e.,NA
, i.e.,NA_character_
),nchar()
returnsNA_integer_
ifkeepNA
istrue, and2
, the number of printing characters, if false.
type = "width"
gives (an approximation to) the number ofcolumns used in printing each element in a terminal font, taking intoaccount double-width, zero-width and ‘composing’ characters.The approximation is likely to be poor when there are unassigned ornon-printing characters.
IfallowNA = TRUE
and an element is detected as invalid in amulti-byte character set such as UTF-8, its number of characters andthe width will beNA
. Otherwise the number of characters willbe non-negative, so!is.na(nchar(x, "chars", TRUE))
is a testof validity.
A character string marked with"bytes"
encoding (seeEncoding
) has a number of bytes, but neither a knownnumber of characters nor a width, so the latter two types areNA
ifallowNA = TRUE
, otherwise an error.
Names, dims and dimnames are copied from the input.
Fornzchar
, a logical vector of the same length asx
,true if and only if the element has non-zero size; if the element isNA
,nzchar()
is true whenkeepNA
is false (thedefault) orNA
, andNA
otherwise.
This doesnot by default give the number of characters thatwill be used toprint()
the string. UseencodeString
to find that.
Where character strings have been marked as UTF-8, the number ofcharacters and widths will be computed in UTF-8, even though printingmay use escapes such as ‘<U+2642>’ in a non-UTF-8 locale.
The concept of ‘width’ is a slippery one even in a monospacedfont. Some human languages have the concept ofcombiningcharacters, in which two or more characters are rendered together: anexample would be"y\u306"
, which is two characters of widthone: combining characters are given width zero, and there are otherzero-width characters such as the zero-width space"\u200b"
.
Some East Asian languages have ‘wide’ characters, ideographswhich are conventionally printed across two columns when mixed withASCII and other ‘narrow’ characters in those languages. Theproblem is that whether a computer prints wide characters over two orone columns depends on the font, with it not being uncommon to use twocolumns in a font intended for East Asian users and a single column ina ‘Western’ font. Unicode has encodings for ‘fullwidth’versions of ASCII characters and ‘halfwidth’ versions ofKatakana (Japanese) and Hangul (Korean) characters. Then there is the‘East Asian Ambiguous class’ (Greek, Cyrillic, signs, someaccented Latin chars, etc), for which the historical practice was touse two columns in East Asia and one elsewhere. The width quoted bynchar
for characters in that class (and some others) depends onthe locale, being one except in some East Asian locales on some OSes(notably Windows).
Control characters are usually given width zero: this includesCR andLF. Computing the width of a string containing control charactersshould be avoided (and may depend on the OS andR version).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Unicode Standard Annex #11:East Asian Width.https://www.unicode.org/reports/tr11/
strwidth
giving width of strings for plotting;paste
,substr
,strsplit
x <- c("asfef", "qwerty", "yuiop[", "b", "stuff.blah.yech")nchar(x)# 5 6 6 1 15nchar(deparse(mean))# 18 17 <-- unless mean differs from base::mean## NA behaviour as function of keepNA=* :logi <- setNames(, c(FALSE, NA, TRUE))sapply(logi, \(k) data.frame(nchar = nchar (NA, keepNA=k), nzchar = nzchar(NA, keepNA=k)))x[3] <- NA; xnchar(x, keepNA= TRUE) # 5 6 NA 1 15nchar(x, keepNA=FALSE) # 5 6 2 1 15stopifnot(identical(nchar(x ), nchar(x, keepNA= TRUE)), identical(nchar(x, "w"), nchar(x, keepNA=FALSE)), identical(is.na(x), is.na(nchar(x))))##' nchar() for all three types :nchars <- function(x, ...) vapply(c("chars", "bytes", "width"), function(tp) nchar(x, tp, ...), integer(length(x)))nchars("\u200b") # in R versions (>= 2015-09-xx):## chars bytes width## 1 3 0data.frame(x, nchars(x)) ## all three types : same unless for NA## force the same by forcing 'keepNA':(ncT <- nchars(x, keepNA = TRUE)) ## .... NA NA NA ....(ncF <- nchars(x, keepNA = FALSE))## .... 2 2 2 ....stopifnot(apply(ncT, 1, function(.) length(unique(.))) == 1, apply(ncF, 1, function(.) length(unique(.))) == 1)
x<- c("asfef","qwerty","yuiop[","b","stuff.blah.yech")nchar(x)# 5 6 6 1 15nchar(deparse(mean))# 18 17 <-- unless mean differs from base::mean## NA behaviour as function of keepNA=* :logi<- setNames(, c(FALSE,NA,TRUE))sapply(logi, \(k) data.frame(nchar= nchar(NA, keepNA=k), nzchar= nzchar(NA, keepNA=k)))x[3]<-NA; xnchar(x, keepNA=TRUE)# 5 6 NA 1 15nchar(x, keepNA=FALSE)# 5 6 2 1 15stopifnot(identical(nchar(x), nchar(x, keepNA=TRUE)), identical(nchar(x,"w"), nchar(x, keepNA=FALSE)), identical(is.na(x), is.na(nchar(x))))##' nchar() for all three types :nchars<-function(x,...) vapply(c("chars","bytes","width"),function(tp) nchar(x, tp,...), integer(length(x)))nchars("\u200b")# in R versions (>= 2015-09-xx):## chars bytes width## 1 3 0data.frame(x, nchars(x))## all three types : same unless for NA## force the same by forcing 'keepNA':(ncT<- nchars(x, keepNA=TRUE))## .... NA NA NA ....(ncF<- nchars(x, keepNA=FALSE))## .... 2 2 2 ....stopifnot(apply(ncT,1,function(.) length(unique(.)))==1, apply(ncF,1,function(.) length(unique(.)))==1)
Return the number of levels which its argument has.
nlevels(x)
nlevels(x)
x | an object, usually a factor. |
This is usually applied to a factor, but other objects can have levels.
The actual factor levels (if they exist) can be obtainedwith thelevels
function.
The length oflevels(x)
, which is zero ifx
has no levels.
nlevels(gl(3, 7)) # = 3
nlevels(gl(3,7))# = 3
Print character strings without quotes.
noquote(obj, right = FALSE)## S3 method for class 'noquote'print(x, quote = FALSE, right = FALSE, ...)## S3 method for class 'noquote'c(..., recursive = FALSE)
noquote(obj, right=FALSE)## S3 method for class 'noquote'print(x, quote=FALSE, right=FALSE,...)## S3 method for class 'noquote'c(..., recursive=FALSE)
obj | anyR object, typically a vector of |
right | optional |
x | an object of class |
quote ,... | further options passed to next methods, such as |
recursive | for compatibility with the generic |
noquote
returns its argument as an object of class"noquote"
. There is a method forc()
and subscriptmethod ("[.noquote"
) which ensures that the class is not lostby subsetting. The print method (print.noquote
) printscharacter stringswithout quotes ("...."
is printed as....
).
Ifright
is specified in a callprint(x, right=*)
, ittakes precedence over a possibleright
setting ofx
,e.g., created byx <- noquote(*, right=TRUE)
.
These functions exist both as utilities and as an example of using (S3)class
and object orientation.
Martin Maechler[email protected]
lettersnql <- noquote(letters)nqlnql[1:4] <- "oh"nql[1:12]cmp.logical <- function(log.v){ ## Purpose: compact printing of logicals log.v <- as.logical(log.v) noquote(if(length(log.v) == 0)"()" else c(".","|")[1 + log.v])}cmp.logical(stats::runif(20) > 0.8)chmat <- as.matrix(format(stackloss)) # a "typical" character matrix## noquote(*, right=TRUE) so it prints exactly like a data framechmat <- noquote(chmat, right = TRUE)chmat
lettersnql<- noquote(letters)nqlnql[1:4]<-"oh"nql[1:12]cmp.logical<-function(log.v){## Purpose: compact printing of logicals log.v<- as.logical(log.v) noquote(if(length(log.v)==0)"()"else c(".","|")[1+ log.v])}cmp.logical(stats::runif(20)>0.8)chmat<- as.matrix(format(stackloss))# a "typical" character matrix## noquote(*, right=TRUE) so it prints exactly like a data framechmat<- noquote(chmat, right=TRUE)chmat
Computes a matrix norm ofx
using LAPACK. The norm can bethe one ("O"
) norm, the infinity ("I"
) norm, theFrobenius ("F"
) norm, the maximum modulus ("M"
) amongelements of a matrix, or the “spectral” or"2"
-norm, asdetermined by the value oftype
.
norm(x, type = c("O", "I", "F", "M", "2"))
norm(x, type= c("O","I","F","M","2"))
x | numeric matrix; note that packages such asMatrixdefine more |
type | character string, specifying thetype of matrixnorm to be computed.A character indicating the type of norm desired.
The default is |
Thebase method ofnorm()
calls the LAPACK functiondlange
.
Note that the 1-, Inf- and"M"
norm is faster to calculate thanthe Frobenius one.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.
The matrix norm, a non-negative number. Zero for a 0-extent (empty) matrix.
Except fornorm = "2"
, the LAPACK routineDLANGE
.
LAPACK is fromhttps://netlib.org/lapack/.
Anderson, E.,et al (1994).LAPACK User's Guide,2nd edition, SIAM, Philadelphia.
rcond
for the (reciprocal) condition number.
(x1 <- cbind(1, 1:10))norm(x1)norm(x1, "I")norm(x1, "M")stopifnot(all.equal(norm(x1, "F"), sqrt(sum(x1^2))))hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, `+`) }h9 <- hilbert(9)## all 5 (4 different) types of norm:(nTyp <- eval(formals(base::norm)$type))sapply(nTyp, norm, x = h9)stopifnot(exprs = { # 0-extent matrices: sapply(nTyp, norm, x = matrix(, 1,0)) == 0 sapply(nTyp, norm, x = matrix(, 0,0)) == 0})
(x1<- cbind(1,1:10))norm(x1)norm(x1,"I")norm(x1,"M")stopifnot(all.equal(norm(x1,"F"), sqrt(sum(x1^2))))hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}h9<- hilbert(9)## all 5 (4 different) types of norm:(nTyp<- eval(formals(base::norm)$type))sapply(nTyp, norm, x= h9)stopifnot(exprs={# 0-extent matrices: sapply(nTyp, norm, x= matrix(,1,0))==0 sapply(nTyp, norm, x= matrix(,0,0))==0})
Convert file paths to canonical form for the platform, to display themin a user-understandable form and so that relative and absolute paths canbe compared.
normalizePath(path, winslash = "\\", mustWork = NA)
normalizePath(path, winslash="\\", mustWork=NA)
path | character vector of file paths. |
winslash | the separator to be used on Windows – ignoredelsewhere. Must be one of |
mustWork | logical: if |
Tilde-expansion (seepath.expand
) is first done onpaths
.
Where the Unix-alike platform supports it attempts to turn paths intoabsolute paths in their canonical form (no ‘./’, ‘../’ norsymbolic links). It relies on the POSIX system functionrealpath
: if the platform does not have that (we know of nocurrent example) then the result will be an absolute path but mightnot be canonical. Even whererealpath
is used the canonicalpath need not be unique, for examplevia hard links ormultiple mounts.
On Windows it converts relative paths to absolute paths, resolves symboliclinks, converts short names for path elements to long names and ensures theseparator is that specified bywinslash
. It will match each pathelement case-insensitively or case-sensitively as during the usual namelookup and return the canonical case. It relies on Windows API functionGetFinalPathNameByHandle
and in case of an error (such asinsufficient permissions) it currently falls back to theR 3.6 (andolder) implementation, which relies onGetFullPathName
andGetLongPathName
with limitations described in the Notes section.An attempt is made not to introduceUNC paths in presence of mapped drivesor symbolic links: ifGetFinalPathNameByHandle
returns aUNC path,butGetLongPathName
returns a path starting with a drive letter, Rfalls back to theR 3.6 (and older) implementation.UTF-8-encoded paths not valid in the current locale can be used.
mustWork = FALSE
is useful for expressing paths for use inmessages.
A character vector.
If an input is not a real path the result is system-dependent (unlessmustWork = TRUE
, when this should be an error). It will beeither the corresponding input element or a transformation of it intoan absolute path.
Converting to an absolute file path can fail for a large number ofreasons. The most common are
One of more components of the file path does not exist.
A component before the last is not a directory, or there isinsufficient permission to read the directory.
For a relative path, the current directory cannot bedetermined.
A symbolic link points to a non-existent place or links form aloop.
The canonicalized path would be exceed the maximum supportedlength of a file path.
The canonical form of paths may not be what you expect. For example,on macOS absolute paths such as ‘/tmp’ and ‘/var’ aresymbolic links. On Linux, a path produced by bash process substitution isa symbolic link (such as ‘/proc/fd/63’) to a pipe and there is nocanonical form of such path. InR 3.6 and older on Windows, symlinks willnot be resolved and the long names for path elements will be returned withthe case in which they are inpath
, which may not be canonical incase-insensitive folders.
cat(normalizePath(c(R.home(), tempdir())), sep = "\n")
cat(normalizePath(c(R.home(), tempdir())), sep="\n")
In order to pinpoint missing functionality, theR core team usesthese functions for missingR functions and not yet used arguments ofexistingR functions (which are typically there for compatibilitypurposes).
You are very welcome to contribute your code ...
.NotYetImplemented().NotYetUsed(arg, error = TRUE)
.NotYetImplemented().NotYetUsed(arg, error=TRUE)
arg | an argument of a function that is not yet used. |
error | a logical. If |
the contrary,Deprecated
andDefunct
for outdated code.
require(graphics)barplot(1:5, inside = TRUE) # 'inside' is not yet used
require(graphics)barplot(1:5, inside=TRUE)# 'inside' is not yet used
nrow
andncol
return the number of rows or columnspresent inx
.NCOL
andNROW
do the same treating a vector as1-column matrix, even a 0-length vector, compatibly withas.matrix()
orcbind()
, see the example.
nrow(x)ncol(x)NCOL(x)NROW(x)
nrow(x)ncol(x)NCOL(x)NROW(x)
x | a vector, array, data frame, or |
aninteger
of length 1 orNULL
, thelatter only forncol
andnrow
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole (ncol
andnrow
.)
dim
which returnsall dimensions, andlength
which gives a number (a ‘count’) also in cases wheredim()
isNULL
, and hencenrow()
andncol()
returnNULL
;array
,matrix
.
ma <- matrix(1:12, 3, 4)nrow(ma) # 3ncol(ma) # 4ncol(array(1:24, dim = 2:4)) # 3, the second dimensionNCOL(1:12) # 1NROW(1:12) # 12, the length() of the vector## as.matrix() produces 1-column matrices from 0-length vectors,## and so does cbind() :dim(as.matrix(numeric())) # 0 1dim( cbind(numeric())) # dittoNCOL(numeric()) # 1## However, as.matrix(NULL) fails and cbind(NULL) gives NULL, hence for## consistency: NCOL(NULL) # 0## (This gave 1 in R < 4.4.0.)
ma<- matrix(1:12,3,4)nrow(ma)# 3ncol(ma)# 4ncol(array(1:24, dim=2:4))# 3, the second dimensionNCOL(1:12)# 1NROW(1:12)# 12, the length() of the vector## as.matrix() produces 1-column matrices from 0-length vectors,## and so does cbind() :dim(as.matrix(numeric()))# 0 1dim( cbind(numeric()))# dittoNCOL(numeric())# 1## However, as.matrix(NULL) fails and cbind(NULL) gives NULL, hence for## consistency:NCOL(NULL)# 0## (This gave 1 in R < 4.4.0.)
Accessing exported and internal variables, i.e.R objects(including lazy loaded data sets) in a namespace.
pkg::namepkg:::name
pkg::namepkg:::name
pkg | package name: symbol or literal character string. |
name | variable name: symbol or literal character string. |
For a packagepkg,pkg::name
returns the value of theexported variablename
in namespacepkg
, whereaspkg:::name
returns the value of the internal variablename
. The package namespace will be loaded if it was notloaded before the call, but the package will not be attached to thesearch path.
Specifying a variable or package that does not exist is an error.
Note thatpkg::name
doesnot access the objects in theenvironmentpackage:pkg
(which does not exist until thepackage's namespace is attached): the latter may contain objects notexported from the namespace. It can access datasets made available bylazy-loading.
It is typically a design mistake to use:::
in your code since the corresponding object has probably been keptinternal for a good reason. Consider contacting the packagemaintainer
if you feel the need to access the object foranything but mere inspection.
get
to access an object masked by another of the same name.loadNamespace
,asNamespace
for more aboutnamespaces.
base::logbase::"+"## Beware -- use ':::' at your own risk! (see "Details")stats:::coef.default
base::logbase::"+"## Beware -- use ':::' at your own risk! (see "Details")stats:::coef.default
Packages can supply functions to be called whenloaded, attached, detached or unloaded.
.onLoad(libname, pkgname).onAttach(libname, pkgname).onUnload(libpath).onDetach(libpath).Last.lib(libpath)
.onLoad(libname, pkgname).onAttach(libname, pkgname).onUnload(libpath).onDetach(libpath).Last.lib(libpath)
libname | a character string giving the library directory wherethe package defining the namespace was found. |
pkgname | a character string giving the name of the package. |
libpath | a character string giving the complete path to the package. |
After loading,loadNamespace
looks for a hook functionnamed.onLoad
and calls it (with two unnamed arguments) beforesealing the namespace and processing exports.
When the package is attached (vialibrary
orattachNamespace
), the hook function.onAttach
islooked for and if found is called (with two unnamed arguments) beforethe package environment is sealed.
If a function.onDetach
is in the namespace or.Last.lib
is exported from the package, it will be called (with a singleargument) when the package isdetach
ed. Beware that itmight be called if.onAttach
has failed, so it should bewritten defensively. (It is called withintryCatch
, soerrors will not stop the package being detached.)
If a namespace is unloaded (viaunloadNamespace
), a hookfunction.onUnload
is run (with a single argument) before finalunloading.
Note that the code in.onLoad
and.onUnload
should notassume any package except the base package is on the search path.Objects in the current package will be visible (unless this iscircumvented), but objects from other packages should be imported orthe double colon operator should be used.
.onLoad
,.onUnload
,.onAttach
and.onDetach
are looked for as internal objects in the namespaceand should not be exported (whereas.Last.lib
should be).
Note that packages are not detached nor namespaces unloaded at the endof anR session unless the user arranges to do so (e.g.,via.Last
).
Anything needed for the functioning of the namespace should behandled at load/unload times by the.onLoad
and.onUnload
hooks. For example, DLLs can be loaded (unless doneby auseDynLib
directive in the ‘NAMESPACE’ file) andinitialized in.onLoad
and unloaded in.onUnload
. Use.onAttach
only for actions that are needed only when thepackage becomes visible to the user (for example a start-up message)or need to be run after the package environment has been created.
Loading a namespace should where possible be silent, with startupmessages given by.onAttach
. These messages (and any essentialones from.onLoad
) should usepackageStartupMessage
so they can be silenced where they would be a distraction.
There should be no calls tolibrary
norrequire
in thesehooks. The way for a package to load other packages is via the‘Depends’ field in the ‘DESCRIPTION’ file: this ensuresthat the dependence is documented and packages are loaded in thecorrect order. Loading a namespace should not change the search path,so rather than attach a package, dependence of a namespace on anotherpackage should be achieved by (selectively) importing from the otherpackage's namespace.
Uses oflibrary
with argumenthelp
to display basicinformation about the package should useformat
on thecomputed package information object and pass this topackageStartupMessage
.
There should be no calls toinstalled.packages
in startupcode: it is potentially very slow and may fail in versions ofRbefore 2.14.2 if package installation is going on in parallel. Seeits help page for alternatives.
Compiled code should be loaded (e.g.,vialibrary.dynam
) in.onLoad
or auseDynLib
directive in the ‘NAMESPACE’ file, and not in.onAttach
.Similarly, compiled code should not be unloaded (e.g.,vialibrary.dynam.unload
) in.Last.lib
nor.onDetach
, only in.onUnload
.
setHook
shows how users can set hooks on the same events, andlists the sequence of events involving all of the hooks.
reg.finalizer
for hooks to be run at the end of a session.
loadNamespace
for more about namespaces.
Functions to load and unload name spaces.
attachNamespace(ns, pos = 2L, depends = NULL, exclude, include.only)loadNamespace(package, lib.loc = NULL, keep.source = getOption("keep.source.pkgs"), partial = FALSE, versionCheck = NULL, keep.parse.data = getOption("keep.parse.data.pkgs"))requireNamespace(package, ..., quietly = FALSE)loadedNamespaces()unloadNamespace(ns)isNamespaceLoaded(name)
attachNamespace(ns, pos=2L, depends=NULL, exclude, include.only)loadNamespace(package, lib.loc=NULL, keep.source= getOption("keep.source.pkgs"), partial=FALSE, versionCheck=NULL, keep.parse.data= getOption("keep.parse.data.pkgs"))requireNamespace(package,..., quietly=FALSE)loadedNamespaces()unloadNamespace(ns)isNamespaceLoaded(name)
ns | string or name space object. |
pos | integer specifying position to attach. |
depends |
|
package | string naming the package/name space to load. |
lib.loc | character vector specifying library search path (the locationofR library trees to search through. |
keep.source | now ignored except during package installation. |
keep.parse.data | ignored except during package installation. |
partial | logical; if true, stop just after loading code. |
versionCheck |
|
quietly | logical: should progress and error messages be suppressed? |
name | string or ‘name’, see |
exclude ,include.only | character vectors; see |
... | further arguments to be passed to |
The functionsloadNamespace
andattachNamespace
areusually called implicitly whenlibrary
is used to load a namespace and any imports needed. However it may be useful at times tocall these functions directly.
loadNamespace
loads the specified name space and registers it inan internal data base. A request to load a name space when one of thatname is already loaded has no effect. The arguments have the samemeaning as the corresponding arguments tolibrary
, whosehelp page explains the details of how a particular installed packagecomes to be chosen. After loading,loadNamespace
looks for ahook function named.onLoad
as an internal variable inthe name space (it should not be exported). Partial loading is usedto support installation with lazy-loading.
Optionally the package licence is checked during loading: see section‘Licenses’ in the help forlibrary
.
loadNamespace
does not attach the name space it loads to thesearch path.attachNamespace
can be used to attach a framecontaining the exported values of a name space to the search path (butthis is almost always donevialibrary
). Thehook function.onAttach
is run after the name spaceexports are attached.
requireNamespace
is a wrapper forloadNamespace
analogous torequire
that returns a logical value.
loadedNamespaces
returns a character vector of the names ofthe loaded name spaces.
isNamespaceLoaded(pkg)
is equivalent to but more efficient thanpkg %in% loadedNamespaces()
.
unloadNamespace
can be used to attempt to force a name space tobe unloaded. If the name space is attached, it is firstdetach
ed, thereby running a.onDetach
or.Last.lib
function in the name space if one is exported. Anerror is signaled and the name space is not unloaded if the name spaceis imported by other loaded name spaces. If defined, a hook function.onUnload
is run before removing the name space from theinternal registry.
See the comments in the help fordetach
about someissues with unloading and reloading name spaces.
attachNamespace
returns invisibly the package environment itadds to the search path.
loadNamespace
returns the name space environment, either onealready loaded or the one the function causes to be loaded.
requireNamespace
returnsTRUE
if it succeeds orFALSE
.
loadedNamespaces
returns acharacter
vector.
unloadNamespace
returnsNULL
, invisibly.
As fromR 4.1.0 the operation ofloadNamespace
can be traced,which can help track down the causes of unexpected messages (includingwhich package(s) they come from sinceloadNamespace
is called inmany ways including from itself and by::
and can be called byload
). Setting the environment variable_R_TRACE_LOADNAMESPACE_ to a numerical value will generateadditional messages on progress. Non-zero values,e.g.1
, report which namespace is being loaded and whenloading completes: values2
to4
report in increasingdetail. Negative values are reserved for tracing specific features andtheir current meanings are documented in source-code comments.
Loading standard packages is never traced.
Luke Tierney and R-core
The ‘Writing R Extensions’ manual, section “Package namespaces”.
getNamespace
,asNamespace
,topenv
,.onLoad
(etc);furtherenvironment
.
(lns <- loadedNamespaces()) statL <- isNamespaceLoaded("stats") stopifnot( identical(statL, "stats" %in% lns) ) ## The string "foo" and the symbol 'foo' can be used interchangably here: stopifnot( identical(isNamespaceLoaded( "foo" ), FALSE), identical(isNamespaceLoaded(quote(foo)), FALSE), identical(isNamespaceLoaded(quote(stats)), statL))hasS <- isNamespaceLoaded("splines") # (to restore if needed)Sns <- asNamespace("splines") # loads it if not alreadystopifnot( isNamespaceLoaded("splines"))if (is.null(try(unloadNamespace(Sns)))) # try unloading the NS 'object'stopifnot( ! isNamespaceLoaded("splines"))if (hasS) loadNamespace("splines") # (restoring previous state)
(lns<- loadedNamespaces()) statL<- isNamespaceLoaded("stats") stopifnot( identical(statL,"stats"%in% lns))## The string "foo" and the symbol 'foo' can be used interchangably here: stopifnot( identical(isNamespaceLoaded("foo"),FALSE), identical(isNamespaceLoaded(quote(foo)),FALSE), identical(isNamespaceLoaded(quote(stats)), statL))hasS<- isNamespaceLoaded("splines")# (to restore if needed)Sns<- asNamespace("splines")# loads it if not alreadystopifnot( isNamespaceLoaded("splines"))if(is.null(try(unloadNamespace(Sns))))# try unloading the NS 'object'stopifnot(! isNamespaceLoaded("splines"))if(hasS) loadNamespace("splines")# (restoring previous state)
Finding the top levelenvironment
from an environmentenvir
and its enclosing environments.
topenv(envir = parent.frame(), matchThisEnv = getOption("topLevelEnvironment"))
topenv(envir= parent.frame(), matchThisEnv= getOption("topLevelEnvironment"))
envir | environment. |
matchThisEnv | return this environment, if it matches beforeany other criterion is satisfied. The default, the option‘topLevelEnvironment’, is set by |
topenv
returns the first top levelenvironment
found when searchingenvir
and its enclosing environments. If notop level environment is found,.GlobalEnv
is returned. Anenvironment is considered top level if it is the internal environmentof a namespace, a package environment in thesearch
path, or.GlobalEnv
.
environment
, notablyparent.env()
on“enclosing environments”;loadNamespace
for more on namespaces.
topenv(.GlobalEnv)topenv(new.env()) # also global envtopenv(environment(ls))# namespace:basetopenv(environment(lm))# namespace:stats
topenv(.GlobalEnv)topenv(new.env())# also global envtopenv(environment(ls))# namespace:basetopenv(environment(lm))# namespace:stats
NULL
represents the null object inR: it is areservedword.NULL
is often returned by expressions and functionswhose value is undefined.
NULLas.null(x, ...)is.null(x)
NULLas.null(x,...)is.null(x)
x | an object to be tested or coerced. |
... | ignored. |
NULL
can be indexed (seeExtract) in just about anysyntactically legal way: apart fromNULL[[]]
which is an error, the result isalwaysNULL
. Objects with valueNULL
can be changed byreplacement operators and will be coerced to the type of theright-hand side.
NULL
is also used as the emptypairlist: see theexamples. Because pairlists are often promoted to lists, you mayencounterNULL
being promoted to an empty list.
Objects with valueNULL
cannot have attributes as there is onlyone null object: attempts to assign them are either an error(attr
) or promote the object to an empty list withattribute(s) (attributes
andstructure
).
as.null
ignores its argument and returnsNULL
.
is.null
returnsTRUE
if its argument's valueisNULL
andFALSE
otherwise.
is.null
is aprimitive function.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
%||%
:L %||% R
is equivalent to if(!is.null(L)) L else R
is.null(list()) # FALSE (on purpose!)is.null(pairlist()) # TRUEis.null(integer(0)) # FALSEis.null(logical(0)) # FALSEas.null(list(a = 1, b = "c"))
is.null(list())# FALSE (on purpose!)is.null(pairlist())# TRUEis.null(integer(0))# FALSEis.null(logical(0))# FALSEas.null(list(a=1, b="c"))
Creates or coerces objects of type"numeric"
.is.numeric
is a more general test of an object beinginterpretable as numbers.
numeric(length = 0)as.numeric(x, ...)is.numeric(x)
numeric(length=0)as.numeric(x,...)is.numeric(x)
length | a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error. |
x | object to be coerced or tested. |
... | further arguments passed to or from other methods. |
numeric
is identical todouble
.It creates a double-precision vector of the specified length with eachelement equal to0
.
as.numeric
is a generic function, but S3 methods must bewritten foras.double
. It is identical toas.double
.
is.numeric
is aninternal genericprimitive
function: you can write methods to handle specific classes of objects,seeInternalMethods. It isnot the same asis.double
. Factors are handled by the default method,and there are methods for classes"Date"
,"POSIXt"
and"difftime"
(all of whichreturn false). Methods foris.numeric
should only return trueif the base type of the class isdouble
orinteger
and values can reasonably be regarded as numeric(e.g., arithmetic on them makes sense, and comparison should be donevia the base type).
fornumeric
andas.numeric
seedouble
.
The default method foris.numeric
returnsTRUE
if its argument is ofmode"numeric"
(type"double"
or type"integer"
) and not afactor, andFALSE
otherwise. That is,is.integer(x) || is.double(x)
, or(mode(x) == "numeric") && !is.factor(x)
.
Ifx
is afactor
,as.numeric
will returnthe underlying numeric (integer) representation, which is oftenmeaningless as it may not correspond to thefactor
levels
, see the ‘Warning’ section infactor
(and the 2nd example below).
as.numeric
andis.numeric
are internally S4 generic andso methods can be set for themviasetMethod
.
To ensure thatas.numeric
andas.double
remain identical, S4 methods can only be set foras.numeric
.
It is a historical anomaly thatR has two names for itsfloating-point vectors,double
andnumeric
(and formerly hadreal
).
double
is the name of thetype.numeric
is the name of themode and also of the implicitclass. As an S4 formal class, use"numeric"
.
The potential confusion is thatR has usedmode"numeric"
to mean ‘double or integer’, which conflictswith the S4 usage. Thusis.numeric
tests the mode, not theclass, butas.numeric
(which is identical toas.double
)coerces to the class.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
## Conversion does trim whitespace; non-numeric strings give NA + warningas.numeric(c("-.1"," 2.7 ","B"))## Numeric values are sometimes accidentally converted to factors.## Converting them back to numeric is trickier than you'd expect.f <- factor(5:10)as.numeric(f) # not what you might expect, probably not what you want## what you typically meant and want:as.numeric(as.character(f))## the same, considerably more efficient (for long vectors):as.numeric(levels(f))[f]
## Conversion does trim whitespace; non-numeric strings give NA + warningas.numeric(c("-.1"," 2.7 ","B"))## Numeric values are sometimes accidentally converted to factors.## Converting them back to numeric is trickier than you'd expect.f<- factor(5:10)as.numeric(f)# not what you might expect, probably not what you want## what you typically meant and want:as.numeric(as.character(f))## the same, considerably more efficient (for long vectors):as.numeric(levels(f))[f]
A simple S3 class for representing numeric versionsincluding package versions, and associated methods.
numeric_version(x, strict = TRUE)package_version(x, strict = TRUE)R_system_version(x, strict = TRUE)getRversion()as.numeric_version(x)as.package_version(x)is.numeric_version(x)is.package_version(x)
numeric_version(x, strict=TRUE)package_version(x, strict=TRUE)R_system_version(x, strict=TRUE)getRversion()as.numeric_version(x)as.package_version(x)is.numeric_version(x)is.package_version(x)
x | for the creators, a character vector with suitable numericversion strings (see ‘Details’);for |
strict | a logical indicating whether invalid numeric versionsshould result in an error (default) or not. |
Numeric versions are sequences of one or more non-negative integers,usually (e.g., in package ‘DESCRIPTION’ files) represented ascharacter strings with the elements of the sequence concatenated andseparated by single ‘.’ or ‘-’ characters.R packageversions consist of at least two such integers, anR system versionof exactly three (major, minor and patch level).
Functionsnumeric_version
,package_version
andR_system_version
create a representation from such strings (ifsuitable) which allows for coercion and testing, combination,comparison, summaries (min/max), inclusion in data frames,subscripting, and printing. The classes can hold a vector of suchrepresentations.
getRversion
returns the version of the runningR as an Rsystem version object.
The[[
operator extracts or replaces a single version. Toaccess the integers of a version use two indices: see the examples.
compareVersion
;packageVersion
for the version of a specificR package.R.version
etc for the version ofR (and the informationunderlyinggetRversion()
).
x <- package_version(c("1.2-4", "1.2-3", "2.1"))x < "1.4-2.3"c(min(x), max(x))x[2, 2]x$majorx$minorif(getRversion() <= "2.5.0") { ## work around missing feature cat("Your version of R, ", as.character(getRversion()), ", is outdated.\n", "Now trying to work around that ...\n", sep = "")}x[[1]]x[[c(1, 3)]] # '4' as a numeric versionx[1, 3] # samex[[1, 3]] # 4 as an integerx[[2, 3]] <- 0 # zero the patchlevelx[[c(2, 3)]] <- 0 # samexx[[3]] <- "2.2.3"xx <- c(x, package_version("0.0"))is.na(x)[4] <- TRUEstopifnot(identical(is.na(x), c(rep(FALSE,3), TRUE)), anyNA(x))
x<- package_version(c("1.2-4","1.2-3","2.1"))x<"1.4-2.3"c(min(x), max(x))x[2,2]x$majorx$minorif(getRversion()<="2.5.0"){## work around missing feature cat("Your version of R, ", as.character(getRversion()),", is outdated.\n","Now trying to work around that ...\n", sep="")}x[[1]]x[[c(1,3)]]# '4' as a numeric versionx[1,3]# samex[[1,3]]# 4 as an integerx[[2,3]]<-0# zero the patchlevelx[[c(2,3)]]<-0# samexx[[3]]<-"2.2.3"xx<- c(x, package_version("0.0"))is.na(x)[4]<-TRUEstopifnot(identical(is.na(x), c(rep(FALSE,3),TRUE)), anyNA(x))
HowR parses numeric constants.
R parses numeric constants in its input in a very similar way to C99floating-point constants.
Inf
andNaN
are numeric constants (withtypeof(.) "double"
). In text input (e.g., inscan
andas.double
), these are recognizedignoring case as isinfinity
as an alternative toInf
.NA_real_
andNA_integer_
are constants oftypes"double"
and"integer"
representing missingvalues. All other numeric constants start with a digit or period andare either a decimal or hexadecimal constant optionally followed byL
.
Hexadecimal constants start with0x
or0X
followed bya non-empty sequence from0-9 a-f A-F .
which is interpreted as ahexadecimal number, optionally followed by a binary exponent. A binaryexponent consists of aP
orp
followed by an optionalplus or minus sign followed by a non-empty sequence of (decimal)digits, and indicates multiplication by a power of two. Thus0x123p456
is.
Decimal constants consist of a non-empty sequence of digits possiblycontaining a period (the decimal point), optionally followed by adecimal exponent. A decimal exponent consists of anE
ore
followed by an optional plus or minus sign followed by anon-empty sequence of digits, and indicates multiplication by a powerof ten.
Values which are too large or too small to be representable willoverflow toInf
or underflow to0.0
.
A numeric constant immediately followed byi
is regarded as animaginarycomplex number.
A numeric constant immediately followed byL
is regarded as aninteger
number when possible (and with a warning if itcontains a"."
).
Only the ASCII digits 0–9 are recognized as digits, even in languageswhich have other representations of digits. The ‘decimalseparator’ is always a period and never a comma.
Note that a leading plus or minus is not regarded by the parser aspart of a numeric constant but as a unary operator applied to the constant.
When a string is parsed to input a numeric constant, the number may ormay not be representable exactly in the C double type used. If notone of the nearest representable numbers will be returned.
R's own C code is used to convert constants to binary numbers, so theeffect can be expected to be the same on all platforms implementingfullIEC 60559 arithmetic (the most likely area of difference beingthe handling of numbers less than.Machine$double.xmin
).The same code is used byscan
.
Syntax
.For complex numbers, seecomplex
.Quotes
for the parsing of character constants,Reserved
for the “reserved words” inR.
## You can create numbers using fixed or scientific formatting.2.12.1e10-2.1E-10## The resulting objects have class numeric and type double.class(2.1)typeof(2.1)## This holds even if what you typed looked like an integer.class(2)typeof(2)## If you actually wanted integers, use an "L" suffix.class(2L)typeof(2L)## These are equal but not identical2 == 2Lidentical(2, 2L)## You can write numbers between 0 and 1 without a leading "0"## (but typically this makes code harder to read).1234sqrt(1i) # remember elementary math?utils::str(0xA0)identical(1L, as.integer(1))## You can combine the "0x" prefix with the "L" suffix :identical(0xFL, as.integer(15))
## You can create numbers using fixed or scientific formatting.2.12.1e10-2.1E-10## The resulting objects have class numeric and type double.class(2.1)typeof(2.1)## This holds even if what you typed looked like an integer.class(2)typeof(2)## If you actually wanted integers, use an "L" suffix.class(2L)typeof(2L)## These are equal but not identical2==2Lidentical(2,2L)## You can write numbers between 0 and 1 without a leading "0"## (but typically this makes code harder to read).1234sqrt(1i)# remember elementary math?utils::str(0xA0)identical(1L, as.integer(1))## You can combine the "0x" prefix with the "L" suffix :identical(0xFL, as.integer(15))
Integers which are displayed in octal (base-8 number system) format, with asmany digits as are needed to display the largest, using leading zeroes asnecessary.
Arithmetic works as for integers, and non-integer valued mathematicalfunctions typically work by truncating the result to integer.
as.octmode(x)## S3 method for class 'octmode'as.character(x, keepStr = FALSE, ...)## S3 method for class 'octmode'format(x, width = NULL, ...)## S3 method for class 'octmode'print(x, ...)
as.octmode(x)## S3 method for class 'octmode'as.character(x, keepStr=FALSE,...)## S3 method for class 'octmode'format(x, width=NULL,...)## S3 method for class 'octmode'print(x,...)
x | an object, for the methods inheriting from class |
keepStr | a |
width |
|
... | further arguments passed to or from other methods. |
"octmode"
objects are integer vectors with that classattribute, used primarily to ensure that they are printed in octalnotation, specifically for Unix-like file permissions such as755
. Subsetting ([
) works too, as do arithmetic orother mathematical operations, albeit truncated to integer.
as.character(x)
drops allattributes
(unless whenkeepStr=TRUE
where it keeps,dim
,dimnames
andnames
for back compatibility) and converts each entry individually, hence with noleading zeroes, whereas informat()
, whenwidth = NULL
(thedefault), the output is padded with leading zeroes to the smallest widthneeded for all the non-missing elements.
as.octmode
can convert integers (oftype"integer"
or"double"
) and character vectors whose elements contain onlydigits0-7
(or areNA
) to class"octmode"
.
There is a!
method and methods for|
and&
:these recycle their arguments to the length of the longer and thenapply the operators bitwise to each element.
These are auxiliary functions forfile.info
.
hexmode
,sprintf
for other options inconverting integers to octal,strtoi
to convert octalstrings to integers.
(on <- as.octmode(c(16, 32, 127:129))) # "020" "040" "177" "200" "201"unclass(on[3:4]) # subsetting## manipulate file modesfmode <- as.octmode("170")(fmode | "644") & "755"(umask <- Sys.umask()) # depends on platformc(fmode, "666", "755") & !umaskom <- as.octmode(1:12)om # print()s via format()stopifnot(nchar(format(om)) == 2)om[1:7] # *no* leading zeroes!stopifnot(format(om[1:7]) == as.character(1:7))om2 <- as.octmode(c(1:10, 60:70))om2 # prints via format() -> with 3 octalsstopifnot(nchar(format(om2)) == 3)as.character(om2) # strings of length 1, 2, 3## Integer arithmetic (remaining "octmode"):om^2om * 64-om(fac <- factorial(om)) # !1, !2, !3, !4 .. in hexadecimalsas.integer(fac) # indeed the same as factorial(1:12)
(on<- as.octmode(c(16,32,127:129)))# "020" "040" "177" "200" "201"unclass(on[3:4])# subsetting## manipulate file modesfmode<- as.octmode("170")(fmode|"644")&"755"(umask<- Sys.umask())# depends on platformc(fmode,"666","755")&!umaskom<- as.octmode(1:12)om# print()s via format()stopifnot(nchar(format(om))==2)om[1:7]# *no* leading zeroes!stopifnot(format(om[1:7])== as.character(1:7))om2<- as.octmode(c(1:10,60:70))om2# prints via format() -> with 3 octalsstopifnot(nchar(format(om2))==3)as.character(om2)# strings of length 1, 2, 3## Integer arithmetic (remaining "octmode"):om^2om*64-om(fac<- factorial(om))# !1, !2, !3, !4 .. in hexadecimalsas.integer(fac)# indeed the same as factorial(1:12)
on.exit
records the expression given as its argument as needingto be executed when the current function exits (either naturally or asthe result of an error). This is useful for resetting graphicalparameters or performing other cleanup actions.
If no expression is provided, i.e., the call ison.exit()
, thenthe currenton.exit
code is removed.
on.exit(expr = NULL, add = FALSE, after = TRUE)
on.exit(expr=NULL, add=FALSE, after=TRUE)
expr | an expression to be executed. |
add | if TRUE, add |
after | if |
Theexpr
argument passed toon.exit
is recorded withoutevaluation. If it is not subsequently removed/replaced by anotheron.exit
call in the same function, it is evaluated in theevaluation frame of the function when it exits (including duringstandard error handling). Thus any functions or variables in theexpression will be looked for in the function and its environment atthe time of exit: to capture the current value inexpr
usesubstitute
or similar.
If multipleon.exit
expressions are set usingadd = TRUE
then all expressions will be run even if one signals an error.
This is a ‘special’primitive function: it onlyevaluates the argumentsadd
andafter
.
InvisibleNULL
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
sys.on.exit
which returns the expression stored for usebyon.exit()
in the function in whichsys.on.exit()
isevaluated.
require(graphics)opar <- par(mai = c(1,1,1,1))on.exit(par(opar))
require(graphics)opar<- par(mai= c(1,1,1,1))on.exit(par(opar))
Operators for the"Date"
class.
There is anOps
method and specificmethods for+
and-
for theDate
class.
date + xx + datedate - xdate1 lop date2
date+ xx+ datedate- xdate1 lop date2
date | an object of class |
date1 ,date2 | date objects or character vectors. (Charactervectors are converted by |
x | a numeric vector (in days)or an object of class |
lop | one of |
x
does not need to be integer if specified as a numeric vector,but see the comments about fractional days in the help forDates
.
(z <- Sys.Date())z + 10z < c("2009-06-01", "2010-01-01", "2015-01-01")
(z<- Sys.Date())z+10z< c("2009-06-01","2010-01-01","2015-01-01")
Allow the user to set and examine a variety of globaloptionswhich affect the way in whichR computes and displays its results.
options(...)getOption(x, default = NULL).Options
options(...)getOption(x, default=NULL).Options
... | any options can be defined, using Options can also be passed by giving a single unnamed argument whichis a named list. |
x | a character string holding an option name. |
default | if the specified option is not set in the options list,this value is returned. This facilitates retrieving an option andchecking whether it is set and setting it separately if not. |
Invokingoptions()
with no arguments returns a list with thecurrent values of the options. Note that not all options listed beloware set initially. To access the value of a single option, one shoulduse, e.g.,getOption("width")
rather thanoptions("width")
which is alist of length one.
ForgetOption
, the current value set for optionx
, ordefault
(which defaults toNULL
) if the option is unset.
Foroptions()
, a list of all set options sorted by name. Foroptions(name)
, a list of length one containing the set value,orNULL
if it is unset. For uses setting one or more options,a list with the previous values of the options changed (returnedinvisibly).
add.smooth
:typically logical, defaulting toTRUE
. Could also be set to an integer for specifying howmany (simulated) smooths should be added. This is currently onlyused byplot.lm
.
askYesNo
:a function (typically set by a front-end)to ask the user binary response functions in a consistent way,or a vector of strings used byaskYesNo
to useas default responses for such questions.
browserNLdisabled
:logical: whether newline isdisabled as a synonym for"n"
in the browser.
catch.script.errors
:logical, false by default. Iftrueandinteractive()
is false, e.g., when anR script is run byR CMDBATCH <script>.R
, thenerrors donot stop execution of the script. Rather evaluationcontinues after printing the error (and jumping to top level).Also,traceback()
would provide info about the error.Do use with care!
checkPackageLicense
:logical, not set by default. Iftrue,loadNamespace
asks a user to accept anynon-standard license at first load of the package.
check.bounds
:logical, defaulting toFALSE
. Iftrue, awarning is produced whenever avector (atomic orlist
) is extended, by somethinglikex <- 1:3; x[5] <- 6
.
CBoundsCheck
:logical, controlling whether.C
and.Fortran
make copies to check forarray over-runs on the atomic vector arguments.
Initially set from value of the environment variableR_C_BOUNDS_CHECK (set toyes
to enable).
conflicts.policy
:character string or list controllinghandling of conflicts found in calls tolibrary
orrequire
. Seelibrary
for details.
continue
:a non-empty string setting the prompt usedfor lines which continue over one line.
defaultPackages
:the packages that are attached bydefault whenR starts up. Initially set from the value of theenvironment variableR_DEFAULT_PACKAGES, or if that is unsettoc("datasets", "utils", "grDevices", "graphics", "stats", "methods")
. (SetR_DEFAULT_PACKAGES toNULL
ora comma-separated list of package names.)This option can be changed in a ‘.Rprofile’ file, but it willnot work to exclude themethods package at this stage, asthe value is screened formethods before that file is read.
deparse.cutoff
:integer value controlling theprinting of language constructs which aredeparse
d.Default60
.
deparse.max.lines
:controls the number of lines usedwhen deparsing inbrowser
, upon entry to a functionwhose debugging flag is set, and if optiontraceback.max.lines
is unset, oftraceback()
. Initially unset, and onlyused if set to a positive integer.
traceback.max.lines
:controls the number of lines usedwhen deparsing intraceback
, if set.Initially unset, and only used if set to a positive integer.
digits
:controls the number ofsignificant (seesignif
) digits toprint when printing numeric values. It is a suggestion only.Valid values are 1...22 with default 7. See the note inprint.default
about values greater than 15.
digits.secs
:controls the maximum number of digits toprint when formatting time values in seconds. Valid valuesare 0...6 with default 0 (equivalent toNULL
which is usedwhen it is undefined as on vanilla startup). Seestrftime
.
download.file.extra
:Extra command-line argument(s) fornon-default methods: seedownload.file
.
download.file.method
:Method to be used fordownload.file
. Currently download methods"internal"
,"wininet"
(Windows only),"libcurl"
,"wget"
and"curl"
are available.If not set,method = "auto"
is chosen: seedownload.file
.
echo
:logical. Only used in non-interactive mode,when it controls whether input is echoed. Command-line option--no-echo sets this toFALSE
, but otherwiseit starts the session asTRUE
.
encoding
:The name of an encoding, default"native.enc"
. Seeconnections
.
error
:either a function or an expression governingthe handling of non-catastrophic errors such as those generated bystop
as well as by signals and internally detectederrors. If the option is a function, a call to that function,with no arguments, is generated as the expression. By defaultthe option is not set: seestop
for the behaviour inthat case. The functionsdump.frames
andrecover
provide alternatives that allow post-mortemdebugging. Note that these need to specified ase.g.options(error = utils::recover)
in startupfiles such as ‘.Rprofile’.
expressions
:sets a limit on the number of nestedexpressions that will be evaluated. Valid values are25...500000 with default 5000. If you increase it, you mayalso want to startR with a larger protection stack;see--max-ppsize inMemory
. Note too thatyou may cause a segfault from overflow of the C stack, and on OSeswhere it is possible you may want to increase that. Once thelimit is reached an error is thrown. The current number underevaluation can be found by callingCstack_info
.
interrupt
:a function taking no arguments to be calledon a user interrupt if the interrupt condition is not otherwisehandled.
keep.parse.data
:When internally storing source code(keep.source
is TRUE), also store parse data. Parse data canthen be retrieved withgetParseData()
and used e.g. forspell checking of string constants or syntax highlighting. The valuehas effect only when internally storing source code (seekeep.source
). The default isTRUE
.
keep.parse.data.pkgs
:As forkeep.parse.data
, usedonly when packages are installed. Defaults toFALSE
unless theenvironment variableR_KEEP_PKG_PARSE_DATA is set toyes
.The space overhead of parse data can be substantial even aftercompression and it causes performance overhead when loading packages.
keep.source
:WhenTRUE
, the source code forfunctions (newly defined or loaded) is stored internallyallowing comments to be kept in the right places. Retrieve thesource by printing or usingdeparse(fn, control = "useSource")
.
The default isinteractive()
, i.e.,TRUE
forinteractive use.
keep.source.pkgs
:As forkeep.source
, used onlywhen packages are installed. Defaults toFALSE
unless theenvironment variableR_KEEP_PKG_SOURCE is set toyes
.
matprod
:a string selecting the implementation ofthe matrix products%*%
,crossprod
, andtcrossprod
for double and complex vectors:
"internal"
uses an unoptimized 3-loop algorithmwhich correctly propagatesNaN
andInf
values and is consistent in precision withother summation algorithms insideR likesum
orcolSums
(which now means that it uses along double
accumulator for summation if available and enabled,seecapabilities
).
"default"
uses BLAS to speed up computation, butto ensure correct propagation ofNaN
andInf
values it uses an unoptimized 3-loop algorithm for inputs that maycontainNaN
orInf
values. When deemedbeneficial for performance,"default"
may call the3-loop algorithm unconditionally, i.e., without checking theinput forNaN
/Inf
values. The 3-loop algorithm uses(only) adouble
accumulator for summation, which isconsistent with the reference BLAS implementation.
"blas"
uses BLAS unconditionally without anychecks and should be used with extreme caution. BLASlibraries do not propagateNaN
orInf
values correctly and for inputs withNaN
/Inf
values the results may be undefined.
"default.simd"
is experimental and will likely beremoved in future versions ofR. It provides the same behavioras"default"
, but the check whether the input containsNaN
/Inf
values is faster on someSIMD hardware.On older systems it will run correctly, but may be much slower than"default"
.
max.print
:integer, defaulting to99999
.print
orshow
methods can make use ofthis option, to limit the amount of information that is printed,to something in the order of (and typically slightly less than)max.print
entries.
OutDec
:character string containing a singlecharacter. The preferred character to be used as the decimalpoint in output conversions, that is in printing, plotting,format
,formatC
andas.character
but not whendeparsing nor bysprintf
(which is sometimes used prior to printing).
pager
:the command used for displaying text files byfile.show
, details depending on the platform:
defaults to ‘R_HOME/bin/pager’, which is a shellscript running the command-line specified by the environmentvariablePAGER whose default is set at configuration,usually toless
.
defaults to"internal"
, which uses a pager similar to theGUI console. Another possibility is"console"
to use theconsole itself.
Can be a character string or anR function, in which case itneeds to accept the arguments(files, header,title, delete.file)
corresponding to the first four arguments offile.show
.
papersize
:the default paper format used bypostscript
; set by environment variableR_PAPERSIZE whenR is started: if that is unset or invalidit defaults platform dependently
to a value derived from the locale categoryLC_PAPER
, or if that is unavailable to a default setwhenR was built.
to"a4"
, or"letter"
in US andCanadian locales.
PCRE_limit_recursion
:Logical: shouldgrep(perl = TRUE)
and similar limit the maximalrecursion allowed when matching? Only relevant for PCRE1 andPCRE2 <= 10.23.
PCRE can be built not to use a recursion stack (seepcre_config
), but it uses recursion by default witha recursion limit of 10000000 which potentially needs a very largeC stack: see the discussion athttps://www.pcre.org/original/doc/html/pcrestack.html. Iftrue, the limit is reduced usingR's estimate of the C stack sizeavailable (if known), otherwise 10000. IfNA
, the limit isimposed only if any input string has 1000 or more bytes. Thelimit has no effect when PCRE's Just-in-Time compiler is used.
PCRE_study
:Logical or integer: shouldgrep(perl = TRUE)
and similar ‘study’ thepatterns? Either logical or a numerical threshold for the minimumnumber of strings to be matched for the pattern to be studied (thedefault is10
)). Missing values and negative numbers aretreated as false. This option is ignored with PCRE2 (PCRE version >=10.00) which does not have a separate study phase and patterns areautomatically optimized when possible.
PCRE_use_JIT
:Logical: shouldgrep(perl =TRUE)
,strsplit(perl = TRUE)
and similar make useof PCRE's Just-In-Time compiler if available? (This applies only tostudied patterns with PCRE1.) Default: true. Missing values aretreated as false.
pdfviewer
:default PDF viewer.The default is set from the environment variableR_PDFVIEWER,the default value of which
is set whenR is configured, and
is the full path toopen.exe
, a utilitysupplied withR.
printcmd
:the command used bypostscript
for printing; set by environment variableR_PRINTCMD whenR is started. This should be a command that expects either inputto be piped to ‘stdin’ or to be given a single filenameargument. Usually set to"lpr"
on a Unix-alike.
prompt
:a non-empty string to be used forR's prompt;should usually end in a blank (" "
).
rl_word_breaks
:(Unix only:) Used for the readline-based terminalinterface. Default value" \t\n\"\\'`><=%;,|&{()}"
.
This is the set of characters use to break the input line intotokens for object- and file-name completion. Those who do not usespaces around operators may prefer" \t\n\"\\'`><=+-*%;,|&{()}"
save.defaults
,save.image.defaults
:seesave
.
scipen
:integer. A penalty to be appliedwhen deciding to print numeric values in fixed or exponentialnotation. Positive values bias towards fixed and negative towardsscientific notation: fixed notation will be preferred unless it ismore thanscipen
digits wider.
setWidthOnResize
:a logical. If set andTRUE
,Rrun in a terminal using a recentreadline
library will setthewidth
option when the terminal is resized.
showWarnCalls
,showErrorCalls
:a logical.Should warning and error messages produced by the default handlersshow a summary of the call stack? By default error call stacksare shown in non-interactive sessions. Whenwarning
orstop
are called on a condition object the callstacks are only shown if the value returned byconditionCall
for the condition object is notNULL
.
showNCalls
:integer. Controls how long the sequenceof calls must be (in bytes) before ellipses are used. Defaults to50 and should be at least 30 and no more than 500.
show.error.locations
:Should source locations oferrors be printed? If set toTRUE
or"top"
, thesource location that is highest on the stack (the most recentcall) will be printed."bottom"
will print the locationof the earliest call found on the stack.
Integer values can select other entries. The value0
corresponds to"top"
and positive values count down thestack from there. The value-1
corresponds to"bottom"
and negative values count up from there.
show.error.messages
:a logical. Should error messagesbe printed? Intended for use withtry
or auser-installed error handler.
texi2dvi
:used by functionstexi2dvi
andtexi2pdf
in packagetools.
Set at startup from the environment variableR_TEXI2DVICMD,which defaults first to the value of environment variableTEXI2DVI, and then to a value set whenR was installed (thefull path to atexi2dvi
script if one was found). Ifnecessary, that environment variable can be set to"emulation"
.
timeout
:positive integer. The timeout for someInternet operations, in seconds. Default 60 (seconds) but can beset from environment variableR_DEFAULT_INTERNET_TIMEOUT. (Invalid values of the option orthe variable are silently ignored: non-integer numeric values willbe truncated.) Seedownload.file
andconnections
.
topLevelEnvironment
:seetopenv
andsys.source
.
url.method
:character string: the default method forurl
. Normally unset, which is equivalent to"default"
, which is"internal"
except on Windows.
useFancyQuotes
:controls the use ofdirectional quotes insQuote
,dQuote
and inrendering text help (seeRd2txt
in packagetools). Can beTRUE
,FALSE
,"TeX"
or"UTF-8"
.
verbose
:logical. ShouldR report extra informationon progress? Set toTRUE
by the command-line option--verbose.
warn
:integer value to set the handling of warningmessages by the default warning handler. Ifwarn
is negative all warnings are ignored. Ifwarn
is zero (the default) warnings are stored until the top–levelfunction returns. If 10 or fewer warnings were signalled theywill be printed otherwise a message saying how many weresignalled. An object calledlast.warning
iscreated and can be printed through the functionwarnings
. Ifwarn
is one, warnings areprinted as they occur. Ifwarn
is two (or larger, coercibleto integer), all warnings are turned into errors. While sometimesuseful for debugging, turning warnings into errors may triggerbugs and resource leaks that would not have been triggered otherwise.
warnPartialMatchArgs
:logical. If true, warns ifpartial matching is used in argument matching.
warnPartialMatchAttr
:logical. If true, warns ifpartial matching is used in extracting attributes viaattr
.
warnPartialMatchDollar
:logical. If true, warns ifpartial matching is used for extraction by$
.
warning.expression
:anR code expression to be calledif a warning is generated, replacing the standard message. Ifnon-null it is called irrespective of the value of optionwarn
.
warning.length
:sets the truncation limit in bytes for errorand warning messages. A non-negative integer, with allowed values100...8170, default 1000.
nwarnings
:the limit for the number of warnings keptwhenwarn = 0
, default 50. This will discard messages ifcalled whilst they are being collected. If you increase thislimit, be aware that the current implementation pre-allocatesthe equivalent of a named list for them, i.e., do not increase it tomore than say a million.
width
:controls the maximum number of columns on aline used in printing vectors, matrices and arrays, and whenfilling bycat
.
Columns are normally the same as characters except in East Asianlanguages.
You may want to change this if you re-size the window thatR isrunning in. Valid values are 10...10000 with default normally80. (The limits on valid values are in file ‘Print.h’ and can bechanged by re-compilingR.) SomeR consoles automatically changethe value when they are resized.
See the examples onStartup for one way to set thisautomatically from the terminal width whenR is started.
The ‘factory-fresh’ default settings of some of these options are
add.smooth | TRUE |
check.bounds | FALSE |
continue | "+ " |
digits | 7 |
echo | TRUE |
encoding | "native.enc" |
error | NULL |
expressions | 5000 |
keep.source | interactive() |
keep.source.pkgs | FALSE |
max.print | 99999 |
OutDec | "." |
prompt | "> " |
scipen | 0 |
show.error.messages | TRUE |
timeout | 60 |
verbose | FALSE |
warn | 0 |
warning.length | 1000 |
width | 80 |
Others are set from environment variables or are platform-dependent.
These will be set when packagegrDevices (or its namespace)is loaded if not already set.
bitmapType
:(Unix only, incl. macOS) character. Thedefault type for thebitmap devices such aspng
. Defaults to"cairo"
on systems where that is available, or to"quartz"
on macOS where that is available.
device
:a character string givingthe name of a function, or the function object itself,which when called creates a new graphics device of the defaulttype for that session. The value of this option defaults to thenormal screen device (e.g.,X11
,windows
orquartz
) for an interactive session, andpdf
in batch use or if a screen is not available. If set to the nameof a device, the device is looked for first from the globalenvironment (that is down the usual search path) and then in thegrDevices namespace.
The default values in interactive and non-interactive sessions areconfigurable via environment variablesR_INTERACTIVE_DEVICE andR_DEFAULT_DEVICErespectively.
The search logic for ‘the normal screen device’ is thatthis iswindows
on Windows, andquartz
if availableon macOS (running at the console, and compiled into the build).OtherwiseX11
is used if environment variableDISPLAYis set.
device.ask.default
:logical. The default fordevAskNewPage("ask")
when a device is opened.
locatorBell
:logical. Should selection inlocator
andidentify
be confirmed by a bell? DefaultTRUE
.Honoured at least onX11
andwindows
devices.
windowsTimeout
:(Windows-only) integer vector of length 2representing two times in milliseconds. These control thedouble-buffering ofwindows
devices when that isenabled: the first is the delay after plotting finishes(default 100) and the second is the update interval duringcontinuous plotting (default 500). The values at the time thedevice is opened are used.
max.contour.segments
:positive integer, defaulting to25000
if not set. A limit on the number ofsegments in a single contour line incontour
orcontourLines
.
These will be set when packagestats (or its namespace)is loaded if not already set.
contrasts
:the defaultcontrasts
used inmodel fitting such as withaov
orlm
.A character vector of length two, the first giving the function tobe used with unordered factors and the second the function to beused with ordered factors. By default the elements are namedc("unordered", "ordered")
, but the names are unused.
na.action
:the name of a function for treating missingvalues (NA
's) for certain situations, seena.action
andna.pass
.
show.coef.Pvalues
:logical, affecting whether Pvalues are printed in summary tables of coefficients. SeeprintCoefmat
.
show.nls.convergence
:logical, shouldnls
convergence messages be printed for successful fits?
show.signif.stars
:logical, should stars be printed onsummary tables of coefficients? SeeprintCoefmat
.
ts.eps
:the relative tolerance for certain time series(ts
) computations. Default1e-05
.
ts.S.compat
:logical. Used to select S compatibilityfor plotting time-series spectra. See the description of argumentlog
inplot.spec
.
These will be set (apart fromNcpus
) when packageutils(or its namespace) is loaded if not already set.
BioC_mirror
:The URL of a Bioconductor mirrorfor use bysetRepositories
,e.g. the default ‘"https://bioconductor.org"’or the European mirror‘"https://bioconductor.statistik.tu-dortmund.de"’. Can be setbychooseBioCmirror
.
browser
:The HTML browser to be used bybrowseURL
. This sets the default browser on UNIX ora non-default browser on Windows. Alternatively, anR functionthat is called with a URL as its argument. SeebrowseURL
for further details.
ccaddress
:default Cc: address used bycreate.post
(and hencebug.report
andhelp.request
). Can beFALSE
or""
.
citation.bibtex.max
:default 1; the maximal number ofbibentries (bibentry
) in acitation
forwhich the BibTeX version is printed in addition to the text one.
de.cellwidth
:integer: the cell widths (number ofcharacters) to be used in the data editordataentry
.If this is unset (the default), 0, negative orNA
, variablecell widths are used.
demo.ask
:default for theask
argument ofdemo
.
editor
:a non-empty character string or anR functionthat sets the default text editor, e.g., foredit
andfile.edit
. Set from the environment variableEDITOR on UNIX, or if unsetVISUAL orvi
.As a string it should specify the name of or path to an externalcommand.
example.ask
:default for theask
argument ofexample
.
help.ports
:optional integer vector for setting portsof the internal HTTP server, seestartDynamicHelp
.
help.search.types
:default types of documentationto be searched byhelp.search
and??
.
help.try.all.packages
:default for an argument ofhelp
.
help_type
:default for an argument ofhelp
, used also as the help type by?
.
help.htmlmath
:default for thetexmath
argumentofRd2HTML
, controlling how LaTeX-like mathematicalequations are displayed in R help pages (if enabled). Usefulvalues are"katex"
(equivalent toNULL
, the default)and"mathjax"
; for all other values basic substitutions areused.
help.htmltoc
:default for thetoc
argumentofRd2HTML
, controlling whether a table of contentsshould be included.
HTTPUserAgent
:string used as the ‘user agent’ inHTTP(S) requests bydownload.file
,url
andcurlGetHeaders
, orNULL
when requests willbe made without a user agent header. The default is"R (versionplatformarchos)"
except when ‘libcurl’ is used when it is"libcurl/version"
for the ‘libcurl’ version in use.
install.lock
:logical: should per-directory packagelocking be used byinstall.packages
? Most usefulfor binary installs on macOS and Windows, but can be used in astartup file for source installsviaR CMDINSTALL
. For binary installs, can also bethe character string"pkglock"
.
internet.info
:The minimum level of information to beprinted on URL downloads etc, using the"internal"
and"libcurl"
methods.Default is 2, for failure causes. Set to 1 or 0 to get moredetailed information (for the"internal"
method 0 providesmore information than 1).
install.packages.check.source
:Used byinstall.packages
(and indirectlyupdate.packages
) on platforms which support binarypackages. Possible values"yes"
and"no"
, withunset being equivalent to"yes"
.
install.packages.compile.from.source
:Used byinstall.packages(type = "both")
(and indirectlyupdate.packages
) on platforms whichsupport binary packages. Possible values are"never"
,"interactive"
(which means ask in interactive use and"never"
in batch use) and"always"
. The default istaken from environment variableR_COMPILE_AND_INSTALL_PACKAGES, with default"interactive"
if unset. However,install.packages
uses"never"
unless amake
program is found,consulting the environment variableMAKE.
mailer
:default emailing method used bycreate.post
and hencebug.report
andhelp.request
.
menu.graphics
:Logical: should graphical menus be usedif available? Defaults toTRUE
. Currently applies toselect.list
,chooseCRANmirror
,setRepositories
and to select from multiple (text)help files inhelp
.
Ncpus
:an integer, used in
install.packages
as default for the number of CPUsto use in a potentially parallel installation, asNcpus = getOption("Ncpus", 1L)
, i.e., when unset isequivalent to a setting of 1.
pkgType
:The default type of packages to be downloadedand installed – seeinstall.packages
.Possible values are platform dependently
"win.binary"
,"source"
and"both"
(the default).
"source"
(the default except under aCRAN macOS build),"mac.binary"
and"both"
(the default for CRAN macOS builds).("mac.binary.el-capitan"
,"mac.binary.mavericks"
,"mac.binary.leopard"
and"mac.binary.universal"
are no longer in use.)
Value"binary"
is a synonym for the native binary type (ifthere is one);"both"
is used byinstall.packages
to choose between source and binaryinstalls.
repos
:character vector of repository URLs for use byavailable.packages
and related functions. Initiallyset from entries marked as default in the‘repositories’ file,whose path is configurable via environment variableR_REPOSITORIES(set this toNULL
to skip initialization at startup).The ‘factory-fresh’ setting from the file inR.home("etc")
isc(CRAN="@CRAN@")
, a value that causes some utilities toprompt for a CRAN mirror. To avoid this do set the CRAN mirror,by something like
local({ r <- getOption("repos") r["CRAN"] <- "https://my.local.cran" options(repos = r)})
in your ‘.Rprofile’,or use a personal ‘repositories’ file.
Note that you can add more repositories (Bioconductor,R-Forge, RForge.net, ...) for the current sessionusingsetRepositories
.
str
:a list of options controlling the defaultstr
display. Defaults tostrOptions()
.
str.dendrogram.last
:seestr.dendrogram
.
SweaveHooks
,SweaveSyntax
:seeSweave
.
unzip
:a character string used byunzip
:the path of the external programunzip
or"internal"
.Defaults (platform dependently)
to the value ofR_UNZIPCMD, which is set in‘etc/Renviron’ to the path of theunzip
command foundduring configuration and otherwise to""
.
to"internal"
when the internal unzipcode is used.
These will be set when packageparallel (or its namespace)is loaded if not already set.
mc.cores
:an integer giving the maximum allowed numberofadditionalR processes allowed to be run in parallel tothe currentR process. Defaults to the setting of theenvironment variableMC_CORES if set. Most applicationswhich use this assume a limit of2
if it is unset.
dvipscmd
:character string giving a command to be used inthe (deprecated) off-line printing of help pagesviaPostScript. Defaults to"dvips"
.
warn.FPU
:logical, by default undefined. If true,awarning is produced wheneverdyn.load repairs thecontrol word damaged by a buggy DLL.
For compatibility with S there is a visible object.Options
whosevalue is a pairlist containing the currentoptions()
(in noparticular order). Assigning to it will make a local copy and notchange the original. (Using it however is faster than callingoptions()
).
An option set toNULL
is indistinguishable from a non existingoption.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
op <- options(); utils::str(op) # op is a named listgetOption("width") == options()$width # the latter needs more memoryoptions(digits = 15)pi# set the editor, and save previous valueold.o <- options(editor = "nedit")old.ooptions(check.bounds = TRUE, warn = 1)x <- NULL; x[4] <- "yes" # gives a warningoptions(digits = 5)print(1e5)options(scipen = 3); print(1e5)options(op) # reset (all) initial optionsoptions("digits")## Not run: ## set contrast handling to be like Soptions(contrasts = c("contr.helmert", "contr.poly"))## End(Not run)## Not run: ## on error, terminate the R session with error status 66options(error = quote(q("no", status = 66, runLast = FALSE)))stop("test it")## End(Not run)## Not run: ## Set error actions for debugging:## enter browser on error, see ?recover:options(error = recover)## allows to call debugger() afterwards, see ?debugger:options(error = dump.frames)## A possible setting for non-interactive sessionsoptions(error = quote({dump.frames(to.file = TRUE); q()}))## End(Not run) # Compare the two ways to get an option and use it # acconting for the possibility it might not be set.if(as.logical(getOption("performCleanp", TRUE))) cat("do cleanup\n")## Not run: # a clumsier way of expressing the above w/o the default.tmp <- getOption("performCleanup")if(is.null(tmp)) tmp <- TRUEif(tmp) cat("do cleanup\n")## End(Not run)
op<- options(); utils::str(op)# op is a named listgetOption("width")== options()$width# the latter needs more memoryoptions(digits=15)pi# set the editor, and save previous valueold.o<- options(editor="nedit")old.ooptions(check.bounds=TRUE, warn=1)x<-NULL; x[4]<-"yes"# gives a warningoptions(digits=5)print(1e5)options(scipen=3); print(1e5)options(op)# reset (all) initial optionsoptions("digits")## Not run: ## set contrast handling to be like Soptions(contrasts= c("contr.helmert","contr.poly"))## End(Not run)## Not run: ## on error, terminate the R session with error status 66options(error= quote(q("no", status=66, runLast=FALSE)))stop("test it")## End(Not run)## Not run: ## Set error actions for debugging:## enter browser on error, see ?recover:options(error= recover)## allows to call debugger() afterwards, see ?debugger:options(error= dump.frames)## A possible setting for non-interactive sessionsoptions(error= quote({dump.frames(to.file=TRUE); q()}))## End(Not run)# Compare the two ways to get an option and use it# acconting for the possibility it might not be set.if(as.logical(getOption("performCleanp",TRUE))) cat("do cleanup\n")## Not run:# a clumsier way of expressing the above w/o the default.tmp<- getOption("performCleanup")if(is.null(tmp)) tmp<-TRUEif(tmp) cat("do cleanup\n")## End(Not run)
order
returns a permutation which rearranges its firstargument into ascending or descending order, breaking ties by furtherarguments.sort.list
does the same, using only one argument.
See the examples for how to use these functions to sort data frames,etc.
order(..., na.last = TRUE, decreasing = FALSE, method = c("auto", "shell", "radix"))sort.list(x, partial = NULL, na.last = TRUE, decreasing = FALSE, method = c("auto", "shell", "quick", "radix"))
order(..., na.last=TRUE, decreasing=FALSE, method= c("auto","shell","radix"))sort.list(x, partial=NULL, na.last=TRUE, decreasing=FALSE, method= c("auto","shell","quick","radix"))
... | a sequence of numeric, complex, character or logicalvectors, all of the same length, or a classedR object. |
x | an atomic vector for |
partial | vector of indices for partial sorting.(Non- |
decreasing | logical. Should the sort order be increasing ordecreasing? For the |
na.last | for controlling the treatment of |
method | the method to be used: partial matches are allowed. Thedefault ( |
In the case of ties in the first vector, values in the second are usedto break the ties. If the values are still tied, values in the laterarguments are used to break the tie (see the first example).The sort used isstable (except formethod = "quick"
),so any unresolved ties will be left in their original ordering.
Complex values are sorted first by the real part, then the imaginarypart.
Except for method"radix"
, the sort order for character vectorswill depend on the collating sequence of the locale in use: seeComparison
.
The"shell"
method is generally the safest bet and is thedefault method, except for short factors, numeric vectors, integervectors and logical vectors, where"radix"
is assumed. Method"radix"
stably sorts logical, numeric and character vectors inlinear time. It outperforms the other methods, although there aredrawbacks, especially for character vectors (seesort
).Method"quick"
forsort.list
is only supported fornumericx
withna.last = NA
, is not stable, and isslower than"radix"
.
partial = NULL
is supported for compatibility with otherimplementations of S, but no other values are accepted and ordering isalways complete.
For a classedR object, the sort order is taken fromxtfrm
: as its help page notes, this can be slow unless asuitable method has been defined oris.numeric(x)
istrue. For factors, this sorts on the internal codes, which isparticularly appropriate for ordered factors.
An integer vector unless any of the inputs has ormore elements, when it is a double vector.
In programmatic use it is unsafe to name the...
arguments,as the names could match current or future controlarguments such asdecreasing
. A sometimes-encountered unsafepractice is to calldo.call('order', df_obj)
wheredf_obj
might be a data frame: copydf_obj
andremove any names, for example usingunname
.
sort.list
can get called by mistake as a method forsort
with a list argument: it gives a suitable errormessage for listx
.
There is a historical difference in behaviour forna.last = NA
:sort.list
removes theNA
s and then computes the orderamongst the remaining elements:order
computes the orderamongst the non-NA
elements of the original vector. Thus
x[order(x, na.last = NA)] zz <- x[!is.na(x)]; zz[sort.list(x, na.last = NA)]
both sort the non-NA
values ofx
.
Prior toR 3.3.0method = "radix"
was only supported forintegers of range less than 100,000.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Knuth, D. E. (1998)The Art of Computer Programming, Volume 3: Sorting andSearching. 2nd ed. Addison-Wesley.
require(stats)(ii <- order(x <- c(1,1,3:1,1:4,3), y <- c(9,9:1), z <- c(2,1:9)))## 6 5 2 1 7 4 10 8 3 9rbind(x, y, z)[,ii] # shows the reordering (ties via 2nd & 3rd arg)## Suppose we wanted descending order on y.## A simple solution for numeric 'y' isrbind(x, y, z)[, order(x, -y, z)]## More generally we can make use of xtfrmcy <- as.character(y)rbind(x, y, z)[, order(x, -xtfrm(cy), z)]## The radix sort supports multiple 'decreasing' values:rbind(x, y, z)[, order(x, cy, z, decreasing = c(FALSE, TRUE, FALSE), method="radix")]## Sorting data frames:dd <- transform(data.frame(x, y, z), z = factor(z, labels = LETTERS[9:1]))## Either as above {for factor 'z' : using internal coding}:dd[ order(x, -y, z), ]## or along 1st column, ties along 2nd, ... *arbitrary* no.{columns}:dd[ do.call(order, dd), ]set.seed(1) # reproducible example:d4 <- data.frame(x = round( rnorm(100)), y = round(10*runif(100)), z = round( 8*rnorm(100)), u = round(50*runif(100)))(d4s <- d4[ do.call(order, d4), ])(i <- which(diff(d4s[, 3]) == 0))# in 2 places, needed 3 cols to break ties:d4s[ rbind(i, i+1), ]## rearrange matched vectors so that the first is in ascending orderx <- c(5:1, 6:8, 12:9)y <- (x - 5)^2o <- order(x)rbind(x[o], y[o])## tests of na.lasta <- c(4, 3, 2, NA, 1)b <- c(4, NA, 2, 7, 1)z <- cbind(a, b)(o <- order(a, b)); z[o, ](o <- order(a, b, na.last = FALSE)); z[o, ](o <- order(a, b, na.last = NA)); z[o, ]## speed examples on an average laptop for long vectors:## factor/small-valued integers:x <- factor(sample(letters, 1e7, replace = TRUE))system.time(o <- sort.list(x, method = "quick", na.last = NA)) # 0.1 secstopifnot(!is.unsorted(x[o]))system.time(o <- sort.list(x, method = "radix")) # 0.05 sec, 2X fasterstopifnot(!is.unsorted(x[o]))## large-valued integers:xx <- sample(1:200000, 1e7, replace = TRUE)system.time(o <- sort.list(xx, method = "quick", na.last = NA)) # 0.3 secsystem.time(o <- sort.list(xx, method = "radix")) # 0.2 sec## character vectors:xx <- sample(state.name, 1e6, replace = TRUE)system.time(o <- sort.list(xx, method = "shell")) # 2 secsystem.time(o <- sort.list(xx, method = "radix")) # 0.007 sec, 300X faster## double vectors:xx <- rnorm(1e6)system.time(o <- sort.list(xx, method = "shell")) # 0.4 secsystem.time(o <- sort.list(xx, method = "quick", na.last = NA)) # 0.1 secsystem.time(o <- sort.list(xx, method = "radix")) # 0.05 sec, 2X faster
require(stats)(ii<- order(x<- c(1,1,3:1,1:4,3), y<- c(9,9:1), z<- c(2,1:9)))## 6 5 2 1 7 4 10 8 3 9rbind(x, y, z)[,ii]# shows the reordering (ties via 2nd & 3rd arg)## Suppose we wanted descending order on y.## A simple solution for numeric 'y' isrbind(x, y, z)[, order(x,-y, z)]## More generally we can make use of xtfrmcy<- as.character(y)rbind(x, y, z)[, order(x,-xtfrm(cy), z)]## The radix sort supports multiple 'decreasing' values:rbind(x, y, z)[, order(x, cy, z, decreasing= c(FALSE,TRUE,FALSE), method="radix")]## Sorting data frames:dd<- transform(data.frame(x, y, z), z= factor(z, labels= LETTERS[9:1]))## Either as above {for factor 'z' : using internal coding}:dd[ order(x,-y, z),]## or along 1st column, ties along 2nd, ... *arbitrary* no.{columns}:dd[ do.call(order, dd),]set.seed(1)# reproducible example:d4<- data.frame(x= round( rnorm(100)), y= round(10*runif(100)), z= round(8*rnorm(100)), u= round(50*runif(100)))(d4s<- d4[ do.call(order, d4),])(i<- which(diff(d4s[,3])==0))# in 2 places, needed 3 cols to break ties:d4s[ rbind(i, i+1),]## rearrange matched vectors so that the first is in ascending orderx<- c(5:1,6:8,12:9)y<-(x-5)^2o<- order(x)rbind(x[o], y[o])## tests of na.lasta<- c(4,3,2,NA,1)b<- c(4,NA,2,7,1)z<- cbind(a, b)(o<- order(a, b)); z[o,](o<- order(a, b, na.last=FALSE)); z[o,](o<- order(a, b, na.last=NA)); z[o,]## speed examples on an average laptop for long vectors:## factor/small-valued integers:x<- factor(sample(letters,1e7, replace=TRUE))system.time(o<- sort.list(x, method="quick", na.last=NA))# 0.1 secstopifnot(!is.unsorted(x[o]))system.time(o<- sort.list(x, method="radix"))# 0.05 sec, 2X fasterstopifnot(!is.unsorted(x[o]))## large-valued integers:xx<- sample(1:200000,1e7, replace=TRUE)system.time(o<- sort.list(xx, method="quick", na.last=NA))# 0.3 secsystem.time(o<- sort.list(xx, method="radix"))# 0.2 sec## character vectors:xx<- sample(state.name,1e6, replace=TRUE)system.time(o<- sort.list(xx, method="shell"))# 2 secsystem.time(o<- sort.list(xx, method="radix"))# 0.007 sec, 300X faster## double vectors:xx<- rnorm(1e6)system.time(o<- sort.list(xx, method="shell"))# 0.4 secsystem.time(o<- sort.list(xx, method="quick", na.last=NA))# 0.1 secsystem.time(o<- sort.list(xx, method="radix"))# 0.05 sec, 2X faster
The outer product of the arraysX
andY
is the arrayA
with dimensionc(dim(X), dim(Y))
where elementA[c(arrayindex.x, arrayindex.y)] = FUN(X[arrayindex.x], Y[arrayindex.y], ...)
.
outer(X, Y, FUN = "*", ...)X %o% Y
outer(X, Y, FUN="*",...)X%o% Y
X ,Y | first and second arguments for function |
FUN | a function to use on the outer products, foundvia |
... | optional arguments to be passed to |
X
andY
must be suitable arguments forFUN
. Eachwill be extended byrep
to length the products of thelengths ofX
andY
beforeFUN
is called.
FUN
is called with these two extended vectors as arguments(plus any arguments in...
). It must be a vectorizedfunction (or the name of one) expecting at least two arguments andreturning a value with the same length as the first (and the second).
Where they exist, the [dim]names ofX
andY
will becopied to the answer, and a dimension assigned which is theconcatenation of the dimensions ofX
andY
(or lengthsif dimensions do not exist).
FUN = "*"
is handled as a special caseviaas.vector(X) %*% t(as.vector(Y))
, and is intended only fornumeric vectors and arrays.
%o%
is binary operator providing a wrapper forouter(x, y, "*")
.
Jonathan Rougier
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
%*%
for usual (inner) matrix vectormultiplication;kronecker
which is based onouter
;Vectorize
for vectorizing a non-vectorized function.
x <- 1:9; names(x) <- x# Multiplication & Power Tablesx %o% xy <- 2:8; names(y) <- paste(y,":", sep = "")outer(y, x, `^`)outer(month.abb, 1999:2003, FUN = paste)## three way multiplication table:x %o% x %o% y[1:3]
x<-1:9; names(x)<- x# Multiplication & Power Tablesx%o% xy<-2:8; names(y)<- paste(y,":", sep="")outer(y, x, `^`)outer(month.abb,1999:2003, FUN= paste)## three way multiplication table:x%o% x%o% y[1:3]
Open parenthesis,(
, and open brace,{
, are.Primitive
functions inR.
Effectively,(
is semantically equivalent to the identityfunction(x) x
, whereas{
is slightly more interesting,see examples.
( ... ){ ... }
(...){...}
For(
, the result of evaluating the argument. This hasvisibility set, so will auto-print if used at top-level.
For{
, the result of the last expression evaluated. This hasthe visibility of the last evaluation.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
if
,return
, etc for other objects used intheR language itself.
Syntax
for operator precedence.
f <- get("(")e <- expression(3 + 2 * 4)identical(f(e), e)do <- get("{")do(x <- 3, y <- 2*x-3, 6-x-y); x; y## note the differences(2+3){2+3; 4+5}(invisible(2+3)){invisible(2+3)}
f<- get("(")e<- expression(3+2*4)identical(f(e), e)do<- get("{")do(x<-3, y<-2*x-3,6-x-y); x; y## note the differences(2+3){2+3;4+5}(invisible(2+3)){invisible(2+3)}
parse()
returns the parsed but unevaluated expressions in anexpression
, a “list” ofcall
s.
str2expression(s)
andstr2lang(s)
return special versionsofparse(text=s, keep.source=FALSE)
and can therefore be regarded astransforming character stringss
to expressions, calls, etc.
parse(file = "", n = NULL, text = NULL, prompt = "?", keep.source = getOption("keep.source"), srcfile, encoding = "unknown")str2lang(s)str2expression(text)
parse(file="", n=NULL, text=NULL, prompt="?", keep.source= getOption("keep.source"), srcfile, encoding="unknown")str2lang(s)str2expression(text)
file | aconnection, or a character string giving the name of afile or a URL to read the expressions from.If |
n | integer (or coerced to integer). The maximum number ofexpressions to parse. If |
text | character vector. The text to parse. Elements are treatedas if they were lines of a file. OtherR objects will be coercedto character if possible. |
prompt | the prompt to print when parsing from the keyboard. |
keep.source | a logical value; if |
srcfile |
|
encoding | encoding to be assumed for input strings. If thevalue is |
s | a |
parse(....)
:Iftext
has length greater than zero (after coercion) it is used inpreference tofile
.
All versions ofR accept input from a connection with end of linemarked byLF (as used on Unix),CRLF (as used on DOS/Windows)orCR (as used on classic Mac OS). The final line can be incomplete,that is missing the finalEOL marker.
When input is taken from the console,n = NULL
is equivalent ton = 1
, andn < 0
will read until anEOF character isread. (TheEOF character isCtrl-Z for the Windows front-ends.) Theline-length limit is 4095 bytes when reading from the console (whichmay impose a lower limit: see ‘An Introduction to R’).
The default forsrcfile
is set as follows. Ifkeep.source
is notTRUE
,srcfile
defaults to a character string, either"<text>"
or onederived fromfile
. Whenkeep.source
isTRUE
, iftext
is used,srcfile
will be set to asrcfilecopy
containing the text. If a characterstring is used forfile
, asrcfile
objectreferring to that file will be used.
Whensrcfile
is a character string, error messages willinclude the name, but source reference information will not be addedto the result. Whensrcfile
is asrcfile
object, source reference information will be retained.
str2expression(s)
:for acharacter
vectors
,str2expression(s)
corresponds toparse(text = s, keep.source=FALSE)
, which is always oftype (typeof
) andclass
expression
.
str2lang(s)
:for acharacter
strings
,str2lang(s)
corresponds toparse(text = s, keep.source=FALSE)[[1]]
(plus a checkthat boths
and theparse(*)
result are of length one)which is typically acall
but may also be asymbol
akaname
,NULL
or an atomic constant such as2
,1L
, orTRUE
. Put differently, the value ofstr2lang(.)
is a call or one of its parts, in short“a call or simpler”.
Currently, encoding is not handled instr2lang()
andstr2expression()
.
parse()
andstr2expression()
return an object of type"expression"
, forparse()
with up ton
elements if specified as a non-negative integer.
str2lang(s)
,s
a string, returns “acall
or simpler”, see the ‘Details:’ section.
Whensrcfile
is non-NULL
, a"srcref"
attributewill be attached to the result containing a list ofsrcref
records corresponding to each element, a"srcfile"
attribute will be attached containing a copy ofsrcfile
, and a"wholeSrcref"
attribute will beattached containing asrcref
record corresponding toall of the parsed text. Detailed parse information will be stored inthe"srcfile"
attribute, to be retrieved bygetParseData
.
A syntax error (including an incomplete expression) will throw an error.
Character strings in the result will have a declared encoding ifencoding
is"latin1"
or"UTF-8"
, or iftext
is supplied with every element of known encoding in aLatin-1 or UTF-8 locale.
When a syntax error occurs during parsing,parse
signals an error. The partial parse data will be stored in thesrcfile
argument if it is asrcfile
objectand thetext
argument was used to supply the text. In othercases it will be lost when the error is triggered.
The partial parse data can be retrieved usinggetParseData
applied to thesrcfile
object.Because parsing was incomplete, it will typically include referencesto"parent"
entries that are not present.
Usingparse(text = *, ..)
or its simplified and hence moreefficient versionsstr2lang()
orstr2expression()
is atleast an order of magnitude less efficient thancall(..)
oras.call()
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Murdoch, D. (2010).“Source References”.The R Journal,2(2), 16–19.doi:10.32614/RJ-2010-010.
The source reference information can be used for debugging (seee.g.setBreakpoint
) and profiling (seeRprof
). It can be examined bygetSrcref
and related functions. More detailed information is available throughgetParseData
.
fil <- tempfile(fileext = ".Rdmped")cat("x <- c(1, 4)\n x ^ 3 -10 ; outer(1:7, 5:9)\n", file = fil)# parse 3 statements from our temp fileparse(file = fil, n = 3)unlink(fil)## str2lang(<string>) || str2expression(<character>) :stopifnot(exprs = { identical( str2lang("x[3] <- 1+4"), quote(x[3] <- 1+4)) identical( str2lang("log(y)"), quote(log(y)) ) identical( str2lang("abc" ), quote(abc) -> qa) is.symbol(qa) & !is.call(qa) # a symbol/name, not a call identical( str2lang("1.375" ), 1.375) # just a number, not a call identical( str2expression(c("# a comment", "", "42")), expression(42) )})# A partial parse with a syntax errortxt <- "x <- 1an error"sf <- srcfile("txt")tryCatch(parse(text = txt, srcfile = sf), error = function(e) "Syntax error.")getParseData(sf)
fil<- tempfile(fileext=".Rdmped")cat("x <- c(1, 4)\n x ^ 3 -10 ; outer(1:7, 5:9)\n", file= fil)# parse 3 statements from our temp fileparse(file= fil, n=3)unlink(fil)## str2lang(<string>) || str2expression(<character>) :stopifnot(exprs={ identical( str2lang("x[3] <- 1+4"), quote(x[3]<-1+4)) identical( str2lang("log(y)"), quote(log(y))) identical( str2lang("abc"), quote(abc)-> qa) is.symbol(qa)&!is.call(qa)# a symbol/name, not a call identical( str2lang("1.375"),1.375)# just a number, not a call identical( str2expression(c("# a comment","","42")), expression(42))})# A partial parse with a syntax errortxt<- "x<-1an error"sf<- srcfile("txt")tryCatch(parse(text= txt, srcfile= sf), error=function(e)"Syntax error.")getParseData(sf)
Concatenate vectors after converting to character.Concatenation happens in two basically different ways, determined bycollapse
being a string or not.
paste (..., sep = " ", collapse = NULL, recycle0 = FALSE)paste0(..., collapse = NULL, recycle0 = FALSE)
paste(..., sep=" ", collapse=NULL, recycle0=FALSE)paste0(..., collapse=NULL, recycle0=FALSE)
... | one or moreR objects, to be converted to character vectors. |
sep | a character string to separate the terms. Not |
collapse | an optional character string to separate the results. Not |
recycle0 |
|
paste
converts its arguments (viaas.character
) to character strings, and concatenatesthem (separating them by the string given bysep
).
If the arguments are vectors, they are concatenated term-by-term to give acharacter vector result. Vector arguments are recycled as needed.Zero-length arguments are recycled as""
unlessrecycle0
isTRUE
andcollapse
isNULL
.
Note thatpaste()
coercesNA_character_
, thecharacter missing value, to"NA"
which may seemundesirable, e.g., when pasting two character vectors, or verydesirable, e.g. inpaste("the value of p is ", p)
.
paste0(..., collapse)
is equivalent topaste(..., sep = "", collapse)
, slightly more efficiently.
If a value is specified forcollapse
, the values in the resultare then concatenated into a single string, with the elements beingseparated by the value ofcollapse
.
A character vector of the concatenated values. This will be of lengthzero if all the objects are, unlesscollapse
is non-NULL, in whichcase it is""
(a single empty string).
If any input into an element of the result is in UTF-8 (and none aredeclared with encoding"bytes"
, seeEncoding
),that element will be in UTF-8, otherwise in the current encoding inwhich case the encoding of the element is declared if the currentlocale is either Latin-1 or UTF-8, at least one of the correspondinginputs (including separators) had a declared encoding and all inputswere either ASCII or declared.
If an input into an element is declared with encoding"bytes"
,no translation will be done of any of the elements and the resultingelement will have encoding"bytes"
. Ifcollapse
isnon-NULL, this applies also to the second, collapsing, phase, but sometranslation may have been done in pasting object together in the firstphase.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
toString
typically callspaste(*, collapse=", ")
.String manipulation withas.character
,substr
,nchar
,strsplit
; further,cat
which concatenates andwrites to a file, andsprintf
for C like stringconstruction.
‘plotmath’ for the use ofpaste
in plot annotation.
## When passing a single vector, paste0 and paste work like as.character.paste0(1:12)paste(1:12) # sameas.character(1:12) # same## If you pass several vectors to paste0, they are concatenated in a## vectorized way.(nth <- paste0(1:12, c("st", "nd", "rd", rep("th", 9))))## paste works the same, but separates each input with a space.## Notice that the recycling rules make every input as long as the longest input.paste(month.abb, "is the", nth, "month of the year.")paste(month.abb, letters)## You can change the separator by passing a sep argument## which can be multiple characters.paste(month.abb, "is the", nth, "month of the year.", sep = "_*_")## To collapse the output into a single string, pass a collapse argument.paste0(nth, collapse = ", ")## For inputs of length 1, use the sep argument rather than collapsepaste("1st", "2nd", "3rd", collapse = ", ") # probably not what you wantedpaste("1st", "2nd", "3rd", sep = ", ")## You can combine the sep and collapse arguments together.paste(month.abb, nth, sep = ": ", collapse = "; ")## Using paste() in combination with strwrap() can be useful## for dealing with long strings.(title <- paste(strwrap( "Stopping distance of cars (ft) vs. speed (mph) from Ezekiel (1930)", width = 30), collapse = "\n"))plot(dist ~ speed, cars, main = title)## zero length arguments recycled as `""` -- NB: `{}` <==> character(0) herepaste({}, 1:2)## 'recycle0 = TRUE' allows standard vectorized behaviour, i.e., zero-length## recycling resulting in zero-length result character(0):valid <- FALSEval <- pipaste("The value is", val[valid], "-- not so good!") # -> ".. value is -- not .."paste("The value is", val[valid], "-- good: empty!", recycle0=TRUE) # -> character(0)## When 'collapse = <string>', result is (length 1) string in all casespaste("foo", {}, "bar", collapse = "|") # |--> "foo bar"paste("foo", {}, collapse = "|", recycle0 = TRUE) # |--> ""## If all arguments are empty (and collapse a string), "" results alwayspaste( collapse = "|")paste( collapse = "|", recycle0 = TRUE)paste({}, collapse = "|")paste({}, collapse = "|", recycle0 = TRUE)
## When passing a single vector, paste0 and paste work like as.character.paste0(1:12)paste(1:12)# sameas.character(1:12)# same## If you pass several vectors to paste0, they are concatenated in a## vectorized way.(nth<- paste0(1:12, c("st","nd","rd", rep("th",9))))## paste works the same, but separates each input with a space.## Notice that the recycling rules make every input as long as the longest input.paste(month.abb,"is the", nth,"month of the year.")paste(month.abb, letters)## You can change the separator by passing a sep argument## which can be multiple characters.paste(month.abb,"is the", nth,"month of the year.", sep="_*_")## To collapse the output into a single string, pass a collapse argument.paste0(nth, collapse=", ")## For inputs of length 1, use the sep argument rather than collapsepaste("1st","2nd","3rd", collapse=", ")# probably not what you wantedpaste("1st","2nd","3rd", sep=", ")## You can combine the sep and collapse arguments together.paste(month.abb, nth, sep=": ", collapse="; ")## Using paste() in combination with strwrap() can be useful## for dealing with long strings.(title<- paste(strwrap("Stopping distance of cars (ft) vs. speed (mph) from Ezekiel (1930)", width=30), collapse="\n"))plot(dist~ speed, cars, main= title)## zero length arguments recycled as `""` -- NB: `{}` <==> character(0) herepaste({},1:2)## 'recycle0 = TRUE' allows standard vectorized behaviour, i.e., zero-length## recycling resulting in zero-length result character(0):valid<-FALSEval<- pipaste("The value is", val[valid],"-- not so good!")# -> ".. value is -- not .."paste("The value is", val[valid],"-- good: empty!", recycle0=TRUE)# -> character(0)## When 'collapse = <string>', result is (length 1) string in all casespaste("foo",{},"bar", collapse="|")# |--> "foo bar"paste("foo",{}, collapse="|", recycle0=TRUE)# |--> ""## If all arguments are empty (and collapse a string), "" results alwayspaste( collapse="|")paste( collapse="|", recycle0=TRUE)paste({}, collapse="|")paste({}, collapse="|", recycle0=TRUE)
Expand a path name, for example by replacing a leading tilde by theuser's home directory (if defined on that platform).
path.expand(path)
path.expand(path)
path | character vector containing one or more path names. |
On most builds ofR a leading~user
will expand to the homedirectory ofuser
.
There are possibly different concepts of ‘home directory’: thatusually used is the setting of the environment variableHOME.
The ‘path names’ need not exist nor be valid path names butthey do need to be representable in the session encoding.
The definition of the ‘home’ directory is in the ‘rw-FAQ’Q2.14: it is taken from theR_USER environment variable whenpath.expand
is first called in a session.
The ‘path names’ need not exist nor be valid path names.
A character vector of possibly expanded path names: where the homedirectory is unknown or none is specified the path is returned unchanged.
If the expansion would exceed the maximum path length the result maybe truncated or the path may be returned unchanged.
basename
,normalizePath
,file.path
.
path.expand("~/foo")
path.expand("~/foo")
Report some of the configuration options of the version of PCRE in usein thisR session.
pcre_config()
pcre_config()
A named logical vector, currently with elements
UTF-8 | Support for UTF-8 inputs. Required. |
Unicode properties | Support for ‘\p{xx}’ and ‘\P{xx}’in regular expressions. Desirable and used by some CRAN packages.As of PCRE2, always present with support for UTF-8. |
JIT | Support for just-in-time compilation. Desirable for speed(but only available as a compile-time option on certainarchitectures, and may be unused as unreliable on some of those,e.g. |
stack | Does match recursion use a stack ( |
extSoftVersion
for the PCRE version.
pcre_config()
pcre_config()
Pipe a value into a call expression or a function expression.
lhs |> rhs
lhs|> rhs
lhs | expression producing a value. |
rhs | a call expression. |
A pipe expression passes, or ‘pipes’, the result of the left-hand-sideexpressionlhs
to the right-hand-side expressionrhs
.
Thelhs
isinserted as the first argument in the call. Sox |> f(y)
isinterpreted asf(x, y)
.
To avoid ambiguities, functions inrhs
calls may not besyntactically special, such as+
orif
.
It is also possible to use a named argument with the placeholder_
in therhs
call to specify where thelhs
is tobe inserted. The placeholder can only appear once on therhs
.
The placeholder can also be used as the first argument in anextraction call, such as_$coef
. More generally, it can be usedas the head of a chain of extractions, such as_$coef[[2]]
,using a sequence of the extraction functions$
,[
,[[
, or@
.
Pipe notation allows a nested sequence of calls to be written in a waythat may make the sequence of processing steps easier to follow.
Currently, pipe operations are implemented as syntax transformations.So an expression written asx |> f(y)
is parsed asf(x, y)
. It is worth emphasizing that while the code in a pipeline iswritten sequentially, regular R semantics for evaluation apply andso piped expressions will be evaluated only when first used in therhs
expression.
Returns the result of evaluating the transformed expression.
The forward pipe operator is motivated by the pipe introduced in themagrittr package, but is more streamlined. It is similar tothe pipe or pipeline operators introduced in other languages, includingF#, Julia, and JavaScript.
This was introduced inR 4.1.0. Code using it will not be parsedas intended (probably with an error) in earlier versions ofR.
# simple uses:mtcars |> head() # same as head(mtcars)mtcars |> head(2) # same as head(mtcars, 2)mtcars |> subset(cyl == 4) |> nrow() # same as nrow(subset(mtcars, cyl == 4))# to pass the lhs into an argument other than the first, either# use the _ placeholder with a named argument:mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = _)# or use an anonymous function:mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()# or explicitly name the argument(s) before the "one":mtcars |> subset(cyl == 4) |> lm(formula = mpg ~ disp)# using the placeholder as the head of an extraction chain:mtcars |> subset(cyl == 4) |> lm(formula = mpg ~ disp) |> _$coef[[2]]# the pipe operator is implemented as a syntax transformation:quote(mtcars |> subset(cyl == 4) |> nrow())# regular R evaluation semantics applystop() |> (function(...) {})() # stop() is not used on RHS so is not evaluated
# simple uses:mtcars|> head()# same as head(mtcars)mtcars|> head(2)# same as head(mtcars, 2)mtcars|> subset(cyl==4)|> nrow()# same as nrow(subset(mtcars, cyl == 4))# to pass the lhs into an argument other than the first, either# use the _ placeholder with a named argument:mtcars|> subset(cyl==4)|> lm(mpg~ disp, data= _)# or use an anonymous function:mtcars|> subset(cyl==4)|>(function(d) lm(mpg~ disp, data= d))()mtcars|> subset(cyl==4)|>(\(d) lm(mpg~ disp, data= d))()# or explicitly name the argument(s) before the "one":mtcars|> subset(cyl==4)|> lm(formula= mpg~ disp)# using the placeholder as the head of an extraction chain:mtcars|> subset(cyl==4)|> lm(formula= mpg~ disp)|> _$coef[[2]]# the pipe operator is implemented as a syntax transformation:quote(mtcars|> subset(cyl==4)|> nrow())# regular R evaluation semantics applystop()|>(function(...){})()# stop() is not used on RHS so is not evaluated
Generic function for plotting ofR objects.
For simple scatter plots,plot.default
will be used.However, there areplot
methods for manyR objects,includingfunction
s,data.frame
s,density
objects, etc. Usemethods(plot)
andthe documentation for these. Most of these methods are implementedusing traditional graphics (thegraphics package), but this isnot mandatory.
For more details about graphical parameter arguments used bytraditional graphics, seepar
.
plot(x, y, ...)
plot(x, y,...)
x | the coordinates of points in the plot. Alternatively, asingle plotting structure, function oranyR object with a |
y | the y coordinates of points in the plot,optionalif |
... | arguments to be passed to methods, such asgraphical parameters (see
|
The two step types differ in their x-y preference: Going from to
with
,
type = "s"
moves first horizontal, then vertical, whereastype = "S"
movesthe other way around.
Theplot
generic was moved from thegraphics package tothebase package inR 4.0.0. It is currently re-exported fromthegraphics namespace to allow packages importing it from thereto continue working, but this may change in future versions ofR.
plot.default
,plot.formula
and othermethods;points
,lines
,par
.For thousands of points, consider usingsmoothScatter()
instead ofplot()
.
For X-Y-Z plotting seecontour
,persp
andimage
.
require(stats) # for lowess, rpois, rnormrequire(graphics) # for plot methodsplot(cars)lines(lowess(cars))plot(sin, -pi, 2*pi) # see ?plot.function## Discrete Distribution Plot:plot(table(rpois(100, 5)), type = "h", col = "red", lwd = 10, main = "rpois(100, lambda = 5)")## Simple quantiles/ECDF, see ecdf() {library(stats)} for a better one:plot(x <- sort(rnorm(47)), type = "s", main = "plot(x, type = \"s\")")points(x, cex = .5, col = "dark red")
require(stats)# for lowess, rpois, rnormrequire(graphics)# for plot methodsplot(cars)lines(lowess(cars))plot(sin,-pi,2*pi)# see ?plot.function## Discrete Distribution Plot:plot(table(rpois(100,5)), type="h", col="red", lwd=10, main="rpois(100, lambda = 5)")## Simple quantiles/ECDF, see ecdf() {library(stats)} for a better one:plot(x<- sort(rnorm(47)), type="s", main="plot(x, type = \"s\")")points(x, cex=.5, col="dark red")
pmatch
seeks matches for the elements of its first argumentamong those of its second.
pmatch(x, table, nomatch = NA_integer_, duplicates.ok = FALSE)
pmatch(x, table, nomatch=NA_integer_, duplicates.ok=FALSE)
x | the values to be matched: converted to a character vector by |
table | the values to be matched against: converted to a charactervector.Long vectors are not supported. |
nomatch | the value to be returned at non-matching or multiplypartially matching positions. Note that it is coerced to |
duplicates.ok | should elements in |
The behaviour differs by the value ofduplicates.ok
. Considerfirst the case if this is true. First exact matches are considered,and the positions of the first exact matches are recorded. Then uniquepartial matches are considered, and if found recorded. (A partialmatch occurs if the whole of the element ofx
matches thebeginning of the element oftable
.) Finally,all remaining elements ofx
are regarded as unmatched.In addition, an empty string can match nothing, not even an exactmatch to an empty string. This is the appropriate behaviour forpartial matching of character indices, for example.
Ifduplicates.ok
isFALSE
, values oftable
oncematched are excluded from the search for subsequent matches. Thisbehaviour is equivalent to theR algorithm for argumentmatching, except for the consideration of empty strings (which inargument matching are matched after exact and partial matching to anyremaining arguments).
charmatch
is similar topmatch
withduplicates.ok
true, the differences being that itdifferentiates between no match and an ambiguous partial match, itdoes match empty strings, and it does not allow multiple exact matches.
NA
values are treated as if they were the string constant"NA"
.
An integer vector (possibly includingNA
ifnomatch = NA
) of the same length asx
, giving the indices of theelements intable
which matched, ornomatch
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
match
,charmatch
andmatch.arg
,match.fun
,match.call
, for function argument matching etc.,startsWith
for particular checking of initial matches;grep
etc for more general (regexp) matching of strings.
pmatch("", "") # returns NApmatch("m", c("mean", "median", "mode")) # returns NApmatch("med", c("mean", "median", "mode")) # returns 2pmatch(c("", "ab", "ab"), c("abc", "ab"), duplicates.ok = FALSE)pmatch(c("", "ab", "ab"), c("abc", "ab"), duplicates.ok = TRUE)## comparecharmatch(c("", "ab", "ab"), c("abc", "ab"))
pmatch("","")# returns NApmatch("m", c("mean","median","mode"))# returns NApmatch("med", c("mean","median","mode"))# returns 2pmatch(c("","ab","ab"), c("abc","ab"), duplicates.ok=FALSE)pmatch(c("","ab","ab"), c("abc","ab"), duplicates.ok=TRUE)## comparecharmatch(c("","ab","ab"), c("abc","ab"))
Find zeros of a real or complex polynomial.
polyroot(z)
polyroot(z)
z | the vector of polynomial coefficients in increasing order. |
A polynomial of degree,
is given by its coefficient vectorz[1:n]
.polyroot
returns the complex zeros of
using the Jenkins-Traub algorithm.
If the coefficient vectorz
has zeroes for the highest powers,these are discarded.
There is no maximum degree, but numerical stabilitymay be an issue for all but low-degree polynomials.
A complex vector of length, where
is the positionof the largest non-zero element of
z
.
C translation by Ross Ihaka of Fortran code in the reference, withmodifications by the R Core Team.
Jenkins, M. A. and Traub, J. F. (1972).Algorithm 419: zeros of a complex polynomial.Communications of the ACM,15(2), 97–99.doi:10.1145/361254.361262.
uniroot
for numerical root finding of arbitraryfunctions;complex
and thezero
example in the demosdirectory.
polyroot(c(1, 2, 1))round(polyroot(choose(8, 0:8)), 11) # guess what!for (n1 in 1:4) print(polyroot(1:n1), digits = 4)polyroot(c(1, 2, 1, 0, 0)) # same as the first
polyroot(c(1,2,1))round(polyroot(choose(8,0:8)),11)# guess what!for(n1in1:4) print(polyroot(1:n1), digits=4)polyroot(c(1,2,1,0,0))# same as the first
Returns the environment at a specified position in the search path.
pos.to.env(x)
pos.to.env(x)
x | an integer between |
SeveralR functions for manipulating objects in environments (such asget
andls
) allow specifying environmentsvia corresponding positions in the search path.pos.to.env
isa convenience function for programmers which converts these positionsto corresponding environments; users will typically have no need forit. It isprimitive.
-1
is interpreted as the environment the function is calledfrom.
This is aprimitive function.
pos.to.env(1) # R_GlobalEnv# the next returns the base environmentpos.to.env(length(search()))
pos.to.env(1)# R_GlobalEnv# the next returns the base environmentpos.to.env(length(search()))
Compute a sequence of aboutn+1
equally spaced ‘round’values which cover the range of the values inx
.The values are chosen so that they are 1, 2 or 5 times a power of 10.
pretty(x, ...)## Default S3 method:pretty(x, n = 5, min.n = n %/% 3, shrink.sml = 0.75, high.u.bias = 1.5, u5.bias = .5 + 1.5*high.u.bias, eps.correct = 0, f.min = 2^-20, ...).pretty(x, n = 5L, min.n = n %/% 3, shrink.sml = 0.75, high.u.bias = 1.5, u5.bias = .5 + 1.5*high.u.bias, eps.correct = 0L, f.min = 2^-20, bounds = TRUE)
pretty(x,...)## Default S3 method:pretty(x, n=5, min.n= n%/%3, shrink.sml=0.75, high.u.bias=1.5, u5.bias=.5+1.5*high.u.bias, eps.correct=0, f.min=2^-20,...).pretty(x, n=5L, min.n= n%/%3, shrink.sml=0.75, high.u.bias=1.5, u5.bias=.5+1.5*high.u.bias, eps.correct=0L, f.min=2^-20, bounds=TRUE)
x | an object coercible to numeric by |
n | integer giving thedesired number ofintervals. Non-integer values are rounded down. |
min.n | nonnegative integer giving theminimal number ofintervals. If |
shrink.sml | positive number, a factor (smaller than one)by which a default scale is shrunk in the case when |
high.u.bias | non-negative numeric, typically |
u5.bias | non-negative numericmultiplier favoring factor 5 over 2. Default and ‘optimal’: |
eps.correct | integer code, one of {0,1,2}. If non-0, anepsilon correction is made at the boundaries such thatthe result boundaries will be outside |
f.min | positive factor multiplied by |
bounds | a |
... | further arguments for methods. |
pretty
ignores non-finite values inx
.
Letd <- max(x) - min(x)
.If
d
is not (very close) to 0, we letc <- d/n
,otherwise more or lessc <- max(abs(range(x)))*shrink.sml / min.n
.Then, the10 baseb
is suchthat
.
Now determine the basicunit as one of
, depending on
and the two ‘bias’ coefficients,
high.u.bias
andu5.bias
.
.........
pretty()
returns an numeric vector ofapproximatelyn
increasing numbers which are “pretty” in decimal notation.(in extreme range cases, the numbers can no longer be “pretty”given the other constraints; e.g., forpretty(..)
For ease of investigating the underlying CR_pretty()
function,.pretty()
returns a namedlist
. Bydefault, whenbounds=TRUE
, the entries arel
,u
,andn
, whereas forbounds=FALSE
, they arens
,nu
,n
, and (a “pretty”)unit
where then*
's are integer valued (but onlyn
is of classinteger
). Programmers may use this to create prettysequence (iterator) objects.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
axTicks
for the computation of pretty axis ticklocations in plots, particularly on the log scale.
pretty(1:15) # 0 2 4 6 8 10 12 14 16pretty(1:15, high.u.bias = 2) # 0 5 10 15pretty(1:15, n = 4) # 0 5 10 15pretty(1:15 * 2) # 0 5 10 15 20 25 30pretty(1:20) # 0 5 10 15 20pretty(1:20, n = 2) # 0 10 20pretty(1:20, n = 10) # 0 2 4 ... 20for(k in 5:11) { cat("k=", k, ": "); print(diff(range(pretty(100 + c(0, pi*10^-k)))))}##-- more bizarre, when min(x) == max(x):pretty(pi)add.names <- function(v) { names(v) <- paste(v); v}utils::str(lapply(add.names(-10:20), pretty))## min.n = 0 returns a length-1 vector "if pretty":utils::str(lapply(add.names(0:20), pretty, min.n = 0))sapply( add.names(0:20), pretty, min.n = 4)pretty(1.234e100)pretty(1001.1001)pretty(1001.1001, shrink.sml = 0.2)for(k in -7:3) cat("shrink=", formatC(2^k, width = 9),":", formatC(pretty(1001.1001, shrink.sml = 2^k), width = 6),"\n")
pretty(1:15)# 0 2 4 6 8 10 12 14 16pretty(1:15, high.u.bias=2)# 0 5 10 15pretty(1:15, n=4)# 0 5 10 15pretty(1:15*2)# 0 5 10 15 20 25 30pretty(1:20)# 0 5 10 15 20pretty(1:20, n=2)# 0 10 20pretty(1:20, n=10)# 0 2 4 ... 20for(kin5:11){ cat("k=", k,": "); print(diff(range(pretty(100+ c(0, pi*10^-k)))))}##-- more bizarre, when min(x) == max(x):pretty(pi)add.names<-function(v){ names(v)<- paste(v); v}utils::str(lapply(add.names(-10:20), pretty))## min.n = 0 returns a length-1 vector "if pretty":utils::str(lapply(add.names(0:20), pretty, min.n=0))sapply( add.names(0:20), pretty, min.n=4)pretty(1.234e100)pretty(1001.1001)pretty(1001.1001, shrink.sml=0.2)for(kin-7:3) cat("shrink=", formatC(2^k, width=9),":", formatC(pretty(1001.1001, shrink.sml=2^k), width=6),"\n")
.Primitive
looks up by name a ‘primitive’(internally implemented) function.
.Primitive(name)
.Primitive(name)
name | name of theR function. |
The advantage of.Primitive
over.Internal
functions is the potential efficiency of argument passing, and thatpositional matching can be used where desirable, e.g. inswitch
. For more details, see the ‘R Internals’manual.
All primitive functions are in the base namespace.
This function is almost never used:`name`
or, more carefully,get(name, envir = baseenv())
work equally well and donot depend on knowing which functions are primitive (which does changeasR evolves).
is.primitive
showing that primitive functions come intwo types (typeof
),.Internal
.
mysqrt <- .Primitive("sqrt")c.Internal # this one *must* be primitive!`if` # need backticks
mysqrt<- .Primitive("sqrt")c.Internal# this one *must* be primitive!`if`# need backticks
print
prints its argument and returns itinvisibly (viainvisible(x)
). It is a generic function which means thatnew printing methods can be easily added for newclass
es.
print(x, ...)## S3 method for class 'factor'print(x, quote = FALSE, max.levels = NULL, width = getOption("width"), ...)## S3 method for class 'table'print(x, digits = getOption("digits"), quote = FALSE, na.print = "", zero.print = "0", right = is.numeric(x) || is.complex(x), justify = "none", ...)## S3 method for class 'function'print(x, useSource = TRUE, ...)
print(x,...)## S3 method for class 'factor'print(x, quote=FALSE, max.levels=NULL, width= getOption("width"),...)## S3 method for class 'table'print(x, digits= getOption("digits"), quote=FALSE, na.print="", zero.print="0", right= is.numeric(x)|| is.complex(x), justify="none",...)## S3 method for class 'function'print(x, useSource=TRUE,...)
x | an object used to select a method. |
... | further arguments passed to or from other methods. |
quote | logical, indicating whether or not strings should beprinted with surrounding quotes. |
max.levels | integer, indicating how many levels should beprinted for a factor; if |
width | only used when |
digits | minimal number ofsignificant digits, see |
na.print | character string (or |
zero.print | character specifying how zeros ( |
right | logical, indicating whether or not strings should beright aligned. |
justify | character indicating if strings should left- orright-justified or left alone, passed to |
useSource | logical indicating if internally stored sourceshould be used for printing when present, e.g., if |
The default method,print.default
has its own help page.Usemethods("print")
to get all the methods for theprint
generic.
print.factor
allows some customization and is used for printingordered
factors as well.
print.table
for printingtable
s allows othercustomization. As of R 3.0.0, it only prints a description in case of a tablewith 0-extents (this can happen if a classifier has no valid data).
Seenoquote
as an example of a class whose mainpurpose is a specificprint
method.
Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
The default methodprint.default
, and help for themethods above; furtheroptions
,noquote
.
For more customizable (but cumbersome) printing, seecat
,format
or alsowrite
.For a simple prototypical print method, see.print.via.format
in packagetools.
require(stats)ts(1:20) #-- print is the "Default function" --> print.ts(.) is calledfor(i in 1:3) print(1:i)## Printing of factorsattenu$station ## 117 levels -> 'max.levels' depending on width## ordered factors: levels "l1 < l2 < .."esoph$agegp[1:12]esoph$alcgp[1:12]## Printing of sparse (contingency) tablesset.seed(521)t1 <- round(abs(rt(200, df = 1.8)))t2 <- round(abs(rt(200, df = 1.4)))table(t1, t2) # simpleprint(table(t1, t2), zero.print = ".") # nicer to read## same for non-integer "table":T <- table(t2,t1)T <- T * (1+round(rlnorm(length(T)))/4)print(T, zero.print = ".") # quite nicer,print.table(T[,2:8] * 1e9, digits=3, zero.print = ".")## still slightly inferior to Matrix::Matrix(T) for larger T## Corner cases with empty extents:table(1, NA) # < table of extent 1 x 0 >
require(stats)ts(1:20)#-- print is the "Default function" --> print.ts(.) is calledfor(iin1:3) print(1:i)## Printing of factorsattenu$station## 117 levels -> 'max.levels' depending on width## ordered factors: levels "l1 < l2 < .."esoph$agegp[1:12]esoph$alcgp[1:12]## Printing of sparse (contingency) tablesset.seed(521)t1<- round(abs(rt(200, df=1.8)))t2<- round(abs(rt(200, df=1.4)))table(t1, t2)# simpleprint(table(t1, t2), zero.print=".")# nicer to read## same for non-integer "table":T<- table(t2,t1)T<- T*(1+round(rlnorm(length(T)))/4)print(T, zero.print=".")# quite nicer,print.table(T[,2:8]*1e9, digits=3, zero.print=".")## still slightly inferior to Matrix::Matrix(T) for larger T## Corner cases with empty extents:table(1,NA)# < table of extent 1 x 0 >
Print a data frame.
## S3 method for class 'data.frame'print(x, ..., digits = NULL, quote = FALSE, right = TRUE, row.names = TRUE, max = NULL)
## S3 method for class 'data.frame'print(x,..., digits=NULL, quote=FALSE, right=TRUE, row.names=TRUE, max=NULL)
x | object of class |
... | optional arguments to |
digits | the minimum number of significant digits to be used: see |
quote | logical, indicating whether or not entries should beprinted with surrounding quotes. |
right | logical, indicating whether or not strings should beright-aligned. The default is right-alignment. |
row.names | logical (or character vector), indicating whether (orwhat) row names should be printed. |
max | numeric or |
This callsformat
which formats the data framecolumn-by-column, then converts to a character matrix and dispatchesto theprint
method for matrices.
Whenquote = TRUE
only the entries are quoted not the row namesnor the column names.
(dd <- data.frame(x = 1:8, f = gl(2,4), ch = I(letters[1:8]))) # print() with defaultsprint(dd, quote = TRUE, row.names = FALSE) # suppresses row.names and quotes all entries
(dd<- data.frame(x=1:8, f= gl(2,4), ch= I(letters[1:8])))# print() with defaultsprint(dd, quote=TRUE, row.names=FALSE)# suppresses row.names and quotes all entries
print.default
is thedefault method of the genericprint
function which prints its argument.
## Default S3 method:print(x, digits = NULL, quote = TRUE, na.print = NULL, print.gap = NULL, right = FALSE, max = NULL, width = NULL, useSource = TRUE, ...)
## Default S3 method:print(x, digits=NULL, quote=TRUE, na.print=NULL, print.gap=NULL, right=FALSE, max=NULL, width=NULL, useSource=TRUE,...)
x | the object to be printed. |
digits | a non-null value for |
quote | logical, indicating whether or not strings( |
na.print | a character string which is used to indicate |
print.gap | a non-negative integer |
right | logical, indicating whether or not strings should beright aligned. The default is left alignment. |
max | a non-null value for |
width | controls the maximum number of columns on a line used inprinting vectors, matrices, etc. The default, |
useSource | logical, indicating whether to use sourcereferences or copies rather than deparsinglanguage objects.The default is to use the original source if it is available. |
... | further arguments to be passed to or from othermethods. They are ignored in this function. |
The default for printingNA
s is to printNA
(withoutquotes) unless this is a characterNA
andquote = FALSE
, when ‘<NA>’ is printed.
The same number of decimal places is used throughout a vector. Thismeans thatdigits
specifies the minimum number of significantdigits to be used, and that at least one entry will be encoded withthat minimum number. However, if all the encoded elements then havetrailing zeroes, the number of decimal places is reduced until atleast one element has a non-zero final digit. Decimal points are onlyincluded if at least one decimal place is selected.
You can suppress “exponential” /scientific
notation inprinting of numbers (atomic vectorsx
),viaformat(., scientific=FALSE)
, see theprI()
examplebelow, or also by increasing global optionscipen
, e.g.,options(scipen = 12)
.
Attributes are printed respecting their class(es), using the values ofdigits
toprint.default
, but using the default values(for the methods called) of the other arguments.
Optionwidth
controls the printing of vectors, matrices andarrays, and optiondeparse.cutoff
controls the printing oflanguage objects such as calls and formulae.
When themethods package is attached,print
will callshow
forR objects with formal classes (‘S4’)if called with no optional arguments.
Note that for large values ofdigits
, currently fordigits >= 16
, the calculation of the number of significantdigits will depend on the platform's internal (C library)implementation of ‘sprintf()’ functionality.
If a non-printable character is encountered during output, it isrepresented as one of the ANSI escape sequences (‘\a’, ‘\b’,‘\f’, ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\\’ and‘\0’: seeQuotes), or failing that as a 3-digit octalcode: for example the UK currency pound sign in the C locale (ifimplemented correctly) is printed as ‘\243’. Which charactersare non-printable depends on the locale.(Because some versions of Windows get this wrong, all bytes with theupper bit set are regarded as printable on Windows in a single-bytelocale.)
In all locales, the characters in the ASCII range (‘0x00’ to‘0x7f’) are printed in the same way, as-is if printable, otherwisevia ANSI escape sequences or 3-digit octal escapes as described forsingle-byte locales. Whether a character is printable depends on thecurrent locale and the operating system (C library).
Multi-byte non-printing characters are printed as an escape sequenceof the form ‘\uxxxx’ or ‘\Uxxxxxxxx’ (in hexadecimal).This is the internal code for the wide-character representation of thecharacter. If this is not known to be Unicode code points, a warning isissued. The only known exceptions are certain Japanese ISO 2022locales on commercial Unixes, which use a concatenation of the bytes:it is unlikely thatR compiles on such a system.
It is possible to have a character string in a character vector thatis not valid in the current locale. If a byte is encountered that isnot part of a valid character it is printed in hex in the form‘\xab’ and this is repeated until the start of a valid character.(This will rapidly recover from minor errors in UTF-8.)
The genericprint
,options
.The"noquote"
class and print method.
encodeString
, which encodes a character vector the wayit would be printed.
piprint(pi, digits = 16)LETTERS[1:16]print(LETTERS, quote = FALSE)M <- cbind(I = 1, matrix(1:10000, ncol = 10, dimnames = list(NULL, LETTERS[1:10])))utils::head(M) # makes more sense thanprint(M, max = 1000) # prints 90 rows and a message about omitting 910(x <- 2^seq(-8, 30, by=1/4)) # auto-prints; by default all in "exponential" formatprI <- function(x) noquote(format(x, scientific = FALSE))prI(x) # prints more "nicely" (using a bit more space)
piprint(pi, digits=16)LETTERS[1:16]print(LETTERS, quote=FALSE)M<- cbind(I=1, matrix(1:10000, ncol=10, dimnames= list(NULL, LETTERS[1:10])))utils::head(M)# makes more sense thanprint(M, max=1000)# prints 90 rows and a message about omitting 910(x<-2^seq(-8,30, by=1/4))# auto-prints; by default all in "exponential" formatprI<-function(x) noquote(format(x, scientific=FALSE))prI(x)# prints more "nicely" (using a bit more space)
An earlier method for printing matrices, provided for S compatibility.
prmatrix(x, rowlab =, collab =, quote = TRUE, right = FALSE, na.print = NULL, ...)
prmatrix(x, rowlab=, collab=, quote=TRUE, right=FALSE, na.print=NULL,...)
x | numeric or character matrix. |
rowlab ,collab | (optional) character vectors giving row or columnnames respectively. By default, these are taken from |
quote | logical; if |
right | if |
na.print | how |
... | arguments for |
prmatrix
is an earlier form ofprint.matrix
, andis very similar to the S function of the same name.
Invisibly returns its argument,x
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
print.default
, and otherprint
methods.
prmatrix(m6 <- diag(6), rowlab = rep("", 6), collab = rep("", 6))chm <- matrix(scan(system.file("help", "AnIndex", package = "splines"), what = ""), , 2, byrow = TRUE)chm # uses print.matrix()prmatrix(chm, collab = paste("Column", 1:3), right = TRUE, quote = FALSE)
prmatrix(m6<- diag(6), rowlab= rep("",6), collab= rep("",6))chm<- matrix(scan(system.file("help","AnIndex", package="splines"), what=""),,2, byrow=TRUE)chm# uses print.matrix()prmatrix(chm, collab= paste("Column",1:3), right=TRUE, quote=FALSE)
proc.time
determines how much real and CPU time (in seconds)the currently runningR process has already taken.
proc.time()
proc.time()
proc.time
returns five elements for backwards compatibility,but itsprint
method prints a named vector oflength 3. The first two entries are the total user and system CPUtimes of the currentR process and any child processes on which ithas waited, and the third entry is the ‘real’ elapsed timesince the process was started.
An object of class"proc_time"
which is a numeric vector oflength 5, containing the user, system, and total elapsed times for thecurrently runningR process, and the cumulative sum of user andsystem times of any child processes spawned by it on which it haswaited. (Theprint
method uses thesummary
method tocombine the child times with those of the main process.)
The definition of ‘user’ and ‘system’ times is from yourOS. Typically it is something like
The ‘user time’ is the CPU time charged for the executionof user instructions of the calling process. The ‘system time’is the CPU time charged for execution by the system on behalf of thecalling process.
Times of child processes are not available on Windows and will alwaysbe given asNA
.
The resolution of the times will be system-specific and on Unix-alikestimes are rounded down to milliseconds. On modern systems they willbe that accurate, but on older systems they might be accurate to 1/100or 1/60 sec. They are typically available to 10ms on Windows.
This is aprimitive function.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
system.time
for timing anR expression,gc.time
for how much of the time was spent in garbagecollection.
setTimeLimit
tolimit the CPU or elapsed time forthe session or an expression.
## a way to time an R expression: system.time is preferredptm <- proc.time()for (i in 1:50) mad(stats::runif(500))proc.time() - ptm
## a way to time an R expression: system.time is preferredptm<- proc.time()for(iin1:50) mad(stats::runif(500))proc.time()- ptm
prod
returns the product of all the valuespresent in its arguments.
prod(..., na.rm = FALSE)
prod(..., na.rm=FALSE)
... | numeric or complex or logical vectors. |
na.rm | logical. Should missing values be removed? |
Ifna.rm
isFALSE
anNA
value in any of the arguments will causea value ofNA
to be returned, otherwiseNA
values are ignored.
This is a generic function: methods can be defined for itdirectly or via theSummary
group generic.For this to work properly, the arguments...
should beunnamed, and dispatch is on the first argument.
Logical true values are regarded as one, false values as zero.For historical reasons,NULL
is accepted and treated as if itwerenumeric(0)
.
The product, a numeric (of type"double"
) or complex vector of length one.NB: the product of an empty set is one, by definition.
This is part of the S4Summary
group generic. Methods for it must use the signaturex, ..., na.rm
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
‘plotmath’ for the use ofprod
in plot annotation.
print(prod(1:7)) == print(gamma(8))
print(prod(1:7))== print(gamma(8))
Returns conditional proportions givenmargins
, i.e.,entries ofx
, divided by the appropriate marginal sums.
proportions(x, margin = NULL)prop.table(x, margin = NULL)
proportions(x, margin=NULL)prop.table(x, margin=NULL)
x | an array, usually a |
margin | a vector giving the margins to split by.E.g., for a matrix |
A table or array likex
, expressed relative tomargin
.
prop.table
is an earlier name, retained for back-compatibility.
Peter Dalgaard
apply
andsweep
are more generalmechanisms for sweeping out marginal statistics.
m <- matrix(1:4, 2)mproportions(m, 1)DF <- as.data.frame(UCBAdmissions)tbl <- xtabs(Freq ~ Gender + Admit, DF)tblproportions(tbl, "Gender")
m<- matrix(1:4,2)mproportions(m,1)DF<- as.data.frame(UCBAdmissions)tbl<- xtabs(Freq~ Gender+ Admit, DF)tblproportions(tbl,"Gender")
Functions to push back text lines onto aconnection, and to enquirehow many lines are currently pushed back.
pushBack(data, connection, newLine = TRUE, encoding = c("", "bytes", "UTF-8"))pushBackLength(connection)clearPushBack(connection)
pushBack(data, connection, newLine=TRUE, encoding= c("","bytes","UTF-8"))pushBackLength(connection)clearPushBack(connection)
data | a character vector. |
connection | |
newLine | logical. If true, a newline is appended to each stringpushed back. |
encoding | character string, partially matched. See details. |
Several character strings can be pushed back on one or more occasions.The occasions form a stack, so the first line to be retrieved will bethe first string from the last call topushBack
. Lines whichare pushed back are read prior to the normal input from theconnection, by the normal text-reading functions such asreadLines
andscan
.
Pushback is only allowed for readable connections in text mode.
Not all uses of connections respect pushbacks, in particular the inputconnection is still wired directly, so for example parsingcommands from the console andscan("")
ignore pushbacks onstdin
.
When character strings with a marked encoding (seeEncoding
) are pushed back they are converted to thecurrent encoding ifencoding = ""
. This may involverepresenting characters as ‘<U+xxxx>’ if they cannot beconverted. They will be converted to UTF-8 ifencoding = "UTF-8"
or left as-is ifencoding = "bytes"
.
pushBack
andclearPushBack()
return nothing, invisibly.
pushBackLength
returns the number of lines currently pushed back.
zz <- textConnection(LETTERS)readLines(zz, 2)pushBack(c("aa", "bb"), zz)pushBackLength(zz)readLines(zz, 1)pushBackLength(zz)readLines(zz, 1)readLines(zz, 1)close(zz)
zz<- textConnection(LETTERS)readLines(zz,2)pushBack(c("aa","bb"), zz)pushBackLength(zz)readLines(zz,1)pushBackLength(zz)readLines(zz,1)readLines(zz,1)close(zz)
qr
computes the QR decomposition of a matrix.
qr(x, ...)## Default S3 method:qr(x, tol = 1e-07 , LAPACK = FALSE, ...)qr.coef(qr, y)qr.qy(qr, y)qr.qty(qr, y)qr.resid(qr, y)qr.fitted(qr, y, k = qr$rank)qr.solve(a, b, tol = 1e-7)## S3 method for class 'qr'solve(a, b, ...)is.qr(x)as.qr(x)
qr(x,...)## Default S3 method:qr(x, tol=1e-07, LAPACK=FALSE,...)qr.coef(qr, y)qr.qy(qr, y)qr.qty(qr, y)qr.resid(qr, y)qr.fitted(qr, y, k= qr$rank)qr.solve(a, b, tol=1e-7)## S3 method for class 'qr'solve(a, b,...)is.qr(x)as.qr(x)
x | a numeric or complex matrix whose QR decomposition is to becomputed. Logical matrices are coerced to numeric. |
tol | the tolerance for detecting linear dependencies in thecolumns of |
qr | a QR decomposition of the type computed by |
y ,b | a vector or matrix of right-hand sides of equations. |
a | a QR decomposition or ( |
k | effective rank. |
LAPACK | logical. For real |
... | further arguments passed to or from other methods. |
The QR decomposition plays an important role in manystatistical techniques. In particular it can be used to solve theequation for given matrix
,and vector
. It is useful for computing regressioncoefficients and in applying the Newton-Raphson algorithm.
The functionsqr.coef
,qr.resid
, andqr.fitted
return the coefficients, residuals and fitted values obtained whenfittingy
to the matrix with QR decompositionqr
.(If pivoting is used, some of the coefficients will beNA
.)qr.qy
andqr.qty
returnQ %*% y
andt(Q) %*% y
, whereQ
is the (complete) matrix.
All the above functions keepdimnames
(andnames
) ofx
andy
if there are any.
solve.qr
is the method forsolve
forqr
objects.qr.solve
solves systems of equations via the QR decomposition:ifa
is a QR decomposition it is the same assolve.qr
,but ifa
is a rectangular matrix the QR decomposition iscomputed first. Either will handle over- and under-determinedsystems, providing a least-squares fit if appropriate.
is.qr
returnsTRUE
ifx
is alist
andinherits
from"qr"
.
It is not possible to coerce objects to mode"qr"
. Objectseither are QR decompositions or they are not.
The LINPACK interface is restricted to matricesx
with lessthan elements.
qr.fitted
andqr.resid
only support the LINPACK interface.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.
The QR decomposition of the matrix as computed by LINPACK(*) or LAPACK.The components in the returned value correspond directlyto the values returned by DQRDC(2)/DGEQP3/ZGEQP3.
qr | a matrix with the same dimensions as |
qraux | a vector of length |
rank | the rank of |
pivot | information on the pivoting strategy used duringthe decomposition. |
Non-complex QR objects computed by LAPACK have the attribute"useLAPACK"
with valueTRUE
.
*)
dqrdc2
instead of LINPACK's DQRDCIn the (default) LINPACK case (LAPACK = FALSE
),qr()
uses amodified version of LINPACK's DQRDC, called‘dqrdc2
’. It differs by using the tolerancetol
for a pivoting strategy which moves columns with near-zero 2-norm tothe right-hand edge of the x matrix. This strategy means thatsequential one degree-of-freedom effects can be computed in a naturalway.
To compute the determinant of a matrix (do youreally need it?),the QR decomposition is much more efficient than using eigenvalues(eigen
). Seedet
.
Using LAPACK (including in the complex case) uses column pivoting anddoes not attempt to detect rank-deficient matrices.
Forqr
, the LINPACK routineDQRDC
(but modified todqrdc2
(*)) and the LAPACKroutinesDGEQP3
andZGEQP3
. Further LINPACK and LAPACKroutines are used forqr.coef
,qr.qy
andqr.aty
.
LAPACK and LINPACK are fromhttps://netlib.org/lapack/ andhttps://netlib.org/linpack/ and their guides are listedin the references.
Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide. Philadelphia: SIAM Publications.
qr.Q
,qr.R
,qr.X
forreconstruction of the matrices.lm.fit
,lsfit
,eigen
,svd
.
det
(usingqr
) to compute the determinant of a matrix.
hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, `+`) }h9 <- hilbert(9); h9qr(h9)$rank #--> only 7qrh9 <- qr(h9, tol = 1e-10)qrh9$rank #--> 9##-- Solve linear equation system H %*% x = y :y <- 1:9/10x <- qr.solve(h9, y, tol = 1e-10) # or equivalently :x <- qr.coef(qrh9, y) #-- is == but much better than #-- solve(h9) %*% yh9 %*% x # = y## overdetermined systemA <- matrix(runif(12), 4)b <- 1:4qr.solve(A, b) # or solve(qr(A), b)solve(qr(A, LAPACK = TRUE), b)# this is a least-squares solution, cf. lm(b ~ 0 + A)## underdetermined systemA <- matrix(runif(12), 3)b <- 1:3qr.solve(A, b)solve(qr(A, LAPACK = TRUE), b)# solutions will have one zero, not necessarily the same one
hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}h9<- hilbert(9); h9qr(h9)$rank#--> only 7qrh9<- qr(h9, tol=1e-10)qrh9$rank#--> 9##-- Solve linear equation system H %*% x = y :y<-1:9/10x<- qr.solve(h9, y, tol=1e-10)# or equivalently :x<- qr.coef(qrh9, y)#-- is == but much better than#-- solve(h9) %*% yh9%*% x# = y## overdetermined systemA<- matrix(runif(12),4)b<-1:4qr.solve(A, b)# or solve(qr(A), b)solve(qr(A, LAPACK=TRUE), b)# this is a least-squares solution, cf. lm(b ~ 0 + A)## underdetermined systemA<- matrix(runif(12),3)b<-1:3qr.solve(A, b)solve(qr(A, LAPACK=TRUE), b)# solutions will have one zero, not necessarily the same one
Returns the original matrix from which the object was constructed orthe components of the decomposition.
qr.X(qr, complete = FALSE, ncol =)qr.Q(qr, complete = FALSE, Dvec =)qr.R(qr, complete = FALSE)
qr.X(qr, complete=FALSE, ncol=)qr.Q(qr, complete=FALSE, Dvec=)qr.R(qr, complete=FALSE)
qr | object representing a QR decomposition. This willtypically have come from a previous call to |
complete | logical expression of length 1. Indicates whether anarbitrary orthogonal completion of the |
ncol | integer in the range |
Dvec | vector (not matrix) of diagonal values. Each column ofthe returned |
qr.X
returns, the original matrix fromwhich the qr object was constructed, provided
ncol(X) <= nrow(X)
.Ifcomplete
isTRUE
or the argumentncol
is greater thanncol(X)
, additional columns from an arbitrary orthogonal(unitary) completion ofX
are returned.
qr.Q
returns part or all ofQ, the orthogonal (unitary)transformation of ordernrow(X)
represented byqr
. Ifcomplete
isTRUE
,Q hasnrow(X)
columns.Ifcomplete
isFALSE
,Q hasncol(X)
columns. WhenDvec
is specified, each column ofQ ismultiplied by the corresponding value inDvec
.
Note thatqr.Q(qr, *)
is a special case ofqr.qy(qr, y)
(with a “diagonal”y
), andqr.X(qr, *)
is basicallyqr.qy(qr, R)
(apart frompivoting anddimnames
setting).
qr.R
returnsR. This may be pivoted, e.g., ifa <- qr(x)
thenx[, a$pivot]
=QR. The number ofrows ofR is eithernrow(X)
orncol(X)
(and maydepend on whethercomplete
isTRUE
orFALSE
).
p <- ncol(x <- LifeCycleSavings[, -1]) # not the 'sr'qrstr <- qr(x) # dim(x) == c(n,p)qrstr $ rank # = 4 = pQ <- qr.Q(qrstr) # dim(Q) == dim(x)R <- qr.R(qrstr) # dim(R) == ncol(x)X <- qr.X(qrstr) # X == xrange(X - as.matrix(x)) # ~ < 6e-12## X == Q %*% R if there has been no pivoting, as here:all.equal(unname(X), unname(Q %*% R))# example of pivotingx <- cbind(int = 1, b1 = rep(1:0, each = 3), b2 = rep(0:1, each = 3), c1 = rep(c(1,0,0), 2), c2 = rep(c(0,1,0), 2), c3 = rep(c(0,0,1),2))x # is singular, columns "b2" and "c3" are "extra"a <- qr(x)zapsmall(qr.R(a)) # columns are int b1 c1 c2 b2 c3a$pivotpivI <- sort.list(a$pivot) # the inverse permutationall.equal (x, qr.Q(a) %*% qr.R(a)) # no, nostopifnot( all.equal(x[, a$pivot], qr.Q(a) %*% qr.R(a)), # TRUE all.equal(x , qr.Q(a) %*% qr.R(a)[, pivI])) # TRUE too!
p<- ncol(x<- LifeCycleSavings[,-1])# not the 'sr'qrstr<- qr(x)# dim(x) == c(n,p)qrstr$ rank# = 4 = pQ<- qr.Q(qrstr)# dim(Q) == dim(x)R<- qr.R(qrstr)# dim(R) == ncol(x)X<- qr.X(qrstr)# X == xrange(X- as.matrix(x))# ~ < 6e-12## X == Q %*% R if there has been no pivoting, as here:all.equal(unname(X), unname(Q%*% R))# example of pivotingx<- cbind(int=1, b1= rep(1:0, each=3), b2= rep(0:1, each=3), c1= rep(c(1,0,0),2), c2= rep(c(0,1,0),2), c3= rep(c(0,0,1),2))x# is singular, columns "b2" and "c3" are "extra"a<- qr(x)zapsmall(qr.R(a))# columns are int b1 c1 c2 b2 c3a$pivotpivI<- sort.list(a$pivot)# the inverse permutationall.equal(x, qr.Q(a)%*% qr.R(a))# no, nostopifnot( all.equal(x[, a$pivot], qr.Q(a)%*% qr.R(a)),# TRUE all.equal(x, qr.Q(a)%*% qr.R(a)[, pivI]))# TRUE too!
The functionquit
or its aliasq
terminate the currentR session.
quit(save = "default", status = 0, runLast = TRUE) q(save = "default", status = 0, runLast = TRUE)
quit(save="default", status=0, runLast=TRUE) q(save="default", status=0, runLast=TRUE)
save | a character string indicating whether the environment(workspace) should be saved, one of |
status | the (numerical) error status to be returned to theoperating system, where relevant. Conventionally |
runLast | should |
save
must be one of"no"
,"yes"
,"ask"
or"default"
. In the first case the workspaceis not saved, in the second it is saved and in the third the user isprompted and can also decidenot to quit. The default is toask in interactive use but may be overridden by command-linearguments (which must be supplied in non-interactive use).
Immediatelybefore normal termination,.Last()
isexecuted if the function.Last
exists andrunLast
istrue. If in interactive use there are errors in the.Last
function, control will be returned to the command prompt, so do testthe function thoroughly. There is a system analogue,.Last.sys()
, which is run after.Last()
ifrunLast
is true.
Exactly what happens at termination of anR session depends on theplatform and GUI interface in use. A typical sequence is to run.Last()
and.Last.sys()
(unlessrunLast
isfalse), to save the workspace if requested (and in most cases alsoto save the session history: seesavehistory
), thenrun any finalizers (seereg.finalizer
) that have beenset to be run on exit, close all open graphics devices, remove thesession temporary directory and print any remaining warnings(e.g., from.Last()
and device closure).
Some error status values are used byR itself. The default errorhandler for non-interactive use effectively callsq("no", 1, FALSE)
and returns error status 1. Error status 2 is used forR‘suicide’, that is a catastrophic failure, and other smallnumbers are used by specific ports for initialization failures. Itis recommended that users choose statuses of 10 or more.
Valid values ofstatus
are system-dependent, but0:255
are normally valid. (Many OSes will report the last byte of thevalue, that is report the value modulo 256. But not all.)
The value of.Last
is for the end user to control: asit can be replaced later in the session, it cannot safely be usedprogrammatically, e.g. by a package. The other way to set code to be runat the end of the session is to use afinalizer: seereg.finalizer
.
TheR.app
GUI on macOS has its own version of these functionswith slightly different behaviour for thesave
argument (theGUI's ‘Startup’ preferences for this action are taken into account).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
.First
for setting things on startup.
## Not run: ## Unix-flavour example.Last <- function() { graphics.off() # close devices before printing cat("Now sending PDF graphics to the printer:\n") system("lpr Rplots.pdf") cat("bye bye...\n")}quit("yes")## End(Not run)
## Not run: ## Unix-flavour example.Last<-function(){ graphics.off()# close devices before printing cat("Now sending PDF graphics to the printer:\n") system("lpr Rplots.pdf") cat("bye bye...\n")}quit("yes")## End(Not run)
Descriptions of the various uses of quoting inR.
Three types of quotes are part of the syntax ofR: single and doublequotation marks and the backtick (or back quote, ‘`’). Inaddition, backslash is used to escape the following characterinside character constants.
Single and double quotes delimit character constants. They can be usedinterchangeably but double quotes are preferred (and characterconstants are printed using double quotes), so single quotes arenormally only used to delimit character constants containing doublequotes.
Backslash is used to start an escape sequence inside characterconstants. Escaping a character not in the following table is anerror.
Single quotes need to be escaped by backslash in single-quotedstrings, and double quotes in double-quoted strings.
‘\n’ | newline (aka ‘line feed’) |
‘\r’ | carriage return |
‘\t’ | tab |
‘\b’ | backspace |
‘\a’ | alert (bell) |
‘\f’ | form feed |
‘\v’ | vertical tab |
‘\\’ | backslash ‘\’ |
‘\'’ | ASCII apostrophe ‘'’ |
‘\"’ | ASCII quotation mark ‘"’ |
‘\`’ | ASCII grave accent (backtick) ‘`’ |
‘\nnn’ | character with given octal code (1, 2 or 3 digits) |
‘\xnn’ | character with given hex code (1 or 2 hex digits) |
‘\unnnn’ | Unicode character with given code (1--4 hex digits) |
‘\Unnnnnnnn’ | Unicode character with given code (1--8 hex digits) |
Alternative forms for the last two are ‘\u{nnnn}’ and‘\U{nnnnnnnn}’. All except the Unicode escape sequences arealso supported when reading character strings byscan
andread.table
ifallowEscapes = TRUE
. Unicodeescapes can be used to enter Unicode characters not in the currentlocale'scharset (when the string will be stored internally in UTF-8).The maximum allowed value for ‘\nnn’ is ‘\377’ (the samecharacter as ‘\xff’).
As fromR 4.1.0 the largest allowed ‘\U’ value is‘\U10FFFF’, the maximum Unicode point.
The parser does not allow the use of both octal/hex and Unicodeescapes in a single string.
These forms will also be used byprint.default
when outputting non-printable characters (including backslash).
EmbeddedNULs are not allowed in character strings, so using escapes(such as ‘\0’) for aNUL will result in the string beingtruncated at that point (usually with a warning).
Raw character constants are also available using a syntax similar tothe one used in C++:r"(...)"
with...
any charactersequence, except that it must not contain the closing sequence‘)"’. The delimiter pairs[]
and{}
can also beused, andR
can be used in place ofr
. For additionalflexibility, a number of dashes can be placed between the opening quoteand the opening delimiter, as long as the same number of dashes appearbetween the closing delimiter and the closing quote.
Identifiers consist of a sequence of letters, digits, the period(.
) and the underscore. They must not start with a digit norunderscore, nor with a period followed by a digit.Reservedwords are not valid identifiers.
The definition of aletter depends on the current locale, butonly ASCII digits are considered to be digits.
Such identifiers are also known assyntactic names and may be useddirectly inR code. Almost always, other names can be usedprovided they are quoted. The preferred quote is the backtick(‘`’), anddeparse
will normally use it, but undermany circumstances single or double quotes can be used (as a characterconstant will often be converted to a name). One place wherebackticks may be essential is to delimit variable names in formulae:seeformula
.
UTF-16 surrogate pairs in ‘\unnnn\uoooo’ form will be convertedto a single Unicode point, so for example ‘\uD834\uDD1E’ givesthe single character ‘\U1D11E’. However, unpaired values inthe surrogate range such as in the string"abc\uD834de"
will beconverted to a non-standard-conformant UTF-8 string (as is done by mostother software): this may change in future.
Syntax
for other aspects of the syntax.
sQuote
for quoting English text.
shQuote
for quoting OS commands.
The ‘R Language Definition’ manual.
'single quotes can be used more-or-less interchangeably'"with double quotes to create character vectors"## Single quotes inside single-quoted strings need backslash-escaping.## Ditto double quotes inside double-quoted strings.##identical('"It\'s alive!", he screamed.', "\"It's alive!\", he screamed.") # same## Backslashes need doubling, or they have a special meaning.x <- "In ALGOL, you could do logical AND with /\\."print(x) # shows it as above ("input-like")writeLines(x) # shows it as you like it ;-)## Single backslashes followed by a letter are used to denote## special characters like tab(ulator)s and newlines:x <- "long\tlines can be\nbroken with newlines"writeLines(x) # see also ?strwrap## Backticks are used for non-standard variable names.## (See make.names and ?Reserved for what counts as## non-standard.)`x y` <- 1:5`x y`d <- data.frame(`1st column` = rchisq(5, 2), check.names = FALSE)d$`1st column`## Backslashes followed by up to three numbers are interpreted as## octal notation for ASCII characters."\110\145\154\154\157\40\127\157\162\154\144\41"## \x followed by up to two numbers is interpreted as## hexadecimal notation for ASCII characters.(hw1 <- "\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21")## Mixing octal and hexadecimal in the same string is OK(hw2 <- "\110\x65\154\x6c\157\x20\127\x6f\162\x6c\144\x21")## \u is also hexadecimal, but supports up to 4 digits,## using Unicode specification. In the previous example,## you can simply replace \x with \u.(hw3 <- "\u48\u65\u6c\u6c\u6f\u20\u57\u6f\u72\u6c\u64\u21")## The last three are all identical tohw <- "Hello World!"stopifnot(identical(hw, hw1), identical(hw1, hw2), identical(hw2, hw3))## Using Unicode makes more sense for non-latin characters.(nn <- "\u0126\u0119\u1114\u022d\u2001\u03e2\u0954\u0f3f\u13d3\u147b\u203c")## Mixing \x and \u throws a _parse_ error (which is not catchable!)## Not run: "\x48\u65\x6c\u6c\x6f\u20\x57\u6f\x72\u6c\x64\u21"## End(Not run)## --> Error: mixing Unicode and octal/hex escapes .....## \U works like \u, but supports up to six hex digits.## So we can replace \u with \U in the previous example.n2 <- "\U0126\U0119\U1114\U022d\U2001\U03e2\U0954\U0f3f\U13d3\U147b\U203c"stopifnot(identical(nn, n2))## Under systems supporting multi-byte locales (and not Windows),## \U also supports the rarer characters outside the usual 16^4 range.## See the R language manual,## https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Literal-constants## and bug 16098 https://bugs.r-project.org/show_bug.cgi?id=16098## This character may or not be printable (the platform decides)## and if it is, may not have a glyph in the font used."\U1d4d7" # On Windows this used to give the incorrect value of "\Ud4d7"## nul characters (for terminating strings in C) are not allowed (parse errors)## Not run: "foo\0bar" # Error: nul character not allowed (line 1) "foo\u0000bar" # same error## End(Not run)## A Windows path written as a raw string constant:r"(c:\Program files\R)"## More raw strings:r"{(\1\2)}"r"(use both "double" and 'single' quotes)"r"---(\1--)-)---"
'single quotes can be used more-or-less interchangeably'"with double quotes to create character vectors"## Single quotes inside single-quoted strings need backslash-escaping.## Ditto double quotes inside double-quoted strings.##identical('"It\'s alive!", he screamed.',"\"It's alive!\", he screamed.")# same## Backslashes need doubling, or they have a special meaning.x<-"In ALGOL, you could do logical AND with /\\."print(x)# shows it as above ("input-like")writeLines(x)# shows it as you like it ;-)## Single backslashes followed by a letter are used to denote## special characters like tab(ulator)s and newlines:x<-"long\tlines can be\nbroken with newlines"writeLines(x)# see also ?strwrap## Backticks are used for non-standard variable names.## (See make.names and ?Reserved for what counts as## non-standard.)`x y`<-1:5`x y`d<- data.frame(`1st column`= rchisq(5,2), check.names=FALSE)d$`1st column`## Backslashes followed by up to three numbers are interpreted as## octal notation for ASCII characters."\110\145\154\154\157\40\127\157\162\154\144\41"## \x followed by up to two numbers is interpreted as## hexadecimal notation for ASCII characters.(hw1<-"\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21")## Mixing octal and hexadecimal in the same string is OK(hw2<-"\110\x65\154\x6c\157\x20\127\x6f\162\x6c\144\x21")## \u is also hexadecimal, but supports up to 4 digits,## using Unicode specification. In the previous example,## you can simply replace \x with \u.(hw3<-"\u48\u65\u6c\u6c\u6f\u20\u57\u6f\u72\u6c\u64\u21")## The last three are all identical tohw<-"Hello World!"stopifnot(identical(hw, hw1), identical(hw1, hw2), identical(hw2, hw3))## Using Unicode makes more sense for non-latin characters.(nn<-"\u0126\u0119\u1114\u022d\u2001\u03e2\u0954\u0f3f\u13d3\u147b\u203c")## Mixing \x and \u throws a _parse_ error (which is not catchable!)## Not run:"\x48\u65\x6c\u6c\x6f\u20\x57\u6f\x72\u6c\x64\u21"## End(Not run)## --> Error: mixing Unicode and octal/hex escapes .....## \U works like \u, but supports up to six hex digits.## So we can replace \u with \U in the previous example.n2<-"\U0126\U0119\U1114\U022d\U2001\U03e2\U0954\U0f3f\U13d3\U147b\U203c"stopifnot(identical(nn, n2))## Under systems supporting multi-byte locales (and not Windows),## \U also supports the rarer characters outside the usual 16^4 range.## See the R language manual,## https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Literal-constants## and bug 16098 https://bugs.r-project.org/show_bug.cgi?id=16098## This character may or not be printable (the platform decides)## and if it is, may not have a glyph in the font used."\U1d4d7"# On Windows this used to give the incorrect value of "\Ud4d7"## nul characters (for terminating strings in C) are not allowed (parse errors)## Not run:"foo\0bar"# Error: nul character not allowed (line 1)"foo\u0000bar"# same error## End(Not run)## A Windows path written as a raw string constant:r"(c:\Program files\R)"## More raw strings:r"{(\1\2)}"r"(use both "double" and 'single' quotes)"r"---(\1--)-)---"
R.Version()
provides detailed information about the version ofR running.
R.version
is a variable (alist
) holding thisinformation (andversion
is a copy of it for S compatibility).
R.Version()R.versionR.version.stringversionR_compiled_by()
R.Version()R.versionR.version.stringversionR_compiled_by()
This gives details of the OS under whichR was built, not the oneunder which it is currently running (for which seeSys.info
).
Note that OS names might not be what you expect: for example macOSMavericks 10.9.4 identifies itself as ‘darwin13.3.0’, Linuxusually as ‘linux-gnu’, Solaris 10 as ‘solaris2.10’ and Windowsas ‘mingw32’.
R.version$crt
is supported on Windows sinceR 4.2.0 and returns"ucrt"
to denote the Universal C Runtime. It would return"msvcrt"
for the older Microsoft Visual C++ Runtime (butR doesnot use that runtime since 4.2.0).
R.Version
returns a list with character-string components
platform | the platform for whichR was built. A triplet of theform CPU-VENDOR-OS, as determined by the configure script. E.g, |
arch | the architecture (CPU)R was built on/for. |
os | the underlying operating system. |
crt | the C runtime on Windows. |
system | CPU and OS, separated by a comma. |
status | the status of the version (e.g., |
major | the major version number. |
minor | the minor version number, including the patch level. |
year | the year the version was released. |
month | the month the version was released. |
day | the day the version was released. |
svn rev | the Subversion revision number, which should be either |
language | always |
version.string | a |
R.version
andversion
are lists of class"simple.list"
which has aprint
method.
R_compiled_by
returns a two-element character vector givingdetails of the C and Fortran compilers used to buildR. (Emptystrings if no information is available.)
Donot useR.version$os
to test the platform thecode is running on: use.Platform$OS.type
instead. Slightlydifferent versions of the OS may report different values ofR.version$os
, as may different versions ofR.Alternatively,osVersion
typically contains moredetails about the platformR is running on.
R.version.string
is a copy ofR.version$version.string
for simplicity and backwards compatibility.
sessionInfo
which provides additional information;getRversion
typically used inside R code,osVersion
,.Platform
,Sys.info
.
require(graphics)R.version$os # to check how lucky you are ...plot(0) # any plotmtext(R.version.string, side = 1, line = 4, adj = 1) # a useful bottom-right note## a good way to detect macOS:if(grepl("^darwin", R.version$os)) message("running on macOS")## Short R version string, ("space free", useful in file/directory names;## also fine for unreleased versions of R):shortRversion <- function() { rvs <- R.version.string if(grepl("devel", (st <- R.version$status))) rvs <- sub(paste0(" ",st," "), "-devel_", rvs, fixed=TRUE) gsub("[()]", "", gsub(" ", "_", sub(" version ", "-", rvs)))}shortRversion()
require(graphics)R.version$os# to check how lucky you are ...plot(0)# any plotmtext(R.version.string, side=1, line=4, adj=1)# a useful bottom-right note## a good way to detect macOS:if(grepl("^darwin", R.version$os)) message("running on macOS")## Short R version string, ("space free", useful in file/directory names;## also fine for unreleased versions of R):shortRversion<-function(){ rvs<- R.version.stringif(grepl("devel",(st<- R.version$status))) rvs<- sub(paste0(" ",st," "),"-devel_", rvs, fixed=TRUE) gsub("[()]","", gsub(" ","_", sub(" version ","-", rvs)))}shortRversion()
.Random.seed
is an integer vector, containing the random numbergenerator (RNG)state for random number generation inR. Itcan be saved and restored, but should not be altered by the user.
RNGkind
is a more friendly interface to query or set the kindof RNG in use.
RNGversion
can be used to set the random generators as theywere in an earlierR version (for reproducibility).
set.seed
is the recommended way to specify seeds.
.Random.seed <- c(rng.kind, n1, n2, ...)RNGkind(kind = NULL, normal.kind = NULL, sample.kind = NULL)RNGversion(vstr)set.seed(seed, kind = NULL, normal.kind = NULL, sample.kind = NULL)
.Random.seed<- c(rng.kind, n1, n2,...)RNGkind(kind=NULL, normal.kind=NULL, sample.kind=NULL)RNGversion(vstr)set.seed(seed, kind=NULL, normal.kind=NULL, sample.kind=NULL)
kind | character or |
normal.kind | character string or |
sample.kind | character string or |
seed | a single value, interpreted as an integer, or |
vstr | a character string containing a version number,e.g., |
rng.kind | integer code in |
n1 ,n2 ,... | integers. See the details for how many are required(which depends on |
The currently available RNG kinds are given below.kind
ispartially matched to this list. The default is"Mersenne-Twister"
.
"Wichmann-Hill"
The seed,.Random.seed[-1] == r[1:3]
is an integer vector oflength 3, where eachr[i]
is in1:(p[i] - 1)
, wherep
is the length 3 vector of primes,p = (30269, 30307, 30323)
.The Wichmann–Hill generator has a cycle length of (=
prod(p-1)/4
, seeApplied Statistics (1984)33, 123 which corrects the original article).It exhibits 12 clear failures in the TestU01 Crush suite and 22in the BigCrush suite (L'Ecuyer, 2007).
"Marsaglia-Multicarry"
:Amultiply-with-carry RNG is used, as recommended by GeorgeMarsaglia in his post to the mailing list ‘sci.stat.math’.It has a period of more than.
It exhibits 40 clear failures in L'Ecuyer's TestU01 Crush suite.Combined with Ahrens-Dieter or Kinderman-Ramage it exhibitsdeviations from normality even for univariate distributiongeneration. SeePR#18168 for a discussion.
The seed is two integers (all values allowed).
"Super-Duper"
:Marsaglia's famous Super-Duper from the 70's. This is the originalversion which doesnot pass the MTUPLE test of the Diehardbattery. It has a period of for most initial seeds. The seed is two integers (allvalues allowed for the first seed: the second must be odd).
We use the implementation by Reedset al. (1982–84).
The two seeds are the Tausworthe and congruence long integers,respectively.
It exhibits 25 clear failures in the TestU01 Crush suite(L'Ecuyer, 2007).
"Mersenne-Twister"
:From Matsumoto and Nishimura (1998); code updated in 2002.A twistedGFSR with period and equidistribution in 623consecutive dimensions (over the whole period). The ‘seed’ is a624-dimensional set of 32-bit integers plus a current position inthat set.
R uses its own initialization method due to B. D. Ripley and isnot affected by the initialization issue in the 1998 code ofMatsumoto and Nishimura addressed in a 2002 update.
It exhibits 2 clear failures in each of the TestU01 Crush and theBigCrush suite (L'Ecuyer, 2007).
"Knuth-TAOCP-2002"
:A 32-bit integerGFSR using lagged Fibonacci sequences withsubtraction. That is, the recurrence used is
and the ‘seed’ is the set of the 100 last numbers (actuallyrecorded as 101 numbers, the last being a cyclic shift of thebuffer). The period is around.
"Knuth-TAOCP"
:An earlier version from Knuth (1997).
The 2002 version was not backwards compatible with the earlierversion: the initialization of theGFSR from the seed was altered.R did not allow you to choose consecutive seeds, the reported‘weakness’, and already scrambled the seeds. Otherwise,the algorithm is identical to Knuth-TAOCP-2002, with the samelagged Fibonacci recurrence formula.
Initialization of this generator is done in interpretedR codeand so takes a short but noticeable time.
It exhibits 3 clear failure in the TestU01 Crush suite and4 clear failures in the BigCrush suite(L'Ecuyer, 2007).
"L'Ecuyer-CMRG"
:A ‘combined multiple-recursive generator’ from L'Ecuyer(1999), each element of which is a feedback multiplicativegenerator with three integer elements: thus the seed is a (signed)integer vector of length 6. The period is around.
The 6 elements of the seed are internally regarded as 32-bitunsigned integers. Neither the first three nor the last threeshould be all zero, and they are limited to less than4294967087
and4294944443
respectively.
This is not particularly interesting of itself, but provides thebasis for the multiple streams used in packageparallel.
It exhibits 6 clear failures in each of the TestU01 Crush and theBigCrush suite (L'Ecuyer, 2007).
"user-supplied"
:Use a user-supplied generator. SeeRandom.user
fordetails.
normal.kind
can be"Kinderman-Ramage"
,"Buggy Kinderman-Ramage"
(not forset.seed
),"Ahrens-Dieter"
,"Box-Muller"
,"Inversion"
(thedefault), or"user-supplied"
. (For inversion, see thereference inqnorm
.) The Kinderman-Ramage generatorused in versions prior to 1.7.0 (now called"Buggy"
) had severalapproximation errors and should only be used for reproduction of oldresults. The"Box-Muller"
generator is stateful as pairs ofnormals are generated and returned sequentially. The state is resetwhenever it is selected (even if it is the current normal generator)and whenkind
is changed.
sample.kind
can be"Rounding"
or"Rejection"
,or partial matches to these. The former was the default in versionsprior to 3.6.0: it madesample
noticeably non-uniformon large populations, and should only be used for reproduction of oldresults. SeePR#17494 for a discussion.
set.seed
uses a single integer argument to set as many seedsas are required. It is intended as a simple way to get quite differentseeds by specifying small integer arguments, and also as a way to getvalid seed sets for the more complicated methods (especially"Mersenne-Twister"
and"Knuth-TAOCP"
). There is noguarantee that different values ofseed
will seed the RNGdifferently, although any exceptions would be extremely rare. Ifcalled withseed = NULL
it re-initializes (see ‘Note’)as if no seed had yet been set.
The use ofkind = NULL
,normal.kind = NULL
orsample.kind = NULL
inRNGkind
orset.seed
selects the currently-usedgenerator (including that used in the previous session if theworkspace has been restored): if no generator has been used it selects"default"
.
.Random.seed
is aninteger
vector whose firstelementcodes the kind of RNG and normal generator. The lowesttwo decimal digits are in0:(k-1)
wherek
is the number of availableRNGs. The hundredsrepresent the type of normal generator (starting at0
), andthe ten thousands represent the type of discrete uniform sampler.
In the underlying C,.Random.seed[-1]
isunsigned
;therefore inR.Random.seed[-1]
can be negative, due tothe representation of an unsigned integer by a signed integer.
RNGkind
returns a three-element character vector of the RNG,normal and sample kinds selectedbefore the call, invisibly if either argument is notNULL
. A type starts a session as the default, and is selected either by a call toRNGkind
or by setting.Random.seed
in the workspace. (NB: prior toR 3.6.0 the firsttwo kinds were returned in a two-element character vector.)
RNGversion
returns the same information asRNGkind
aboutthe defaults in a specificR version.
set.seed
returnsNULL
, invisibly.
Initially, there is no seed; a new one is created from the currenttime and the process ID when one is required. Hence differentsessions will give different simulation results, by default. However,the seed might be restored from a previous session if a previouslysaved workspace is restored.
.Random.seed
saves the seed set for the uniform random-numbergenerator, at least for the system generators. It does notnecessarily save the state of other generators, and in particular doesnot save the state of the Box–Muller normal generator. If you wantto reproduce work later, callset.seed
(preferably withexplicit values forkind
andnormal.kind
) rather thanset.Random.seed
.
The object.Random.seed
is only looked for in the user'sworkspace.
Do not rely on randomness of low-order bits fromRNGs. Most of thesupplied uniform generators return 32-bit integer values that areconverted to doubles, so they take at most distinctvalues and long runs will return duplicated values (Wichmann-Hill isthe exception, and all give at least 30 varying bits.)
of RNGkind: Martin Maechler. Current implementation, B. D. Ripleywith modifications by Duncan Murdoch.
Ahrens, J. H. and Dieter, U. (1973).Extensions of Forsythe's method for random sampling from the normaldistribution.Mathematics of Computation,27, 927–937.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).The New S Language.Wadsworth & Brooks/Cole.(set.seed
, storing in.Random.seed
.)
Box, G. E. P. and Muller, M. E. (1958).A note on the generation of normal random deviates.Annals of Mathematical Statistics,29, 610–611.doi:10.1214/aoms/1177706645.
De Matteis, A. and Pagnutti, S. (1993).Long-range Correlation Analysis of the Wichmann-Hill Random NumberGenerator.Statistics and Computing,3, 67–70.doi:10.1007/BF00153065.
Kinderman, A. J. and Ramage, J. G. (1976).Computer generation of normal random variables.Journal of the American Statistical Association,71,893–896.doi:10.2307/2286857.
Knuth, D. E. (1997).The Art of Computer Programming.Volume 2, third edition.
Source code athttps://www-cs-faculty.stanford.edu/~knuth/taocp.html.
Knuth, D. E. (2002).The Art of Computer Programming.Volume 2, third edition, ninth printing.
L'Ecuyer, P. (1999).Good parameters and implementations for combined multiple recursiverandom number generators.Operations Research,47, 159–164.doi:10.1287/opre.47.1.159.
L'Ecuyer, P. and Simard, R. (2007).TestU01: A C Library for Empirical Testing of Random Number GeneratorsACM Transactions on Mathematical Software,33, Article 22.doi:10.1145/1268776.1268777.
The TestU01 C library is available fromhttp://simul.iro.umontreal.ca/testu01/tu01.html or alsohttps://github.com/umontreal-simul/TestU01-2009.
Marsaglia, G. (1997).A random number generator for C.Discussion paper, posting on Usenet newsgroupsci.stat.math
onSeptember 29, 1997.
Marsaglia, G. and Zaman, A. (1994).Some portable very-long-period random number generators.Computers in Physics,8, 117–121.doi:10.1063/1.168514.
Matsumoto, M. and Nishimura, T. (1998).Mersenne Twister: A 623-dimensionally equidistributed uniformpseudo-random number generator,ACM Transactions on Modeling and Computer Simulation,8, 3–30.
Source code formerly athttp://www.math.keio.ac.jp/~matumoto/emt.html
.
Now seehttp://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/c-lang.html.
Reeds, J., Hubert, S. and Abrahams, M. (1982–4).C implementation of SuperDuper, University of California at Berkeley.(Personal communication from Jim Reeds to Ross Ihaka.)
Wichmann, B. A. and Hill, I. D. (1982).Algorithm AS 183: An Efficient and Portable Pseudo-random NumberGenerator.Applied Statistics,31, 188–190; Remarks:34, 198 and35, 89.doi:10.2307/2347988.
sample
for random sampling with and without replacement.
Distributions for functions for random-variate generation fromstandard distributions.
require(stats)## Seed the current RNG, i.e., set the RNG statusset.seed(42); u1 <- runif(30)set.seed(42); u2 <- runif(30) # the same because of identical RNG status:stopifnot(identical(u1, u2))## the default random seed is 626 integers, so only print a few runif(1); .Random.seed[1:6]; runif(1); .Random.seed[1:6] ## If there is no seed, a "random" new one is created: rm(.Random.seed); runif(1); .Random.seed[1:6]ok <- RNGkind()RNGkind("Wich") # (partial string matching on 'kind')## This shows how 'runif(.)' works for Wichmann-Hill,## using only R functions:p.WH <- c(30269, 30307, 30323)a.WH <- c( 171, 172, 170)next.WHseed <- function(i.seed = .Random.seed[-1]) { (a.WH * i.seed) %% p.WH }my.runif1 <- function(i.seed = .Random.seed) { ns <- next.WHseed(i.seed[-1]); sum(ns / p.WH) %% 1 }set.seed(1998-12-04)# (when the next lines were added to the souRce)rs <- .Random.seed(WHs <- next.WHseed(rs[-1]))u <- runif(1)stopifnot( next.WHseed(rs[-1]) == .Random.seed[-1], all.equal(u, my.runif1(rs)))## ----.Random.seedRNGkind("Super") # matches "Super-Duper"RNGkind().Random.seed # new, corresponding to Super-Duper## Reset:RNGkind(ok[1])RNGversion(getRversion()) # the default version for this R version## ----sum(duplicated(runif(1e6))) # around 110 for default generator## and we would expect about almost sure duplicates beyond aboutqbirthday(1 - 1e-6, classes = 2e9) # 235,000
require(stats)## Seed the current RNG, i.e., set the RNG statusset.seed(42); u1<- runif(30)set.seed(42); u2<- runif(30)# the same because of identical RNG status:stopifnot(identical(u1, u2))## the default random seed is 626 integers, so only print a few runif(1); .Random.seed[1:6]; runif(1); .Random.seed[1:6]## If there is no seed, a "random" new one is created: rm(.Random.seed); runif(1); .Random.seed[1:6]ok<- RNGkind()RNGkind("Wich")# (partial string matching on 'kind')## This shows how 'runif(.)' works for Wichmann-Hill,## using only R functions:p.WH<- c(30269,30307,30323)a.WH<- c(171,172,170)next.WHseed<-function(i.seed= .Random.seed[-1]){(a.WH* i.seed)%% p.WH}my.runif1<-function(i.seed= .Random.seed){ ns<-next.WHseed(i.seed[-1]); sum(ns/ p.WH)%%1}set.seed(1998-12-04)# (when the next lines were added to the souRce)rs<- .Random.seed(WHs<-next.WHseed(rs[-1]))u<- runif(1)stopifnot(next.WHseed(rs[-1])== .Random.seed[-1], all.equal(u, my.runif1(rs)))## ----.Random.seedRNGkind("Super")# matches "Super-Duper"RNGkind().Random.seed# new, corresponding to Super-Duper## Reset:RNGkind(ok[1])RNGversion(getRversion())# the default version for this R version## ----sum(duplicated(runif(1e6)))# around 110 for default generator## and we would expect about almost sure duplicates beyond aboutqbirthday(1-1e-6, classes=2e9)# 235,000
FunctionRNGkind
allows user-coded uniform andnormal random number generators to be supplied. The details are givenhere.
A user-specified uniform RNG is called from entry points indynamically-loaded compiled code. The user must supply the entry pointuser_unif_rand
, which takes no arguments and returns apointer to a double. The example below will show the generalpattern. The generator should have at least 25 bits of precision.
Optionally, the user can supply the entry pointuser_unif_init
,which is called with anunsigned int
argument whenRNGkind
(orset.seed
) is called, and is intendedto be used to initialize the user's RNG code. The argument is intendedto be used to set the ‘seeds’; it is theseed
argument toset.seed
or an essentially random seed ifRNGkind
is called.
If only these functions are supplied, no information about thegenerator's state is recorded in.Random.seed
. Optionally,functionsuser_unif_nseed
anduser_unif_seedloc
can besupplied which are called with no arguments and should return pointersto the number of seeds and to an integer (specifically, ‘Int32’)array of seeds. Calls toGetRNGstate
andPutRNGstate
will then copy this array to and from.Random.seed
.
A user-specified normal RNG is specified by a single entry pointuser_norm_rand
, which takes no arguments and returns apointer to a double.
As with all compiled code, mis-specifying thesefunctions can crashR. Do include the ‘R_ext/Random.h’header file for type checking.
## Not run: ## Marsaglia's congruential PRNG#include <R_ext/Random.h>static Int32 seed;static double res;static int nseed = 1;double * user_unif_rand(void){ seed = 69069 * seed + 1; res = seed * 2.32830643653869e-10; return &res;}void user_unif_init(Int32 seed_in) { seed = seed_in; }int * user_unif_nseed(void) { return &nseed; }int * user_unif_seedloc(void) { return (int *) &seed; }/* ratio-of-uniforms for normal */#include <math.h>static double x;double * user_norm_rand(void){ double u, v, z; do { u = unif_rand(); v = 0.857764 * (2. * unif_rand() - 1); x = v/u; z = 0.25 * x * x; if (z < 1. - u) break; if (z > 0.259/u + 0.35) continue; } while (z > -log(u)); return &x;}## Use under Unix:R CMD SHLIB urand.cR> dyn.load("urand.so")> RNGkind("user")> runif(10)> .Random.seed> RNGkind(, "user")> rnorm(10)> RNGkind()[1] "user-supplied" "user-supplied"## End(Not run)
## Not run:## Marsaglia's congruential PRNG#include <R_ext/Random.h>static Int32 seed;static double res;static int nseed=1;double* user_unif_rand(void){ seed=69069* seed+1; res= seed*2.32830643653869e-10; return&res;}void user_unif_init(Int32 seed_in){ seed= seed_in;}int* user_unif_nseed(void){ return&nseed;}int* user_unif_seedloc(void){ return(int*)&seed;}/* ratio-of-uniformsfor normal*/#include <math.h>static double x;double* user_norm_rand(void){ double u, v, z; do{ u= unif_rand(); v=0.857764*(2.* unif_rand()-1); x= v/u; z=0.25* x* x;if(z<1.- u)break;if(z>0.259/u+0.35) continue;}while(z>-log(u)); return&x;}## Use under Unix:R CMD SHLIB urand.cR> dyn.load("urand.so")> RNGkind("user")> runif(10)> .Random.seed> RNGkind(,"user")> rnorm(10)> RNGkind()[1]"user-supplied""user-supplied"## End(Not run)
range
returns a vector containing the minimum and maximum ofall the given arguments.
range(..., na.rm = FALSE)## Default S3 method:range(..., na.rm = FALSE, finite = FALSE)## same for classes 'Date' and 'POSIXct'.rangeNum(..., na.rm, finite, isNumeric)
range(..., na.rm=FALSE)## Default S3 method:range(..., na.rm=FALSE, finite=FALSE)## same for classes 'Date' and 'POSIXct'.rangeNum(..., na.rm, finite, isNumeric)
... | any |
na.rm | logical, indicating if |
finite | logical, indicating if all non-finite elements shouldbe omitted. |
isNumeric | a |
range
is a generic function: methods can be defined for itdirectly or via theSummary
group generic.For this to work properly, the arguments...
should beunnamed, and dispatch is on the first argument.
Ifna.rm
isFALSE
,NA
andNaN
values in any of the arguments will causeNA
valuesto be returned, otherwiseNA
values are ignored.
Iffinite
isTRUE
, the minimumand maximum of all finite values is computed, i.e.,finite = TRUE
includesna.rm = TRUE
.
A special situation occurs when there is no (after omissionofNA
s) nonempty argument left, seemin
.
This is part of the S4Summary
group generic. Methods for it must use the signaturex, ..., na.rm
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Theextendrange()
utility in packagegrDevices.
(r.x <- range(stats::rnorm(100)))diff(r.x) # the SAMPLE rangex <- c(NA, 1:3, -1:1/0); xrange(x)range(x, na.rm = TRUE)range(x, finite = TRUE)
(r.x<- range(stats::rnorm(100)))diff(r.x)# the SAMPLE rangex<- c(NA,1:3,-1:1/0); xrange(x)range(x, na.rm=TRUE)range(x, finite=TRUE)
Returns the sample ranks of the values in a vector. Ties (i.e., equalvalues) and missing values can be handled in several ways.
rank(x, na.last = TRUE, ties.method = c("average", "first", "last", "random", "max", "min"))
rank(x, na.last=TRUE, ties.method= c("average","first","last","random","max","min"))
x | a numeric, complex, character or logical vector. |
na.last | a logical or character string controlling the treatmentof |
ties.method | a character string specifying how ties are treated,see ‘Details’; can be abbreviated. |
If all components are different (and noNA
s), the ranks arewell defined, with values inseq_along(x)
. With some values equal(called ‘ties’), the argumentties.method
determines theresult at the corresponding indices. The"first"
method resultsin a permutation with increasing values at each index set of ties, andanalogously"last"
with decreasing values. The"random"
method puts these in random order whereas thedefault,"average"
, replaces them by their mean, and"max"
and"min"
replaces them by their maximum andminimum respectively, the latter being the typical sportsranking.
NA
values are never considered to be equal: forna.last = TRUE
andna.last = FALSE
they are given distinct ranks inthe order in which they occur inx
.
NB:rank
is not itself generic butxtfrm
is, andrank(xtfrm(x), ....)
will have the desired result ifthere is axtfrm
method. Otherwise,rank
will make useof==
,>
,is.na
and extraction methods forclassed objects, possibly rather slowly.
A numeric vector of the same length asx
with names copied fromx
(unlessna.last = NA
, when missing values areremoved). The vector is of integer type unlessx
is a longvector orties.method = "average"
when it is of double type(whether or not there are any ties).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
order
andsort
;xtfrm
, see above.
(r1 <- rank(x1 <- c(3, 1, 4, 15, 92)))x2 <- c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5)names(x2) <- letters[1:11](r2 <- rank(x2)) # ties are averaged## rank() is "idempotent": rank(rank(x)) == rank(x) :stopifnot(rank(r1) == r1, rank(r2) == r2)## ranks without averagingrank(x2, ties.method= "first") # first occurrence winsrank(x2, ties.method= "last") # last occurrence winsrank(x2, ties.method= "random") # ties broken at randomrank(x2, ties.method= "random") # and again## keep ties ties, no average(rma <- rank(x2, ties.method= "max")) # as used classically(rmi <- rank(x2, ties.method= "min")) # as in Sportsstopifnot(rma + rmi == round(r2 + r2))## Comparing all tie.methods:tMeth <- eval(formals(rank)$ties.method)rx2 <- sapply(tMeth, function(M) rank(x2, ties.method=M))cbind(x2, rx2)## ties.method's does not matter w/o ties:x <- sample(47)rx <- sapply(tMeth, function(MM) rank(x, ties.method=MM))stopifnot(all(rx[,1] == rx))
(r1<- rank(x1<- c(3,1,4,15,92)))x2<- c(3,1,4,1,5,9,2,6,5,3,5)names(x2)<- letters[1:11](r2<- rank(x2))# ties are averaged## rank() is "idempotent": rank(rank(x)) == rank(x) :stopifnot(rank(r1)== r1, rank(r2)== r2)## ranks without averagingrank(x2, ties.method="first")# first occurrence winsrank(x2, ties.method="last")# last occurrence winsrank(x2, ties.method="random")# ties broken at randomrank(x2, ties.method="random")# and again## keep ties ties, no average(rma<- rank(x2, ties.method="max"))# as used classically(rmi<- rank(x2, ties.method="min"))# as in Sportsstopifnot(rma+ rmi== round(r2+ r2))## Comparing all tie.methods:tMeth<- eval(formals(rank)$ties.method)rx2<- sapply(tMeth,function(M) rank(x2, ties.method=M))cbind(x2, rx2)## ties.method's does not matter w/o ties:x<- sample(47)rx<- sapply(tMeth,function(MM) rank(x, ties.method=MM))stopifnot(all(rx[,1]== rx))
rapply
is a recursive version oflapply
withflexibility inhow the result is structured (how = ".."
).
rapply(object, f, classes = "ANY", deflt = NULL, how = c("unlist", "replace", "list"), ...)
rapply(object, f, classes="ANY", deflt=NULL, how= c("unlist","replace","list"),...)
object | a |
f | a |
classes | character vector of |
deflt | the default result (not used if |
how | character string partially matching the three possibilities given:see ‘Details’. |
... | additional arguments passed to the call to |
This function has two basic modes. Ifhow = "replace"
, eachelement ofobject
which is not itself list-like and has a classincluded inclasses
is replaced by the result of applyingf
to the element.
Otherwise, with modehow = "list"
orhow = "unlist"
,conceptuallyobject
is copied, all non-list elements which have a class included inclasses
are replaced by the result of applyingf
to theelement and all others are replaced bydeflt
. Finally, ifhow = "unlist"
,unlist(recursive = TRUE)
is called onthe result.
The semantics differ in detail fromlapply
: inparticular the arguments are evaluated before calling the C code.
InR 3.5.x and earlier,object
was required to be a list,which wasnot the case for its list-like components.
Ifhow = "unlist"
, a vector, otherwise “list-like”of similar structure asobject
.
Chambers, J. A. (1998)Programming with Data.Springer.
(rapply
is only described briefly there.)
X <- list(list(a = pi, b = list(c = 1L)), d = "a test")# the "identity operation":rapply(X, function(x) x, how = "replace") -> X.; stopifnot(identical(X, X.))rapply(X, sqrt, classes = "numeric", how = "replace")rapply(X, deparse, control = "all") # passing extras. argument of deparse()rapply(X, nchar, classes = "character", deflt = NA_integer_, how = "list")rapply(X, nchar, classes = "character", deflt = NA_integer_, how = "unlist")rapply(X, nchar, classes = "character", how = "unlist")rapply(X, log, classes = "numeric", how = "replace", base = 2)## with expression() / list():E <- expression(list(a = pi, b = expression(c = C1 * C2)), d = "a test")LE <- list(expression(a = pi, b = expression(c = C1 * C2)), d = "a test")rapply(E, nchar, how="replace") # "expression(c = C1 * C2)" are 23 charsrapply(E, nchar, classes = "character", deflt = NA_integer_, how = "unlist")rapply(LE, as.character) # a "pi" | b1 "expression" | b2 "C1 * C2" ..rapply(LE, nchar) # (see above)stopifnot(exprs = { identical(E , rapply(E , identity, how = "replace")) identical(LE, rapply(LE, identity, how = "replace"))})
X<- list(list(a= pi, b= list(c=1L)), d="a test")# the "identity operation":rapply(X,function(x) x, how="replace")-> X.; stopifnot(identical(X, X.))rapply(X, sqrt, classes="numeric", how="replace")rapply(X, deparse, control="all")# passing extras. argument of deparse()rapply(X, nchar, classes="character", deflt=NA_integer_, how="list")rapply(X, nchar, classes="character", deflt=NA_integer_, how="unlist")rapply(X, nchar, classes="character", how="unlist")rapply(X, log, classes="numeric", how="replace", base=2)## with expression() / list():E<- expression(list(a= pi, b= expression(c= C1* C2)), d="a test")LE<- list(expression(a= pi, b= expression(c= C1* C2)), d="a test")rapply(E, nchar, how="replace")# "expression(c = C1 * C2)" are 23 charsrapply(E, nchar, classes="character", deflt=NA_integer_, how="unlist")rapply(LE, as.character)# a "pi" | b1 "expression" | b2 "C1 * C2" ..rapply(LE, nchar)# (see above)stopifnot(exprs={ identical(E, rapply(E, identity, how="replace")) identical(LE, rapply(LE, identity, how="replace"))})
Creates or tests for objects of type"raw"
.
raw(length = 0)as.raw(x)is.raw(x)
raw(length=0)as.raw(x)is.raw(x)
length | desired length. |
x | object to be coerced. |
The raw type is intended to hold raw bytes. It is possible to extractsubsequences of bytes, and to replace elements (but only by elementsof a raw vector). The relational operators (seeComparison,using the numerical order of the byte representation) work, as do thelogical operators (seeLogic) with a bitwise interpretation.
A raw vector is printed with each byte separately represented as apair of hex digits. If you want to see a character representation(with escape sequences for non-printing characters) userawToChar
.
Coercion to raw treats the input values as representing small(decimal) integers, so the input is first coerced to integer, and thenvalues which are outside the range[0 ... 255]
or areNA
are set to0
(thenul
byte).
as.raw
andis.raw
areprimitive functions.
raw
creates a raw vector of the specified length.Each element of the vector is equal to0
.Raw vectors are used to store fixed-length sequences of bytes.
as.raw
attempts to coerce its argument to be of rawtype. The (elementwise) answer will be0
unless thecoercion succeeds (or if the original value successfully coerces to 0).
is.raw
returns true if and only iftypeof(x) == "raw"
.
&
for bitwise operations on raw vectors.
xx <- raw(2)xx[1] <- as.raw(40) # NB, not just 40.xx[2] <- charToRaw("A")xx ## 28 41 -- raw prints hexadecimalsdput(xx) ## as.raw(c(0x28, 0x41))as.integer(xx) ## 40 65x <- "A test string"(y <- charToRaw(x))is.vector(y) # TRUErawToChar(y)is.raw(x)is.raw(y)stopifnot( charToRaw("\xa3") == as.raw(0xa3) )isASCII <- function(txt) all(charToRaw(txt) <= as.raw(127))isASCII(x) # trueisASCII("\xa325.63") # false (in Latin-1, this is an amount in UK pounds)
xx<- raw(2)xx[1]<- as.raw(40)# NB, not just 40.xx[2]<- charToRaw("A")xx## 28 41 -- raw prints hexadecimalsdput(xx)## as.raw(c(0x28, 0x41))as.integer(xx)## 40 65x<-"A test string"(y<- charToRaw(x))is.vector(y)# TRUErawToChar(y)is.raw(x)is.raw(y)stopifnot( charToRaw("\xa3")== as.raw(0xa3))isASCII<-function(txt) all(charToRaw(txt)<= as.raw(127))isASCII(x)# trueisASCII("\xa325.63")# false (in Latin-1, this is an amount in UK pounds)
Input and output raw connections.
rawConnection(object, open = "r")rawConnectionValue(con)
rawConnection(object, open="r")rawConnectionValue(con)
object | character or raw vector. A description of the connection.For an input this is anR raw vector object, and for an outputconnection the name for the connection. |
open | character. Any of the standard connection open modes. |
con | an output raw connection. |
An input raw connection is opened and the raw vector is copiedat the time the connection object is created, andclose
destroys the copy.
An output raw connection is opened and creates anR raw vectorinternally. The raw vector can be retrievedviarawConnectionValue
.
If a connection is open for both input and output the initial rawvector supplied is copied when the connections is open
ForrawConnection
, a connection object of class"rawConnection"
which inherits from class"connection"
.
ForrawConnectionValue
, a raw vector.
As output raw connections keep the internal raw vector up to datecall-by-call, they are relatively expensive to use (althoughover-allocation is used), and it may be better to use an anonymousfile()
connection to collect output.
On (rare) platforms wherevsnprintf
does not return the needed lengthof output there is a 100,000 character limit on the length of line foroutput connections: longer lines will be truncated with a warning.
zz <- rawConnection(raw(0), "r+") # start with empty raw vectorwriteBin(LETTERS, zz)seek(zz, 0)readLines(zz) # raw vector has embedded nulsseek(zz, 0)writeBin(letters[1:3], zz)rawConnectionValue(zz)close(zz)
zz<- rawConnection(raw(0),"r+")# start with empty raw vectorwriteBin(LETTERS, zz)seek(zz,0)readLines(zz)# raw vector has embedded nulsseek(zz,0)writeBin(letters[1:3], zz)rawConnectionValue(zz)close(zz)
Conversion to and from and manipulation of objects of type"raw"
,both used as bits or “packed” 8 bits.
charToRaw(x)rawToChar(x, multiple = FALSE)rawShift(x, n)rawToBits(x)intToBits(x)packBits(x, type = c("raw", "integer", "double"))numToInts(x)numToBits(x)
charToRaw(x)rawToChar(x, multiple=FALSE)rawShift(x, n)rawToBits(x)intToBits(x)packBits(x, type= c("raw","integer","double"))numToInts(x)numToBits(x)
x | object to be converted or shifted. |
multiple | logical: should the conversion be to a singlecharacter string or multiple individual characters? |
n | the number of bits to shift. Positive numbers shift rightand negative numbers shift left: allowed values are |
type | the result type, partially matched. |
packBits
accepts raw, integer or logical inputs, the last twowithout any NAs.
numToBits(.)
andpackBits(., type="double")
areinverse functions of each other, see also the examples.
Note that ‘bytes’ are not necessarily the same as characters,e.g. in UTF-8 locales.
charToRaw
converts a length-one character string to raw bytes.It does so without taking into account any declared encoding (seeEncoding
).
rawToChar
converts raw bytes either to a single characterstring or a character vector of single bytes (with""
for0
). (Note that a single character string could containembeddedNULs; only trailing nulls are allowed and will be removed.)In either case it is possible to create a result which is invalid in amultibyte locale, e.g. one using UTF-8.Long vectors areallowed ifmultiple
is true.
rawShift(x, n)
shift the bits inx
byn
positionsto the right, see the argumentn
, above.
rawToBits
returns a raw vector of 8 times the length of a rawvector with entries 0 or 1.intToBits
returns a raw vectorof 32 times the length of an integer vector with entries 0 or 1.(Non-integral numeric values are truncated to integers.) Inboth cases the unpacking is least-significant bit first.
packBits
packs its input (using only the lowest bit for raw orinteger vectors) least-significant bit first to a raw, integer or double(“numeric”) vector.
numToInts()
andnumToBits()
splitdouble
precision numeric vectorseither into to twointeger
s each or into 64 bits each,stored asraw
. In both cases the unpacking is least-significantelement first.
x <- "A test string"(y <- charToRaw(x))is.vector(y) # TRUErawToChar(y)rawToChar(y, multiple = TRUE)(xx <- c(y, charToRaw("&"), charToRaw(" more")))rawToChar(xx)rawShift(y, 1)rawShift(y,-2)rawToBits(y)showBits <- function(r) stats::symnum(as.logical(rawToBits(r)))z <- as.raw(5)z ; showBits(z)showBits(rawShift(z, 1)) # shift to rightshowBits(rawShift(z, 2))showBits(z)showBits(rawShift(z, -1)) # shift to leftshowBits(rawShift(z, -2)) # ..showBits(rawShift(z, -3)) # shifted off entirelypackBits(as.raw(0:31))i <- -2:3stopifnot(exprs = { identical(i, packBits(intToBits(i), "integer")) identical(packBits( 0:31) , packBits(as.raw(0:31)))})str(pBi <- packBits(intToBits(i)))data.frame(B = matrix(pBi, nrow=6, byrow=TRUE), hex = format(as.hexmode(i)), i)## Look at internal bit representation of ...## ... of integers :bitI <- function(x) vapply(as.integer(x), function(x) { b <- substr(as.character(rev(intToBits(x))), 2L, 2L) paste0(c(b[1L], " ", b[2:32]), collapse = "") }, "")print(bitI(-8:8), width = 35, quote = FALSE)## ... of double precision numbers in format 'sign exp | mantissa'## where 1 bit sign 1 <==> "-";## 11 bit exp is the base-2 exponent biased by 2^10 - 1 (1023)## 52 bit mantissa is without the implicit leading '1'### Bit representation [ sign | exponent | mantissa ] of double prec numbers :bitC <- function(x) noquote(vapply(as.double(x), function(x) { # split one double b <- substr(as.character(rev(numToBits(x))), 2L, 2L) paste0(c(b[1L], " ", b[2:12], " | ", b[13:64]), collapse = "") }, ""))bitC(17)bitC(c(-1,0,1))bitC(2^(-2:5))bitC(1+2^-(1:53))# from 0.5 converge to 1### numToBits(.) <==> intToBits(numToInts(.)) :d2bI <- function(x) vapply(as.double(x), function(x) intToBits(numToInts(x)), raw(64L))d2b <- function(x) vapply(as.double(x), function(x) numToBits(x) , raw(64L))set.seed(1)x <- c(sort(rt(2048, df=1.5)), 2^(-10:10), 1+2^-(1:53))str(bx <- d2b(x)) # a 64 x 2122 raw matrixstopifnot( identical(bx, d2bI(x)) )## Show that packBits(*, "double") is the inverse of numToBits() :packBits(numToBits(pi), type="double")bitC(2050)b <- numToBits(2050) identical(b, numToBits(packBits(b, type="double")))pbx <- apply(bx, 2, packBits, type="double")stopifnot( identical(pbx, x))
x<-"A test string"(y<- charToRaw(x))is.vector(y)# TRUErawToChar(y)rawToChar(y, multiple=TRUE)(xx<- c(y, charToRaw("&"), charToRaw(" more")))rawToChar(xx)rawShift(y,1)rawShift(y,-2)rawToBits(y)showBits<-function(r) stats::symnum(as.logical(rawToBits(r)))z<- as.raw(5)z; showBits(z)showBits(rawShift(z,1))# shift to rightshowBits(rawShift(z,2))showBits(z)showBits(rawShift(z,-1))# shift to leftshowBits(rawShift(z,-2))# ..showBits(rawShift(z,-3))# shifted off entirelypackBits(as.raw(0:31))i<--2:3stopifnot(exprs={ identical(i, packBits(intToBits(i),"integer")) identical(packBits(0:31), packBits(as.raw(0:31)))})str(pBi<- packBits(intToBits(i)))data.frame(B= matrix(pBi, nrow=6, byrow=TRUE), hex= format(as.hexmode(i)), i)## Look at internal bit representation of ...## ... of integers :bitI<-function(x) vapply(as.integer(x),function(x){ b<- substr(as.character(rev(intToBits(x))),2L,2L) paste0(c(b[1L]," ", b[2:32]), collapse="")},"")print(bitI(-8:8), width=35, quote=FALSE)## ... of double precision numbers in format 'sign exp | mantissa'## where 1 bit sign 1 <==> "-";## 11 bit exp is the base-2 exponent biased by 2^10 - 1 (1023)## 52 bit mantissa is without the implicit leading '1'### Bit representation [ sign | exponent | mantissa ] of double prec numbers :bitC<-function(x) noquote(vapply(as.double(x),function(x){# split one double b<- substr(as.character(rev(numToBits(x))),2L,2L) paste0(c(b[1L]," ", b[2:12]," | ", b[13:64]), collapse="")},""))bitC(17)bitC(c(-1,0,1))bitC(2^(-2:5))bitC(1+2^-(1:53))# from 0.5 converge to 1### numToBits(.) <==> intToBits(numToInts(.)) :d2bI<-function(x) vapply(as.double(x),function(x) intToBits(numToInts(x)), raw(64L))d2b<-function(x) vapply(as.double(x),function(x) numToBits(x), raw(64L))set.seed(1)x<- c(sort(rt(2048, df=1.5)),2^(-10:10),1+2^-(1:53))str(bx<- d2b(x))# a 64 x 2122 raw matrixstopifnot( identical(bx, d2bI(x)))## Show that packBits(*, "double") is the inverse of numToBits() :packBits(numToBits(pi), type="double")bitC(2050)b<- numToBits(2050) identical(b, numToBits(packBits(b, type="double")))pbx<- apply(bx,2, packBits, type="double")stopifnot( identical(pbx, x))
Utilities for converting files in R documentation (Rd) format to otherformats or create indices from them, and for converting documentationin other formats to Rd format.
R CMD Rdconv [options] fileR CMD Rd2pdf [options] files
R CMD Rdconv[options] fileR CMD Rd2pdf[options] files
file | the path to a file to be processed. |
files | a list of file names specifying the R documentationsources to use, by either giving the paths to the files, or the pathto a directory with the sources of a package. |
options | further options to control the processing, or forobtaining information about usage and version of the utility. |
R CMD Rdconv
converts Rd format to plain text, HTML or LaTeXformats: it can also extract the examples.
R CMD Rd2pdf
is the user-level program for producing PDF outputfrom Rd sources. It will make use of the environment variablesR_PAPERSIZE (set byR CMD
, with a default set whenRwas installed: values forR_PAPERSIZE area4
,letter
,legal
andexecutive
)
andR_PDFVIEWER (the PDF previewer). Also,RD2PDF_INPUTENC can be set toinputenx
to make use of theLaTeX package of that name rather thaninputenc
: this might beneeded for better support of the UTF-8 encoding.
R CMD Rd2pdf
callstools::texi2pdf
to produceits PDF file: see its help for the possibilities for thetexi2dvi
command which that function uses (and which can beoverridden by setting environment variableR_TEXI2DVICMD).
UseR CMDfoo --help
to obtain usage information on utilityfoo
.
The section ‘Processing documentation files’ in the‘Writing R Extensions’ manual:RShowDoc("R-exts")
.
Read binary data from or write binary data to a connection or raw vector.
readBin(con, what, n = 1L, size = NA_integer_, signed = TRUE, endian = .Platform$endian)writeBin(object, con, size = NA_integer_, endian = .Platform$endian, useBytes = FALSE)
readBin(con, what, n=1L, size=NA_integer_, signed=TRUE, endian= .Platform$endian)writeBin(object, con, size=NA_integer_, endian= .Platform$endian, useBytes=FALSE)
con | Aconnection object or a character string naming a file ora raw vector. |
what | Either an object whose mode will give the mode of thevector to be read, or a character vector of length one describingthe mode: one of |
n | numeric. The (maximal) number of records to beread. You can use an over-estimate here, but not too large asstorage is reserved for |
size | integer. The number of bytes per element in the bytestream. The default, |
signed | logical. Only used for integers of sizes 1 and 2,when it determines if the quantity on fileshould be regarded as a signed or unsigned integer. |
endian | The endianness ( |
object | AnR object to be written to the connection. |
useBytes | See |
These functions can only be used with binary-mode connections.Ifcon
is a character string, the functions callfile
to obtain a binary-mode file connection which isopened for the duration of the function call.
If the connection is open it is read/written from its currentposition. If it is not open, it is opened for the duration of thecall in an appropriate mode (binary read or write) and then closedagain. An open connection must be in binary mode.
IfreadBin
is called withcon
a raw vector, the data inthe vector is used as input. IfwriteBin
is called withcon
a raw vector, it is just an indication that a raw vectorshould be returned.
Ifsize
is specified and not the natural size of the object,each element of the vector is coerced to an appropriate type beforebeing written or as it is read. Possible sizes are 1, 2, 4 andpossibly 8 for integer or logical vectors, and 4, 8 and possibly 12/16for numeric vectors. (Note that coercion occurs as signed typesexcept ifsigned = FALSE
when reading integers of sizes 1 and 2.)Changing sizes is unlikely to preserveNA
s, and the extendedprecision sizes are unlikely to be portable across platforms.
readBin
andwriteBin
read and write C-stylezero-terminated character strings. Input strings are limited to 10000characters.readChar
andwriteChar
canbe used to read and write fixed-length strings. No check is made thatthe string is valid in the current locale's encoding.
HandlingR's missing and special (Inf
,-Inf
andNaN
) values is discussed in the ‘R Data Import/Export’ manual.
Only bytes can be written in a singlecall (and that is the maximum capacity of a raw vector on 32-bitplatforms).
‘Endian-ness’ is relevant forsize > 1
, and shouldalways be set for portable code (the default is only appropriate whenwriting and then reading files on the same platform).
ForreadBin
, a vector of appropriate mode and length the number ofitems read (which might be less thann
).
ForwriteBin
, a raw vector (ifcon
is a raw vector) orinvisiblyNULL
.
Integer read/writes of size 8 will be available if either C typelong
is of size 8 bytes or C typelong long
exists andis of size 8 bytes.
Real read/writes of sizesizeof(long double)
(usually 12 or 16bytes) will be available only if that type is available and differentfromdouble
.
IfreadBin(what = character())
is used incorrectly on a filewhich does not contain C-style character strings, warnings (usuallymany) are given. From a file or connection, the input will be brokeninto pieces of length 10000 with any final part being discarded.
The ‘R Data Import/Export’ manual.
readChar
to read/write fixed-length strings.
connections
,readLines
,writeLines
.
.Machine
for the sizes oflong
,long long
andlong double
.
zzfil <- tempfile("testbin")zz <- file(zzfil, "wb")writeBin(1:10, zz)writeBin(pi, zz, endian = "swap")writeBin(pi, zz, size = 4)writeBin(pi^2, zz, size = 4, endian = "swap")writeBin(pi+3i, zz)writeBin("A test of a connection", zz)z <- paste("A very long string", 1:100, collapse = " + ")writeBin(z, zz)if(.Machine$sizeof.long == 8 || .Machine$sizeof.longlong == 8) writeBin(as.integer(5^(1:10)), zz, size = 8)if((s <- .Machine$sizeof.longdouble) > 8) writeBin((pi/3)^(1:10), zz, size = s)close(zz)zz <- file(zzfil, "rb")readBin(zz, integer(), 4)readBin(zz, integer(), 6)readBin(zz, numeric(), 1, endian = "swap")readBin(zz, numeric(), size = 4)readBin(zz, numeric(), size = 4, endian = "swap")readBin(zz, complex(), 1)readBin(zz, character(), 1)z2 <- readBin(zz, character(), 1)if(.Machine$sizeof.long == 8 || .Machine$sizeof.longlong == 8) readBin(zz, integer(), 10, size = 8)if((s <- .Machine$sizeof.longdouble) > 8) readBin(zz, numeric(), 10, size = s)close(zz)unlink(zzfil)stopifnot(z2 == z)## signed vs unsigned intszzfil <- tempfile("testbin")zz <- file(zzfil, "wb")x <- as.integer(seq(0, 255, 32))writeBin(x, zz, size = 1)writeBin(x, zz, size = 1)x <- as.integer(seq(0, 60000, 10000))writeBin(x, zz, size = 2)writeBin(x, zz, size = 2)close(zz)zz <- file(zzfil, "rb")readBin(zz, integer(), 8, size = 1)readBin(zz, integer(), 8, size = 1, signed = FALSE)readBin(zz, integer(), 7, size = 2)readBin(zz, integer(), 7, size = 2, signed = FALSE)close(zz)unlink(zzfil)## use of rawz <- writeBin(pi^{1:5}, raw(), size = 4)readBin(z, numeric(), 5, size = 4)z <- writeBin(c("a", "test", "of", "character"), raw())readBin(z, character(), 4)
zzfil<- tempfile("testbin")zz<- file(zzfil,"wb")writeBin(1:10, zz)writeBin(pi, zz, endian="swap")writeBin(pi, zz, size=4)writeBin(pi^2, zz, size=4, endian="swap")writeBin(pi+3i, zz)writeBin("A test of a connection", zz)z<- paste("A very long string",1:100, collapse=" + ")writeBin(z, zz)if(.Machine$sizeof.long==8|| .Machine$sizeof.longlong==8) writeBin(as.integer(5^(1:10)), zz, size=8)if((s<- .Machine$sizeof.longdouble)>8) writeBin((pi/3)^(1:10), zz, size= s)close(zz)zz<- file(zzfil,"rb")readBin(zz, integer(),4)readBin(zz, integer(),6)readBin(zz, numeric(),1, endian="swap")readBin(zz, numeric(), size=4)readBin(zz, numeric(), size=4, endian="swap")readBin(zz, complex(),1)readBin(zz, character(),1)z2<- readBin(zz, character(),1)if(.Machine$sizeof.long==8|| .Machine$sizeof.longlong==8) readBin(zz, integer(),10, size=8)if((s<- .Machine$sizeof.longdouble)>8) readBin(zz, numeric(),10, size= s)close(zz)unlink(zzfil)stopifnot(z2== z)## signed vs unsigned intszzfil<- tempfile("testbin")zz<- file(zzfil,"wb")x<- as.integer(seq(0,255,32))writeBin(x, zz, size=1)writeBin(x, zz, size=1)x<- as.integer(seq(0,60000,10000))writeBin(x, zz, size=2)writeBin(x, zz, size=2)close(zz)zz<- file(zzfil,"rb")readBin(zz, integer(),8, size=1)readBin(zz, integer(),8, size=1, signed=FALSE)readBin(zz, integer(),7, size=2)readBin(zz, integer(),7, size=2, signed=FALSE)close(zz)unlink(zzfil)## use of rawz<- writeBin(pi^{1:5}, raw(), size=4)readBin(z, numeric(),5, size=4)z<- writeBin(c("a","test","of","character"), raw())readBin(z, character(),4)
Transfer character strings to and from connections, without assumingthey are null-terminated on the connection.
readChar(con, nchars, useBytes = FALSE)writeChar(object, con, nchars = nchar(object, type = "chars"), eos = "", useBytes = FALSE)
readChar(con, nchars, useBytes=FALSE)writeChar(object, con, nchars= nchar(object, type="chars"), eos="", useBytes=FALSE)
con | aconnection object, or a character string naming a file,or a raw vector. |
nchars | integer vector, giving the lengths in characters of(unterminated) character strings to be read or written. Elementsmust be >= 0 and not |
useBytes | logical: For |
object | a character vector to be written to the connection, atleast as long as |
eos | ‘end of string’: character string. The terminatorto be written after each string, followed by an ASCII |
These functions complementreadBin
andwriteBin
which read and write C-style zero-terminatedcharacter strings. They are for strings of known length, andcan optionally write an end-of-string mark. They are intended onlyfor character strings valid in the current locale.
These functions are intended to be used with binary-mode connections.Ifcon
is a character string, the functions callfile
to obtain a binary-mode file connection which isopened for the duration of the function call.
If the connection is open it is read/written from its currentposition. If it is not open, it is opened for the duration of thecall in an appropriate mode (binary read or write) and then closedagain. An open connection must be in binary mode.
IfreadChar
is called withcon
a raw vector, the data inthe vector is used as input. IfwriteChar
is called withcon
a raw vector, it is just an indication that a raw vectorshould be returned.
Character strings containing ASCIInul
(s) will be readcorrectly byreadChar
but truncated at the firstnul
with a warning.
If the character length requested forreadChar
is longer thanthe data available on the connection, what is available isreturned. ForwriteChar
if too many characters are requestedthe output is zero-padded, with a warning.
Missing strings are written asNA
.
ForreadChar
, a character vector of length the number ofitems read (which might be less thanlength(nchars)
).
ForwriteChar
, a raw vector (ifcon
is a raw vector) orinvisiblyNULL
.
Earlier versions ofR allowed embeddedNUL bytes within characterstrings, but notR >= 2.8.0.readChar
was commonly used toread fixed-size zero-padded byte fields for whichreadBin
wasunsuitable.readChar
can still be used for such fields ifthere are no embeddedNULs: otherwisereadBin(what = "raw")
provides an alternative.
nchars
will be interpreted in bytes not characters in anon-UTF-8 multi-byte locale, with a warning.
There is little validity checking of UTF-8 reads.
Using these functions on a text-mode connection may work but shouldnot be mixed with text-mode access to the connection, especially ifthe connection was opened with anencoding
argument.
The ‘R Data Import/Export’ manual.
connections
,readLines
,writeLines
,readBin
## test fixed-length stringszzfil <- tempfile("testchar")zz <- file(zzfil, "wb")x <- c("a", "this will be truncated", "abc")nc <- c(3, 10, 3)writeChar(x, zz, nc, eos = NULL)writeChar(x, zz, eos = "\r\n")close(zz)zz <- file(zzfil, "rb")readChar(zz, nc)readChar(zz, nchar(x)+3) # need to read the terminator explicitlyclose(zz)unlink(zzfil)
## test fixed-length stringszzfil<- tempfile("testchar")zz<- file(zzfil,"wb")x<- c("a","this will be truncated","abc")nc<- c(3,10,3)writeChar(x, zz, nc, eos=NULL)writeChar(x, zz, eos="\r\n")close(zz)zz<- file(zzfil,"rb")readChar(zz, nc)readChar(zz, nchar(x)+3)# need to read the terminator explicitlyclose(zz)unlink(zzfil)
readline
reads a line from the terminal (in interactive use).
readline(prompt = "")
readline(prompt="")
prompt | the string printed when prompting the user for input.Should usually end with a space |
The prompt string will be truncated to a maximum allowed length,normally 256 chars (but can be changed in the source code).
This can only be used in aninteractive session.
A character vector of length one. Both leading and trailingspaces and tabs are stripped from the result.
In non-interactive use the result is as if the response wasRETURN and the value is""
.
readLines
for reading text lines from connections,including files.
fun <- function() { ANSWER <- readline("Are you a satisfied R user? ") ## a better version would check the answer less cursorily, and ## perhaps re-prompt if (substr(ANSWER, 1, 1) == "n") cat("This is impossible. YOU LIED!\n") else cat("I knew it.\n")}if(interactive()) fun()
fun<-function(){ ANSWER<- readline("Are you a satisfied R user? ")## a better version would check the answer less cursorily, and## perhaps re-promptif(substr(ANSWER,1,1)=="n") cat("This is impossible. YOU LIED!\n")else cat("I knew it.\n")}if(interactive()) fun()
Read some or all text lines from a connection.
readLines(con = stdin(), n = -1L, ok = TRUE, warn = TRUE, encoding = "unknown", skipNul = FALSE)
readLines(con= stdin(), n=-1L, ok=TRUE, warn=TRUE, encoding="unknown", skipNul=FALSE)
con | aconnection object or a character string. |
n | integer. The (maximal) number of lines toread. Negative values indicate that one should read up to the end ofinput on the connection. |
ok | logical. Is it OK to reach the end of the connection before |
warn | logical. Warn if a text file is missing a finalEOL or ifthere are embeddedNULs in the file. |
encoding | encoding to be assumed for input strings. It isused to mark character strings as known to be inLatin-1, UTF-8 or to be bytes: it is not used to re-encode the input.To do thelatter, specify the encoding as part of the connection |
skipNul | logical: shouldNULs be skipped? |
If thecon
is a character string, the function callsfile
to obtain a file connection which is opened forthe duration of the function call. This can be a compressed file.(tilde expansion of the file path is done byfile
.)
If the connection is open it is read from its current position. If itis not open, it is opened in"rt"
mode for the duration ofthe call and then closed (but not destroyed; one must callclose
to do that).
If the final line is incomplete (no finalEOL marker) the behaviourdepends on whether the connection is blocking or not. For anon-blocking text-mode connection the incomplete line is pushed back,silently. For all other connections the line will be accepted, with awarning.
Whatever mode the connection is opened in, any ofLF,CRLF orCR will be accepted as theEOL marker fora line.
Ifcon
is a not-already-openconnection with a non-defaultencoding
argument, the text is converted to UTF-8 and declaredas such (and theencoding
argument toreadLines
is ignored).See the examples.
A character vector of length the number of lines read.
The elements of the result have a declared encoding ifencoding
is"latin1"
or"UTF-8"
,
The default connection,stdin
, may be different fromcon = "stdin"
: seefile
.
connections
,writeLines
,readBin
,scan
fil <- tempfile(fileext = ".data")cat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = fil, sep = "\n")readLines(fil, n = -1)unlink(fil) # tidy up## difference in blockingfil <- tempfile("test")cat("123\nabc", file = fil)readLines(fil) # line with a warningcon <- file(fil, "r", blocking = FALSE)readLines(con) # "123"cat(" def\n", file = fil, append = TRUE)readLines(con) # gets bothclose(con)unlink(fil) # tidy up## Not run: # read a 'Windows Unicode' fileA <- readLines(con <- file("Unicode.txt", encoding = "UCS-2LE"))close(con)unique(Encoding(A)) # will most likely be UTF-8## End(Not run)
fil<- tempfile(fileext=".data")cat("TITLE extra line","2 3 5 7","","11 13 17", file= fil, sep="\n")readLines(fil, n=-1)unlink(fil)# tidy up## difference in blockingfil<- tempfile("test")cat("123\nabc", file= fil)readLines(fil)# line with a warningcon<- file(fil,"r", blocking=FALSE)readLines(con)# "123"cat(" def\n", file= fil, append=TRUE)readLines(con)# gets bothclose(con)unlink(fil)# tidy up## Not run:# read a 'Windows Unicode' fileA<- readLines(con<- file("Unicode.txt", encoding="UCS-2LE"))close(con)unique(Encoding(A))# will most likely be UTF-8## End(Not run)
Functions to write a singleR object to a file, and to restore it.
saveRDS(object, file = "", ascii = FALSE, version = NULL, compress = TRUE, refhook = NULL)readRDS(file, refhook = NULL)infoRDS(file)
saveRDS(object, file="", ascii=FALSE, version=NULL, compress=TRUE, refhook=NULL)readRDS(file, refhook=NULL)infoRDS(file)
object | R object to serialize. |
file | aconnection or the name of the file where theR objectis saved to or read from. |
ascii | a logical. If |
version | the workspace format version to use. |
compress | a logical specifying whether saving to a named file isto use |
refhook | a hook function for handling reference objects. |
saveRDS
andreadRDS
provide the means to save a singleRobject to a connection (typically a file) and to restore the object, quitepossibly under a different name. This differs fromsave
andload
, which save and restore one or more named objects intoan environment. They are widely used byR itself, for example to storemetadata for a package and to store thehelp.search
databases: the".rds"
file extension is most often used.
Functionsserialize
andunserialize
provide a slightly lower-level interface to serialization: objectsserialized to a connection byserialize
can be read back byreadRDS
and conversely.
FunctioninfoRDS
retrieves meta-data about serialization producedbysaveRDS
orserialize
.infoRDS
cannot be used todetect whether a file is a serialization nor whether it is valid.
All of these interfaces use the same serialization format, butsave
writes a single line header (typically"RDXs\n"
) before theserialization of a single object (a pairlist of all the objects to besaved).
Iffile
is a file name, it is opened bygzfile
except forsave(compress = FALSE)
which usesfile
. Only for the exception are marked encodings offile
which cannot be translated to the native encoding handledon Windows.
Compression is handled by the connection opened whenfile
is afile name, so is only possible whenfile
is a connection ifhandled by the connection. So e.g.url
connections will need to be wrapped in a call togzcon
.
If a connection is supplied it will be opened (in binary mode) for theduration of the function if not already open: if it is already open itmust be in binary mode forsaveRDS(ascii = FALSE)
or to readnon-ASCII saves.
ForreadRDS
, anR object.
ForsaveRDS
,NULL
invisibly.
ForinfoRDS
, anR list with elementsversion
(versionnumber, currently 2 or 3),writer_version
(version ofR thatproduced the serialization),min_reader_version
(minimum version ofR that can read the serialization),format
(data representation)andnative_encoding
(native encoding of the session that producedthe serialization, available since version 3). The data representation isgiven as"xdr"
for big-endian binary representation,"ascii"
for ASCII representation (produced viaascii = TRUE
orascii = NA
) or"binary"
(binary representation with native‘endianness’ which can be produced byserialize
).
Files produced bysaveRDS
(orserialize
to a fileconnection) are not suitable as an interchange format betweenmachines, for example to download from a website. Thefiles produced bysave
have a header identifying thefile type and so are better protected against erroneous use.
The ‘R Internals’ manual for details of the format used.
fil <- tempfile("women", fileext = ".rds")## save a single object to filesaveRDS(women, fil)## restore it under a different namewomen2 <- readRDS(fil)identical(women, women2)## or examine the object via a connection, which will be opened as needed.con <- gzfile(fil)readRDS(con)close(con)## Less convenient ways to restore the object## which demonstrate compatibility with unserialize()con <- gzfile(fil, "rb")identical(unserialize(con), women)close(con)con <- gzfile(fil, "rb")wm <- readBin(con, "raw", n = 1e4) # size is a guessclose(con)identical(unserialize(wm), women)## Format compatibility with serialize():fil2 <- tempfile("women")con <- file(fil2, "w")serialize(women, con) # ASCII, uncompressedclose(con)identical(women, readRDS(fil2))fil3 <- tempfile("women")con <- bzfile(fil3, "w")serialize(women, con) # binary, bzip2-compressedclose(con)identical(women, readRDS(fil3))unlink(c(fil, fil2, fil3))
fil<- tempfile("women", fileext=".rds")## save a single object to filesaveRDS(women, fil)## restore it under a different namewomen2<- readRDS(fil)identical(women, women2)## or examine the object via a connection, which will be opened as needed.con<- gzfile(fil)readRDS(con)close(con)## Less convenient ways to restore the object## which demonstrate compatibility with unserialize()con<- gzfile(fil,"rb")identical(unserialize(con), women)close(con)con<- gzfile(fil,"rb")wm<- readBin(con,"raw", n=1e4)# size is a guessclose(con)identical(unserialize(wm), women)## Format compatibility with serialize():fil2<- tempfile("women")con<- file(fil2,"w")serialize(women, con)# ASCII, uncompressedclose(con)identical(women, readRDS(fil2))fil3<- tempfile("women")con<- bzfile(fil3,"w")serialize(women, con)# binary, bzip2-compressedclose(con)identical(women, readRDS(fil3))unlink(c(fil, fil2, fil3))
Read as file such as ‘.Renviron’ or ‘Renviron.site’ in theformat described in the help forStartup, and set environmentvariables as defined in the file.
readRenviron(path)
readRenviron(path)
path | A length-one character vector giving the path to thefile. Tilde-expansion is performed where supported. |
Scalar logical indicating if the file was read successfully. Returnedinvisibly. If the file cannot be opened for reading, a warning is given.
Startup
for the file format.
## Not run: ## re-read a startup file (or read it in a vanilla session)readRenviron("~/.Renviron")## End(Not run)
## Not run:## re-read a startup file (or read it in a vanilla session)readRenviron("~/.Renviron")## End(Not run)
Recall
is used as a placeholder for the name of the functionin which it is called. It allows the definition of recursivefunctions which still work after being renamed, see example below.
Recall(...)
Recall(...)
... | all the arguments to be passed. |
Recall
will not work correctly when passed as a functionargument, e.g. to theapply
family of functions.
local
for another way to write anonymous recursive functions.
## A trivial (but inefficient!) example:fib <- function(n) if(n<=2) { if(n>=0) 1 else 0 } else Recall(n-1) + Recall(n-2)fibonacci <- fib; rm(fib)## renaming wouldn't work without Recallfibonacci(10) # 55
## A trivial (but inefficient!) example:fib<-function(n)if(n<=2){if(n>=0)1else0}else Recall(n-1)+ Recall(n-2)fibonacci<- fib; rm(fib)## renaming wouldn't work without Recallfibonacci(10)# 55
Registers anR function to be called upon garbage collection ofobject or (optionally) at the end of anR session.
reg.finalizer(e, f, onexit = FALSE)
reg.finalizer(e, f, onexit=FALSE)
e | object to finalize. Must be an environment or an external pointer. |
f | function to call on finalization. Must accept a single argument,which will be the object to finalize. |
onexit | logical: should the finalizer be run if the object isstill uncollected at the end of theR session? |
The main purpose of this function is to allow objects that refer toexternal items (a temporary file, say) to perform cleanup actions whenthey are no longer referenced from withinR. This only makes sensefor objects that are never copied on assignment, hence the restrictionto environments and external pointers.
Inter alia, it provides a way to program code to be run atthe end of anR session without manipulating.Last
.For use in a package, it is often a good idea to set a finalizer on anobject in the namespace: then it will be called at the end of thesession, or soon after the namespace is unloaded if that is doneduring the session.
NULL
.
R's interpreter is not re-entrant and the finalizer could be run inthe middle of a computation. So there are many functions which it ispotentially unsafe to call fromf
: one example which causedtrouble isoptions
. Finalizers arescheduled at garbage collection but only run at a relatively safe timethereafter.
gc
andMemory
for garbage collection andmemory management.
f <- function(e) print("cleaning....")g <- function(x){ e <- environment(); reg.finalizer(e, f) }g()invisible(gc()) # trigger cleanup
f<-function(e) print("cleaning....")g<-function(x){ e<- environment(); reg.finalizer(e, f)}g()invisible(gc())# trigger cleanup
This help page documents the regular expression patterns supported bygrep
and related functionsgrepl
,regexpr
,gregexpr
,sub
andgsub
, as well as bystrsplit
and optionally byagrep
andagrepl
.
A ‘regular expression’ is a pattern that describes a set ofstrings. Two types of regular expressions are used inR,extended regular expressions (the default) andPerl-like regular expressions used byperl = TRUE
.There is alsofixed = TRUE
which can be considered to use aliteral regular expression.
Other functions which use regular expressions (often via the use ofgrep
) includeapropos
,browseEnv
,help.search
,list.files
andls
.These will all useextended regular expressions.
Patterns are described here as they would be printed bycat
:(do remember that backslashes need to be doubled when enteringRcharacter strings, e.g. from the keyboard).
Long regular expression patterns may or may not be accepted: the POSIXstandard only requires up to 256bytes.
This section covers the regular expressions allowed in the defaultmode ofgrep
,grepl
,regexpr
,gregexpr
,sub
,gsub
,regexec
andstrsplit
. They usean implementation of the POSIX 1003.2 standard: that allows some scopefor interpretation and the interpretations here are those currentlyused byR. The implementation supports some extensions to thestandard.
Regular expressions are constructed analogously to arithmeticexpressions, by using various operators to combine smallerexpressions. The whole expression matches zero or more characters(read ‘character’ as ‘byte’ ifuseBytes = TRUE
).
The fundamental building blocks are the regular expressions that matcha single character. Most characters, including all letters anddigits, are regular expressions that match themselves. Anymetacharacter with special meaning may be quoted by preceding it witha backslash. The metacharacters in extended regular expressions are‘. \ | ( ) [ { ^ $ * + ?’, but note that whether these have aspecial meaning depends on the context.
Escaping non-metacharacters with a backslash isimplementation-dependent. The current implementation interprets‘\a’ as ‘BEL’, ‘\e’ as ‘ESC’, ‘\f’ as‘FF’, ‘\n’ as ‘LF’, ‘\r’ as ‘CR’ and‘\t’ as ‘TAB’. (Note that these will be interpreted byR's parser in literal character strings.)
Acharacter class is a list of characters enclosed between‘[’ and ‘]’ which matches any single character in that list;unless the first character of the list is the caret ‘^’, when itmatches any characternot in the list. For example, theregular expression ‘[0123456789]’ matches any single digit, and‘[^abc]’ matches anything except the characters ‘a’,‘b’ or ‘c’. A range of characters may be specified bygiving the first and last characters, separated by a hyphen. (Becausetheir interpretation is locale- and implementation-dependent,character ranges are best avoided. Some but not all implementationsinclude both cases in ranges when doing caseless matching.) The onlyportable way to specify all ASCII letters is to list them all as thecharacter class
‘[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz]’.
(Thecurrent implementation uses numerical order of the encoding, normally asingle-byte encoding or Unicode points.)
Certain named classes of characters are predefined. Theirinterpretation depends on thelocale (seelocales); theinterpretation below is that of the POSIX locale.
Alphanumeric characters: ‘[:alpha:]’and ‘[:digit:]’.
Alphabetic characters: ‘[:lower:]’ and‘[:upper:]’.
Blank characters: space and tab, andpossibly other locale-dependent characters, but on most platformsnot including non-breaking space.
Control characters. In ASCII, these characters have octal codes000 through 037, and 177 (DEL
). In another character set,these are the equivalent characters, if any.
Digits: ‘0 1 2 3 4 5 6 7 8 9’.
Graphical characters: ‘[:alnum:]’ and‘[:punct:]’.
Lower-case letters in the current locale.
Printable characters: ‘[:alnum:]’, ‘[:punct:]’ and space.
Punctuation characters:
‘! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~’.
Space characters: tab, newline, vertical tab, form feed, carriagereturn, space and possibly other locale-dependent characters – onmost platforms this does not include non-breaking spaces.
Upper-case letters in the current locale.
Hexadecimal digits:
‘0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f’.
For example, ‘[[:alnum:]]’ means ‘[0-9A-Za-z]’, except thelatter depends upon the locale and the character encoding, whereas theformer is independent of locale and character set. (Note that thebrackets in these class names are part of the symbolic names, and mustbe included in addition to the brackets delimiting the bracket list.)Most metacharacters lose their special meaning inside a characterclass. To include a literal ‘]’, place it first in the list.Similarly, to include a literal ‘^’, place it anywhere but first.Finally, to include a literal ‘-’, place it first or last (or,forperl = TRUE
only, precede it by a backslash). (Only‘^ - \ ]’ are special inside character classes.)
The period ‘.’ matches any single character. The symbol‘\w’ matches a ‘word’ character (a synonym for‘[[:alnum:]_]’, an extension) and ‘\W’ is its negation(‘[^[:alnum:]_]’). Symbols ‘\d’, ‘\s’, ‘\D’and ‘\S’ denote the digit and space classes and their negations(these are all extensions).
The caret ‘^’ and the dollar sign ‘$’ are metacharactersthat respectively match the empty string at the beginning and end of aline. The symbols ‘\<’ and ‘\>’ match the empty string atthe beginning and end of a word. The symbol ‘\b’ matches theempty string at either edge of a word, and ‘\B’ matches theempty string provided it is not at an edge of a word. (Theinterpretation of ‘word’ depends on the locale andimplementation: these are all extensions.)
A regular expression may be followed by one of several repetitionquantifiers:
The preceding item is optional and will be matchedat most once.
The preceding item will be matched zero or moretimes.
The preceding item will be matched one or moretimes.
The preceding item is matched exactlyn
times.
The preceding item is matchedn
or moretimes.
The preceding item is matched at leastn
times, but not more thanm
times.
By default repetition is greedy, so the maximal possible number ofrepeats is used. This can be changed to ‘minimal’ by appending?
to the quantifier. (There are further quantifiers that allowapproximate matching: see the TRE documentation.)
Regular expressions may be concatenated; the resulting regularexpression matches any string formed by concatenating the substringsthat match the concatenated subexpressions.
Two regular expressions may be joined by the infix operator ‘|’;the resulting regular expression matches any string matching eithersubexpression. For example, ‘abba|cde’ matches either thestringabba
or the stringcde
. Note that alternationdoes not work inside character classes, where ‘|’ has its literalmeaning.
Repetition takes precedence over concatenation, which in turn takesprecedence over alternation. A whole subexpression may be enclosed inparentheses to override these precedence rules.
The backreference ‘\N’, where ‘N = 1 ... 9’, matchesthe substring previously matched by the Nth parenthesizedsubexpression of the regular expression. (This is anextension for extended regular expressions: POSIX defines them onlyfor basic ones.)
Theperl = TRUE
argument togrep
,regexpr
,gregexpr
,sub
,gsub
andstrsplit
switchesto the PCRE library that implements regular expression patternmatching using the same syntax and semantics as Perl 5.x,with just a few differences.
For complete details please consult the man pages for PCRE, especiallyman pcrepattern
andman pcreapi
, on your system orfrom the sources athttps://www.pcre.org. (The version in use can befound by callingextSoftVersion
. It need not be the versiondescribed in the system's man page. PCRE1 (reported as version < 10.00 byextSoftVersion
) has been feature-frozen for some time(essentially 2012), the man pages athttps://www.pcre.org/original/doc/html/ should be a good match.PCRE2 (PCRE version >= 10.00) has man pages athttps://www.pcre.org/current/doc/html/).
Perl regular expressions can be computed byte-by-byte or(UTF-8) character-by-character: the latter is used in all multibytelocales and if any of the inputs are marked as UTF-8 (seeEncoding
, or as Latin-1 except in a Latin-1 locale.
All the regular expressions described for extended regular expressionsare accepted except ‘\<’ and ‘\>’: in Perl all backslashedmetacharacters are alphanumeric and backslashed symbols always areinterpreted as a literal character. ‘{’ is not special if itwould be the start of an invalid interval specification. There can bemore than 9 backreferences (but the replacement insub
can only refer to the first 9).
Character ranges are interpreted in the numerical order of thecharacters, either as bytes in a single-byte locale or as Unicode codepoints in UTF-8 mode. So in either case ‘[A-Za-z]’ specifies theset of ASCII letters.
In UTF-8 mode the named character classes only match ASCII characters:see ‘\p’ below for an alternative.
The construct ‘(?...)’ is used for Perl extensions in a varietyof ways depending on what immediately follows the ‘?’.
Perl-like matching can work in several modes, set by the options‘(?i)’ (caseless, equivalent to Perl's ‘/i’), ‘(?m)’(multiline, equivalent to Perl's ‘/m’), ‘(?s)’ (single line,so a dot matches all characters, even new lines: equivalent to Perl's‘/s’) and ‘(?x)’ (extended, whitespace data characters areignored unless escaped and comments are allowed: equivalent to Perl's‘/x’). These can be concatenated, so for example, ‘(?im)’sets caseless multiline matching. It is also possible to unset theseoptions by preceding the letter with a hyphen, and to combine settingand unsetting such as ‘(?im-sx)’. These settings can be appliedwithin patterns, and then apply to the remainder of the pattern.Additional options not in Perl include ‘(?U)’ to set‘ungreedy’ mode (so matching is minimal unless ‘?’ is usedas part of the repetition quantifier, when it is greedy). Initiallynone of these options are set.
If you want to remove the special meaning from a sequence ofcharacters, you can do so by putting them between ‘\Q’ and‘\E’. This is different from Perl in that ‘$’ and ‘@’ arehandled as literals in ‘\Q...\E’ sequences in PCRE, whereas inPerl, ‘$’ and ‘@’ cause variable interpolation.
The escape sequences ‘\d’, ‘\s’ and ‘\w’ representany decimal digit, space character and ‘word’ character(letter, digit or underscore in the current locale: in UTF-8 mode onlyASCII letters and digits are considered) respectively, and theirupper-case versions represent their negation. Vertical tab was notregarded as a space character in aC
locale before PCRE 8.34.Sequences ‘\h’, ‘\v’, ‘\H’ and ‘\V’ matchhorizontal and vertical space or the negation. (In UTF-8 mode, thesedo match non-ASCII Unicode code points.)
There are additional escape sequences: ‘\cx’ is‘cntrl-x’ for any ‘x’, ‘\ddd’ is theoctal character (for up to three digits unlessinterpretable as a backreference, as ‘\1’ to ‘\7’ alwaysare), and ‘\xhh’ specifies a character by two hex digits.In a UTF-8 locale, ‘\x{h...}’ specifies a Unicode code pointby one or more hex digits. (Note that some of these will beinterpreted byR's parser in literal character strings.)
Outside a character class, ‘\A’ matches at the start of asubject (even in multiline mode, unlike ‘^’), ‘\Z’ matchesat the end of a subject or before a newline at the end, ‘\z’matches only at end of a subject. and ‘\G’ matches at firstmatching position in a subject (which is subtly different from Perl'send of the previous match). ‘\C’ matches a singlebyte, including a newline, but its use is warned against. In UTF-8mode, ‘\R’ matches any Unicode newline character (not just CR),and ‘\X’ matches any number of Unicode characters that form anextended Unicode sequence. ‘\X’, ‘\R’ and ‘\B’ cannot beused inside a character class (with PCRE1, they are treated as characters‘X’, ‘R’ and ‘B’; with PCRE2 they cause an error).
A hyphen (minus) inside a character class is treated as a range, unless itis first or last character in the class definition. It can be quoted torepresent the hyphen literal (‘\-’). PCRE1 allows an unquoted hyphenat some other locations inside a character class where it cannot representa valid range, but PCRE2 reports an error in such cases.
In UTF-8 mode, some Unicode properties may be supported via‘\p{xx}’ and ‘\P{xx}’ which match characters with andwithout property ‘xx’ respectively. For a list of supportedproperties see the PCRE documentation, but for example ‘Lu’ is‘upper case letter’ and ‘Sc’ is ‘currency symbol’. Notethat properties such as ‘\w’, ‘\W’, ‘\d’, ‘\D’, ‘\s’,‘\S’, ‘\b’ and ‘\B’ by default do not refer to fullUnicode, but one can override this by starting a pattern with ‘(*UCP)’(which comes with a performance penalty).(This support depends on the PCRE library being compiled with‘Unicode property support’ which can be checkedviapcre_config
. PCRE2 when compiled with Unicode support alwayssupports also Unicode properties.)
The sequence ‘(?#’ marks the start of a comment which continuesup to the next closing parenthesis. Nested parentheses are notpermitted. The characters that make up a comment play no part at all inthe pattern matching.
If the extended option is set, an unescaped ‘#’ character outsidea character class introduces a comment that continues up to the nextnewline character in the pattern.
The pattern ‘(?:...)’ groups characters just as parentheses dobut does not make a backreference.
Patterns ‘(?=...)’ and ‘(?!...)’ are zero-width positive andnegative lookaheadassertions: they match if an attempt tomatch the...
forward from the current position would succeed(or not), but use up no characters in the string being processed.Patterns ‘(?<=...)’ and ‘(?<!...)’ are the lookbehindequivalents: they do not allow repetition quantifiers nor ‘\C’in...
.
regexpr
andgregexpr
support ‘named capture’. Ifgroups are named, e.g.,"(?<first>[A-Z][a-z]+)"
then thepositions of the matches are also returned by name. (Namedbackreferences are not supported bysub
.)
Atomic grouping, possessive qualifiers and conditionaland recursive patterns are not covered here.
This help page is based on the TRE documentation and the POSIXstandard, and thepcre2pattern
man page from PCRE2 10.35.
grep
,apropos
,browseEnv
,glob2rx
,help.search
,list.files
,ls
,strsplit
andagrep
.
The POSIX 1003.2 standard athttps://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html.
Thepcre2pattern
orpcrepattern
man
page(found as part ofhttps://www.pcre.org/original/pcre.txt), anddetails of Perl's own implementation athttps://perldoc.perl.org/perlre.
Extract or replace matched substrings from match data obtained byregexpr
,gregexpr
,regexec
orgregexec
.
regmatches(x, m, invert = FALSE)regmatches(x, m, invert = FALSE) <- value
regmatches(x, m, invert=FALSE)regmatches(x, m, invert=FALSE)<- value
x | a character vector. |
m | an object with match data. |
invert | a logical: if |
value | an object with suitable replacement values for thematched or non-matched substrings (see |
Ifinvert
isFALSE
(default),regmatches
extractsthe matched substrings as specified by the match data. For vectormatch data (as obtained fromregexpr
), empty matches aredropped; for list match data, empty matches give empty components(zero-length character vectors).
Ifinvert
isTRUE
,regmatches
extracts thenon-matched substrings, i.e., the strings are split according to thematches similar tostrsplit
(for vector match data, atmost a single split is performed).
Ifinvert
isNA
,regmatches
extracts bothnon-matched and matched substrings, always starting and ending with anon-match (empty if the match occurred at the beginning or the end,respectively).
Note that the match data can be obtained from regular expressionmatching on a modified version ofx
with the same numbers ofcharacters.
The replacement function can be used for replacing the matched ornon-matched substrings. For vector match data, ifinvert
isFALSE
,value
should be a character vector with length thenumber of matched elements inm
. Otherwise, it should be alist of character vectors with the same length asm
, each aslong as the number of replacements needed. Replacement coerces valuesto character or list and generously recycles values as needed.Missing replacement values are not allowed.
Forregmatches
, a character vector with the matched substringsifm
is a vector andinvert
isFALSE
. Otherwise,a list with the matched or/and non-matched substrings.
Forregmatches<-
, the updated character vector.
x <- c("A and B", "A, B and C", "A, B, C and D", "foobar")pattern <- "[[:space:]]*(,|and)[[:space:]]"## Match data from regexpr()m <- regexpr(pattern, x)regmatches(x, m)regmatches(x, m, invert = TRUE)## Match data from gregexpr()m <- gregexpr(pattern, x)regmatches(x, m)regmatches(x, m, invert = TRUE)## Considerx <- "John (fishing, hunting), Paul (hiking, biking)"## Suppose we want to split at the comma (plus spaces) between the## persons, but not at the commas in the parenthesized hobby lists.## One idea is to "blank out" the parenthesized parts to match the## parts to be used for splitting, and extract the persons as the## non-matched parts.## First, match the parenthesized hobby lists.m <- gregexpr("\\([^)]*\\)", x)## Create blank strings with given numbers of characters.blanks <- function(n) strrep(" ", n)## Create a copy of x with the parenthesized parts blanked out.s <- xregmatches(s, m) <- Map(blanks, lapply(regmatches(s, m), nchar))s## Compute the positions of the split matches (note that we cannot call## strsplit() on x with match data from s).m <- gregexpr(", *", s)## And finally extract the non-matched parts.regmatches(x, m, invert = TRUE)## regexec() and gregexec() return overlapping ranges because the## first match is the full match. This conflicts with regmatches()<-## and regmatches(..., invert=TRUE). We can work-around by dropping## the first match.drop_first <- function(x) { if(!anyNA(x) && all(x > 0)) { ml <- attr(x, 'match.length') if(is.matrix(x)) x <- x[-1,] else x <- x[-1] attr(x, 'match.length') <- if(is.matrix(ml)) ml[-1,] else ml[-1] } x}m <- gregexec("(\\w+) \\(((?:\\w+(?:, )?)+)\\)", x)regmatches(x, m)try(regmatches(x, m, invert=TRUE))regmatches(x, lapply(m, drop_first))## invert=TRUE loses matrix structure because we are retrieving what## is in between every sub-matchregmatches(x, lapply(m, drop_first), invert=TRUE)y <- z <- x## Notice **list**(...) on the RHSregmatches(y, lapply(m, drop_first)) <- list(c("<NAME>", "<HOBBY-LIST>"))yregmatches(z, lapply(m, drop_first), invert=TRUE) <- list(sprintf("<%d>", 1:5))z## With `perl = TRUE` and `invert = FALSE` capture group names## are preserved. Collect functions and arguments in calls:NEWS <- head(readLines(file.path(R.home(), 'doc', 'NEWS.2')), 100)m <- gregexec("(?<fun>\\w+)\\((?<args>[^)]*)\\)", NEWS, perl = TRUE)y <- regmatches(NEWS, m)y[[16]]## Make tabular, adding original line numbersmdat <- as.data.frame(t(do.call(cbind, y)))mdat <- cbind(mdat, line=rep(seq_along(y), lengths(y) / ncol(mdat)))head(mdat)NEWS[head(mdat[['line']])]
x<- c("A and B","A, B and C","A, B, C and D","foobar")pattern<-"[[:space:]]*(,|and)[[:space:]]"## Match data from regexpr()m<- regexpr(pattern, x)regmatches(x, m)regmatches(x, m, invert=TRUE)## Match data from gregexpr()m<- gregexpr(pattern, x)regmatches(x, m)regmatches(x, m, invert=TRUE)## Considerx<-"John (fishing, hunting), Paul (hiking, biking)"## Suppose we want to split at the comma (plus spaces) between the## persons, but not at the commas in the parenthesized hobby lists.## One idea is to "blank out" the parenthesized parts to match the## parts to be used for splitting, and extract the persons as the## non-matched parts.## First, match the parenthesized hobby lists.m<- gregexpr("\\([^)]*\\)", x)## Create blank strings with given numbers of characters.blanks<-function(n) strrep(" ", n)## Create a copy of x with the parenthesized parts blanked out.s<- xregmatches(s, m)<- Map(blanks, lapply(regmatches(s, m), nchar))s## Compute the positions of the split matches (note that we cannot call## strsplit() on x with match data from s).m<- gregexpr(", *", s)## And finally extract the non-matched parts.regmatches(x, m, invert=TRUE)## regexec() and gregexec() return overlapping ranges because the## first match is the full match. This conflicts with regmatches()<-## and regmatches(..., invert=TRUE). We can work-around by dropping## the first match.drop_first<-function(x){if(!anyNA(x)&& all(x>0)){ ml<- attr(x,'match.length')if(is.matrix(x)) x<- x[-1,]else x<- x[-1] attr(x,'match.length')<-if(is.matrix(ml)) ml[-1,]else ml[-1]} x}m<- gregexec("(\\w+) \\(((?:\\w+(?:, )?)+)\\)", x)regmatches(x, m)try(regmatches(x, m, invert=TRUE))regmatches(x, lapply(m, drop_first))## invert=TRUE loses matrix structure because we are retrieving what## is in between every sub-matchregmatches(x, lapply(m, drop_first), invert=TRUE)y<- z<- x## Notice **list**(...) on the RHSregmatches(y, lapply(m, drop_first))<- list(c("<NAME>","<HOBBY-LIST>"))yregmatches(z, lapply(m, drop_first), invert=TRUE)<- list(sprintf("<%d>",1:5))z## With `perl = TRUE` and `invert = FALSE` capture group names## are preserved. Collect functions and arguments in calls:NEWS<- head(readLines(file.path(R.home(),'doc','NEWS.2')),100)m<- gregexec("(?<fun>\\w+)\\((?<args>[^)]*)\\)", NEWS, perl=TRUE)y<- regmatches(NEWS, m)y[[16]]## Make tabular, adding original line numbersmdat<- as.data.frame(t(do.call(cbind, y)))mdat<- cbind(mdat, line=rep(seq_along(y), lengths(y)/ ncol(mdat)))head(mdat)NEWS[head(mdat[['line']])]
remove
andrm
are identicalR functions thatcan be used to remove objects. These canbe specified successively as character strings, or in the charactervectorlist
, or through a combination of both. All objectsthus specified will be removed.
Ifenvir
is NULL then the currently active environment issearched first.
Ifinherits
isTRUE
then parents of the supplieddirectory are searched until a variable with the given name isencountered. A warning is printed for each variable that is notfound.
remove(..., list = character(), pos = -1, envir = as.environment(pos), inherits = FALSE)rm (..., list = character(), pos = -1, envir = as.environment(pos), inherits = FALSE)
remove(..., list= character(), pos=-1, envir= as.environment(pos), inherits=FALSE)rm(..., list= character(), pos=-1, envir= as.environment(pos), inherits=FALSE)
... | the objects to be removed, as names (unquoted) orcharacter strings (quoted). |
list | a character vector (or |
pos | where to do the removal. By default, uses thecurrent environment. See ‘details’ for other possibilities. |
envir | the |
inherits | should the enclosing frames of the environment beinspected? |
Thepos
argument can specify the environment from which to removethe objects in any of several ways:as an integer (the position in thesearch
list); asthe character string name of an element in the search list; or as anenvironment
(including usingsys.frame
toaccess the currently active function calls).Theenvir
argument is an alternative way to specify anenvironment, but is primarily there for back compatibility.
It is not allowed to remove variables from the base environment andbase namespace, nor from any environment which is locked (seelockEnvironment
).
Earlier versions ofR incorrectly claimed that supplying a charactervector in...
removed the objects named in the charactervector, but it removed the character vector. Use thelist
argument to specify objectsvia a character vector.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
tmp <- 1:4## work with tmp and cleanuprm(tmp)## Not run: ## remove (almost) everything in the working environment.## You will get no warning, so don't do this unless you are really sure.rm(list = ls())## End(Not run)
tmp<-1:4## work with tmp and cleanuprm(tmp)## Not run:## remove (almost) everything in the working environment.## You will get no warning, so don't do this unless you are really sure.rm(list= ls())## End(Not run)
rep
replicates the values inx
. It is a genericfunction, and the (internal) default method is described here.
rep.int
andrep_len
are faster simplified versions fortwo common cases. Internally, they are generic, so methods can bedefined for them (seeInternalMethods).
rep(x, ...)rep.int(x, times)rep_len(x, length.out)
rep(x,...)rep.int(x, times)rep_len(x, length.out)
x | a vector (of any mode including a |
... | further arguments to be passed to or from other methods.For the internal default method these can include:
|
times ,length.out | see |
The default behaviour is as if the call was
rep(x, times = 1, length.out = NA, each = 1)
. Normally just one of the additionalarguments is specified, but ifeach
is specified with eitherof the other two, its replication is performed first, and then thatimplied bytimes
orlength.out
.
Iftimes
consists of a single integer, the result consists ofthe whole input repeated this many times. Iftimes
is avector of the same length asx
(after replication byeach
), the result consists ofx[1]
repeatedtimes[1]
times,x[2]
repeatedtimes[2]
times andso on.
length.out
may be given in place oftimes
,in which casex
is repeated as many times as isnecessary to create a vector of this length. If both are given,length.out
takes priority andtimes
is ignored.
Non-integer values oftimes
will be truncated towards zero.Iftimes
is a computed quantity it is prudent to add a smallfuzz or useround
. And analogously foreach
.
Ifx
has length zero andlength.out
is supplied and ispositive, the values are filled in using the extraction rules, that isby anNA
of the appropriate class for an atomic vector(0
for raw vectors) andNULL
for a list.
An object of the same type asx
.
rep.int
andrep_len
return no attributes (except theclass if returning a factor).
The default method ofrep
gives the result names (which willalmost always contain duplicates) ifx
had names, but retainsno other attributes.
Functionrep.int
is a simple case which was provided as aseparate function partly for S compatibility and partly for speed(especially when names can be dropped). The performance ofrep
has been improved since, butrep.int
is still at least twice asfast whenx
has names.
The namerep.int
long precedes makingrep
generic.
Functionrep
is a primitive, but (partial) matching of argumentnames is performed as for normal functions.
For historical reasonsrep
(only) works onNULL
: theresult is alwaysNULL
even whenlength.out
is positive.
Although it has never been documented, these functions have alwaysworked onexpression vectors.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
rep(1:4, 2)rep(1:4, each = 2) # not the same.rep(1:4, c(2,2,2,2)) # same as second.rep(1:4, c(2,1,2,1))rep(1:4, each = 2, length.out = 4) # first 4 only.rep(1:4, each = 2, length.out = 10) # 8 integers plus two recycled 1's.rep(1:4, each = 2, times = 3) # length 24, 3 complete replicationsrep(1, 40*(1-.8)) # length 7 on most platformsrep(1, 40*(1-.8)+1e-7) # better## replicate a listfred <- list(happy = 1:10, name = "squash")rep(fred, 5)# date-time objectsx <- .leap.seconds[1:3]rep(x, 2)rep(as.POSIXlt(x), rep(2, 3))## named factorx <- factor(LETTERS[1:4]); names(x) <- letters[1:4]xrep(x, 2)rep(x, each = 2)rep.int(x, 2) # no namesrep_len(x, 10)
rep(1:4,2)rep(1:4, each=2)# not the same.rep(1:4, c(2,2,2,2))# same as second.rep(1:4, c(2,1,2,1))rep(1:4, each=2, length.out=4)# first 4 only.rep(1:4, each=2, length.out=10)# 8 integers plus two recycled 1's.rep(1:4, each=2, times=3)# length 24, 3 complete replicationsrep(1,40*(1-.8))# length 7 on most platformsrep(1,40*(1-.8)+1e-7)# better## replicate a listfred<- list(happy=1:10, name="squash")rep(fred,5)# date-time objectsx<- .leap.seconds[1:3]rep(x,2)rep(as.POSIXlt(x), rep(2,3))## named factorx<- factor(LETTERS[1:4]); names(x)<- letters[1:4]xrep(x,2)rep(x, each=2)rep.int(x,2)# no namesrep_len(x,10)
replace
replaces the values inx
with indices given inlist
by those given invalues
.If necessary, the values invalues
are recycled.
replace(x, list, values)
replace(x, list, values)
x | a vector. |
list | an index vector. |
values | replacement values. |
A vector with the values replaced.
x
is unchanged: remember to assign the result.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
The reserved words inR's parser are
if
else
repeat
while
function
for
in
next
break
TRUE
FALSE
NULL
Inf
NaN
NA
NA_integer_
NA_real_
NA_complex_
NA_character_
...
and..1
,..2
etc, which are used to refer toarguments passed down from a calling function, see...
.
Reserved words outsidequotes are always parsed to bereferences to the objects linked to in the ‘Description’, andhence they are not allowed as syntactic names (seemake.names
). Theyare allowed as non-syntacticnames, e.g. insidebacktick quotes.
rev
provides a reversed version of its argument. It is genericfunction with a default method for vectors and one fordendrogram
s.
Note that this is no longer needed (nor efficient) for obtainingvectors sorted into descending order, since that is now rather moredirectly achievable bysort(x, decreasing = TRUE)
.
rev(x)
rev(x)
x | a vector or another object for which reversal is defined. |
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
x <- c(1:5, 5:3)## sort into descending order; first more efficiently:stopifnot(sort(x, decreasing = TRUE) == rev(sort(x)))stopifnot(rev(1:7) == 7:1) #- don't need 'rev' here
x<- c(1:5,5:3)## sort into descending order; first more efficiently:stopifnot(sort(x, decreasing=TRUE)== rev(sort(x)))stopifnot(rev(1:7)==7:1)#- don't need 'rev' here
Return theR home directory, or the full path to a component of theR installation.
R.home(component = "home")
R.home(component="home")
component |
|
TheR home directory is the top-level directory of theRinstallation being run.
TheR home directory is often referred to asR_HOME,and is the value of an environment variable of that name in anRsession.It can be found outside anR session byRRHOME
.
The paths to components often are subdirectories ofR_HOME butneed not be:"doc"
,"include"
and"share"
arenot for some Linux binary installations ofR.
A character string giving theR home directory or path to aparticular component. Normally the components are all subdirectoriesof theR home directory, but this need not be the case in a Unix-likeinstallation.
The value for"modules"
and on Windows"bin"
is asub-architecture-specific location. (This is not so for"etc"
, which may have sub-architecture-specific files as wellas common ones.)
On a Unix-alike, the constructed paths are based on the currentvalues of the environment variablesR_HOME and where setR_SHARE_DIR,R_DOC_DIR andR_INCLUDE_DIR (these areset on startup and should not be altered).
On Windows the values ofR.home()
andR_HOME areswitched to the 8.3 short form of path elements if required and ifthe Windows service to do that is enabled. The value ofR_HOME is set to use forward slashes (since many packagemaintainers pass it unquoted to shells, for example in‘Makefile’s).
commandArgs()[1]
may provide related information.
## These result quite platform-dependently :rbind(home = R.home(), bin = R.home("bin")) # often the 'bin' sub directory of 'home' # but not always ...list.files(R.home("bin"))
## These result quite platform-dependently :rbind(home= R.home(), bin= R.home("bin"))# often the 'bin' sub directory of 'home'# but not always ...list.files(R.home("bin"))
Compute the lengths and values of runs of equal values in a vector– or the reverse operation.
rle(x)inverse.rle(x, ...)## S3 method for class 'rle'print(x, digits = getOption("digits"), prefix = "", ...)
rle(x)inverse.rle(x,...)## S3 method for class 'rle'print(x, digits= getOption("digits"), prefix="",...)
x | a vector (atomic, not a list) for |
... | further arguments; ignored here. |
digits | number of significant digits for printing, see |
prefix | character string, prepended to each printed line. |
‘vector’ is used in the sense ofis.vector
.
Missing values are regarded as unequal to the previous value, even ifthat is also missing.
inverse.rle()
is the inverse function ofrle()
,reconstructingx
from the runs.
rle()
returns an object of class"rle"
which is a listwith components:
lengths | an integer vector containing the length of each run. |
values | a vector of the same length as |
inverse.rle()
returns an atomic vector.
x <- rev(rep(6:10, 1:5))rle(x)## lengths [1:5] 5 4 3 2 1## values [1:5] 10 9 8 7 6z <- c(TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE)rle(z)rle(as.character(z))print(rle(z), prefix = "..| ")N <- integer(0)stopifnot(x == inverse.rle(rle(x)), identical(N, inverse.rle(rle(N))), z == inverse.rle(rle(z)))
x<- rev(rep(6:10,1:5))rle(x)## lengths [1:5] 5 4 3 2 1## values [1:5] 10 9 8 7 6z<- c(TRUE,TRUE,FALSE,FALSE,TRUE,FALSE,TRUE,TRUE,TRUE)rle(z)rle(as.character(z))print(rle(z), prefix="..| ")N<- integer(0)stopifnot(x== inverse.rle(rle(x)), identical(N, inverse.rle(rle(N))), z== inverse.rle(rle(z)))
ceiling
takes a single numeric argumentx
and returns anumeric vector containing the smallest integers not less than thecorresponding elements ofx
.
floor
takes a single numeric argumentx
and returns anumeric vector containing the largest integers not greater than thecorresponding elements ofx
.
trunc
takes a single numeric argumentx
and returns anumeric vector containing the integers formed by truncating the values inx
toward0
.
round
rounds the values in its first argument to the specifiednumber of decimal places (default 0). See ‘Details’ about“round to even” when rounding off a 5.
signif
rounds the values in its first argument to the specifiednumber ofsignificant digits. Hence, fornumeric
x
,signif(x, dig)
is the same asround(x, dig - ceiling(log10(abs(x))))
.
ceiling(x)floor(x)trunc(x, ...)round(x, digits = 0, ...)signif(x, digits = 6)
ceiling(x)floor(x)trunc(x,...)round(x, digits=0,...)signif(x, digits=6)
x | a numeric vector. Or, for |
digits | integer indicating the number of decimal places( |
... | arguments to be passed to methods. |
These are generic functions: methods can be defined for themindividually or via theMath
groupgeneric.
Rounding to a negative number of digits means rounding to a power often, so for exampleround(x, digits = -2)
rounds to the nearesthundred.
Forsignif
the recognized values ofdigits
are1...22
, and non-missing values are rounded to the nearestinteger in that range. Each element of the vector is rounded individually, unlike printing.
These are all primitive functions.
These are all (internally) S4 generic.
ceiling
,floor
andtrunc
are members of theMath
group generic. As an S4generic,trunc
has only one argument.
round
andsignif
are members of theMath2
group generic.
The realities of computer arithmetic can cause unexpected results,especially withfloor
andceiling
. For example, we‘know’ thatfloor(log(x, base = 8))
forx = 8
is1
, but0
has been seen on anR platform. It isnormally necessary to use a tolerance.
Rounding to decimal digits in binary arithmetic is non-trivial (whendigits != 0
) and may be surprising. Be aware that most decimalfractions arenot exactly representable in binary double precision.InR 4.0.0, the algorithm forround(x, d)
, for, hasbeen improved tomeasure and round “to nearest even”,contrary to earlier versions ofR (or also to
sprintf()
orformat()
based rounding).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language. Wadsworth & Brooks/Cole.
The ISO/IEC/IEEE 60559:2011 standard is available for money fromhttps://www.iso.org.
The IEEE 754:2008 standard is more openly documented, e.g, athttps://en.wikipedia.org/wiki/IEEE_754.
as.integer
.Packageround'sroundX()
for severalversions or implementations of rounding, including some previous and thecurrentR version (asversion = "3d.C"
).
round(.5 + -2:4) # IEEE / IEC rounding: -2 0 0 2 2 4 4## (this is *good* behaviour -- do *NOT* report it as bug !)( x1 <- seq(-2, 4, by = .5) )round(x1) #-- IEEE / IEC rounding !x1[trunc(x1) != floor(x1)]x1[round(x1) != floor(x1 + .5)](non.int <- ceiling(x1) != floor(x1))x2 <- pi * 100^(-1:3)round(x2, 3)signif(x2, 3)
round(.5+-2:4)# IEEE / IEC rounding: -2 0 0 2 2 4 4## (this is *good* behaviour -- do *NOT* report it as bug !)( x1<- seq(-2,4, by=.5))round(x1)#-- IEEE / IEC rounding !x1[trunc(x1)!= floor(x1)]x1[round(x1)!= floor(x1+.5)](non.int<- ceiling(x1)!= floor(x1))x2<- pi*100^(-1:3)round(x2,3)signif(x2,3)
Round or truncate date-time objects.
## S3 method for class 'POSIXt'round(x, units = c("secs", "mins", "hours", "days", "months", "years"))## S3 method for class 'POSIXt'trunc(x, units = c("secs", "mins", "hours", "days", "months", "years"), ...)## S3 method for class 'Date'round(x, ...)## S3 method for class 'Date'trunc(x, units = c("secs", "mins", "hours", "days", "months", "years"), ...)
## S3 method for class 'POSIXt'round(x, units= c("secs","mins","hours","days","months","years"))## S3 method for class 'POSIXt'trunc(x, units= c("secs","mins","hours","days","months","years"),...)## S3 method for class 'Date'round(x,...)## S3 method for class 'Date'trunc(x, units= c("secs","mins","hours","days","months","years"),...)
x | |
units | one of the units listed, a string. Can be abbreviated. |
... | arguments to be passed to or from other methods, notably |
The time is rounded or truncated to the second, minute, hour, day,month or year. Time zones are only relevant to days or more, whenmidnight in the currenttime zone is used.
Forunits
arguments besides “months” and “years”,the methods for class"Date"
are of little use except to removefractional days.
An object of class"POSIXlt"
or"Date"
.
round
for the generic function and default methods.
round(.leap.seconds + 1000, "hour") trunc(Sys.time(), "day")(timM <- trunc(Sys.time() -> St, "months")) # shows timezone(datM <- trunc(Sys.Date() -> Sd, "months"))(timY <- trunc(St, "years")) # + timezone(datY <- trunc(Sd, "years"))stopifnot(inherits(datM, "Date"), inherits(timM, "POSIXt"), substring(format(datM), 9,10) == "01", # first of month substring(format(datY), 6,10) == "01-01", # Jan 1 identical(format(datM), format(timM)), identical(format(datY), format(timY)))
round(.leap.seconds+1000,"hour") trunc(Sys.time(),"day")(timM<- trunc(Sys.time()-> St,"months"))# shows timezone(datM<- trunc(Sys.Date()-> Sd,"months"))(timY<- trunc(St,"years"))# + timezone(datY<- trunc(Sd,"years"))stopifnot(inherits(datM,"Date"), inherits(timM,"POSIXt"), substring(format(datM),9,10)=="01",# first of month substring(format(datY),6,10)=="01-01",# Jan 1 identical(format(datM), format(timM)), identical(format(datY), format(timY)))
Returns a matrix of integers indicating their row number in amatrix-like object, or a factor indicating the row labels.
row(x, as.factor = FALSE).row(dim)
row(x, as.factor=FALSE).row(dim)
x | a matrix-like object, that is one with a two-dimensional |
dim | a matrix dimension, i.e., an integer valued numeric vector oflength two (with non-negative entries). |
as.factor | a logical value indicating whether the value shouldbe returned as a factor of row labels (created if necessary)rather than as numbers. |
An integer (or factor) matrix with the same dimensions asx
and whoseij
-th element is equal toi
(or thei
-th row label).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
col
to get columns;slice.index
for a general way to get slice indicesin an array.
x <- matrix(1:12, 3, 4)# extract the diagonal of a matrix - more slowly than diag(x)dx <- x[row(x) == col(x)]dx# create an identity 5-by-5 matrix more slowly than diag(n = 5):x <- matrix(0, nrow = 5, ncol = 5)x[row(x) == col(x)] <- 1x(i34 <- .row(3:4))stopifnot(identical(i34, .row(c(3,4)))) # 'dim' maybe "double"
x<- matrix(1:12,3,4)# extract the diagonal of a matrix - more slowly than diag(x)dx<- x[row(x)== col(x)]dx# create an identity 5-by-5 matrix more slowly than diag(n = 5):x<- matrix(0, nrow=5, ncol=5)x[row(x)== col(x)]<-1x(i34<- .row(3:4))stopifnot(identical(i34, .row(c(3,4))))# 'dim' maybe "double"
All data frames have row names, a character vector oflength the number of rows with no duplicates nor missing values.
There are generic functions for getting and setting row names,with default methods for arrays.The description here is for thedata.frame
method.
`.rowNamesDF<-`
is a (non-generic replacement) function to setrow names for data frames, with extra argumentmake.names
.This function only exists as workaround as we cannot easily change therow.names<-
generic without breaking legacy code in existing packages.
row.names(x)row.names(x) <- value.rowNamesDF(x, make.names=FALSE) <- value
row.names(x)row.names(x)<- value.rowNamesDF(x, make.names=FALSE)<- value
x | object of class |
make.names |
|
value | an object to be coerced to character unless an integervector. It should have (after coercion) the same length as thenumber of rows of |
A data frame has (by definition) a vector ofrow names whichhas length the number of rows in the data frame, and contains neithermissing nor duplicated values. Where a row names sequence has beenadded by the software to meet this requirement, they are regarded as‘automatic’.
Row names are currently allowed to be integer or character, butfor backwards compatibility (withR <= 2.4.0)row.names
willalways return a character vector. (Useattr(x, "row.names")
ifyou need to retrieve an integer-valued set of row names.)
UsingNULL
for the value resets the row names toseq_len(nrow(x))
, regarded as ‘automatic’.
row.names
returns a character vector.
row.names<-
returns a data frame with the row names changed.
row.names
is similar torownames
for arrays, andit has a method that callsrownames
for an array argument.
Row names of the form1:n
forn > 2
are storedinternally in a compact form, which might be seen from C code or bydeparsing but never viarow.names
orattr(x, "row.names")
. Additionally, some names of thissort are marked as ‘automatic’ and handled differently byas.matrix
anddata.matrix
(and potentiallyother functions). (All zero-row data frames are regarded as havingautomatic row names.)
Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
.row_names_info
for the internal representations.
## To illustrate the note:df <- data.frame(x = c(TRUE, FALSE, NA, NA), y = c(12, 34, 56, 78))row.names(df) <- 1 : 4attr(df, "row.names") #> 1:4deparse(df) # or dput(df)##--> c(NA, 4L) : Compact storage, *not* regarded as automatic.row.names(df) <- NULLattr(df, "row.names") #> 1:4deparse(df) # or dput(df) -- shows##--> c(NA, -4L) : Compact storage, regarded as automatic.
## To illustrate the note:df<- data.frame(x= c(TRUE,FALSE,NA,NA), y= c(12,34,56,78))row.names(df)<-1:4attr(df,"row.names")#> 1:4deparse(df)# or dput(df)##--> c(NA, 4L) : Compact storage, *not* regarded as automatic.row.names(df)<-NULLattr(df,"row.names")#> 1:4deparse(df)# or dput(df) -- shows##--> c(NA, -4L) : Compact storage, regarded as automatic.
Retrieve or set the row or column names of a matrix-like object.
rownames(x, do.NULL = TRUE, prefix = "row")rownames(x) <- valuecolnames(x, do.NULL = TRUE, prefix = "col")colnames(x) <- value
rownames(x, do.NULL=TRUE, prefix="row")rownames(x)<- valuecolnames(x, do.NULL=TRUE, prefix="col")colnames(x)<- value
x | a matrix-likeR object, with at least two dimensions for |
do.NULL | logical. If |
prefix | for created names. |
value | a valid value for that component of |
The extractor functions try to do something sensible for anymatrix-like objectx
. If the object hasdimnames
the first component is used as the row names, and the second component(if any) is used for the column names. For a data frame,rownames
andcolnames
eventually callrow.names
andnames
respectively, but the latter are preferred.
Ifdo.NULL
isFALSE
, a character vector (of lengthNROW(x)
orNCOL(x)
) is returned in anycase, prependingprefix
to simple numbers, if there are nodimnames or the corresponding component of the dimnames isNULL
.
The replacement methods for arrays/matrices coerce vector and factorvalues ofvalue
to character, but do not dispatch methods foras.character
.
For a data frame,value
forrownames
should be acharacter vector of non-duplicated and non-missing names (this isenforced), and forcolnames
a character vector of (preferably)unique syntactically-valid names. In both cases,value
will becoerced byas.character
, and settingcolnames
will convert the row names to character.
If the replacement versions are called on a matrix without anyexisting dimnames, they will add suitable dimnames. Butconstructions such as
rownames(x)[3] <- "c"
may not work unlessx
already has dimnames, since this willcreate a length-3value
from theNULL
value ofrownames(x)
.
dimnames
,case.names
,variable.names
.
m0 <- matrix(NA, 4, 0)rownames(m0)m2 <- cbind(1, 1:4)colnames(m2, do.NULL = FALSE)colnames(m2) <- c("x","Y")rownames(m2) <- rownames(m2, do.NULL = FALSE, prefix = "Obs.")m2
m0<- matrix(NA,4,0)rownames(m0)m2<- cbind(1,1:4)colnames(m2, do.NULL=FALSE)colnames(m2)<- c("x","Y")rownames(m2)<- rownames(m2, do.NULL=FALSE, prefix="Obs.")m2
Compute column sums across rows of a numeric matrix-like object foreach level of a grouping variable.rowsum
is generic, with amethod for data frames and a default method for vectors and matrices.
rowsum(x, group, reorder = TRUE, ...)## S3 method for class 'data.frame'rowsum(x, group, reorder = TRUE, na.rm = FALSE, ...)## Default S3 method:rowsum(x, group, reorder = TRUE, na.rm = FALSE, ...)
rowsum(x, group, reorder=TRUE,...)## S3 method for class 'data.frame'rowsum(x, group, reorder=TRUE, na.rm=FALSE,...)## Default S3 method:rowsum(x, group, reorder=TRUE, na.rm=FALSE,...)
x | a matrix, data frame or vector of numeric data. Missingvalues are allowed. A numeric vector will be treated as a column vector. |
group | a vector or factor giving the grouping, with one elementper row of |
reorder | if |
na.rm | logical ( |
... | other arguments to be passed to or from methods. |
The default is to reorder the rows to agree withtapply
as inthe example below. Reordering should not add noticeably to the timeexcept when there are very many distinct values ofgroup
andx
has few columns.
The original function was written by Terry Therneau, but this is anew implementation using hashing that is much faster for large matrices.
To sum over all the rows of a matrix (i.e., a singlegroup
) usecolSums
, which should be even faster.
For integer arguments, over/underflow in forming the sum results inNA
.
A matrix or data frame containing the sums. There will be one row perunique value ofgroup
.
require(stats)x <- matrix(runif(100), ncol = 5)group <- sample(1:8, 20, TRUE)(xsum <- rowsum(x, group))## Slower versionstapply(x, list(group[row(x)], col(x)), sum)t(sapply(split(as.data.frame(x), group), colSums))aggregate(x, list(group), sum)[-1]
require(stats)x<- matrix(runif(100), ncol=5)group<- sample(1:8,20,TRUE)(xsum<- rowsum(x, group))## Slower versionstapply(x, list(group[row(x)], col(x)), sum)t(sapply(split(as.data.frame(x), group), colSums))aggregate(x, list(group), sum)[-1]
Register S3 methods in R scripts.
.S3method(generic, class, method)
.S3method(generic, class, method)
generic | a character string naming an S3 generic function. |
class | a character string naming an S3 class. |
method | a character string or function giving the S3 method tobe registered. If not given, the function named |
This function should only be used in R scripts: for package code, oneshould use the corresponding ‘S3method’ ‘NAMESPACE’ directive.
## Create a generic function and register a method for objects## inheriting from class 'cls':gen <- function(x) UseMethod("gen")met <- function(x) writeLines("Hello world.").S3method("gen", "cls", met)## Create an object inheriting from class 'cls', and call the## generic on it:x <- structure(123, class = "cls")gen(x)
## Create a generic function and register a method for objects## inheriting from class 'cls':gen<-function(x) UseMethod("gen")met<-function(x) writeLines("Hello world.").S3method("gen","cls", met)## Create an object inheriting from class 'cls', and call the## generic on it:x<- structure(123, class="cls")gen(x)
sample
takes a sample of the specified size from the elementsofx
using either with or without replacement.
sample(x, size, replace = FALSE, prob = NULL)sample.int(n, size = n, replace = FALSE, prob = NULL, useHash = (n > 1e+07 && !replace && is.null(prob) && size <= n/2))
sample(x, size, replace=FALSE, prob=NULL)sample.int(n, size= n, replace=FALSE, prob=NULL, useHash=(n>1e+07&&!replace&& is.null(prob)&& size<= n/2))
x | either a vector of one or more elements from which to choose,or a positive integer. See ‘Details.’ |
n | a positive number, the number of items to choose from. See‘Details.’ |
size | a non-negative integer giving the number of items to choose. |
replace | should sampling be with replacement? |
prob | a vector of probability weights for obtaining the elementsof the vector being sampled. |
useHash |
|
Ifx
has length 1, is numeric (in the sense ofis.numeric
) andx >= 1
, samplingviasample
takes place from1:x
.Note that thisconvenience feature may lead to undesired behaviour whenx
isof varying length in calls such assample(x)
. See the examples.
Otherwisex
can be anyR object for whichlength
andsubsetting by integers make sense: S3 or S4 methods for theseoperations will be dispatched as appropriate.
Forsample
the default forsize
is the number of itemsinferred from the first argument, so thatsample(x)
generates arandom permutation of the elements ofx
(or1:x
).
It is allowed to ask forsize = 0
samples withn = 0
ora length-zerox
, but otherwisen > 0
or positivelength(x)
is required.
Non-integer positive numerical values ofn
orx
will betruncated to the next smallest integer, which has to be no larger than.Machine$integer.max
.
The optionalprob
argument can be used to give a vector ofweights for obtaining the elements of the vector being sampled. Theyneed not sum to one, but they should be non-negative and not all zero.Ifreplace
is true, Walker's alias method (Ripley, 1987) isused when there are more than 200 reasonably probable values: thisgives results incompatible with those fromR < 2.2.0.
Ifreplace
is false, these probabilities are appliedsequentially, that is the probability of choosing the next item isproportional to the weights amongst the remaining items. The numberof nonzero weights must be at leastsize
in this case.
sample.int
is a bare interface in which bothn
andsize
must be supplied as integers.
Argumentn
can be larger than the largest integer oftypeinteger
, up to the largest representable integer in typedouble
. Only uniform sampling is supported. Tworandom numbers are used to ensure uniform sampling of large integers.
Forsample
a vector of lengthsize
with elementsdrawn from eitherx
or from the integers1:x
.
Forsample.int
, an integer vector of lengthsize
withelements from1:n
, or a double vector if.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Ripley, B. D. (1987)Stochastic Simulation. Wiley.
RNGkind(sample.kind = ..)
about random number generation,notably the change ofsample()
results withR version 3.6.0.
CRAN packagesampling for other methods of weighted samplingwithout replacement.
x <- 1:12# a random permutationsample(x)# bootstrap resampling -- only if length(x) > 1 !sample(x, replace = TRUE)# 100 Bernoulli trialssample(c(0,1), 100, replace = TRUE)## More careful bootstrapping -- Consider this when using sample()## programmatically (i.e., in your function or simulation)!# sample()'s surprise -- examplex <- 1:10 sample(x[x > 8]) # length 2 sample(x[x > 9]) # oops -- length 10! sample(x[x > 10]) # length 0## safer version:resample <- function(x, ...) x[sample.int(length(x), ...)]resample(x[x > 8]) # length 2resample(x[x > 9]) # length 1resample(x[x > 10]) # length 0## R 3.0.0 and latersample.int(1e10, 12, replace = TRUE)sample.int(1e10, 12) # not that there is much chance of duplicates
x<-1:12# a random permutationsample(x)# bootstrap resampling -- only if length(x) > 1 !sample(x, replace=TRUE)# 100 Bernoulli trialssample(c(0,1),100, replace=TRUE)## More careful bootstrapping -- Consider this when using sample()## programmatically (i.e., in your function or simulation)!# sample()'s surprise -- examplex<-1:10 sample(x[x>8])# length 2 sample(x[x>9])# oops -- length 10! sample(x[x>10])# length 0## safer version:resample<-function(x,...) x[sample.int(length(x),...)]resample(x[x>8])# length 2resample(x[x>9])# length 1resample(x[x>10])# length 0## R 3.0.0 and latersample.int(1e10,12, replace=TRUE)sample.int(1e10,12)# not that there is much chance of duplicates
save
writes an external representation ofR objects to thespecified file. The objects can be read back from the file at a laterdate by using the functionload
orattach
(ordata
in some cases).
save.image()
is just a short-cut for ‘save my currentworkspace’, i.e.,save(list = ls(all.names = TRUE), file = ".RData", envir = .GlobalEnv)
.It is also what happens withq("yes")
.
save(..., list = character(), file = stop("'file' must be specified"), ascii = FALSE, version = NULL, envir = parent.frame(), compress = isTRUE(!ascii), compression_level, eval.promises = TRUE, precheck = TRUE)save.image(file = ".RData", version = NULL, ascii = FALSE, compress = !ascii, safe = TRUE)
save(..., list= character(), file= stop("'file' must be specified"), ascii=FALSE, version=NULL, envir= parent.frame(), compress= isTRUE(!ascii), compression_level, eval.promises=TRUE, precheck=TRUE)save.image(file=".RData", version=NULL, ascii=FALSE, compress=!ascii, safe=TRUE)
... | the names of the objects to be saved (as symbols orcharacter strings). |
list | a character vector (or |
file | a (writable binary-mode)connection or the name of thefile where the data will be saved (whentilde expansionis done). Must be a file name for |
ascii | if |
version | the workspace format version to use. |
envir | environment to search for objects to be saved. |
compress | logical or character string specifying whether savingto a named file is to use compression. |
compression_level | integer: the level of compression to beused. Defaults to |
eval.promises | logical: should objects which are promises beforced before saving? |
precheck | logical: should the existence of the objects bechecked before starting to save (and in particular before openingthe file/connection)? Does not apply to version 1 saves. |
safe | logical. If |
The names of the objects specified either as symbols (or characterstrings) in...
or as a character vector inlist
areused to look up the objects from environmentenvir
. By defaultpromises are evaluated, but ifeval.promises = FALSE
promises are saved (together with their evaluation environments).(Promises embedded in objects are always saved unevaluated.)
AllR platforms use the XDR (big-endian) representation of C ints anddoubles in binary save-d files, and these are portable across allRplatforms.
ASCII saves used to be useful for moving data between platforms butare now mainly of historical interest. They can be more compact thanbinary saves where compression is not used, but are almost alwaysslower to both read and write: binary saves compress much better thanASCII ones. Further, decimal ASCII saves may not restoredouble/complex values exactly, and what value is restored may dependon theR platform.
Default values for theascii
,compress
,safe
andversion
arguments can be modified with the"save.defaults"
option (used both bysave
andsave.image
), see also the ‘Examples’ section. If a"save.image.defaults"
option is set it is used in preference to"save.defaults"
for functionsave.image
(which allowsthis to have different defaults). In addition,compression_level
can be part of the"save.defaults"
option.
A connection that is not already open will be opened in mode"wb"
. Supplying a connection which is open and not in binarymode gives an error.
Large files can be reduced considerably in size by compression. Aparticular 46MBR object was saved as 35MB without compression in 2seconds, 22MB withgzip
compression in 8 secs, 19MB withbzip2
compression in 13 secs and 9.4MB withxz
compression in 40 secs. The load times were 1.3, 2.8, 5.5 and 5.7seconds respectively. These results are indicative, but the relativeperformances do depend on the actual file:xz
compressedunusually well here.
It is possible to compress later (withgzip
,bzip2
orxz
) a file saved withcompress = FALSE
: the effectis the same as saving with compression. Also, a saved file can beuncompressed and re-compressed under a different compression scheme(and seeresaveRdaFiles
for a way to do so from withinR).
Thatfile
can be a connection can be exploited to make use ofan external parallel compression utility such aspigz
(https://zlib.net/pigz/) orpbzip2
(https://launchpad.net/pbzip2)via apipe
connection. For example, using 8 threads,
con <- pipe("pigz -p8 > fname.gz", "wb") save(myObj, file = con); close(con) con <- pipe("pbzip2 -p8 -9 > fname.bz2", "wb") save(myObj, file = con); close(con) con <- pipe("xz -T8 -6 -e > fname.xz", "wb") save(myObj, file = con); close(con)
where the last requiresxz
5.1.1 or later built with supportfor multiple threads (and parallel compression is only effective forlarge objects: at level 6 it will compress in serialized chunks of 12MB).
The...
arguments only give thenames of the objectsto be saved: they are searched for in the environment given by theenvir
argument, and the actual objects given as arguments neednot be those found.
SavedR objects are binary files, even those saved withascii = TRUE
, so ensure that they are transferred withoutconversion of end-of-line markers and of 8-bit characters. The linesare delimited byLF on all platforms.
Although the default version was not changed betweenR 1.4.0 andR3.4.4 nor sinceR 3.5.0, this does not mean that saved files arenecessarily backwards compatible. You will be able to load a savedimage into an earlier version ofR which supports its version unlessuse is made of later additions (for example for version 2, rawvectors, external pointers and some S4 objects).
One such ‘later addition’ waslong vectors, introduced inR3.0.0 and loadable only on 64-bit platforms.
Loading files saved withASCII = NA
requires a C99-compliant Cfunctionsscanf
: this is a problem on Windows, first workedaround inR 3.1.2: version-2 files in that format should be readablein earlier versions ofR on all other platforms.
For saving singleR objects,saveRDS()
is mostlypreferable tosave()
, notably because of thefunctionalnature ofreadRDS()
, as opposed toload()
.
The most common reason for failure is lack of write permission in thecurrent directory. Forsave.image
and for saving at the end ofa session this will shown by messages like
Error in gzfile(file, "wb") : unable to open connection In addition: Warning message: In gzfile(file, "wb") : cannot open compressed file '.RDataTmp', probable reason 'Permission denied'
For other interfaces to the underlying serialization format, seeserialize
andsaveRDS
.
x <- stats::runif(20)y <- list(a = 1, b = TRUE, c = "oops")save(x, y, file = "xy.RData")save.image() # creating ".RData" in current working directoryunlink("xy.RData")# set save defaults using option:options(save.defaults = list(ascii = TRUE, safe = FALSE))save.image() # creating ".RData"if(interactive()) withAutoprint({ file.info(".RData") readLines(".RData", n = 7) # first 7 lines; first starts w/ "RDA"..})unlink(".RData")
x<- stats::runif(20)y<- list(a=1, b=TRUE, c="oops")save(x, y, file="xy.RData")save.image()# creating ".RData" in current working directoryunlink("xy.RData")# set save defaults using option:options(save.defaults= list(ascii=TRUE, safe=FALSE))save.image()# creating ".RData"if(interactive()) withAutoprint({ file.info(".RData") readLines(".RData", n=7)# first 7 lines; first starts w/ "RDA"..})unlink(".RData")
scale
is generic function whose default method centers and/orscales the columns of a numeric matrix.
scale(x, center = TRUE, scale = TRUE)
scale(x, center=TRUE, scale=TRUE)
x | a numeric matrix(like object). |
center | either a logical value or numeric-alike vector of lengthequal to the number of columns of |
scale | either a logical value or a numeric-alike vector of lengthequal to the number of columns of |
The value ofcenter
determines how column centering isperformed. Ifcenter
is a numeric-alike vector with length equal tothe number of columns ofx
, then each column ofx
hasthe corresponding value fromcenter
subtracted from it. Ifcenter
isTRUE
then centering is done by subtracting thecolumn means (omittingNA
s) ofx
from theircorresponding columns, and ifcenter
isFALSE
, nocentering is done.
The value ofscale
determines how column scaling is performed(after centering). Ifscale
is a numeric-alike vector with lengthequal to the number of columns ofx
, then each column ofx
is divided by the corresponding value fromscale
.Ifscale
isTRUE
then scaling is done by dividing the(centered) columns ofx
by their standard deviations ifcenter
isTRUE
, and the root mean square otherwise.Ifscale
isFALSE
, no scaling is done.
The root-mean-square for a (possibly centered) column is defined as, where
isa vector of the non-missing values and
is the number ofnon-missing values. In the case
center = TRUE
, this is thesame as the standard deviation, but in general it is not. (To scaleby the standard deviations without centering, usescale(x, center = FALSE, scale = apply(x, 2, sd, na.rm = TRUE))
.)
Forscale.default
, the centered, scaled matrix. The numericcentering and scalings used (if any) are returned as attributes"scaled:center"
and"scaled:scale"
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
sweep
which allows centering (and scaling) witharbitrary statistics.
For working with the scale of a plot, seepar
.
require(stats)x <- matrix(1:10, ncol = 2)(centered.x <- scale(x, scale = FALSE))cov(centered.scaled.x <- scale(x)) # all 1
require(stats)x<- matrix(1:10, ncol=2)(centered.x<- scale(x, scale=FALSE))cov(centered.scaled.x<- scale(x))# all 1
Read data into a vector or list from the console or file.
scan(file = "", what = double(), nmax = -1, n = -1, sep = "", quote = if(identical(sep, "\n")) "" else "'\"", dec = ".", skip = 0, nlines = 0, na.strings = "NA", flush = FALSE, fill = FALSE, strip.white = FALSE, quiet = FALSE, blank.lines.skip = TRUE, multi.line = TRUE, comment.char = "", allowEscapes = FALSE, fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)
scan(file="", what= double(), nmax=-1, n=-1, sep="", quote=if(identical(sep,"\n"))""else"'\"", dec=".", skip=0, nlines=0, na.strings="NA", flush=FALSE, fill=FALSE, strip.white=FALSE, quiet=FALSE, blank.lines.skip=TRUE, multi.line=TRUE, comment.char="", allowEscapes=FALSE, fileEncoding="", encoding="unknown", text, skipNul=FALSE)
file | the name of a file to read data values from. If thespecified file is Otherwise, the file name is interpretedrelative to thecurrent working directory (given by This can be a compressed file (see Alternatively,
To read a data file not in the current encoding (for example aLatin-1 file in a UTF-8 locale or conversely) use a |
what | thetype of |
nmax | the maximum number of data values to be read, or if |
n | integer: the maximum number of data values to be read, defaulting tono limit. Invalid values will be ignored. |
sep | by default, scan expects to read ‘white-space’delimited input fields. Alternatively, If specified this should be the empty character string (the default)or |
quote | the set of quoting characters as a single characterstring or |
dec | decimal point character. This should be a character stringcontaining just one single-byte character. ( |
skip | the number of lines of the input file to skip beforebeginning to read data values. |
nlines | if positive, the maximum number of lines of data to be read. |
na.strings | character vector. Elements of this vector are to beinterpreted as missing ( |
flush | logical: if |
fill | logical: if |
strip.white | vector of logical value(s) corresponding to itemsin the If |
quiet | logical: if |
blank.lines.skip | logical: if |
multi.line | logical. Only used if |
comment.char | character: a character vector of length onecontaining a single character or an empty string. Use |
allowEscapes | logical. Should C-style escapes such as‘\n’ be processed (the default) or read verbatim? Note that ifnot within quotes these could be interpreted as a delimiter (but notas a comment character). The escapes which are interpreted are the control characters‘\a, \b, \f, \n, \r, \t, \v’ and octal andhexadecimal representations like ‘\040’ and ‘\0x2A’. Anyother escaped character is treated as itself, including backslash.Note that Unicode escapes (starting ‘\u’ or ‘\U’: seeQuotes) are never processed. |
fileEncoding | character string: if non-empty declares theencoding used on a file (not a connection nor the keyboard) so thecharacter data can be re-encoded. See the ‘Encoding’ sectionof the help for |
encoding | encoding to be assumed for input strings. If thevalue is |
text | character string: if |
skipNul | logical: shouldNULs be skipped when reading characterfields? |
The value ofwhat
can be a list of types, in which casescan
returns a list of vectors with the types given by thetypes of the elements inwhat
. This provides a way of readingcolumnar data. If any of the types isNULL
, the correspondingfield is skipped (but aNULL
component appears in the result).
The type ofwhat
or its components can be one of the sixatomic vector types orNULL
(seeis.atomic
).
‘White space’ is defined for the purposes of this function asone or more contiguous characters from the set space, horizontal tab,carriage return and line feed (aka “newline”,"\n"
). Itdoes not include form feed norvertical tab, but in Latin-1 and Windows 8-bit locales (but not UTF-8)'space' includes the non-breaking space ‘"\xa0"’.
Empty numeric fields are always regarded as missing values.Empty character fields are scanned as empty character vectors, unlessna.strings
contains""
when they are regarded as missingvalues.
The allowed input for a numeric field is optional whitespace, followed byeitherNA
or an optional sign followed by a decimal orhexadecimal constant (seeNumericConstants), orNaN
,Inf
orinfinity
(ignoring case). Out-of-range valuesare recorded asInf
,-Inf
or0
.
For an integer field the allowed input is optional whitespace,followed by eitherNA
or an optional sign and one or moredigits (‘0-9’): all out-of-range values are converted toNA_integer_
.
Ifsep
is the default (""
), the character ‘\’in a quoted string escapes the following character, so quotes may beincluded in the string by escaping them.
Ifsep
is non-default, the fields may be quoted in the style of‘.csv’ files where separators inside quotes (''
or""
) are ignored and quotes may be put inside strings bydoubling them. However, ifsep = "\n"
it is assumedby default that one wants to read entire lines verbatim.
Quoting is only interpreted in character fields and inNULL
fields (which might be skipping character fields).
Note that sincesep
is a separator and not a terminator,reading a file byscan("foo", sep = "\n", blank.lines.skip = FALSE)
will give an empty final line if the file ends in a line feed ("\n"
)and not if it does not. This might not be what you expected; see alsoreadLines
.
Ifcomment.char
occurs (except inside a quoted characterfield), it signals that the rest of the line should be regarded as acomment and be discarded. Lines beginning with a comment character(possibly after white space with the default separator) are treated asblank lines.
There is a line-length limit of 4095 bytes when reading from theconsole (which may impose a lower limit: see ‘An Introductionto R’).
There is a check for a user interrupt every 1000 lines ifwhat
is a list, otherwise every 10000 items.
Iffile
is a character string andfileEncoding
isnon-default, or if it is a not-already-openconnection with anon-defaultencoding
argument, the text is converted to UTF-8and declared as such (and theencoding
argument toscan
is ignored). See the examples ofreadLines
.
ifwhat
is a list, a list of the same length and same names (asany) aswhat
.
Otherwise, a vector of the type ofwhat
.
Character strings in the result will have a declared encoding ifencoding
is"latin1"
or"UTF-8"
.
The default formulti.line
differs from S. To read one recordper line, useflush = TRUE
andmulti.line = FALSE
.(Note that quoted character strings can still include embedded newlines.)
If number of items is not specified, the internalmechanism re-allocates memory in powers of two and so could use upto three times as much memory as needed. (It needs both old and newcopies.) If you can, specify eithern
ornmax
wheneverinputting a large vector, andnmax
ornlines
wheninputting a large list.
Usingscan
on an open connection to read partial lines can losechars: use an explicit separator to avoid this.
Havingnul
bytes in fields (including ‘\0’ ifallowEscapes = TRUE
) may lead to interpretation of thefield being terminated at thenul
. They not normally presentin text files – seereadBin
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
read.table
for more user-friendly reading of datamatrices;readLines
to read a file a line at a time.write
.
Quotes
for the details of C-style escape sequences.
readChar
andreadBin
to read fixed orvariable length character strings or binary representations of numbersa few at a time from a connection.
cat("TITLE extra line", "2 3 5 7", "11 13 17", file = "ex.data", sep = "\n")pp <- scan("ex.data", skip = 1, quiet = TRUE)scan("ex.data", skip = 1)scan("ex.data", skip = 1, nlines = 1) # only 1 line after the skipped onescan("ex.data", what = list("","","")) # flush is F -> read "7"scan("ex.data", what = list("","",""), flush = TRUE)unlink("ex.data") # tidy up## "inline" usagescan(text = "1 2 3")
cat("TITLE extra line","2 3 5 7","11 13 17", file="ex.data", sep="\n")pp<- scan("ex.data", skip=1, quiet=TRUE)scan("ex.data", skip=1)scan("ex.data", skip=1, nlines=1)# only 1 line after the skipped onescan("ex.data", what= list("","",""))# flush is F -> read "7"scan("ex.data", what= list("","",""), flush=TRUE)unlink("ex.data")# tidy up## "inline" usagescan(text="1 2 3")
Gives a list ofattach
edpackages(seelibrary
), andR objects, usuallydata.frames
.
search()searchpaths()
search()searchpaths()
A character vector, starting with".GlobalEnv"
, andending with"package:base"
which isR'sbase packagerequired always.
searchpaths
gives a similar character vector, with theentries for packages being the path to the package used to load thecode.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (search
.)
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer. (searchpaths
.)
.packages
to list just the packages on search path.
loadedNamespaces
to list loaded namespaces.
attach
anddetach
to change thesearch path,objects
to findR objects in there.
search()searchpaths()
search()searchpaths()
Functions to re-position connections.
seek(con, ...)## S3 method for class 'connection'seek(con, where = NA, origin = "start", rw = "", ...)isSeekable(con)truncate(con, ...)
seek(con,...)## S3 method for class 'connection'seek(con, where=NA, origin="start", rw="",...)isSeekable(con)truncate(con,...)
con | |
where | numeric. A file position (relative to the originspecified by |
rw | character string. Empty or |
origin | character string. One of |
... | further arguments passed to or from other methods. |
seek
withwhere = NA
returns the current byte offsetof a connection (from the beginning), and with a non-missingwhere
argument the connection is re-positioned (if possible) to thespecified position.isSeekable
returns whether the connectionin principle supportsseek
: currently only (possiblygz-compressed) file connections do.
where
is stored as a real but should represent an integer:non-integer values are likely to be truncated. Note that the possiblevalues can exceed the largest representable number in anRinteger
on 64-bit builds, and on some 32-bit builds.
File connections can be open for both writing/appending, in which caseR keeps separate positions for reading and writing. Whichseek
refers to can be set by itsrw
argument: the default is thelast mode (reading or writing) which was used. Most files areonly opened for reading or writing and so default to that state. If afile is open for both reading and writing but has not been used, thedefault is to give the reading position (0).
The initial file position for reading is always at the beginning.The initial position for writing is at the beginning of the filefor modes"r+"
and"r+b"
, otherwise at the end of thefile. Some platforms only allow writing at the end of the file inthe append modes. (The reported write position for a file opened inan append mode will typically be unreliable until the file has beenwritten to.)
gzfile
connections supportseek
with a number oflimitations, using the file position of the uncompressed file.They do not supportorigin = "end"
. When writing, seeking isonly possible forwards: when reading seeking backwards is supported byrewinding the file and re-reading from its start.
Ifseek
is called with a non-NA
value ofwhere
,any pushback on a text-mode connection is discarded.
truncate
truncates a file opened for writing at its currentposition. It works only forfile
connections, and is notimplemented on all platforms: on others (including Windows) it willnot work for large (> 2Gb) files.
None of these should be expected to work on text-mode connections withre-encoding selected.
seek
returns the current position (before any move), as a(numeric) byte offset from the origin, if relevant, or0
ifnot. Note that the position can exceed the largest representablenumber in anRinteger
on 64-bit builds, and on some 32-bitbuilds.
truncate
returnsNULL
: it stops with an error ifit fails (or is not implemented).
isSeekable
returns a logical value, whether the connectionsupportsseek
.
Use ofseek
on Windows is discouraged. We have found so manyerrors in the Windows implementation of file positioning that usersare advised to use it only at their own risk, and asked not to wastetheR developers' time with bug reports on Windows' deficiencies.
Generate regular sequences.seq
is a standard generic with adefault method.seq.int
is a primitive which can bemuch faster but has a few restrictions.seq_along
andseq_len
are very fast primitives for two common cases.
seq(...)## Default S3 method:seq(from = 1, to = 1, by = ((to - from)/(length.out - 1)), length.out = NULL, along.with = NULL, ...)seq.int(from, to, by, length.out, along.with, ...)seq_along(along.with)seq_len(length.out)
seq(...)## Default S3 method:seq(from=1, to=1, by=((to- from)/(length.out-1)), length.out=NULL, along.with=NULL,...)seq.int(from, to, by, length.out, along.with,...)seq_along(along.with)seq_len(length.out)
... | arguments passed to or from methods. |
from ,to | the starting and (maximal) end values of thesequence. Of length |
by | number: increment of the sequence. |
length.out | desired length of the sequence. Anon-negative number, which for |
along.with | take the length from the length of this argument. |
Numerical inputs should all befinite (that is, not infinite,NaN
orNA
).
The interpretation of the unnamed arguments ofseq
andseq.int
isnot standard, and it is recommended always toname the arguments when programming.
seq
is generic, and only the default method is described here.Note that it dispatches on the class of thefirst argumentirrespective of argument names. This can have unintended consequencesif it is called with just one argument intending this to be taken asalong.with
: it is much better to useseq_along
in thatcase.
seq.int
is aninternal generic which dispatches onmethods for"seq"
based on the class of the first suppliedargument (before argument matching).
Typical usages are
seq(from, to)seq(from, to, by= )seq(from, to, length.out= )seq(along.with= )seq(from)seq(length.out= )
The first form generates the sequencefrom, from+/-1, ..., to
(identical tofrom:to
).
The second form generatesfrom, from+by
, ..., up to thesequence value less than or equal toto
. Specifyingto - from
andby
of opposite signs is an error. Note that thecomputed final value can go just beyondto
to allow forrounding error, but is truncated toto
. (‘Just beyond’is by up to times
abs(from - to)
.)
The third generates a sequence oflength.out
equally spacedvalues fromfrom
toto
. (length.out
is usuallyabbreviated tolength
orlen
, andseq_len
is muchfaster.)
The fourth form generates the integer sequence1, 2, ..., length(along.with)
. (along.with
is usually abbreviated toalong
, andseq_along
is much faster.)
The fifth form generates the sequence1, 2, ..., length(from)
(as if argumentalong.with
had been specified),unlessthe argument is numeric of length 1 when it is interpreted as1:from
(even forseq(0)
for compatibility with S).Using eitherseq_along
orseq_len
is much preferred(unless strict S compatibility is essential).
The final form generates the integer sequence1, 2, ..., length.out
unlesslength.out = 0
, when it generatesinteger(0)
.
Very small sequences (withfrom - to
of the order oftimes the larger of the ends) will return
from
.
Forseq
(only), up to two offrom
,to
andby
can be supplied as complex values providedlength.out
oralong.with
is specified. More generally, the default methodofseq
will handle classed objects with methods fortheMath
,Ops
andSummary
group generics.
seq.int
,seq_along
andseq_len
areprimitive.
seq.int
and the default method ofseq
for numericarguments return a vector of type"integer"
or"double"
:programmers should not rely on which.
seq_along
andseq_len
return an integer vector, unlessit is along vector when it will be double.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
The methodsseq.Date
andseq.POSIXt
.
seq(0, 1, length.out = 11)seq(stats::rnorm(20)) # effectively 'along'seq(1, 9, by = 2) # matches 'end'seq(1, 9, by = pi) # stays below 'end'seq(1, 6, by = 3)seq(1.575, 5.125, by = 0.05)seq(17) # same as 1:17, or even better seq_len(17)
seq(0,1, length.out=11)seq(stats::rnorm(20))# effectively 'along'seq(1,9, by=2)# matches 'end'seq(1,9, by= pi)# stays below 'end'seq(1,6, by=3)seq(1.575,5.125, by=0.05)seq(17)# same as 1:17, or even better seq_len(17)
The method forseq
for objects of class"Date"
representing calendar dates.
## S3 method for class 'Date'seq(from, to, by, length.out = NULL, along.with = NULL, ...)
## S3 method for class 'Date'seq(from, to, by, length.out=NULL, along.with=NULL,...)
from | starting date. Required. |
to | end date. Optional. |
by | increment of the sequence. Optional. See ‘Details’. |
length.out | integer, optional. Desired length of the sequence. |
along.with | take the length from the length of this argument. |
... | arguments passed to or from other methods. |
by
can be specified in several ways.
A number, taken to be in days.
A object of classdifftime
A character string, containing one of"day"
,"week"
,"month"
,"quarter"
or"year"
.This can optionally be preceded by a (positive or negative) integerand a space, or followed by"s"
.
Seeseq.POSIXt
for the details of"month"
.
A vector of class"Date"
.
## first days of yearsseq(as.Date("1910/1/1"), as.Date("1999/1/1"), "years")## by monthseq(as.Date("2000/1/1"), by = "month", length.out = 12)## quartersseq(as.Date("2000/1/1"), as.Date("2003/1/1"), by = "quarter")## find all 7th of the month between two dates, the last being a 7th.st <- as.Date("1998-12-17")en <- as.Date("2000-1-7")ll <- seq(en, st, by = "-1 month")rev(ll[ll > st & ll < en])
## first days of yearsseq(as.Date("1910/1/1"), as.Date("1999/1/1"),"years")## by monthseq(as.Date("2000/1/1"), by="month", length.out=12)## quartersseq(as.Date("2000/1/1"), as.Date("2003/1/1"), by="quarter")## find all 7th of the month between two dates, the last being a 7th.st<- as.Date("1998-12-17")en<- as.Date("2000-1-7")ll<- seq(en, st, by="-1 month")rev(ll[ll> st& ll< en])
The method forseq
for date-time classes.
## S3 method for class 'POSIXt'seq(from, to, by, length.out = NULL, along.with = NULL, ...)
## S3 method for class 'POSIXt'seq(from, to, by, length.out=NULL, along.with=NULL,...)
from | starting date. Required. |
to | end date. Optional. |
by | increment of the sequence. Optional. See ‘Details’. |
length.out | integer, optional. Desired length of the sequence. |
along.with | take the length from the length of this argument. |
... | arguments passed to or from other methods. |
by
can be specified in several ways.
A number, taken to be in seconds.
A object of classdifftime
A character string, containing one of"sec"
,"min"
,"hour"
,"day"
,"DSTday"
,"week"
,"month"
,"quarter"
or"year"
.This can optionally be preceded by a (positive or negative) integerand a space, or followed by"s"
.
The difference between"day"
and"DSTday"
is that theformer ignores changes to/from daylight savings time and the latter takesthe same clock time each day."week"
ignores DST (it is aperiod of 144 hours), but"7 DSTdays"
can be used as analternative."month"
and"year"
allow for DST.
Thetime zone of the result is taken fromfrom
: rememberthat GMT means UTC (and not the time zone of Greenwich, England) and sodoes not have daylight savings time.
Using"month"
first advances the month without changing theday: if this results in an invalid day of the month, it is countedforward into the next month: see the examples.
A vector of class"POSIXct"
.
## first days of yearsseq(ISOdate(1910,1,1), ISOdate(1999,1,1), "years")## by monthseq(ISOdate(2000,1,1), by = "month", length.out = 12)seq(ISOdate(2000,1,31), by = "month", length.out = 4)## quartersseq(ISOdate(1990,1,1), ISOdate(2000,1,1), by = "quarter") # or "3 months"## days vs DSTdays: use c() to lose the time zone.seq(c(ISOdate(2000,3,20)), by = "day", length.out = 10)seq(c(ISOdate(2000,3,20)), by = "DSTday", length.out = 10)seq(c(ISOdate(2000,3,20)), by = "7 DSTdays", length.out = 4)
## first days of yearsseq(ISOdate(1910,1,1), ISOdate(1999,1,1),"years")## by monthseq(ISOdate(2000,1,1), by="month", length.out=12)seq(ISOdate(2000,1,31), by="month", length.out=4)## quartersseq(ISOdate(1990,1,1), ISOdate(2000,1,1), by="quarter")# or "3 months"## days vs DSTdays: use c() to lose the time zone.seq(c(ISOdate(2000,3,20)), by="day", length.out=10)seq(c(ISOdate(2000,3,20)), by="DSTday", length.out=10)seq(c(ISOdate(2000,3,20)), by="7 DSTdays", length.out=4)
The default method forsequence
generates the sequenceseq(from[i], by = by[i], length.out = nvec[i])
for eachelementi
in the parallel (and recycled) vectorsfrom
,by
andnvec
. It then returns the result of concatenatingthose sequences.
sequence(nvec, ...)## Default S3 method:sequence(nvec, from = 1L, by = 1L, ...)
sequence(nvec,...)## Default S3 method:sequence(nvec, from=1L, by=1L,...)
nvec | coerced to a non-negative integer vector each element of whichspecifies the length of a sequence. |
from | coerced to an integer vector each element of whichspecifies the first element of a sequence. |
by | coerced to an integer vector each element of whichspecifies the step size between elements of a sequence. |
... | additional arguments passed to methods. |
Negative values are supported forfrom
andby
.sequence(nvec, from, by=0L)
is equivalent torep(from, each=nvec)
.
This function was originally implemented in R with fewer features, butit has since become more flexible, and the default method isimplemented in C for speed.
Of the current version, Michael Lawrence based on code from theS4Vectors Bioconductor package
sequence(c(3, 2)) # the concatenated sequences 1:3 and 1:2.#> [1] 1 2 3 1 2sequence(c(3, 2), from=2L)#> [1] 2 3 4 2 3sequence(c(3, 2), from=2L, by=2L)#> [1] 2 4 6 2 4sequence(c(3, 2), by=c(-1L, 1L))#> [1] 1 0 -1 1 2
sequence(c(3,2))# the concatenated sequences 1:3 and 1:2.#> [1] 1 2 3 1 2sequence(c(3,2), from=2L)#> [1] 2 3 4 2 3sequence(c(3,2), from=2L, by=2L)#> [1] 2 4 6 2 4sequence(c(3,2), by=c(-1L,1L))#> [1] 1 0 -1 1 2
A simple low-level interface for serializing to connections.
serialize(object, connection, ascii, xdr = TRUE, version = NULL, refhook = NULL)unserialize(connection, refhook = NULL)
serialize(object, connection, ascii, xdr=TRUE, version=NULL, refhook=NULL)unserialize(connection, refhook=NULL)
object | R object to serialize. |
connection | an openconnection or (for |
ascii | a logical. If |
xdr | a logical: if a binary representation is used, should abig-endian one (XDR) be used? |
version | the workspace format version to use. |
refhook | a hook function for handling reference objects. |
The functionserialize
serializesobject
to the specifiedconnection. Ifconnection
isNULL
thenobject
isserialized to a raw vector, which is returned as the result ofserialize
.
Sharing of reference objects is preserved within the object but notacross separate calls toserialize
.
unserialize
reads an object (as written byserialize
)fromconnection
or a raw vector.
Therefhook
functions can be used to customize handling ofnon-system reference objects (all external pointers and weakreferences, and all environments other than namespace and packageenvironments and.GlobalEnv
). The hook function forserialize
should return a character vector for references itwants to handle; otherwise it should returnNULL
. The hook forunserialize
will be called with character vectors supplied toserialize
and should return an appropriate object.
For a text-mode connection, the default value ofascii
is settoTRUE
: only ASCII representations can be written to text-modeconnections and attempting to useascii = FALSE
will throw anerror.
The format consists of a single line followed by the data: the firstline contains a single character:X
for binary serializationandA
for ASCII serialization, followed by a new line. (Theformat used is identical to that used byreadRDS
.)
As almost all systems in current use are little-endian,xdr = FALSE
can be used to avoid byte-shuffling at both ends whentransferring data from one little-endian machine to another (orbetween processes on the same machine). Depending on the system, thiscan speed up serialization and unserialization by a factor of up to3x.
Forserialize
,NULL
unlessconnection = NULL
, whenthe result is returned in a raw vector.
Forunserialize
anR object.
These functions have provided a stable interface sinceR 2.4.0 (whenthe storage of serialized objects was changed from character to rawvectors). However, the serialization format may change in futureversions ofR, so this interface should not be used for long-termstorage ofR objects.
On 32-bit platforms a raw vector is limited to bytes, butR objects can exceed this and their serializations willnormally be larger than the objects.
saveRDS
for a more convenient interface to serialize anobject to a file or connection.
save
andload
to serialize and restore oneor more named objects.
The ‘R Internals’ manual for details of the format used.
x <- serialize(list(1,2,3), NULL)unserialize(x)## see also the examples for saveRDS
x<- serialize(list(1,2,3),NULL)unserialize(x)## see also the examples for saveRDS
Performsset union, intersection, (asymmetric!) difference,equality and membership on two vectors.
union(x, y)intersect(x, y)setdiff(x, y)setequal(x, y)is.element(el, set)
union(x, y)intersect(x, y)setdiff(x, y)setequal(x, y)is.element(el, set)
x ,y ,el ,set | vectors (of the same mode) containing a sequenceof items (conceptually) with no duplicated values. |
Each ofunion
,intersect
,setdiff
andsetequal
will discard any duplicated values in the arguments,and they applyas.vector
to their arguments (and soin particular coerce factors to character vectors).
is.element(x, y)
is identical tox %in% y
.
Forunion
, a vector of a common mode.
Forintersect
, a vector of a common mode, orNULL
ifx
ory
isNULL
.
Forsetdiff
, a vector of the samemode
asx
.
A logical scalar forsetequal
and a logical of the samelength asx
foris.element
.
‘plotmath’ for the use ofunion
andintersect
in plot annotation.
(x <- c(sort(sample(1:20, 9)), NA))(y <- c(sort(sample(3:23, 7)), NA))union(x, y)intersect(x, y)setdiff(x, y)setdiff(y, x)setequal(x, y)## True for all possible x & y :setequal( union(x, y), c(setdiff(x, y), intersect(x, y), setdiff(y, x)))is.element(x, y) # length 10is.element(y, x) # length 8
(x<- c(sort(sample(1:20,9)),NA))(y<- c(sort(sample(3:23,7)),NA))union(x, y)intersect(x, y)setdiff(x, y)setdiff(y, x)setequal(x, y)## True for all possible x & y :setequal( union(x, y), c(setdiff(x, y), intersect(x, y), setdiff(y, x)))is.element(x, y)# length 10is.element(y, x)# length 8
Functions to set CPU and/or elapsed time limits for top-levelcomputations or the current session.
setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)setSessionTimeLimit(cpu = Inf, elapsed = Inf)
setTimeLimit(cpu=Inf, elapsed=Inf, transient=FALSE)setSessionTimeLimit(cpu=Inf, elapsed=Inf)
cpu ,elapsed | double (of length one). Set a limit onthe total or elapsed CPU time in seconds, respectively. |
transient | logical. If |
setTimeLimit
sets limits which apply to each top-levelcomputation, that is a command line (including any continuation lines)entered at the console or from a file. If it is called from within acomputation the limits apply to the rest of the computation and(unlesstransient = TRUE
) to subsequent top-level computations.
setSessionTimeLimit
sets limits for the rest of thesession. Once a session limit is reached it is reset toInf
.
Setting any limit has a small overhead – well under 1% on thesystems measured.
Time limits are checked whenever a user interrupt could occur.This will happen frequently inR code and duringSys.sleep
,but only at points in compiled C and Fortran code identified by thecode author.
‘Total CPU time’ includes that used by child processes wherethe latter is reported.
Display aspects ofconnections.
showConnections(all = FALSE)getConnection(what)closeAllConnections()stdin()stdout()stderr()nullfile()isatty(con)getAllConnections()
showConnections(all=FALSE)getConnection(what)closeAllConnections()stdin()stdout()stderr()nullfile()isatty(con)getAllConnections()
all | logical: if true all connections, including closed onesand the standard ones are displayed. If false only open user-createdconnections are included. |
what | integer: a row number of the table given by |
con | a connection. |
stdin()
,stdout()
andstderr()
are standardconnections corresponding to input, output and error on the consolerespectively (and not necessarily to file streams). They are text-modeconnections of class"terminal"
which cannot be opened orclosed, and are read-only, write-only and write-only respectively.Thestdout()
andstderr()
connections can bere-directed bysink
(and in some circumstances theoutput fromstdout()
can be split: see the help page).
The encoding forstdin()
when redirected canbe set by the command-line flag--encoding.
nullfile()
returns filename of the null device ("/dev/null"
on Unix,"nul:"
on Windows).
showConnections
returns a matrix of information. If aconnection object has been lost or forgotten,getConnection
will take a row number from the table and return a connection objectfor that connection, which can be used to close the connection,for example. However, if there is noR level object referring to theconnection it will be closed automatically at the next garbagecollection (except forgzcon
connections).
closeAllConnections
closes (and destroys) all userconnections, restoring allsink
diversions as it doesso.
isatty
returns true if the connection is one of the class"terminal"
connections and it is apparently connected to aterminal, otherwise false. This may not be reliable in embeddedapplications, including GUI consoles.
getAllConnections
returns a sequence of integer connectiondescriptors for use withgetConnection
, corresponding to therow names of the table returned byshowConnections(all = TRUE)
.
stdin()
,stdout()
andstderr()
return connectionobjects.
showConnections
returns a character matrix of information witha row for each connection, by default only for open non-standard connections.
getConnection
returns a connection object, orNULL
.
stdin()
refers to the ‘console’ and not to the C-level‘stdin’ of the process. The distinction matters in GUI consoles(which may not have an active ‘stdin’, and if they do it may notbe connected to console input), and also in embedded applications.If you want access to the C-level file stream ‘stdin’, usefile("stdin")
.
WhenR is reading a script from a file, thefile is the‘console’: this is traditional usage to allow in-line data (see‘An Introduction to R’ for an example).
showConnections(all = TRUE)## Not run: textConnection(letters)# oops, I forgot to record that oneshowConnections()# class description mode text isopen can read can write#3 "letters" "textConnection" "r" "text" "opened" "yes" "no"mycon <- getConnection(3)## End(Not run)c(isatty(stdin()), isatty(stdout()), isatty(stderr()))
showConnections(all=TRUE)## Not run:textConnection(letters)# oops, I forgot to record that oneshowConnections()# class description mode text isopen can read can write#3 "letters" "textConnection" "r" "text" "opened" "yes" "no"mycon<- getConnection(3)## End(Not run)c(isatty(stdin()), isatty(stdout()), isatty(stderr()))
Quote a string to be passed to an operating system shell.
shQuote(string, type = c("sh", "csh", "cmd", "cmd2"))
shQuote(string, type= c("sh","csh","cmd","cmd2"))
string | a character vector, usually of length one. |
type | character: the type of shell quoting. Partial matching issupported. |
The default type of quoting supported under Unix-alikes is that forthe Bourne shellsh
. If the string does not contain singlequotes, we can just surround it with single quotes. Otherwise, thestring is surrounded in double quotes, which suppresses all specialmeanings of metacharacters except dollar, backquote and backslash, sothese (and of course double quote) are preceded by backslash. Thistype of quoting is also appropriate forbash
,ksh
andzsh
.
The other type of quoting is for the C-shell (csh
andtcsh
). Once again, if the string does not contain singlequotes, we can just surround it with single quotes. If it doescontain single quotes, we can use double quotes provided it does notcontain dollar or backquote (and we need to escape backslash,exclamation mark and double quote). As a last resort, we need tosplit the string into pieces not containing single quotes (some may beempty) and surround each with single quotes, and the single quoteswith double quotes.
In Windows, command line interpretation is done by the application as wellas the shell. It may depend on the compiler used: Microsoft's rules forthe C run-time are given athttps://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-160. It may depend on the whim of the programmer of the application: check itsdocumentation. Thetype = "cmd"
prepares the string for parsing asan argument by the Microsoft's rules and makesshQuote
safe for usewith many applications when used withsystem
orsystem2
. It surrounds the string by double quotes andescapes internal double quotes by a backslash. Any trailing backslashesand backslashes that were originally before double quotes are doubled.
The Windowscmd.exe
shell (used by default withshell
)usestype = "cmd2"
quoting: special characters are prefixedwith"^"
. In some cases, two types of quoting should beused: first for the application, and thentype = "cmd2"
forcmd.exe
. See the examples below.
A character vector of the same length asstring
.
Loukides, M.et al (2002)Unix Power ToolsThird Edition. O'Reilly. Section 27.12.
Discussion inPR#16636.
Quotes for quotingR code.
sQuote
for quoting English text.
test <- "abc$def`gh`i\\j"cat(shQuote(test), "\n")## Not run: system(paste("echo", shQuote(test)))test <- "don't do it!"cat(shQuote(test), "\n")tryit <- paste("use the", sQuote("-c"), "switch\nlike this")cat(shQuote(tryit), "\n")## Not run: system(paste("echo", shQuote(tryit)))cat(shQuote(tryit, type = "csh"), "\n")## Windows-only example, assuming cmd.exe:perlcmd <- 'print "Hello World\\n";'## Not run: shell(shQuote(paste("perl -e", shQuote(perlcmd, type = "cmd")), type = "cmd2"))## End(Not run)
test<-"abc$def`gh`i\\j"cat(shQuote(test),"\n")## Not run: system(paste("echo", shQuote(test)))test<-"don't do it!"cat(shQuote(test),"\n")tryit<- paste("use the", sQuote("-c"),"switch\nlike this")cat(shQuote(tryit),"\n")## Not run: system(paste("echo", shQuote(tryit)))cat(shQuote(tryit, type="csh"),"\n")## Windows-only example, assuming cmd.exe:perlcmd<-'print "Hello World\\n";'## Not run:shell(shQuote(paste("perl -e", shQuote(perlcmd, type="cmd")), type="cmd2"))## End(Not run)
sign
returns a vector with the signs of the correspondingelements ofx
(the sign of a real number is 1, 0, orif the number is positive, zero, or negative, respectively).
Note thatsign
does not operate on complex vectors.
sign(x)
sign(x)
x | a numeric vector |
This is aninternal genericprimitive function: methodscan be defined for it directly or via theMath
group generic.
sign(pi) # == 1sign(-2:3) # -1 -1 0 1 1 1
sign(pi)# == 1sign(-2:3)# -1 -1 0 1 1 1
On receivingSIGUSR1
R will save the workspace and quit.SIGUSR2
has the same result except that the.Last
function andon.exit
expressions will not be called.
kill -USR1 pidkill -USR2 pid
kill-USR1 pidkill-USR2 pid
pid | The process ID of theR process. |
The commands history will also be saved if would be at normaltermination.
This is not available on Windows, and possibly on other OSes which donot support these signals.
It is possible that one or moreR objects will be undergoingmodification at the time the signal is sent. These objects could besaved in a corrupted form.
Sys.getpid
to report the process ID for future use.
sink
divertsR output to a connection (and stops such diversions).
sink.number()
reports how many diversions are in use.
sink.number(type = "message")
reports the number of theconnection currently being used for error messages.
sink(file = NULL, append = FALSE, type = c("output", "message"), split = FALSE)sink.number(type = c("output", "message"))
sink(file=NULL, append=FALSE, type= c("output","message"), split=FALSE)sink.number(type= c("output","message"))
file | a writableconnection or a character string naming thefile to write to, or |
append | logical. If |
type | character string. Either the output stream or the messagesstream. The name will be partially matched so can be abbreviated. |
split | logical: if |
sink
divertsR output to a connection (and must be used againto finish such a diversion, see below!). Iffile
is acharacter string, a file connection with that name will be establishedfor the duration of the diversion.
NormalR output (to connectionstdout
) is diverted bythe defaulttype = "output"
. Only prompts and (most)messages continue to appear on the console. Messages sent tostderr()
(including those frommessage
,warning
andstop
) can be diverted bysink(type = "message")
(see below).
sink()
orsink(file = NULL)
ends the last diversion (ofthe specified type). There is a stack of diversions for normaloutput, so output reverts to the previous diversion (if there wasone). The stack is of up to 21 connections (20 diversions).
Iffile
is a connection it will be opened if necessary (in"wt"
mode) and closed once it is removed from the stack ofdiversions.
split = TRUE
only splitsR output (viaRvprintf
) andthe default output fromwriteLines
: it does not splitall output that might be sent tostdout()
.
Sink-ing the messages stream should be done only with great care.For that streamfile
must be an already open connection, andthere is no stack of connections.
Iffile
is a character string, the file will be opened usingthe current encoding. If you want a different encoding (e.g., torepresent strings which have been stored in UTF-8), use afile
connection — but some ways to produceR outputwill already have converted such strings to the current encoding.
sink
returnsNULL
.
Forsink.number()
the number (0, 1, 2, ...) of diversions ofoutput in place.
Forsink.number("message")
the connection number used formessages, 2 if no diversion has been used.
Do not use a connection that is open forsink
for any otherpurpose. The software will stop you closing one such inadvertently.
Do not sink the messages stream unless you understand the source codeimplementing it and hence the pitfalls.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
sink("sink-examp.txt")i <- 1:10outer(i, i)sink()## capture all the output to a file.zz <- file("all.Rout", open = "wt")sink(zz)sink(zz, type = "message")try(log("a"))## revert output back to the console -- only then access the file!sink(type = "message")sink()file.show("all.Rout", delete.file = TRUE)
sink("sink-examp.txt")i<-1:10outer(i, i)sink()## capture all the output to a file.zz<- file("all.Rout", open="wt")sink(zz)sink(zz, type="message")try(log("a"))## revert output back to the console -- only then access the file!sink(type="message")sink()file.show("all.Rout", delete.file=TRUE)
Returns a matrix of integers indicating the number of their slice in agiven array.
slice.index(x, MARGIN)
slice.index(x, MARGIN)
x | an array. If |
MARGIN | an integer vector giving the dimension numbers to slice by. |
IfMARGIN
gives a single dimension, then all elements of slicenumberi
with respect to this have valuei
. In general,slice numbers are obtained by numbering all combinations of indices inthe dimensions given byMARGIN
in column-major order. I.e.,with, ...,
the dimension numbers (elements of
MARGIN
) sliced by and, ...,
thecorresponding extents, and
,
, ...,
,the number of the slice where dimension
has value
,..., dimension
has value
is
.
An integer arrayy
with dimensions corresponding to those ofx
.
row
andcol
for determining row and columnindexes; in fact, these are special cases ofslice.index
corresponding toMARGIN
equal to 1 and 2, respectively whenx
is a matrix.
x <- array(1 : 24, c(2, 3, 4))slice.index(x, 2)slice.index(x, c(1, 3))## When slicing by dimensions 1 and 3, slice index 5 is obtained for## dimension 1 has value 1 and dimension 3 has value 3 (see above):which(slice.index(x, c(1, 3)) == 5, arr.ind = TRUE)
x<- array(1:24, c(2,3,4))slice.index(x,2)slice.index(x, c(1,3))## When slicing by dimensions 1 and 3, slice index 5 is obtained for## dimension 1 has value 1 and dimension 3 has value 3 (see above):which(slice.index(x, c(1,3))==5, arr.ind=TRUE)
Extract or replace the contents of a slot or property of an object.
object@nameobject@name <- value
object@nameobject@name<- value
object | An object from a formally defined (S4) class, or anobject with a class for which '@' or '@<-' S3 methods are defined. |
name | The name of the slot or property, supplied as a characterstring or unquoted symbol. If |
value | A suitable replacement value for the slot orproperty. For an S4 object this must be from a class compatiblewith the class defined for this slot in the definition of the classof |
Ifobject
is not an S4 object, then a suitable S3 method for'@' or '@<-' is searched for. If no method is found, then an erroris signaled.
ifobject
is an S4 object, then these operators are for slotaccess, and are enabled only when packagemethods is loaded (asper default). The slot must be formally defined. (There is anexception for the name.Data
, intended for internal use only.)The replacement operator checks that the slot already exists on theobject (which it should if the object is really from the class itclaims to be). Seeslot
for further details, inparticular for the differences betweenslot()
and the@
operator.
These are internal generic operators: seeInternalMethods.
The current contents of the slot.
Waits for the first of several socket connections and server socketsto become available.
socketSelect(socklist, write = FALSE, timeout = NULL)
socketSelect(socklist, write=FALSE, timeout=NULL)
socklist | list of open socket connections and server sockets. |
write | logical. If |
timeout | numeric or |
The values inwrite
are recycled if necessary to make up alogical vector the same length assocklist
. Socket connectionscan appear more than once insocklist
; this can be useful ifyou want to determine whether a socket is available for reading orwriting.
Logical the same length assocklist
indicatingwhether the corresponding socket connection is available foroutput or input, depending on the corresponding value ofwrite
.Server sockets can only become available for input.
## Not run: ## test whether socket connection s is available for writing or readingsocketSelect(list(s, s), c(TRUE, FALSE), timeout = 0)## End(Not run)
## Not run:## test whether socket connection s is available for writing or readingsocketSelect(list(s, s), c(TRUE,FALSE), timeout=0)## End(Not run)
This generic function solves the equationa %*% x = b
forx
,whereb
can be either a vector or a matrix.
solve(a, b, ...)## Default S3 method:solve(a, b, tol, LINPACK = FALSE, ...)
solve(a, b,...)## Default S3 method:solve(a, b, tol, LINPACK=FALSE,...)
a | a square numeric or complex matrix containing the coefficients ofthe linear system. Logical matrices are coerced to numeric. |
b | a numeric or complex vector or matrix giving the right-handside(s) of the linear system. If missing, |
tol | the tolerance for detecting linear dependencies in thecolumns of |
LINPACK | logical. Defunct and an error. |
... | further arguments passed to or from other methods. |
a
orb
can be complex, but this uses double complexarithmetic which might not be available on all platforms.
The row and column names of the result are taken from the column namesofa
and ofb
respectively. Ifb
is missing thecolumn names of the result are the row names ofa
. No check ismade that the column names ofa
match the row names ofb
.
For back-compatibilitya
can be a (real) QR decomposition,althoughqr.solve
should be called in that case.qr.solve
can handle non-square systems.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.
What happens ifa
and/orb
contain missing,NaN
or infinite values is platform-dependent, including on the version ofLAPACK is in use.
tol
is a tolerance for the (estimated 1-norm)‘reciprocal condition number’: the check is skipped iftol <= 0
.
For historical reasons, the default method acceptsa
as anobject of class"qr"
(with a warning) and passes it on tosolve.qr
.
The default method is an interface to the LAPACK routinesDGESV
andZGESV
.
LAPACK is fromhttps://netlib.org/lapack/.
Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
solve.qr
for theqr
method,chol2inv
for inverting from the Cholesky factorbacksolve
,qr.solve
.
hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, `+`) }h8 <- hilbert(8); h8sh8 <- solve(h8)round(sh8 %*% h8, 3)A <- hilbert(4)A[] <- as.complex(A)## might not be supported on all platformstry(solve(A))
hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}h8<- hilbert(8); h8sh8<- solve(h8)round(sh8%*% h8,3)A<- hilbert(4)A[]<- as.complex(A)## might not be supported on all platformstry(solve(A))
Sort (ororder) a vector or factor (partially) intoascending or descending order. For ordering along more than onevariable, e.g., for sorting data frames, seeorder
.
sort(x, decreasing = FALSE, ...)## Default S3 method:sort(x, decreasing = FALSE, na.last = NA, ...)sort.int(x, partial = NULL, na.last = NA, decreasing = FALSE, method = c("auto", "shell", "quick", "radix"), index.return = FALSE)
sort(x, decreasing=FALSE,...)## Default S3 method:sort(x, decreasing=FALSE, na.last=NA,...)sort.int(x, partial=NULL, na.last=NA, decreasing=FALSE, method= c("auto","shell","quick","radix"), index.return=FALSE)
x | for |
decreasing | logical. Should the sort be increasing or decreasing?Not available for partial sorting. |
... | arguments to be passed to or from methods or (for thedefault methods and objects without a class) to |
na.last | for controlling the treatment of |
partial |
|
method | character string specifying the algorithm used. Notavailable for partial sorting. Can be abbreviated. |
index.return | logical indicating if the ordering index vector shouldbe returned as well. Supported by |
sort
is a generic function for which methods can be written,andsort.int
is the internal method which is compatiblewith S if only the first three arguments are used.
The defaultsort
method makes use oforder
forclassed objects, which in turn makes use of the generic functionxtfrm
(and can be slow unless axtfrm
method hasbeen defined oris.numeric(x)
is true).
Complex values are sorted first by the real part, then the imaginarypart.
The"auto"
method selects"radix"
for short (less than elements) numeric vectors, integer vectors, logicalvectors and factors; otherwise,
"shell"
.
Except for method"radix"
,the sort order for character vectors will depend on the collatingsequence of the locale in use: seeComparison
.The sort order for factors is the order of their levels (which isparticularly appropriate for ordered factors).
Ifpartial
is notNULL
, it is taken to contain indicesof elements of the result which are to be placed in their correctpositions in the sorted array by partial sorting. For each of theresult values in a specified position, any values smaller than thatone are guaranteed to have a smaller index in the sorted array and anyvalues which are greater are guaranteed to have a bigger index in thesorted array. (This is included for efficiency, and many of theoptions are not available for partial sorting. It is onlysubstantially more efficient ifpartial
has a handful ofelements, and a full sort is done (a Quicksort if possible) if thereare more than 10.) Names are discarded for partial sorting.
Method"shell"
uses Shellsort (an variant fromSedgewick (1986)). If
x
has names a stable modification isused, so ties are not reordered. (This only matters if names arepresent.)
Method"quick"
uses Singleton (1969)'s implementation ofHoare's Quicksort method and is only available whenx
isnumeric (double or integer) andpartial
isNULL
. (Forother types ofx
Shellsort is used, silently.) It is normallysomewhat faster than Shellsort (perhaps 50% faster on vectors oflength a million and twice as fast at a billion) but has poorperformance in the rare worst case. (Peto's modification using apseudo-random midpoint is used to make the worst case rarer.) This isnot a stable sort, and ties may be reordered.
Method"radix"
relies on simple hashing to scale time linearlywith the input size, i.e., its asymptotic time complexity is O(n). Thespecific variant and its implementation originated from the data.tablepackage and are due to Matt Dowle and Arun Srinivasan. For smallinputs (< 200), the implementation uses an insertion sort (O(n^2))that operates in-place to avoid the allocation overhead of the radixsort. For integer vectors of range less than 100,000, it switches to asimpler and faster linear time counting sort. In all cases, the sortis stable; the order of ties is preserved. It is the default methodfor integer vectors and factors.
The"radix"
method generally outperforms the other methods,especially for small integers. Compared to quick sort, it is slightlyfaster for vectors with large integer or real values (but unlike quicksort, radix is stable and supports allna.last
options). Theimplementation is orders of magnitude faster than shell sort forcharacter vectors, but collationdoes not respect thelocale and so gives incorrect answers even in English locales.
However, there are some caveats for the radix sort:
Ifx
is acharacter
vector, all elements must sharethe same encoding. Only UTF-8 (including ASCII) and Latin-1encodings are supported. Collation follows that withLC_COLLATE=C, that is lexicographically byte-by-byte usingnumerical ordering of bytes.
Long vectors (with or more elements)and
complex
vectors are not supported.
Forsort
, the result depends on the S3 method which isdispatched. Ifx
does not have a classsort.int
is usedand it description applies. For classed objects which do not have aspecific method the default method will be used and is equivalent tox[order(x, ...)]
: this depends on the class having a suitablemethod for[
(and also thatorder
will work,which requires axtfrm
method).
Forsort.int
the value is the sorted vector unlessindex.return
is true, when the result is a list with componentsnamedx
andix
containing the sorted numbers and theordering index vector. In the latter case, ifmethod == "quick"
ties may be reversed in the ordering (unlikesort.list
) as quicksort is not stable. Formethod == "radix"
,index.return
is supported for allna.last
modes. The other methods only supportindex.return
whenna.last
isNA
. The index vectorrefers to element numbersafter removal ofNA
s: seeorder
if you want the original element numbers.
All attributes are removed from the return value (see Beckeret al., 1988, p.146) except names, which are sorted. (Ifpartial
is specified even the names are removed.) Note thatthis means that the returned value has no class, except for factorsand ordered factors (which are treated specially and whose result istransformed back to the original class).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).The New S Language.Wadsworth & Brooks/Cole.
Knuth, D. E. (1998).The Art of Computer Programming, Volume 3: Sorting andSearching, 2nd ed.Addison-Wesley.
Sedgewick, R. (1986).A new upper bound for Shellsort.Journal of Algorithms,7, 159–173.doi:10.1016/0196-6774(86)90001-5.
Singleton, R. C. (1969).Algorithm 347: an efficient algorithm for sorting with minimal storage.Communications of the ACM,12, 185–186.doi:10.1145/362875.362901.
‘Comparison’ for how character strings are collated.
order
for sorting on or reordering multiple variables.
require(stats)x <- swiss$Education[1:25]x; sort(x); sort(x, partial = c(10, 15))## illustrate 'stable' sorting (of ties):sort(c(10:3, 2:12), method = "shell", index.return = TRUE) # is stable## $x : 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 12## $ix: 9 8 10 7 11 6 12 5 13 4 14 3 15 2 16 1 17 18 19sort(c(10:3, 2:12), method = "quick", index.return = TRUE) # is not## $x : 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 12## $ix: 9 10 8 7 11 6 12 5 13 4 14 3 15 16 2 17 1 18 19x <- c(1:3, 3:5, 10)is.unsorted(x) # FALSE: is sortedis.unsorted(x, strictly = TRUE) # TRUE : is not (and cannot be) # sorted strictly## Not run: ## Small speed comparison simulation:N <- 2000Sim <- 20rep <- 1000 # << adjust to your CPUc1 <- c2 <- numeric(Sim)for(is in seq_len(Sim)){ x <- rnorm(N) c1[is] <- system.time(for(i in 1:rep) sort(x, method = "shell"))[1] c2[is] <- system.time(for(i in 1:rep) sort(x, method = "quick"))[1] stopifnot(sort(x, method = "shell") == sort(x, method = "quick"))}rbind(ShellSort = c1, QuickSort = c2)cat("Speedup factor of quick sort():\n")summary({qq <- c1 / c2; qq[is.finite(qq)]})## A larger testx <- rnorm(1e7)system.time(x1 <- sort(x, method = "shell"))system.time(x2 <- sort(x, method = "quick"))system.time(x3 <- sort(x, method = "radix"))stopifnot(identical(x1, x2))stopifnot(identical(x1, x3))## End(Not run)
require(stats)x<- swiss$Education[1:25]x; sort(x); sort(x, partial= c(10,15))## illustrate 'stable' sorting (of ties):sort(c(10:3,2:12), method="shell", index.return=TRUE)# is stable## $x : 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 12## $ix: 9 8 10 7 11 6 12 5 13 4 14 3 15 2 16 1 17 18 19sort(c(10:3,2:12), method="quick", index.return=TRUE)# is not## $x : 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 12## $ix: 9 10 8 7 11 6 12 5 13 4 14 3 15 16 2 17 1 18 19x<- c(1:3,3:5,10)is.unsorted(x)# FALSE: is sortedis.unsorted(x, strictly=TRUE)# TRUE : is not (and cannot be)# sorted strictly## Not run:## Small speed comparison simulation:N<-2000Sim<-20rep<-1000# << adjust to your CPUc1<- c2<- numeric(Sim)for(isin seq_len(Sim)){ x<- rnorm(N) c1[is]<- system.time(for(iin1:rep) sort(x, method="shell"))[1] c2[is]<- system.time(for(iin1:rep) sort(x, method="quick"))[1] stopifnot(sort(x, method="shell")== sort(x, method="quick"))}rbind(ShellSort= c1, QuickSort= c2)cat("Speedup factor of quick sort():\n")summary({qq<- c1/ c2; qq[is.finite(qq)]})## A larger testx<- rnorm(1e7)system.time(x1<- sort(x, method="shell"))system.time(x2<- sort(x, method="quick"))system.time(x3<- sort(x, method="radix"))stopifnot(identical(x1, x2))stopifnot(identical(x1, x3))## End(Not run)
Generic function to sort an object in the order determined by one ormore other objects, typically vectors. A method is defined for dataframes to sort its rows (typically by one or more columns), and thedefault method handles vector-like objects.
sort_by(x, y, ...)## Default S3 method:sort_by(x, y, ...)## S3 method for class 'data.frame'sort_by(x, y, ...)
sort_by(x, y,...)## Default S3 method:sort_by(x, y,...)## S3 method for class 'data.frame'sort_by(x, y,...)
x | An object to be sorted, typically a vector or data frame. |
y | Variables to sort by. For the default method, this can be a vector, or more generally anyobject that has a For the |
... | Additional arguments, typically passed on to |
A sorted version ofx
. Ifx
is a data frame, this meansthat the rows ofx
have been reordered to sort the variablesspecified iny
.
mtcars$ammtcars$mpgwith(mtcars, sort_by(mpg, am)) # group mpg by am## data.frame methodsort_by(mtcars, runif(nrow(mtcars))) # random row permutationsort_by(mtcars, list(mtcars$am, mtcars$mpg))# formula interfacesort_by(mtcars, ~ am + mpg) |> subset(select = c(am, mpg))sort_by.data.frame(mtcars, ~ list(am, -mpg)) |> subset(select = c(am, mpg))
mtcars$ammtcars$mpgwith(mtcars, sort_by(mpg, am))# group mpg by am## data.frame methodsort_by(mtcars, runif(nrow(mtcars)))# random row permutationsort_by(mtcars, list(mtcars$am, mtcars$mpg))# formula interfacesort_by(mtcars,~ am+ mpg)|> subset(select= c(am, mpg))sort_by.data.frame(mtcars,~ list(am,-mpg))|> subset(select= c(am, mpg))
source
causesR to accept its input from the named file or URLor connection or expressions directly. Input is read andparse
d from that fileuntil the end of the file is reached, then the parsed expressions areevaluated sequentially in the chosen environment.
withAutoprint(exprs)
is a wrapper forsource(exprs = exprs, ..)
with different defaults. Its main purpose is to evaluateand auto-print expressions as if in a toplevel context, e.g, as in theR console.
source(file, local = FALSE, echo = verbose, print.eval = echo, exprs, spaced = use_file, verbose = getOption("verbose"), prompt.echo = getOption("prompt"), max.deparse.length = 150, width.cutoff = 60L, deparseCtrl = "showAttributes", chdir = FALSE, catch.aborts = FALSE, encoding = getOption("encoding"), continue.echo = getOption("continue"), skip.echo = 0, keep.source = getOption("keep.source"))withAutoprint(exprs, evaluated = FALSE, local = parent.frame(), print. = TRUE, echo = TRUE, max.deparse.length = Inf, width.cutoff = max(20, getOption("width")), deparseCtrl = c("keepInteger", "showAttributes", "keepNA"), skip.echo = 0, ...)
source(file, local=FALSE, echo= verbose, print.eval= echo, exprs, spaced= use_file, verbose= getOption("verbose"), prompt.echo= getOption("prompt"), max.deparse.length=150, width.cutoff=60L, deparseCtrl="showAttributes", chdir=FALSE, catch.aborts=FALSE, encoding= getOption("encoding"), continue.echo= getOption("continue"), skip.echo=0, keep.source= getOption("keep.source"))withAutoprint(exprs, evaluated=FALSE, local= parent.frame(), print.=TRUE, echo=TRUE, max.deparse.length=Inf, width.cutoff= max(20, getOption("width")), deparseCtrl= c("keepInteger","showAttributes","keepNA"), skip.echo=0,...)
file | aconnection or a character string giving thepathname of the file or URL to read from. The |
local |
|
echo | logical; if |
print.eval ,print. | logical; if |
exprs | for for |
evaluated | logical indicating that |
spaced | logical indicating if newline (hence empty line) shouldbe printed before each expression (when |
verbose | if |
prompt.echo | character; gives the prompt to be used if |
max.deparse.length | integer; is used only if |
width.cutoff | integer, passed to |
deparseCtrl |
|
chdir | logical; if |
catch.aborts | logical indicating that “abort”ing errorsshould be caught. |
encoding | character vector. The encoding(s) to be assumed when |
continue.echo | character; gives the prompt to use oncontinuation lines if |
skip.echo | integer; how many comment lines at the start of thefile to skip if |
keep.source | logical: should the source formatting be retainedwhen echoing expressions, if possible? |
... | (for |
Note that running code viasource
differs in a few respectsfrom entering it at theR command line. Since expressions are notexecuted at the top level, auto-printing is not done. So you willneed to include explicitprint
calls for things you want to beprinted (and remember that this includes plotting bylattice,FAQ Q7.22). Since the complete file is parsed before any of it isrun, syntax errors result in none of the code being run. If an erroroccurs in running a syntactically correct script, anything assignedinto the workspace by code that has been run will be kept (just asfrom the command line), but diagnostic information such astraceback()
will contain additional calls towithVisible
.
All versions ofR accept input from a connection with end of linemarked byLF (as used on Unix),CRLF (as used onDOS/Windows) orCR (as used on classic Mac OS) and map this tonewline. The final line can be incomplete, that is missing the finalend-of-line marker.
Ifkeep.source
is true (the default in interactive use), thesource of functions is kept so they can be listed exactly as input.
Unlike input from a console, lines in the file or on a connection cancontain an unlimited number of characters.
Whenskip.echo > 0
, that many comment lines at the start ofthe file will not be echoed. This does not affect the execution ofthe code at all. If there are executable lines within the firstskip.echo
lines, echoing will start with the first of them.
Ifecho
is true and a deparsed expression exceedsmax.deparse.length
, that many characters are output followed by .... [TRUNCATED]
.
By default the input is read and parsed in the current encoding oftheR session. This is usually what is required, but occasionallyre-encoding is needed, e.g. if a file from a UTF-8-using system is tobe read on Windows (orvice versa).
The rest of this paragraph applies iffile
is an actualfilename or URL (and not a connection). Ifencoding = "unknown"
, an attempt is made to guess the encoding:the result oflocaleToCharset()
is used as a guide. Ifencoding
has two or more elements, they are tried in turn untilthe file/URL can be read without error in the trial encoding. If anactualencoding
is specified (rather than the default or"unknown"
) in a Latin-1 or UTF-8 locale then character stringsin the result will be translated to the current encoding and marked assuch (seeEncoding
).
Iffile
is a connection,it is not possible to re-encode the input insidesource
, and sotheencoding
argument is just used to mark character strings in theparsed input in Latin-1 and UTF-8 locales: seeparse
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
demo
which usessource
;eval
,parse
andscan
;options("keep.source")
.
sys.source
which is a streamlined version to source afile into an environment.
‘The R Language Definition’ for a discussion of sourcedirectives.
someCond <- 7 > 6## want an if-clause to behave "as top level" wrt auto-printing :## (all should look "as if on top level", e.g. non-assignments should print:)if(someCond) withAutoprint({ x <- 1:12 x-1 (y <- (x-5)^2) z <- y z - 10})## If you want to source() a bunch of files, something like## the following may be useful:sourceDir <- function(path, trace = TRUE, ...) { op <- options(); on.exit(options(op)) # to reset after each for (nm in list.files(path, pattern = "[.][RrSsQq]$")) { if(trace) cat(nm,":") source(file.path(path, nm), ...) if(trace) cat("\n") options(op) }}suppressWarnings( rm(x,y) ) # remove 'x' or 'y' from global envwithAutoprint({ x <- 1:2; cat("x=",x, "\n"); y <- x^2 })## x and y now exist:stopifnot(identical(x, 1:2), identical(y, x^2))withAutoprint({ formals(sourceDir); body(sourceDir) }, max.deparse.length = 20, verbose = TRUE)## Continuing after (catchable) errors:tc <- textConnection('1:3 2 + "3" cat(" .. in spite of error: happily continuing! ..\n") 6*7')r <- source(tc, catch.aborts = TRUE)## Error in 2 + "3" ....## .. in spite of error: happily continuing! ..stopifnot(identical(r, list(value = 42, visible=TRUE)))
someCond<-7>6## want an if-clause to behave "as top level" wrt auto-printing :## (all should look "as if on top level", e.g. non-assignments should print:)if(someCond) withAutoprint({ x<-1:12 x-1(y<-(x-5)^2) z<- y z-10})## If you want to source() a bunch of files, something like## the following may be useful:sourceDir<-function(path, trace=TRUE,...){ op<- options(); on.exit(options(op))# to reset after eachfor(nmin list.files(path, pattern="[.][RrSsQq]$")){if(trace) cat(nm,":") source(file.path(path, nm),...)if(trace) cat("\n") options(op)}}suppressWarnings( rm(x,y))# remove 'x' or 'y' from global envwithAutoprint({ x<-1:2; cat("x=",x,"\n"); y<- x^2})## x and y now exist:stopifnot(identical(x,1:2), identical(y, x^2))withAutoprint({ formals(sourceDir); body(sourceDir)}, max.deparse.length=20, verbose=TRUE)## Continuing after (catchable) errors:tc<- textConnection('1:32+"3" cat(" .. in spite of error: happily continuing! ..\n")6*7')r<- source(tc, catch.aborts=TRUE)## Error in 2 + "3" ....## .. in spite of error: happily continuing! ..stopifnot(identical(r, list(value=42, visible=TRUE)))
Special mathematical functions related to the beta and gammafunctions.
beta(a, b)lbeta(a, b)gamma(x)lgamma(x)psigamma(x, deriv = 0)digamma(x)trigamma(x)choose(n, k)lchoose(n, k)factorial(x)lfactorial(x)
beta(a, b)lbeta(a, b)gamma(x)lgamma(x)psigamma(x, deriv=0)digamma(x)trigamma(x)choose(n, k)lchoose(n, k)factorial(x)lfactorial(x)
a ,b | non-negative numeric vectors. |
x ,n | numeric vectors. |
k ,deriv | integer vectors. |
The functionsbeta
andlbeta
return the beta functionand the natural logarithm of the beta function,
The formal definition is
(Abramowitz and Stegun section 6.2.1, page 258).Note that it is onlydefined inR for non-negativea
andb
, and is infiniteif either is zero.
The functionsgamma
andlgamma
return the gamma function and the natural logarithm ofthe absolute value of thegamma function. The gamma function is defined by(Abramowitz and Stegun section 6.1.1, page 255)
for all realx
except zero and negative integers (whenNaN
is returned). There will be a warning on possible loss ofprecision for values which are too close (within about) to a negative integer less than ‘-10’.
factorial(x)
( for non-negative integer
x
)is defined to begamma(x+1)
andlfactorial
to belgamma(x+1)
.
The functionsdigamma
andtrigamma
return the first and secondderivatives of the logarithm of the gamma function.psigamma(x, deriv)
(deriv >= 0
) computes thederiv
-th derivative of.
and its derivatives, the
psigamma()
functions, areoften called the ‘polygamma’ functions, e.g. inAbramowitz and Stegun (section 6.4.1, page 260); and higherderivatives (deriv = 2:4
) have occasionally been called‘tetragamma’, ‘pentagamma’, and ‘hexagamma’.
The functionschoose
andlchoose
return binomialcoefficients and the logarithms of their absolute values. Note thatchoose(n, k)
is defined for all real numbers and integer
. For
it is defined as
,as
for
and as
for negative
.Non-integer values of
k
are rounded to an integer, with a warning.choose(*, k)
uses direct arithmetic (instead of[l]gamma
calls) for smallk
, for speed and accuracyreasons. Note the functioncombn
(packageutils) for enumeration of all possible combinations.
Thegamma
,lgamma
,digamma
andtrigamma
functions areinternal genericprimitive functions: methods can bedefined for them individually or via theMath
group generic.
gamma
,lgamma
,beta
andlbeta
are based onC translations of Fortran subroutines by W. Fullerton of Los AlamosScientific Laboratory (now available as part of SLATEC).
digamma
,trigamma
andpsigamma
forx >= 0
are based on
Amos, D. E. (1983). A portable Fortran subroutine forderivatives of the psi function, Algorithm 610,ACM Transactions on Mathematical Software9(4), 494–502.
For,x < 0
andderiv <= 5
, the reflection formula (6.4.7) ofAbramowitz and Stegun is used.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (Forgamma
andlgamma
.)
Abramowitz, M. and Stegun, I. A. (1972)Handbook of Mathematical Functions. New York: Dover.https://en.wikipedia.org/wiki/Abramowitz_and_Stegun provideslinks to the full text which is in public domain.
Chapter 6: Gamma and Related Functions.
Arithmetic
for simple,sqrt
formiscellaneous mathematical functions andBessel
for thereal Bessel functions.
For the incomplete gamma function seepgamma
.
require(graphics)choose(5, 2)for (n in 0:10) print(choose(n, k = 0:n))factorial(100)lfactorial(10000)## gamma has 1st order poles at 0, -1, -2, ...## this will generate loss of precision warnings, so turn offop <- options("warn")options(warn = -1)x <- sort(c(seq(-3, 4, length.out = 201), outer(0:-3, (-1:1)*1e-6, `+`)))plot(x, gamma(x), ylim = c(-20,20), col = "red", type = "l", lwd = 2, main = expression(Gamma(x)))abline(h = 0, v = -3:0, lty = 3, col = "midnightblue")options(op)x <- seq(0.1, 4, length.out = 201); dx <- diff(x)[1]par(mfrow = c(2, 3))for (ch in c("", "l","di","tri","tetra","penta")) { is.deriv <- nchar(ch) >= 2 nm <- paste0(ch, "gamma") if (is.deriv) { dy <- diff(y) / dx # finite difference der <- which(ch == c("di","tri","tetra","penta")) - 1 nm2 <- paste0("psigamma(*, deriv = ", der,")") nm <- if(der >= 2) nm2 else paste(nm, nm2, sep = " ==\n") y <- psigamma(x, deriv = der) } else { y <- get(nm)(x) } plot(x, y, type = "l", main = nm, col = "red") abline(h = 0, col = "lightgray") if (is.deriv) lines(x[-1], dy, col = "blue", lty = 2)}par(mfrow = c(1, 1))## "Extended" Pascal triangle:fN <- function(n) formatC(n, width=2)for (n in -4:10) { cat(fN(n),":", fN(choose(n, k = -2:max(3, n+2)))) cat("\n")}## R code version of choose() [simplistic; warning for k < 0]:mychoose <- function(r, k) ifelse(k <= 0, (k == 0), sapply(k, function(k) prod(r:(r-k+1))) / factorial(k))k <- -1:6cbind(k = k, choose(1/2, k), mychoose(1/2, k))## Binomial theorem for n = 1/2 ;## sqrt(1+x) = (1+x)^(1/2) = sum_{k=0}^Inf choose(1/2, k) * x^k :k <- 0:10 # 10 is sufficient for ~ 9 digit precision:sqrt(1.25)sum(choose(1/2, k)* .25^k)
require(graphics)choose(5,2)for(nin0:10) print(choose(n, k=0:n))factorial(100)lfactorial(10000)## gamma has 1st order poles at 0, -1, -2, ...## this will generate loss of precision warnings, so turn offop<- options("warn")options(warn=-1)x<- sort(c(seq(-3,4, length.out=201), outer(0:-3,(-1:1)*1e-6, `+`)))plot(x, gamma(x), ylim= c(-20,20), col="red", type="l", lwd=2, main= expression(Gamma(x)))abline(h=0, v=-3:0, lty=3, col="midnightblue")options(op)x<- seq(0.1,4, length.out=201); dx<- diff(x)[1]par(mfrow= c(2,3))for(chin c("","l","di","tri","tetra","penta")){ is.deriv<- nchar(ch)>=2 nm<- paste0(ch,"gamma")if(is.deriv){ dy<- diff(y)/ dx# finite difference der<- which(ch== c("di","tri","tetra","penta"))-1 nm2<- paste0("psigamma(*, deriv = ", der,")") nm<-if(der>=2) nm2else paste(nm, nm2, sep=" ==\n") y<- psigamma(x, deriv= der)}else{ y<- get(nm)(x)} plot(x, y, type="l", main= nm, col="red") abline(h=0, col="lightgray")if(is.deriv) lines(x[-1], dy, col="blue", lty=2)}par(mfrow= c(1,1))## "Extended" Pascal triangle:fN<-function(n) formatC(n, width=2)for(nin-4:10){ cat(fN(n),":", fN(choose(n, k=-2:max(3, n+2)))) cat("\n")}## R code version of choose() [simplistic; warning for k < 0]:mychoose<-function(r, k) ifelse(k<=0,(k==0), sapply(k,function(k) prod(r:(r-k+1)))/ factorial(k))k<--1:6cbind(k= k, choose(1/2, k), mychoose(1/2, k))## Binomial theorem for n = 1/2 ;## sqrt(1+x) = (1+x)^(1/2) = sum_{k=0}^Inf choose(1/2, k) * x^k :k<-0:10# 10 is sufficient for ~ 9 digit precision:sqrt(1.25)sum(choose(1/2, k)*.25^k)
split
divides the data in the vectorx
into the groupsdefined byf
. The replacement forms replace valuescorresponding to such a division.unsplit
reverses the effect ofsplit
.
split(x, f, drop = FALSE, ...)## Default S3 method:split(x, f, drop = FALSE, sep = ".", lex.order = FALSE, ...)split(x, f, drop = FALSE, ...) <- valueunsplit(value, f, drop = FALSE)
split(x, f, drop=FALSE,...)## Default S3 method:split(x, f, drop=FALSE, sep=".", lex.order=FALSE,...)split(x, f, drop=FALSE,...)<- valueunsplit(value, f, drop=FALSE)
x | vector or data frame containing values to be divided into groups. |
f | a ‘factor’ in the sense that |
drop | logical indicating if levels that do not occur should be dropped(if |
value | a list of vectors or data frames compatible with asplitting of |
sep | character string, passed to |
lex.order | logical, passed to |
... | further potential arguments passed to methods. |
split
andsplit<-
are generic functions with default anddata.frame
methods. The data frame method can also be used tosplit a matrix into a list of matrices, and the replacement formlikewise, provided they are invoked explicitly.
unsplit
works with lists of vectors or data frames (assumed tohave compatible structure, as if created bysplit
). It putselements or rows back in the positions given byf
. In the dataframe case, row names are obtained by unsplitting the row namevectors from the elements ofvalue
.
f
is recycled as necessary and if the length ofx
is nota multiple of the length off
a warning is printed.
Any missing values inf
are dropped together with thecorresponding values ofx
.
The default method callsinteraction
whenf
is alist
. If the levels of the factors contain ‘.’the factors may not be split as expected, unlesssep
is set tostring not present in the factorlevels
.
The value returned fromsplit
is a list of vectors containingthe values for the groups. The components of the list are named bythe levels off
(after converting to a factor, or if already afactor anddrop = TRUE
, dropping unused levels).
The replacement forms return their right hand side.unsplit
returns a vector or data frame for whichsplit(x, f)
equalsvalue
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
cut
to categorize numeric values.
strsplit
to split strings.
require(stats); require(graphics)n <- 10; nn <- 100g <- factor(round(n * runif(n * nn)))x <- rnorm(n * nn) + sqrt(as.numeric(g))xg <- split(x, g)boxplot(xg, col = "lavender", notch = TRUE, varwidth = TRUE)sapply(xg, length)sapply(xg, mean)### Calculate 'z-scores' by group (standardize to mean zero, variance one)z <- unsplit(lapply(split(x, g), scale), g)# orzz <- xsplit(zz, g) <- lapply(split(x, g), scale)# and check that the within-group std dev is indeed onetapply(z, g, sd)tapply(zz, g, sd)### data frame variation## Notice that assignment form is not used since a variable is being addedg <- airquality$Monthl <- split(airquality, g)## Alternative using a formulaidentical(l, split(airquality, ~ Month))l <- lapply(l, transform, Oz.Z = scale(Ozone))aq2 <- unsplit(l, g)head(aq2)with(aq2, tapply(Oz.Z, Month, sd, na.rm = TRUE))### Split a matrix into a list by columnsma <- cbind(x = 1:10, y = (-4:5)^2)split(ma, col(ma))split(1:10, 1:2)
require(stats); require(graphics)n<-10; nn<-100g<- factor(round(n* runif(n* nn)))x<- rnorm(n* nn)+ sqrt(as.numeric(g))xg<- split(x, g)boxplot(xg, col="lavender", notch=TRUE, varwidth=TRUE)sapply(xg, length)sapply(xg, mean)### Calculate 'z-scores' by group (standardize to mean zero, variance one)z<- unsplit(lapply(split(x, g), scale), g)# orzz<- xsplit(zz, g)<- lapply(split(x, g), scale)# and check that the within-group std dev is indeed onetapply(z, g, sd)tapply(zz, g, sd)### data frame variation## Notice that assignment form is not used since a variable is being addedg<- airquality$Monthl<- split(airquality, g)## Alternative using a formulaidentical(l, split(airquality,~ Month))l<- lapply(l, transform, Oz.Z= scale(Ozone))aq2<- unsplit(l, g)head(aq2)with(aq2, tapply(Oz.Z, Month, sd, na.rm=TRUE))### Split a matrix into a list by columnsma<- cbind(x=1:10, y=(-4:5)^2)split(ma, col(ma))split(1:10,1:2)
A wrapper for the C functionsprintf
, that returns a charactervector containing a formatted combination of text and variable values.
sprintf(fmt, ...)gettextf(fmt, ..., domain = NULL, trim = TRUE)
sprintf(fmt,...)gettextf(fmt,..., domain=NULL, trim=TRUE)
fmt | a character vector of format strings, each of up to 8192 bytes. |
... | values to be passed into |
trim ,domain | see |
sprintf
is a wrapper for the systemsprintf
C-libraryfunction. Attempts are made to check that the mode of the valuespassed match the format supplied, andR's special values (NA
,Inf
,-Inf
andNaN
) are handled correctly.
gettextf
is a convenience function which provides C-stylestring formatting with possible translation of the format string.
The arguments (includingfmt
) are recycled if possible a wholenumber of times to the length of the longest, and then the formattingis done in parallel. Zero-length arguments are allowed and will givea zero-length result. All arguments are evaluated even if unused, andhence some types (e.g.,"symbol"
or"language"
, seetypeof
) are not allowed. Arguments unused byfmt
result in a warning. (The format%.0s
can be used to“skip” an argument.)
The following is abstracted fromKernighan and Ritchie (1988):however the actual implementation will follow the C99standard and fine details (especially the behaviour under user error)may depend on the platform. References to numbered arguments come fromPOSIX.
The stringfmt
contains normal characters,which are passed through to the output string, and also conversionspecifications which operate on the arguments provided through...
. The allowed conversion specifications start with a%
and end with one of the letters in the setaAdifeEgGosxX%
. These letters denote the following types:
d
,i
,o
,x
,X
Integervalue,o
being octal,x
andX
being hexadecimal (using the same case fora-f
as the code). Numeric variables with exactly integervalues will be coerced to integer. Formatsd
andi
can also be used for logical variables, which will be converted to0
,1
orNA
.
f
Double precision value, in “fixedpoint” decimal notation of the form ‘"[-]mmm.ddd"’. The number ofdecimal places ("d") is specified by the precision: the default is 6;a precision of 0 suppresses the decimal point. Non-finite valuesare converted toNA
,NaN
or (perhaps a sign followedby)Inf
.
e
,E
Double precision value, in“exponential” decimal notation of theform[-]m.ddde[+-]xx
or[-]m.dddE[+-]xx
.
g
,G
Double precision value, in%e
or%E
format if the exponent is less than -4 or greater than orequal to the precision, and%f
format otherwise.(The precision (default 6) specifies the number ofsignificant digits here, whereas in%f, %e
, it isthe number of digits after the decimal point.)
a
,A
Double precision value, in binary notationof the form[-]0xh.hhhp[+-]d
. This is a binary fractionexpressed in hex multiplied by a (decimal) power of 2. The numberof hex digits after the decimal point is specified by the precision:the default is enough digits to represent exactly the internalbinary representation. Non-finite values are converted toNA
,NaN
or (perhaps a sign followed by)Inf
. Format%a
uses lower-case forx
,p
and the hexvalues: format%A
uses upper-case.
This should be supported on all platforms as it is a feature of C99.The format is not uniquely defined: although it would be possibleto make the leadingh
always zero or one, this is notalways done. Most systems will suppress trailing zeros, but a fewdo not. On a well-written platform, for normal numbers there willbe a leading one before the decimal point plus (by default) 13hexadecimal digits, hence 53 bits. The treatment of denormalized(aka ‘subnormal’) numbers is very platform-dependent.
s
Character string. CharacterNA
s areconverted to"NA"
.
%
Literal%
(none of the extra formattingcharacters given below are permitted in this case).
Conversion byas.character
is used for non-characterarguments withs
and byas.double
fornon-double arguments withf, e, E, g, G
. NB: the length isdetermined before conversion, so do not rely on the internalcoercion if this would change the length. The coercion is done onlyonce, so iflength(fmt) > 1
then all elements must expect thesame types of arguments.
In addition, between the initial%
and the terminatingconversion character there may be, in any order:
m.n
Two numbers separated by a period, denoting thefield width (m
) and the precision (n
).
-
Left adjustment of converted argument in its field.
+
Always print number with sign: by default onlynegative numbers are printed with a sign.
Prefix a space if the first character is not a sign.
0
For numbers, pad to the field width with leadingzeros. For characters, this zero-pads on some platforms and isignored on others.
#
specifies “alternate output” for numbers, itsaction depending on the type:Forx
orX
,0x
or0X
will be prefixedto a non-zero result. Fore
,e
,f
,g
andG
, the output will always have a decimal point; forg
andG
, trailing zeros will not be removed.
Further, immediately after%
may come1$
to99$
to refer to a numbered argument: this allows arguments to bereferenced out of order and is mainly intended for translators oferror messages. If this is done it is best if all formats arenumbered: if not the unnumbered ones process the arguments in order.See the examples. This notation allows arguments to be used more thanonce, in which case they must be used as the same type (integer,double or character).
A field width or precision (but not both) may be indicated by anasterisk*
: in this case an argument specifies the desirednumber. A negative field width is taken as a '-' flag followed by apositive field width. A negative precision is treated as if theprecision were omitted. The argument should be integer, but a doubleargument will be coerced to integer.
There is a limit of 8192 bytes on elements offmt
, and onstrings included from a single%
letter conversionspecification.
Field widths and precisions of%s
conversions are interpretedas bytes, not characters, as described in the C standard.
The C doubles used forR numerical vectors have signed zeros, whichsprintf
may output as-0
,-0.000
....
A character vector of length that of the longest input. If anyelement offmt
or any character argument is declared as UTF-8,the element of the result will be in UTF-8 and have the encodingdeclared as UTF-8. Otherwise it will be in the current locale'sencoding.
The format string is passed down the OS'ssprintf
function, andincorrect formats can cause the latter to crash theR process .Rdoes perform sanity checks on the format, but not all possible usererrors on all platforms have been tested, and some might be terminal.
The behaviour on inputs not documented here is ‘undefined’,which means it is allowed to differ by platform.
Original code by Jonathan Rougier.
Kernighan, B. W. and Ritchie, D. M. (1988)The C Programming Language. Second edition, Prentice Hall.Describes the format options in table B-1 in the Appendix.
The C Standards, especially ISO/IEC 9899:1999 for ‘C99’. Linkscan be found athttps://developer.r-project.org/Portability.html.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.htmlfor POSIX extensions such as numbered arguments.
man sprintf
on a Unix-alike system.
formatC
for a way of formatting vectors of numbers in asimilar fashion.
paste
for another way of creating a vector combiningtext and values.
gettext
for the mechanisms for the automated translationof text.
## be careful with the format: most things in R are floats## only integer-valued reals get coerced to integer.sprintf("%s is %f feet tall\n", "Sven", 7.1) # OKtry(sprintf("%s is %i feet tall\n", "Sven", 7.1)) # not OK sprintf("%s is %i feet tall\n", "Sven", 7 ) # OK## use a literal % :sprintf("%.0f%% said yes (out of a sample of size %.0f)", 66.666, 3)## various formats of pi :sprintf("%f", pi)sprintf("%.3f", pi)sprintf("%1.0f", pi)sprintf("%5.1f", pi)sprintf("%05.1f", pi)sprintf("%+f", pi)sprintf("% f", pi)sprintf("%-10f", pi) # left justifiedsprintf("%e", pi)sprintf("%E", pi)sprintf("%g", pi)sprintf("%g", 1e6 * pi) # -> exponentialsprintf("%.9g", 1e6 * pi) # -> "fixed"sprintf("%G", 1e-6 * pi)## no truncation:sprintf("%1.f", 101)## re-use one argument three times, show difference between %x and %Xxx <- sprintf("%1$d %1$x %1$X", 0:15)xx <- matrix(xx, dimnames = list(rep("", 16), "%d%x%X"))noquote(format(xx, justify = "right"))## More sophisticated:sprintf("min 10-char string '%10s'", c("a", "ABC", "and an even longer one"))n <- 1:18sprintf(paste0("e with %2d digits = %.", n, "g"), n, exp(1))## Platform-dependent bad example: may pad with spaces or zeroessprintf("%09s", month.name)## Using arguments out of ordersprintf("second %2$1.0f, first %1$5.2f, third %3$1.0f", pi, 2, 3)## Using asterisk for width or precisionsprintf("precision %.*f, width '%*.3f'", 3, pi, 8, pi)## Asterisk and argument re-use, 'e' example reiterated:sprintf("e with %1$2d digits = %2$.*1$g", n, exp(1))## re-cycle argumentssprintf("%s %d", "test", 1:3)## binary output showing rounding/representation errorsx <- seq(0, 1.0, 0.1); y <- c(0,.1,.2,.3,.4,.5,.6,.7,.8,.9,1)cbind(x, sprintf("%a", x), sprintf("%a", y))
## be careful with the format: most things in R are floats## only integer-valued reals get coerced to integer.sprintf("%s is %f feet tall\n","Sven",7.1)# OKtry(sprintf("%s is %i feet tall\n","Sven",7.1))# not OK sprintf("%s is %i feet tall\n","Sven",7)# OK## use a literal % :sprintf("%.0f%% said yes (out of a sample of size %.0f)",66.666,3)## various formats of pi :sprintf("%f", pi)sprintf("%.3f", pi)sprintf("%1.0f", pi)sprintf("%5.1f", pi)sprintf("%05.1f", pi)sprintf("%+f", pi)sprintf("% f", pi)sprintf("%-10f", pi)# left justifiedsprintf("%e", pi)sprintf("%E", pi)sprintf("%g", pi)sprintf("%g",1e6* pi)# -> exponentialsprintf("%.9g",1e6* pi)# -> "fixed"sprintf("%G",1e-6* pi)## no truncation:sprintf("%1.f",101)## re-use one argument three times, show difference between %x and %Xxx<- sprintf("%1$d %1$x %1$X",0:15)xx<- matrix(xx, dimnames= list(rep("",16),"%d%x%X"))noquote(format(xx, justify="right"))## More sophisticated:sprintf("min 10-char string '%10s'", c("a","ABC","and an even longer one"))n<-1:18sprintf(paste0("e with %2d digits = %.", n,"g"), n, exp(1))## Platform-dependent bad example: may pad with spaces or zeroessprintf("%09s", month.name)## Using arguments out of ordersprintf("second %2$1.0f, first %1$5.2f, third %3$1.0f", pi,2,3)## Using asterisk for width or precisionsprintf("precision %.*f, width '%*.3f'",3, pi,8, pi)## Asterisk and argument re-use, 'e' example reiterated:sprintf("e with %1$2d digits = %2$.*1$g", n, exp(1))## re-cycle argumentssprintf("%s %d","test",1:3)## binary output showing rounding/representation errorsx<- seq(0,1.0,0.1); y<- c(0,.1,.2,.3,.4,.5,.6,.7,.8,.9,1)cbind(x, sprintf("%a", x), sprintf("%a", y))
Single or double quote text by combining with appropriate single ordouble left and right quotation marks.
sQuote(x, q = getOption("useFancyQuotes"))dQuote(x, q = getOption("useFancyQuotes"))
sQuote(x, q= getOption("useFancyQuotes"))dQuote(x, q= getOption("useFancyQuotes"))
x | anR object, to be coerced to a character vector. |
q | the kind of quotes to be used, see ‘Details’. |
The purpose of the functions is to provide a simple means of markupfor quoting text to be used in the R output, e.g., in warnings orerror messages.
The choice of the appropriate quotation marks depends on both thelocale and the available character sets. Older Unix/X11 fontsdisplayed the grave accent (ASCII code 0x60) and the apostrophe (0x27)in a way that they could also be used as matching open and closesingle quotation marks. Using modern fonts, or non-Unix systems,these characters no longer produce matching glyphs. Unicode providesleft and right single quotation mark characters (U+2018 and U+2019);if Unicode markup cannot be assumed to be available, it seems goodpractice to use the apostrophe as a non-directional single quotationmark.
Similarly, Unicode has left and right double quotation mark characters(U+201C and U+201D); if only ASCII's typewriter characteristics can beemployed, than the ASCII quotation mark (0x22) should be used as boththe left and right double quotation mark.
Some other locales also have the directional quotation marks, notablyon Windows. TeX uses grave and apostrophe for the directional singlequotation marks, and doubled grave and doubled apostrophe for thedirectional double quotation marks.
What rendering is used depends onq
which by default depends ontheoptions
setting foruseFancyQuotes
. If thisisFALSE
then the undirectionalASCII quotation style is used. If this isTRUE
(the default),Unicode directional quotes are used are used where available(currently, UTF-8 locales on Unix-alikes and all Windows localesexceptC
): if set to"UTF-8"
UTF-8 markup is used(whatever the current locale). If set to"TeX"
, TeX-stylemarkup is used. Finally, if this is set to a character vector oflength four, the first two entries are used for beginning and endingsingle quotes and the second two for beginning and ending doublequotes: this can be used to implement non-English quoting conventionssuch as the use of guillemets.
Where fancy quotes are used, you should be aware that they may not berendered correctly as not all fonts include the requisite glyphs: forexample some have directional single quotes but not directional doublequotes.
A character vector of the same length asx
(after any coercion)in the current locale's encoding.
Markus Kuhn, “ASCII and Unicode quotation marks”.https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
Quotes for quotingR code.
shQuote
for quoting OS commands.
op <- options("useFancyQuotes")paste("argument", sQuote("x"), "must be non-zero")options(useFancyQuotes = FALSE)cat("\ndistinguish plain", sQuote("single"), "and", dQuote("double"), "quotes\n")options(useFancyQuotes = TRUE)cat("\ndistinguish fancy", sQuote("single"), "and", dQuote("double"), "quotes\n")options(useFancyQuotes = "TeX")cat("\ndistinguish TeX", sQuote("single"), "and", dQuote("double"), "quotes\n")if(l10n_info()$`Latin-1`) { options(useFancyQuotes = c("\xab", "\xbb", "\xbf", "?")) cat("\n", sQuote("guillemet"), "and", dQuote("Spanish question"), "styles\n")} else if(l10n_info()$`UTF-8`) { options(useFancyQuotes = c("\xc2\xab", "\xc2\xbb", "\xc2\xbf", "?")) cat("\n", sQuote("guillemet"), "and", dQuote("Spanish question"), "styles\n")}options(op)
op<- options("useFancyQuotes")paste("argument", sQuote("x"),"must be non-zero")options(useFancyQuotes=FALSE)cat("\ndistinguish plain", sQuote("single"),"and", dQuote("double"),"quotes\n")options(useFancyQuotes=TRUE)cat("\ndistinguish fancy", sQuote("single"),"and", dQuote("double"),"quotes\n")options(useFancyQuotes="TeX")cat("\ndistinguish TeX", sQuote("single"),"and", dQuote("double"),"quotes\n")if(l10n_info()$`Latin-1`){ options(useFancyQuotes= c("\xab","\xbb","\xbf","?")) cat("\n", sQuote("guillemet"),"and", dQuote("Spanish question"),"styles\n")}elseif(l10n_info()$`UTF-8`){ options(useFancyQuotes= c("\xc2\xab","\xc2\xbb","\xc2\xbf","?")) cat("\n", sQuote("guillemet"),"and", dQuote("Spanish question"),"styles\n")}options(op)
These functions are for working with source files and more generallywith “source references” ("srcref"
), i.e., references tosource code. The resulting data is used for printing and source leveldebugging, and is typically available in interactiveR sessions,namely whenoptions(keep.source = TRUE)
.
srcfile(filename, encoding = getOption("encoding"), Enc = "unknown")srcfilecopy(filename, lines, timestamp = Sys.time(), isFile = FALSE)srcfilealias(filename, srcfile)getSrcLines(srcfile, first, last)srcref(srcfile, lloc)## S3 method for class 'srcfile'print(x, ...)## S3 method for class 'srcfile'summary(object, ...)## S3 method for class 'srcfile'open(con, line, ...)## S3 method for class 'srcfile'close(con, ...)## S3 method for class 'srcref'print(x, useSource = TRUE, ...)## S3 method for class 'srcref'summary(object, useSource = FALSE, ...)## S3 method for class 'srcref'as.character(x, useSource = TRUE, to = x, ...).isOpen(srcfile)
srcfile(filename, encoding= getOption("encoding"), Enc="unknown")srcfilecopy(filename, lines, timestamp= Sys.time(), isFile=FALSE)srcfilealias(filename, srcfile)getSrcLines(srcfile, first, last)srcref(srcfile, lloc)## S3 method for class 'srcfile'print(x,...)## S3 method for class 'srcfile'summary(object,...)## S3 method for class 'srcfile'open(con, line,...)## S3 method for class 'srcfile'close(con,...)## S3 method for class 'srcref'print(x, useSource=TRUE,...)## S3 method for class 'srcref'summary(object, useSource=FALSE,...)## S3 method for class 'srcref'as.character(x, useSource=TRUE, to= x,...).isOpen(srcfile)
filename | The name of a file. |
encoding | The character encoding to assume for the file. |
Enc | The encoding with which to make strings: see the |
lines | A character vector of source lines. OtherR objectswill be coerced to character. |
timestamp | The timestamp to use on a copy of a file. |
isFile | Is this |
srcfile | A |
first ,last ,line | Line numbers. |
lloc | A vector of four, six or eight values giving a source location; see‘Details’. |
x ,object ,con | An object of the appropriate class. |
useSource | Whether to read the |
to | An optional second |
... | Additional arguments to the methods; these will be ignored. |
These functions and classes handle source code references.
Thesrcfile
function produces an object of classsrcfile
, which contains the name and directory of a source codefile, along with its timestamp, for use in source level debugging (notyet implemented) and source echoing. The encoding of the file issaved; seefile
for a discussion of encodings, andiconvlist
for a list of allowable encodings on your platform.
Thesrcfilecopy
function produces an object of the descendantclasssrcfilecopy
, which saves the source lines in a charactervector. It copies the value of theisFile
argument, to helpdebuggers identify whether this text comes from a real file in thefile system.
Thesrcfilealias
function produces an object of the descendantclasssrcfilealias
, which gives an alternate name to anothersrcfile
. This is produced by the parser when a#line
directiveis used.
ThegetSrcLines
function reads the specified lines fromsrcfile
.
Thesrcref
function produces an object of classsrcref
, which describes a range of characters in asrcfile
.Thelloc
value gives the following values:
c(first_line, first_byte, last_line, last_byte, first_column, last_column, first_parsed, last_parsed)
Bytes (elements 2, 4) andcolumns (elements 5, 6) may be different due to multibytecharacters. If only four values are given, the columns and bytesare assumed to match. Lines (elements 1, 3) and parsed lines(elements 7, 8) may differ if a#line
directive is used incode: the former will respect the directive, the latter will justcount lines. If only 4 or 6 elements are given, the parsed lineswill be assumed to match the lines.
Methods are defined forprint
,summary
,open
,andclose
for classessrcfile
andsrcfilecopy
.Theopen
method opens its internalfile
connection ata particular line; if it was already open, it will be repositionedto that line.
Methods are defined forprint
,summary
andas.character
for classsrcref
. Theas.character
method will read the associated source file to obtain the textcorresponding to the reference. If theto
argument is given,it should be a secondsrcref
that follows the first, in thesame file; they will be treated as one reference to the wholerange. The exact behaviour depends on theclass of the source file. If the source file inherits fromclasssrcfilecopy
, the lines are taken from the saved copyusing the “parsed” line counts. If not, an attemptis made to read the file, and the original line numbers of thesrcref
record (i.e., elements 1 and 3) are used. If an erroroccurs (e.g., the file no longer exists), text like‘<srcref: "file" chars 1:1 to 2:10>’ will be returned instead,indicating theline:column
ranges of the first and lastcharacter. Thesummary
method defaults to this type ofdisplay.
Lists ofsrcref
objects may be attached to expressions as the"srcref"
attribute. (The list ofsrcref
objects should be the samelength as the expression.) By default, expressions are printed byprint.default
using the associatedsrcref
. Tosee deparsed code instead, callprint
with argumentuseSource = FALSE
. If asrcref
objectis printed withuseSource = FALSE
, the ‘<srcref: ....>’record will be printed.
.isOpen
is intended for internal use: it checks whether theconnection associated with asrcfile
object is open.
srcfile
returns asrcfile
object.
srcfilecopy
returns asrcfilecopy
object.
getSrcLines
returns a character vector of source code lines.
srcref
returns asrcref
object.
Duncan Murdoch
getSrcFilename
for extracting information from a sourcereference, orremoveSource
to remove it from a(non-primitive) function (aka ‘closure’).
src <- srcfile(system.file("DESCRIPTION", package = "base"))summary(src)getSrcLines(src, 1, 4)ref <- srcref(src, c(1, 1, 2, 1000))refprint(ref, useSource = FALSE)
src<- srcfile(system.file("DESCRIPTION", package="base"))summary(src)getSrcLines(src,1,4)ref<- srcref(src, c(1,1,2,1000))refprint(ref, useSource=FALSE)
Errors signaled byR when stacks used in evaluation overflow.
R uses several stacks in evaluating expressions: the C stack, thepointer protection stack, and the node stack used by the byte codeengine. In addition, the number of nestedR expressions currentlyunder evaluation is limited by the value set asoptions("expressions")
. Overflowing these stacks orlimits signals an error that inherits from classesstackOverflowError
,error
, andcondition
.
The specific classes signaled are:
CStackOverflowError
: Signaled when the C stackoverflows. Theusage
field of the error object contains thecurrent stack usage.
protectStackOverflowError
: Signaled when the pointerprotection stack overflows.
nodeStackOverflowError
: Signaled when the node stackused by the byte code engine overflows.
expressionStackOverflowError
: Signaled when the theevaluation depth, the number of nestedR expressions currentlyunder evaluation, exceeds the limit set byoptions("expressions")
Stack overflow errors can be caught and handled by exiting handlersestablished withtryCatch()
Calling handlers establishedbywithCallingHandlers()
may fail since there may not beenough stack space to run the handler. In this case the next availableexiting handler will be run, or error handling will fall back to thedefault handler. Default handlers set bytryCatch("error")
may also fail to run in a stackoverflow situation.
Cstack_info
for information on the environment and theevaluation depth limit.
Memory
andoptions
for information on theprotection stack.
The functionstandardGeneric
initiates dispatch of S4methods: see the references and the documentation of themethods package. Usually, calls to this function aregenerated automatically and not explicitly by the programmer.
standardGeneric(f, fdef)
standardGeneric(f, fdef)
f | The name of the generic. |
fdef | The generic function definition. Never passed whendefining a new generic. |
standardGeneric
dispatches the method defined for a genericfunction namedf
, using the actual arguments in the frame from whichit is called.
The argumentfdef
is inserted (automatically) when dispatchingmethods for a primitive function. If present, it must always be the functiondefinition for the corresponding generic. Don't insert this argumentby hand, as there is no validity checking and miss-specifying thefunction definition will cause certain failure.
For more, use themethods package, and see the documentation inGenericFunctions
.
John Chambers
Chambers, John M. (2008)Software for Data Analysis: Programming with RSpringer. (For the R version.)
Chambers, John M. (1998)Programming with DataSpringer (For the original S4 version.)
Determines if entries ofx
start or end with string (entries of)prefix
orsuffix
respectively, where strings arerecycled to common lengths.
startsWith(x, prefix) endsWith(x, suffix)
startsWith(x, prefix) endsWith(x, suffix)
x |
|
prefix ,suffix |
|
startsWith()
is equivalent to but much faster than
substring(x, 1, nchar(prefix)) == prefix
or also
grepl("^<prefix>", x)
whereprefix
is not to contain special regular expressioncharacters (and forgrepl
,x
does not contain missingvalues, see below).
The code has an optimized branch for the most common usage in whichprefix
orsuffix
is of length one, and is furtheroptimized in a UTF-8 or 8-byte locale if that is an ASCII string.
Alogical
vector, of “common length” ofx
andprefix
(orsuffix
), i.e., of the longer of the twolengths unless one of them is zero when the result isalso of zero length. A shorter input is recycled to the output length.
grepl
,substring
; the partial stringmatching functionscharmatch
andpmatch
solve a different task.
startsWith(search(), "package:") # typically at least two FALSE, nowadays often threex1 <- c("Foobar", "bla bla", "something", "another", "blu", "brown", "blau blüht der Enzian")# non-ASCIIx2 <- cbind( startsWith(x1, "b"), startsWith(x1, "bl"), startsWith(x1, "bla"), endsWith(x1, "n"), endsWith(x1, "an"))rownames(x2) <- x1; colnames(x2) <- c("b", "b1", "bla", "n", "an")x2## Non-equivalence in case of missing values in 'x', see Details:x <- c("all", "but", NA_character_)cbind(startsWith(x, "a"), substring(x, 1L, 1L) == "a", grepl("^a", x))
startsWith(search(),"package:")# typically at least two FALSE, nowadays often threex1<- c("Foobar","bla bla","something","another","blu","brown","blau blüht der Enzian")# non-ASCIIx2<- cbind( startsWith(x1,"b"), startsWith(x1,"bl"), startsWith(x1,"bla"), endsWith(x1,"n"), endsWith(x1,"an"))rownames(x2)<- x1; colnames(x2)<- c("b","b1","bla","n","an")x2## Non-equivalence in case of missing values in 'x', see Details:x<- c("all","but",NA_character_)cbind(startsWith(x,"a"), substring(x,1L,1L)=="a", grepl("^a", x))
InR, the startup mechanism is as follows.
Unless--no-environ was given on the command line,Rsearches for site and user files to process for setting environmentvariables. The name of the site file is the one pointed to by theenvironment variableR_ENVIRON; if this is unset,‘R_HOME/etc/Renviron.site’ is used (if it exists,which it does not in a ‘factory-fresh’ installation). The nameof the user file can be specified by theR_ENVIRON_USERenvironment variable; if this is unset, the files searched for are‘.Renviron’ in the current or in the user's home directory (inthat order). See ‘Details’ for how the files are read.
ThenR searches for the site-wide startup profile file ofR codeunless the command line option--no-site-file was given. Thepath of this file is taken from the value of theR_PROFILEenvironment variable (aftertilde expansion). If this variableis unset, the default is ‘R_HOME/etc/Rprofile.site’,which is used if it exists(which it does not in a ‘factory-fresh’ installation).
This code is sourced into the workspace (global environment). Users needto be careful not to unintentionally create objects in the workspace, andit is normally advisable to uselocal
if code needs to beexecuted: see the examples..Library.site
may be assigned to andthe assignment will effectively modify the value of the variable in thebase namespace where.libPaths()
finds it. One may alsoassign to.First
and.Last
, but assigning to other variablesin the execution environment is not recommended and does not work insome older versions ofR.
Then, unless--no-init-file was given,R searches for a userprofile, a file ofR code. The path of this file can be specified bytheR_PROFILE_USER environment variable (andtilde expansion will be performed). If this is unset, a filecalled ‘.Rprofile’ is searched for in the current directory or inthe user's home directory (in that order). The user profile file issourced into the workspace.
Note that when the site and user profile files are sourced only thebase package is loaded, so objects in other packages need to bereferred to by e.g.utils::dump.frames
or after explicitlyloading the package concerned.
R then loads a saved image of the user workspace from ‘.RData’in the current directory if there is one (unless--no-restore-data or--no-restore was specified onthe command line).
Next, if a function.First
is found on the search path,it is executed as.First()
. Finally, function.First.sys()
in thebase package is run. This callsrequire
to attach the default packages specified byoptions("defaultPackages")
. If themethodspackage is included, this will have been attached earlier (by function.OptRequireMethods()
) so that namespace initializations suchas those from the user workspace will proceed correctly.
A function.First
(and.Last
) can be defined inappropriate ‘.Rprofile’ or ‘Rprofile.site’ files or havebeen saved in ‘.RData’. If you want a different set of packagesthan the default ones when you start, insert a call tooptions
in the ‘.Rprofile’ or ‘Rprofile.site’file. For example,options(defaultPackages = character())
willattach no extra packages on startup (only thebase package) (orsetR_DEFAULT_PACKAGES=NULL
as an environment variable beforerunningR). Usingoptions(defaultPackages = "")
orR_DEFAULT_PACKAGES=""
enforces the Rsystem default.
On front-ends which support it, the commands history is read from thefile specified by the environment variableR_HISTFILE (default‘.Rhistory’ in the current directory) unless--no-restore-history or--no-restore was specified.
The command-line option--vanilla implies--no-site-file,--no-init-file,--no-environ and (except forR CMD
)--no-restore
Note that there are two sorts of files used in startup:environment files which contain lists of environment variablesto be set, andprofile files which containR code.
Lines in a site or user environment file should be either commentlines starting with#
, or lines of the formname=value
. The latter sets the environmentalvariablename
tovalue
, overriding anexisting value. Ifvalue
contains an expression of theform${foo-bar}
, the value is that of the environmentalvariablefoo
if that is set, otherwisebar
. For${foo:-bar}
, the value is that offoo
if that is set toa non-empty value, otherwisebar
. (If it is of the form${foo}
, the default is""
.) This construction can benested, sobar
can be of the same form (as in${foo-${bar-blah}}
). Note that the braces are essential: forexample$HOME
will not be interpreted.
Leading and trailing white space invalue
are stripped.value
is then processed in a similar way to a Unix shell:in particular (single or double) quotes not preceded by backslashare removed and backslashes are removed except inside such quotes.
For readability and future compatibility it is recommended to only useconstructs that have the same behavior as in a Unix shell. Hence,expansions of variables should be in double quotes (e.g."${HOME}"
, in case they may contain a backslash) and literalsincluding a backslash should be in single quotes. If a variable valuemay end in a backslash, such asPATH
on Windows, it may benecessary to protect the following quote from it, e.g."${PATH}/"
.It is recommended to use forward slashes instead of backslashes.It is ok to mix text in single and double quotes, see examples below.
On systems with sub-architectures (mainly Windows), thefiles ‘Renviron.site’ and ‘Rprofile.site’ are looked forfirst in architecture-specific directories,e.g. ‘R_HOME/etc/i386/Renviron.site’.And e.g. ‘.Renviron.i386’ will be used in preferenceto ‘.Renviron’.
There is a 100,000 byte limit on the length of a line (after expansions)in environment files.
It is not intended that there be interaction with the user duringstartup code. Attempting to do so can crash theR process.
On Unix versions ofR there is also a file‘R_HOME/etc/Renviron’ which is read very early inthe start-up processing. It contains environment variables set byRin the configure process. Values in that file can be overridden insite or user environment files: do not change‘R_HOME/etc/Renviron’ itself. Note that this isdistinct from ‘R_HOME/etc/Renviron.site’.
Command-line options may well not apply to alternative front-ends:they do not apply toR.app
on macOS.
R CMD check
andR CMD build
do not always read thestandard startup files, but they do always read specific‘Renviron’ files. The location of these can be controlled by theenvironment variablesR_CHECK_ENVIRON andR_BUILD_ENVIRON.If these are set their value is used as the path for the‘Renviron’ file; otherwise, files ‘~/.R/check.Renviron’ or‘~/.R/build.Renviron’ or sub-architecture-specific versions areemployed.
If you want ‘~/.Renviron’ or ‘~/.Rprofile’ to be ignored bychildR processes (such as those run byR CMD check
andR CMD build
), set the appropriate environment variableR_ENVIRON_USER orR_PROFILE_USER to (if possible, which itis not on Windows)""
or to the name of a non-existent file.
For the definition of the ‘home’ directory on Windows see the‘rw-FAQ’ Q2.14. It can be found from a runningR bySys.getenv("R_USER")
.
.Last
for final actions at the close of anR session.commandArgs
for accessing the command line arguments.
There are examples of using startup files to set defaults for graphicsdevices in the help for
An Introduction to R for more command-line options: thoseaffecting memory management are covered in the help file forMemory.
readRenviron
to read ‘.Renviron’ files.
For profiling code, seeRprof
.
## Not run: ## Example ~/.Renviron on UnixR_LIBS=~/R/libraryPAGER=/usr/local/bin/less## Example .Renviron on WindowsR_LIBS=C:/R/libraryMY_TCLTK="c:/Program Files/Tcl/bin"# Variable expansion in double quotes, string literals with backslashes in# single quotes.R_LIBS_USER="${APPDATA}"'\R-library'## Example of setting R_DEFAULT_PACKAGES (from R CMD check)R_DEFAULT_PACKAGES='utils,grDevices,graphics,stats'# this loads the packages in the order given, so they appear on# the search path in reverse order.## Example of .Rprofileoptions(width=65, digits=5)options(show.signif.stars=FALSE)setHook(packageEvent("grDevices", "onLoad"), function(...) grDevices::ps.options(horizontal=FALSE))set.seed(1234).First <- function() cat("\n Welcome to R!\n\n").Last <- function() cat("\n Goodbye!\n\n")## Example of Rprofile.sitelocal({ # add MASS to the default packages, set a CRAN mirror old <- getOption("defaultPackages"); r <- getOption("repos") r["CRAN"] <- "http://my.local.cran" options(defaultPackages = c(old, "MASS"), repos = r) ## (for Unix terminal users) set the width from COLUMNS if set cols <- Sys.getenv("COLUMNS") if(nzchar(cols)) options(width = as.integer(cols)) # interactive sessions get a fortune cookie (needs fortunes package) if (interactive()) fortunes::fortune()})## if .Renviron containsFOOBAR="coo\bar"doh\ex"abc\"def'"## then we get# > cat(Sys.getenv("FOOBAR"), "\n")# coo\bardoh\exabc"def'## End(Not run)
## Not run:## Example ~/.Renviron on UnixR_LIBS=~/R/libraryPAGER=/usr/local/bin/less## Example .Renviron on WindowsR_LIBS=C:/R/libraryMY_TCLTK="c:/Program Files/Tcl/bin"# Variable expansion in double quotes, string literals with backslashes in# single quotes.R_LIBS_USER="${APPDATA}"'\R-library'## Example of setting R_DEFAULT_PACKAGES (from R CMD check)R_DEFAULT_PACKAGES='utils,grDevices,graphics,stats'# this loads the packages in the order given, so they appear on# the search path in reverse order.## Example of .Rprofileoptions(width=65, digits=5)options(show.signif.stars=FALSE)setHook(packageEvent("grDevices","onLoad"),function(...) grDevices::ps.options(horizontal=FALSE))set.seed(1234).First<-function() cat("\n Welcome to R!\n\n").Last<-function() cat("\n Goodbye!\n\n")## Example of Rprofile.sitelocal({# add MASS to the default packages, set a CRAN mirror old<- getOption("defaultPackages"); r<- getOption("repos") r["CRAN"]<-"http://my.local.cran" options(defaultPackages= c(old,"MASS"), repos= r)## (for Unix terminal users) set the width from COLUMNS if set cols<- Sys.getenv("COLUMNS")if(nzchar(cols)) options(width= as.integer(cols))# interactive sessions get a fortune cookie (needs fortunes package)if(interactive()) fortunes::fortune()})## if .Renviron containsFOOBAR="coo\bar"doh\ex"abc\"def'"## then we get# > cat(Sys.getenv("FOOBAR"), "\n")# coo\bardoh\exabc"def'## End(Not run)
stop
stops execution of the current expression and executesan error action.
geterrmessage
gives the last error message.
stop(..., call. = TRUE, domain = NULL)geterrmessage()
stop(..., call.=TRUE, domain=NULL)geterrmessage()
... | zero or more objects which can be coerced to character(and which are pasted together with no separator) or a singlecondition object. |
call. | logical, indicating if the call should become part of theerror message. |
domain | see |
The error action is controlled by error handlers established withinthe executing code and by the current default error handler set byoptions(error=)
. The error is first signaled as if usingsignalCondition()
. If there are no handlers or if all handlersreturn, then the error message is printed (ifoptions("show.error.messages")
is true) and the default errorhandler is used. The default behaviour (theNULL
error-handler) in interactive use is to return to the top levelprompt or the top level browser, and in non-interactive use to(effectively) callq("no", status = 1, runLast = FALSE)
unlessgetOption("catch.script.errors")
is true.
The default handler stores the error message in a buffer; it can beretrieved bygeterrmessage()
. It also stores a trace ofthe call stack that can be retrieved bytraceback()
.
Errors will be truncated togetOption("warning.length")
characters, default 1000.
If a condition object is supplied it should be the only argument, andfurther arguments will be ignored, with a warning.
geterrmessage
gives the last error message, as a character stringending in"\n"
.
Usedomain = NA
whenever...
contain aresult fromgettextf()
as that is translated already.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
warning
,try
to catch errors and retry,andoptions
for setting error handlers.stopifnot
for validity testing.tryCatch
andwithCallingHandlers
can be used to establish custom handlerswhile executing an expression.
gettext
for the mechanisms for the automated translationof messages.
iter <- 12try(if(iter > 10) stop("too many iterations"))tst1 <- function(...) stop("dummy error")try(tst1(1:10, long, calling, expression))tst2 <- function(...) stop("dummy error", call. = FALSE)try(tst2(1:10, longcalling, expression, but.not.seen.in.Error))
iter<-12try(if(iter>10) stop("too many iterations"))tst1<-function(...) stop("dummy error")try(tst1(1:10, long, calling, expression))tst2<-function(...) stop("dummy error", call.=FALSE)try(tst2(1:10, longcalling, expression, but.not.seen.in.Error))
If any of the expressions (in...
orexprs
) are notall
TRUE
,stop
is called, producingan error message indicating thefirst expression which was not(all
) true.
stopifnot(..., exprs, exprObject, local = TRUE)
stopifnot(..., exprs, exprObject, local=TRUE)
... ,exprs | any number ofR expressions, which should eachevaluate to (a logical vector of all) { expr1 expr2 ....} Note that e.g., positive numbers arenot If names are provided to |
exprObject | alternative to |
local | (only when |
This function is intended for use in regression tests or also argumentchecking of functions, in particular to make them easier to read.
stopifnot(A, B)
or equivalentlystopifnot(exprs= {A ; B})
are conceptually equivalent to
{ if(any(is.na(A)) || !all(A)) stop(...); if(any(is.na(B)) || !all(B)) stop(...) }
SinceR version 3.6.0,stopifnot()
no longer handles potentialerrors or warnings (bytryCatch()
etc) for each singleexpressionand may usesys.call(n)
to get a meaningful and shorterror message in case an expression did not evaluate to all TRUE. Thisprovides considerably less overhead.
SinceR version 3.5.0, expressionsare evaluated sequentially,and hence evaluation stops as soon as there is a “non-TRUE”, asindicated by the above conceptual equivalence statement.
Also, sinceR version 3.5.0,stopifnot(exprs = { ... })
can be usedalternatively and may be preferable in the case of severalexpressions, as they are more conveniently evaluated interactively(“no extraneous,
”).
SinceR version 3.4.0, when an expression (from...
) is nottrueand is a call toall.equal
, the errormessage will report the (first part of the) differences reported byall.equal(*)
; sinceR 4.3.0, this happens for all callswhere"all.equal"
pmatch()
es the function called,e.g., when that is calledall.equalShow
, see the example inall.equal
.
(NULL
if all statements in...
areTRUE
.)
Trying to use thestopifnot(exprs = ..)
version via a shortcut,say,
assertWRONG <- function(exprs) stopifnot(exprs = exprs)
is delicate and the above isnot a good idea. Contrary tostopifnot()
which takes care to evaluate the parts ofexprs
one by one andstop at the first non-TRUE, the above short cut would typically evaluateall parts ofexprs
and pass the result, i.e., typically of thelast entry ofexprs
tostopifnot()
.
However, a more careful version,
assert <- function(exprs) eval.parent(substitute(stopifnot(exprs = exprs)))
may be a nice short cut forstopifnot(exprs = *)
calls using themore commonly known verb as function name.
stop
,warning
;assertCondition
in packagetools complementsstopifnot()
for testing warnings and errors.
## NB: Some of these examples are expected to produce an error. To## prevent them from terminating a run with example() they are## piped into a call to try().stopifnot(1 == 1, all.equal(pi, 3.14159265), 1 < 2) # all TRUEm <- matrix(c(1,3,3,1), 2, 2)stopifnot(m == t(m), diag(m) == rep(1, 2)) # all(.) |=> TRUEstopifnot(length(10)) |> try() # gives an error: '1' is *not* TRUE## even when if(1) "ok" worksstopifnot(all.equal(pi, 3.141593), 2 < 2, (1:10 < 12), "a" < "b") |> try()## More convenient for interactive "line by line" evaluation:stopifnot(exprs = { all.equal(pi, 3.1415927) 2 < 2 1:10 < 12 "a" < "b"}) |> try()eObj <- expression(2 < 3, 3 <= 3:6, 1:10 < 2)stopifnot(exprObject = eObj) |> try()stopifnot(exprObject = quote(3 == 3))stopifnot(exprObject = TRUE)# long all.equal() error messages are abbreviated:stopifnot(all.equal(rep(list(pi),4), list(3.1, 3.14, 3.141, 3.1415))) |> try()# The default error message can be overridden to be more informative:m[1,2] <- 12stopifnot("m must be symmetric"= m == t(m)) |> try()#=> Error: m must be symmetric##' warnifnot(): a "only-warning" version of stopifnot()##' {Yes, learn how to use do.call(substitute, ...) in a powerful manner !!}warnifnot <- stopifnot ; N <- length(bdy <- body(warnifnot))bdy <- do.call(substitute, list(bdy, list(stopifnot = quote(warnifnot))))bdy[[N-1]] <- do.call(substitute, list(bdy[[N-1]], list(stop = quote(warning))))body(warnifnot) <- bdywarnifnot(1 == 1, 1 < 2, 2 < 2) # => warns " 2 < 2 is not TRUE "warnifnot(exprs = { 1 == 1 3 < 3 # => warns "3 < 3 is not TRUE"})
## NB: Some of these examples are expected to produce an error. To## prevent them from terminating a run with example() they are## piped into a call to try().stopifnot(1==1, all.equal(pi,3.14159265),1<2)# all TRUEm<- matrix(c(1,3,3,1),2,2)stopifnot(m== t(m), diag(m)== rep(1,2))# all(.) |=> TRUEstopifnot(length(10))|> try()# gives an error: '1' is *not* TRUE## even when if(1) "ok" worksstopifnot(all.equal(pi,3.141593),2<2,(1:10<12),"a"<"b")|> try()## More convenient for interactive "line by line" evaluation:stopifnot(exprs={ all.equal(pi,3.1415927)2<21:10<12"a"<"b"})|> try()eObj<- expression(2<3,3<=3:6,1:10<2)stopifnot(exprObject= eObj)|> try()stopifnot(exprObject= quote(3==3))stopifnot(exprObject=TRUE)# long all.equal() error messages are abbreviated:stopifnot(all.equal(rep(list(pi),4), list(3.1,3.14,3.141,3.1415)))|> try()# The default error message can be overridden to be more informative:m[1,2]<-12stopifnot("m must be symmetric"= m== t(m))|> try()#=> Error: m must be symmetric##' warnifnot(): a "only-warning" version of stopifnot()##' {Yes, learn how to use do.call(substitute, ...) in a powerful manner !!}warnifnot<- stopifnot; N<- length(bdy<- body(warnifnot))bdy<- do.call(substitute, list(bdy, list(stopifnot= quote(warnifnot))))bdy[[N-1]]<- do.call(substitute, list(bdy[[N-1]], list(stop= quote(warning))))body(warnifnot)<- bdywarnifnot(1==1,1<2,2<2)# => warns " 2 < 2 is not TRUE "warnifnot(exprs={1==13<3# => warns "3 < 3 is not TRUE"})
Functions to convert between character representations and objects ofclasses"POSIXlt"
and"POSIXct"
representing calendardates and times.
## S3 method for class 'POSIXct'format(x, format = "", tz = "", usetz = FALSE, ...)## S3 method for class 'POSIXlt'format(x, format = "", usetz = FALSE, digits = getOption("digits.secs"), ...)## S3 method for class 'POSIXt'as.character(x, digits = if(inherits(x, "POSIXlt")) 14L else 6L, OutDec = ".", ...)strftime(x, format = "", tz = "", usetz = FALSE, ...)strptime(x, format, tz = "")
## S3 method for class 'POSIXct'format(x, format="", tz="", usetz=FALSE,...)## S3 method for class 'POSIXlt'format(x, format="", usetz=FALSE, digits= getOption("digits.secs"),...)## S3 method for class 'POSIXt'as.character(x, digits=if(inherits(x,"POSIXlt"))14Lelse6L, OutDec=".",...)strftime(x, format="", tz="", usetz=FALSE,...)strptime(x, format, tz="")
x | an object to be converted: a character vector for |
tz | a character string specifying the time zone to be used forthe conversion. System-specific (see |
format | a character string. The default for the |
... | further arguments to be passed from or to other methods. |
usetz | logical. Should the time zone abbreviation be appendedto the output? This is used in printing times, and more reliablethan using |
digits | integer determining the |
OutDec | a 1-character string specifying the decimal point to beused; the default isnot |
Theformat
andas.character
methods andstrftime
convert objects from the classes"POSIXlt"
and"POSIXct"
tocharacter
vectors.
strptime
converts character vectors to class"POSIXlt"
:its inputx
is first converted byas.character
.Each input string is processed as far as necessary for the formatspecified: any trailing characters are ignored.
strftime
is a wrapper forformat.POSIXlt
, and it andformat.POSIXct
first convert to class"POSIXlt"
bycallingas.POSIXlt
(so they also work for class"Date"
). Note that only that conversion depends on thetime zone. SinceR version 4.2.0,as.POSIXlt()
conversion nowtreats the non-finite numeric-Inf
,Inf
,NA
andNaN
differently (where previously all were treated asNA
). Also theformat()
method forPOSIXlt
nowtreats these different non-finite times and dates analogously to typedouble
.
The usual vector re-cycling rules are applied tox
andformat
so the answer will be of length of the longer of thesevectors.
Locale-specific conversions to and from character strings are usedwhere appropriate and available. This affects the names of the daysand months, the AM/PM indicator (if used) and the separators in outputformats such as%x
and%X
,via the setting oftheLC_TIME
locale category. The ‘currentlocale’ of the descriptions might mean the locale in use at the startof theR session or when these functions are first used. (For input,the locale-specific conversions can be changed by callingSys.setlocale
with categoryLC_TIME
(orLC_ALL
). For output, what happens depends on the OS butusually works.)
The details of the formats are platform-specific, but the following arelikely to be widely available: most are defined by the POSIX standard.Aconversion specification is introduced by%
, usuallyfollowed by a single letter orO
orE
and then a singleletter. Any character in the format string not part of a conversionspecification is interpreted literally (and%%
gives%
). Widely implemented conversion specifications include
%a
Abbreviated weekday name in the currentlocale on this platform. (Also matches full name on input:in some locales there are no abbreviations of names.)
%A
Full weekday name in the current locale. (Alsomatches abbreviated name on input.)
%b
Abbreviated month name in the current locale onthis platform. (Also matches full name on input: insome locales there are no abbreviations of names.)
%B
Full month name in the current locale. (Alsomatches abbreviated name on input.)
%c
Date and time. Locale-specific on output,"%a %b %e %H:%M:%S %Y"
on input.
%C
Century (00–99): the integer part of the yeardivided by 100.
%d
Day of the month as decimal number (01–31).
%D
Date format such as%m/%d/%y
: the C99standard says it should be that exact format (but not all OSescomply).
%e
Day of the month as decimal number (1–31), witha leading space for a single-digit number.
%F
Equivalent to %Y-%m-%d (the ISO 8601 dateformat).
%g
The last two digits of the week-based year(see%V
). (Accepted but ignored on input.)
%G
The week-based year (see%V
) as a decimalnumber. (Accepted but ignored on input.)
%h
Equivalent to%b
.
%H
Hours as decimal number (00–23). As a specialexception strings such as ‘24:00:00’ are accepted for input,since ISO 8601 allows these.
%I
Hours as decimal number (01–12).
%j
Day of year as decimal number (001–366): Forinput, 366 is only valid in a leap year.
%m
Month as decimal number (01–12).
%M
Minute as decimal number (00–59).
%n
Newline on output, arbitrary whitespace on input.
%p
AM/PM indicator in the locale. Used inconjunction with%I
andnot with%H
. Anempty string in some locales (for example on some OSes,non-English European locales including Russia). The behaviour isundefined if used for input in such a locale.
Some platforms accept%P
for output, which uses a lower-caseversion (%p
may also use lower case): others will outputP
.
%r
For output, the 12-hour clock time (using thelocale's AM or PM): only defined in some locales, and on some OSesmisleading in locales which do not define an AM/PM indicator.For input, equivalent to%I:%M:%S %p
.
%R
Equivalent to%H:%M
.
%S
Second as integer (00–61), allowing forup to two leap-seconds (but POSIX-compliant implementationswill ignore leap seconds).
%t
Tab on output, arbitrary whitespace on input.
%T
Equivalent to%H:%M:%S
.
%u
Weekday as a decimal number (1–7, Monday is 1).
%U
Week of the year as decimal number (00–53) usingSunday as the first day 1 of the week (and typically with thefirst Sunday of the year as day 1 of week 1). The US convention.
%V
Week of the year as decimal number (01–53) asdefined in ISO 8601.If the week (starting on Monday) containing 1 January has four ormore days in the new year, then it is considered week 1. Otherwise, itis the last week of the previous year, and the next week is week1. See%G
(%g
) for the year corresponding to theweek given by%V
. (Accepted but ignored on input.)
%w
Weekday as decimal number (0–6, Sunday is 0).
%W
Week of the year as decimal number (00–53) usingMonday as the first day of week (and typically with thefirst Monday of the year as day 1 of week 1). The UK convention.
%x
Date. Locale-specific on output,"%y/%m/%d"
on input.
%X
Time. Locale-specific on output,"%H:%M:%S"
on input.
%y
Year without century (00–99). On input, values00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is thebehaviour specified by the 2018 POSIX standard, but it doesalso say ‘it is expected that in a future version thedefault century inferred from a 2-digit year will change’.
%Y
Year with century. Note that whereas there was nozero in the original Gregorian calendar, ISO 8601:2004 defines itto be valid (interpreted as 1BC): seehttps://en.wikipedia.org/wiki/0_(year). However, the standardsalso say that years before 1582 in its calendar should only be usedwith agreement of the parties involved.
For input, only years0:9999
are accepted.
%z
Signed offset in hours and minutes from UTC, so-0800
is 8 hours behind UTC. (Standard only for output. ForinputR currently supports it on all platforms – values from-1400
to+1400
are accepted.)
%Z
(Output only.) Time zone abbreviation as acharacter string (empty if not available). This may not be reliablewhen a time zone has changed abbreviations over the years.
Where leading zeros are shown they will be used on output but areoptional on input. Names are matched case-insensitively on input:whether they are capitalized on output depends on the platform and thelocale. Note that abbreviated names are platform-specific (althoughthe standards specify that in the ‘C’ locale they must be thefirst three letters of the capitalized English name: this conventionis widely used in English-language locales but for example the Frenchmonth abbreviations are not the same on any two of Linux, macOS, Solarisand Windows). Knowing what the abbreviations are is essentialif you wish to use%a
,%b
or%h
as part of aninput format: see the examples for how to check.
When%z
or%Z
is used for output with anobject with an assigned time zone an attempt is made to use the valuesfor that time zone — but it is not guaranteed to succeed.
The definition of ‘whitespace’ for%n
and%t
is platform-dependent: for most it does not include non-breaking spaces.
Not in the standards and less widely implemented are
%k
The 24-hour clock time with single digits precededby a blank.
%l
The 12-hour clock time with single digits precededby a blank.
%s
(Output only.) The number of seconds since theepoch.
%+
(Output only.) Similar to%c
, often"%a %b %e %H:%M:%S %Z %Y"
. May depend on the locale.
For output there are also%O[dHImMUVwWy]
which may emitnumbers in an alternative locale-dependent format (e.g., romannumerals), and%E[cCyYxX]
which can use an alternative‘era’ (e.g., a different religious calendar). Which of theseare supported is OS-dependent. These are accepted for input, but withthe standard interpretation.
Specific toR is%OSn
, which for output gives the secondstruncated to0 <= n <= 6
decimal places (and if%OS
isnot followed by a digit, it uses the setting ofgetOption("digits.secs")
, or if that is unset,n = 0
). Further, forstrptime
%OS
will input secondsincluding fractional seconds. Note that%S
does not readfractional parts on output.
The behaviour of other conversion specifications (and even if othercharacter sequences commencing with%
are conversionspecifications) is system-specific. Some systems document that theuse of multi-byte characters informat
is unsupported: UTF-8locales are unlikely to cause a problem.
Theformat
methods andstrftime
return character vectorsrepresenting the time.NA
times are returned asNA_character_
.
strptime
turns character representations into an object ofclass"POSIXlt"
. The time zone is used to set theisdst
component and to set the"tzone"
attribute iftz != ""
. If the specified time is invalid (for example‘"2010-02-30 08:00"’) all the components of the result areNA
. (NB: this does means exactly what it says – if it is aninvalid time, not just a time that does not exist in some time zone.)
Everyone agrees that years from 1000 to 9999 should be printed with 4digits, but the standards do not define what is to be done outsidethat range. For years 0 to 999 most OSes pad with zeros or spaces to4 characters, but Linux/glibc outputs just the number.
OS facilities will probably not print years before 1 CE (aka 1 AD)‘correctly’ (they tend to assume the existence of a year 0: seehttps://en.wikipedia.org/wiki/0_(year), and some OSes get themcompletely wrong). Common formats are-45
and-045
.
Years after 9999 and before -999 are normally printed with five ormore characters.
Some platforms support modifiers from POSIX 2008 (and others). OnLinux/glibc the format"%04Y"
assures a minimum of fourcharacters and zero-padding (the default is no padding). The internalcode (as used on Windows and by default on macOS) uses zero-padding bydefault (this can be controlled by environment variableR_PAD_YEARS_BY_ZERO). On those platforms, formats%04Y
,%_4Y
and%_Y
can be used for zero, space and nopadding respectively. (On macOS, the native code (not the default)supports none of these and uses zero-padding to 4 digits.)
Offsets from GMT (also known as UTC) are part of the conversionbetween timezones and to/from class"POSIXct"
, but causedifficulties as they are often computed incorrectly.
They conventionally have the opposite sign from time-zonespecifications (seeSys.timezone
): positive values areEast of the meridian. Although there have been time zones withoffsets like +00:09:21 (Paris in 1900), and -00:44:30 (Liberia until1972), offsets are usually treated as whole numbers of minutes, andare most often seen in RFC 5322 email headers in forms like-0800
(e.g., used on the Pacific coast of the USA in winter).
Format%z
can be used for input or output: it is a characterstring, conventionally plus or minus followed by two digits for hoursand two for minutes: the standards say that an empty string should beoutput if the offset is undetermined, but some systems use+0000
or the offsets for the time zone in use for the currentyear. (On some platforms this works better after conversion to"POSIXct"
. Some platforms only recognize hour or half-houroffsets for output.)
Using%z
for input makes most sense withtz = "UTC"
.
Input uses the POSIX functionstrptime
and output the C99functionstrftime
.
However, not all OSes (notably Windows) providedstrptime
andmany issues were found for those which did, so since 2000R has useda fork of code from ‘glibc’. The forked code uses thesystem'sstrftime
to find the locale-specific day and monthnames and any AM/PM indicator.
On some platforms (including Windows and by default on macOS) thesystem'sstrftime
is replaced (along with most of the rest ofthe C-level datetime code) by code modified fromIANA's ‘tzcode’distribution (https://www.iana.org/time-zones).
Note that asstrftime
is used for output (and notwcsftime
), argumentformat
is translated if necessary tothe session encoding.
The default formats follow the rules of the ISO 8601 internationalstandard which expresses a day as"2001-02-28"
and a time as"14:01:02"
using leading zeroes as here. (The ISO form uses nospace, possibly ‘T’, to separate dates and times:R uses a spaceby default.)
Forstrptime
the input string need not specify the datecompletely: it is assumed that unspecified seconds, minutes or hoursare zero, and an unspecified year, month or day is the current one.(However, if a month is specified, the day of that month has to bespecified by%d
or%e
since the current day of themonth need not be valid for the specified month.) Some components maybe returned asNA
(but an unknowntzone
component isrepresented by an empty string).
If the time zone specified is invalid on your system, what happens issystem-specific but it will probably be ignored.
Remember that in most time zones some times do not occur and someoccur twice because of transitions to/from ‘daylight saving’(also known as ‘summer’) time.strptime
does notvalidate such times (it does not assume a specific time zone), butconversion byas.POSIXct
will do so. Conversion bystrftime
and formatting/printing uses OS facilities and mayreturn nonsensical results for non-existent times at DST transitions.
In a C locale%c
is required to be"%a %b %e %H:%M:%S %Y"
. As Windows does not comply (anduses a date format not understood outside N. America), that format isused byR on Windows in all locales.
There is a limit of 2048 bytes on each string produced bystrftime
and theformat
methods. As fromR 4.3.0attempting to exceed this is an error (previous versions silentlytruncated at 255 bytes).
International Organization for Standardization (2004, 2000, ...)‘ISO 8601. Data elements and interchange formats –Information interchange – Representation of dates and times.’,slightly updated to International Organization for Standardization (2019)‘ISO 8601-1:2019. Date and time – Representations forinformation interchange – Part 1: Basic rules’, and further amendedin 2022.For links to versions available on-line see (at the time of writing)https://dotat.at/tmp/ISO_8601-2004_E.pdf andhttps://www.qsl.net/g1smd/isopdf.htm; for information on thecurrent official version, seehttps://www.iso.org/iso/iso8601 andhttps://en.wikipedia.org/wiki/ISO_8601.
The POSIX 1003.1 standard, which is in some respects stricter than ISO 8601.
DateTimeClasses for details of the date-time classes;locales to query or set a locale.
Your system's help page onstrftime
to see how to specify theirformats. (On some systems, including Windows,strftime
isreplaced by more comprehensive internal code.)
## locale-specific version of date()format(Sys.time(), "%a %b %d %X %Y %Z")## time to sub-second accuracy (if supported by the OS)format(Sys.time(), "%H:%M:%OS3")## read in date info in format 'ddmmmyyyy'## This will give NA(s) in some non-English locales; setting the C locale## as in the commented lines will overcome this on most systems.## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")z <- strptime(x, "%d%b%Y")## Sys.setlocale("LC_TIME", lct)z(chz <- as.character(z)) # same w/o TZ## *here* (but not in general), the same as format():stopifnot(exprs = { identical(chz, format(z)) grepl("^1960-0[137]-[03][012]$", chz[!is.na(z)])})## read in date/time info in format 'm/d/y h:m:s'dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")times <- c("23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26")x <- paste(dates, times)z2 <- strptime(x, "%m/%d/%y %H:%M:%S")z2 ## *here* (but not in general), the same as format():stopifnot(identical(format(z2), as.character(z2)))## time with fractional secondsz3 <- strptime("20/2/06 11:16:16.683", "%d/%m/%y %H:%M:%OS") z3 # prints without fractional seconds by default, digits.sec = NULL ("= 0")op <- options(digits.secs = 3)z3 # shows the 3 extra digitsas.character(z3) # dittooptions(op)## time zone names are not portable, but 'EST5EDT' comes pretty close.## (but its interpretation may not be universal: see ?timezones)z4 <- strptime(c("2006-01-08 10:07:52", "2006-08-07 19:33:02"), "%Y-%m-%d %H:%M:%S", tz = "EST5EDT")z4 attr(z4, "tzone")as.character(z4)z4$sec[2] <- pi # "very" fractional secondsas.character(z4) # shows full precisionformat(z4) # no fractional secformat(z4, digits=8) # shows only 6 (hard-wired maximum)format(z4, digits=4)## An RFC 5322 header (Eastern Canada, during DST)## In a non-English locale the commented lines may be needed.## prev <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")strptime("Tue, 23 Mar 2010 14:36:38 -0400", "%a, %d %b %Y %H:%M:%S %z")## Sys.setlocale("LC_TIME", prev)## Make sure you know what the abbreviated names are for you if you wish## to use them for input (they are matched case-insensitively):format(s1 <- seq.Date(as.Date('1978-01-01'), by = 'day', len = 7), "%a")format(s2 <- seq.Date(as.Date('2000-01-01'), by = 'month', len = 12), "%b")## Non-finite date-times :format(as.POSIXct(Inf)) # "Inf" (was NA in R <= 4.1.x)format(as.POSIXlt(c(-Inf,Inf,NaN,NA))) # were all NA
## locale-specific version of date()format(Sys.time(),"%a %b %d %X %Y %Z")## time to sub-second accuracy (if supported by the OS)format(Sys.time(),"%H:%M:%OS3")## read in date info in format 'ddmmmyyyy'## This will give NA(s) in some non-English locales; setting the C locale## as in the commented lines will overcome this on most systems.## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")x<- c("1jan1960","2jan1960","31mar1960","30jul1960")z<- strptime(x,"%d%b%Y")## Sys.setlocale("LC_TIME", lct)z(chz<- as.character(z))# same w/o TZ## *here* (but not in general), the same as format():stopifnot(exprs={ identical(chz, format(z)) grepl("^1960-0[137]-[03][012]$", chz[!is.na(z)])})## read in date/time info in format 'm/d/y h:m:s'dates<- c("02/27/92","02/27/92","01/14/92","02/28/92","02/01/92")times<- c("23:03:20","22:29:56","01:03:30","18:21:03","16:56:26")x<- paste(dates, times)z2<- strptime(x,"%m/%d/%y %H:%M:%S")z2## *here* (but not in general), the same as format():stopifnot(identical(format(z2), as.character(z2)))## time with fractional secondsz3<- strptime("20/2/06 11:16:16.683","%d/%m/%y %H:%M:%OS") z3# prints without fractional seconds by default, digits.sec = NULL ("= 0")op<- options(digits.secs=3)z3# shows the 3 extra digitsas.character(z3)# dittooptions(op)## time zone names are not portable, but 'EST5EDT' comes pretty close.## (but its interpretation may not be universal: see ?timezones)z4<- strptime(c("2006-01-08 10:07:52","2006-08-07 19:33:02"),"%Y-%m-%d %H:%M:%S", tz="EST5EDT")z4 attr(z4,"tzone")as.character(z4)z4$sec[2]<- pi# "very" fractional secondsas.character(z4)# shows full precisionformat(z4)# no fractional secformat(z4, digits=8)# shows only 6 (hard-wired maximum)format(z4, digits=4)## An RFC 5322 header (Eastern Canada, during DST)## In a non-English locale the commented lines may be needed.## prev <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")strptime("Tue, 23 Mar 2010 14:36:38 -0400","%a, %d %b %Y %H:%M:%S %z")## Sys.setlocale("LC_TIME", prev)## Make sure you know what the abbreviated names are for you if you wish## to use them for input (they are matched case-insensitively):format(s1<- seq.Date(as.Date('1978-01-01'), by='day', len=7),"%a")format(s2<- seq.Date(as.Date('2000-01-01'), by='month', len=12),"%b")## Non-finite date-times :format(as.POSIXct(Inf))# "Inf" (was NA in R <= 4.1.x)format(as.POSIXlt(c(-Inf,Inf,NaN,NA)))# were all NA
Repeat the character strings in a character vector a given number oftimes (i.e., concatenate the respective numbers of copies of thestrings).
strrep(x, times)
strrep(x, times)
x | a character vector, or an object which can be coerced to acharacter vector using |
times | an integer vector giving the (non-negative) numbers oftimes to repeat the respective elements of |
The elements ofx
andtimes
will be recycled asnecessary (if one has no elements, and empty character vector isreturned). Missing elements inx
ortimes
result inmissing elements of the return value.
A character vector with the elements of the given character vectorrepeated the given numbers of times.
strrep("ABC", 2)strrep(c("A", "B", "C"), 1 : 3)## Create vectors with the given numbers of spaces:strrep(" ", 1 : 5)
strrep("ABC",2)strrep(c("A","B","C"),1:3)## Create vectors with the given numbers of spaces:strrep(" ",1:5)
Split the elements of a character vectorx
into substringsaccording to the matches to substringsplit
within them.
strsplit(x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE)
strsplit(x, split, fixed=FALSE, perl=FALSE, useBytes=FALSE)
x | character vector, each element of which is to be split. Otherinputs, including a factor, will give an error. |
split | character vector (or object which can be coerced to such)containingregular expression(s) (unless |
fixed | logical. If |
perl | logical. Should Perl-compatible regexps be used? |
useBytes | logical. If |
Argumentsplit
will be coerced to character, soyou will see uses withsplit = NULL
to meansplit = character(0)
, including in the examples below.
Note that splitting into single characters can be doneviasplit = character(0)
orsplit = ""
; the two areequivalent. The definition of ‘character’ here depends on thelocale: in a single-byte locale it is a byte, and in a multi-bytelocale it is the unit represented by a ‘wide character’ (almostalways a Unicode code point).
A missing value ofsplit
does not split the correspondingelement(s) ofx
at all.
The algorithm applied to each input string is
repeat { if the string is empty break. if there is a match add the string to the left of the match to the output. remove the match and all to the left of it. else add the string to the output. break. }
Note that this means that if there is a match at the beginning of a(non-empty) string, the first element of the output is""
, butif there is a match at the end of the string, the output is the sameas with the match removed.
Note also that if there is an empty match at the beginning of a non-emptystring, the first character is returned and the algorithm continues withthe rest of the string. This needs to be kept in mind when designing theregular expressions. For example, when looking for a word boundaryfollowed by a letter ("[[:<:]]"
withperl = TRUE
), one candisallow a match at the beginning of a string (via"(?!^)[[:<:]]"
).
Invalid inputs in the current locale are warned about up to 5 times.
A list of the same length asx
, thei
-th element of whichcontains the vector of splits ofx[i]
.
If any element ofx
orsplit
is declared to be in UTF-8(seeEncoding
), all non-ASCII character strings in theresult will be in UTF-8 and have their encoding declared as UTF-8.(This also holds if any element is declared to be Latin-1 except in aLatin-1 locale.)Forperl = TRUE, useBytes = FALSE
all non-ASCII strings in amultibyte locale are translated to UTF-8.
If any element ofx
orsplit
is marked as"bytes"
(seeEncoding
), all non-ASCII character strings created bythe splitting in the result will be marked as"bytes"
, but encodingof the resulting character strings not split is unspecified (may be"bytes"
or the original). If no element ofx
orsplit
is marked as"bytes"
, butuseBytes = TRUE
, eventhe encoding of the resulting character strings created by splitting isunspecified (may be"bytes"
or"unknown"
, possibly invalidin the current encoding). Mixed use of"bytes"
and other markedencodings is discouraged, but if still desired one may useiconv
to re-encode the result e.g. to UTF-8 with suitablysubstituted invalid bytes.
paste
for the reverse,grep
andsub
for string search andmanipulation; alsonchar
,substr
.
‘regular expression’ for the details of the patternspecification.
OptionPCRE_use_JIT
controls the details whenperl = TRUE
.
noquote(strsplit("A text I want to display with spaces", NULL)[[1]])x <- c(as = "asfef", qu = "qwerty", "yuiop[", "b", "stuff.blah.yech")# split x on the letter estrsplit(x, "e")unlist(strsplit("a.b.c", "."))## [1] "" "" "" "" ""## Note that 'split' is a regexp!## If you really want to split on '.', useunlist(strsplit("a.b.c", "[.]"))## [1] "a" "b" "c"## orunlist(strsplit("a.b.c", ".", fixed = TRUE))## a useful function: rev() for stringsstrReverse <- function(x) sapply(lapply(strsplit(x, NULL), rev), paste, collapse = "")strReverse(c("abc", "Statistics"))## get the first names of the members of R-corea <- readLines(file.path(R.home("doc"),"AUTHORS"))[-(1:8)]a <- a[(0:2)-length(a)](a <- sub(" .*","", a))# and reverse themstrReverse(a)## Note that final empty strings are not produced:strsplit(paste(c("", "a", ""), collapse="#"), split="#")[[1]]# [1] "" "a"## and also an empty string is only produced before a definite match:strsplit("", " ")[[1]] # character(0)strsplit(" ", " ")[[1]] # [1] ""
noquote(strsplit("A text I want to display with spaces",NULL)[[1]])x<- c(as="asfef", qu="qwerty","yuiop[","b","stuff.blah.yech")# split x on the letter estrsplit(x,"e")unlist(strsplit("a.b.c","."))## [1] "" "" "" "" ""## Note that 'split' is a regexp!## If you really want to split on '.', useunlist(strsplit("a.b.c","[.]"))## [1] "a" "b" "c"## orunlist(strsplit("a.b.c",".", fixed=TRUE))## a useful function: rev() for stringsstrReverse<-function(x) sapply(lapply(strsplit(x,NULL), rev), paste, collapse="")strReverse(c("abc","Statistics"))## get the first names of the members of R-corea<- readLines(file.path(R.home("doc"),"AUTHORS"))[-(1:8)]a<- a[(0:2)-length(a)](a<- sub(" .*","", a))# and reverse themstrReverse(a)## Note that final empty strings are not produced:strsplit(paste(c("","a",""), collapse="#"), split="#")[[1]]# [1] "" "a"## and also an empty string is only produced before a definite match:strsplit(""," ")[[1]]# character(0)strsplit(" "," ")[[1]]# [1] ""
Convert strings to integers according to the given base using the Cfunctionstrtol
, or choose a suitable base following the C rules.
strtoi(x, base = 0L)
strtoi(x, base=0L)
x | a character vector, or something coercible to this by |
base | an integer which is between 2 and 36 inclusive, or zero(default). |
Conversion is based on the C library functionstrtol
.
For the defaultbase = 0L
, the base chosen from the stringrepresentation of that element ofx
, so different elements canhave different bases (see the first example). The standard C rulesfor choosing the base are that octal constants (prefix0
notfollowed byx
orX
) and hexadecimal constants (prefix0x
or0X
) are interpreted as base8
and16
; all other strings are interpreted as base10
.
For a base greater than10
, lettersa
toz
(orA
toZ
) are used to represent10
to35
.
An integer vector of the same length asx
. Values which cannotbe interpreted as integers or would overflow are returned asNA_integer_
.
For decimal stringsas.integer
is equally useful.
strtoi(c("0xff", "077", "123"))strtoi(c("ffff", "FFFF"), 16L)strtoi(c("177", "377"), 8L)
strtoi(c("0xff","077","123"))strtoi(c("ffff","FFFF"),16L)strtoi(c("177","377"),8L)
Trim character strings to specified display widths.
strtrim(x, width)
strtrim(x, width)
x | a character vector, or an object which can be coerced to acharacter vector by |
width | positive integer values: recycled to the length of |
‘Width’ is interpreted as the display width in a monospacedfont. What happens with non-printable characters (such as backspace, tab)is implementation-dependent and may depend on the locale (e.g., theymay be included in the count or they may be omitted).
Using this function rather thansubstr
is important whenthere might be double-width (e.g., Chinese/Japanese/Korean) charactersin the character vector.
A character vector of the same length and with the same attributesasx
(after possible coercion).
Elements of the result will have the encoding declared as that ofthe current locale (seeEncoding
) if the correspondinginput had a declared encoding and the current locale is either Latin-1or UTF-8.
strtrim(c("abcdef", "abcdef", "abcdef"), c(1,5,10))
strtrim(c("abcdef","abcdef","abcdef"), c(1,5,10))
structure
returns the given object with furtherattributes set.
structure(.Data, ...)
structure(.Data,...)
.Data | an object which will havevarious attributes attached to it. |
... | attributes, specified in |
Adding a class"factor"
will ensure that numeric codes aregiven integer storage mode.
For historical reasons (these names are used when deparsing),attributes".Dim"
,".Dimnames"
,".Names"
,".Tsp"
and".Label"
are renamed to"dim"
,"dimnames"
,"names"
,"tsp"
and"levels"
.
It is possible to give the same tag more than once, in which case thelast value assigned wins. As with other ways of assigning attributes,usingtag = NULL
removes attributetag
from.Data
ifit is present.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
structure(1:6, dim = 2:3)
structure(1:6, dim=2:3)
Each character string in the input is first split into paragraphs (orlines containing whitespace only). The paragraphs are then formattedby breaking lines at word boundaries. The target columns for wrappinglines and the indentation of the first and all subsequent lines of aparagraph can be controlled independently.
strwrap(x, width = 0.9 * getOption("width"), indent = 0, exdent = 0, prefix = "", simplify = TRUE, initial = prefix)
strwrap(x, width=0.9* getOption("width"), indent=0, exdent=0, prefix="", simplify=TRUE, initial= prefix)
x | a character vector, or an object which can be converted to acharacter vector by |
width | a positive integer giving the target column for wrappinglines in the output. |
indent | a non-negative integer giving the indentation of thefirst line in a paragraph. |
exdent | a non-negative integer specifying the indentation ofsubsequent lines in paragraphs. |
prefix ,initial | a character string to be used as prefix foreach line except the first, for which |
simplify | a logical. If |
Whitespace (space, tab or newline characters) in the input isdestroyed. Double spaces after periods, question and explanationmarks (thought as representing sentence ends) are preserved.Currently, possible sentence ends at line breaks are not consideredspecially.
Indentation is relative to the number of characters in the prefixstring.
A character vector (ifsimplify
isTRUE
), or a list ofsuch character vectors, with declared input encodings preserved.
## Read in file 'THANKS'.x <- paste(readLines(file.path(R.home("doc"), "THANKS")), collapse = "\n")## Split into paragraphs and remove the first three onesx <- unlist(strsplit(x, "\n[ \t\n]*\n"))[-(1:3)]## Join the restx <- paste(x, collapse = "\n\n")## Now for some fun:writeLines(strwrap(x, width = 60))writeLines(strwrap(x, width = 60, indent = 5))writeLines(strwrap(x, width = 60, exdent = 5))writeLines(strwrap(x, prefix = "THANKS> "))## Note that messages are wrapped AT the target column indicated by## 'width' (and not beyond it).## From an R-devel posting by J. Hosking <[email protected]>.x <- paste(sapply(sample(10, 100, replace = TRUE), function(x) substring("aaaaaaaaaa", 1, x)), collapse = " ")sapply(10:40, function(m) c(target = m, actual = max(nchar(strwrap(x, m)))))
## Read in file 'THANKS'.x<- paste(readLines(file.path(R.home("doc"),"THANKS")), collapse="\n")## Split into paragraphs and remove the first three onesx<- unlist(strsplit(x,"\n[ \t\n]*\n"))[-(1:3)]## Join the restx<- paste(x, collapse="\n\n")## Now for some fun:writeLines(strwrap(x, width=60))writeLines(strwrap(x, width=60, indent=5))writeLines(strwrap(x, width=60, exdent=5))writeLines(strwrap(x, prefix="THANKS> "))## Note that messages are wrapped AT the target column indicated by## 'width' (and not beyond it).## From an R-devel posting by J. Hosking <[email protected]>.x<- paste(sapply(sample(10,100, replace=TRUE),function(x) substring("aaaaaaaaaa",1, x)), collapse=" ")sapply(10:40,function(m) c(target= m, actual= max(nchar(strwrap(x, m)))))
Return subsets of vectors, matrices or data frames which meet conditions.
subset(x, ...)## Default S3 method:subset(x, subset, ...)## S3 method for class 'matrix'subset(x, subset, select, drop = FALSE, ...)## S3 method for class 'data.frame'subset(x, subset, select, drop = FALSE, ...)
subset(x,...)## Default S3 method:subset(x, subset,...)## S3 method for class 'matrix'subset(x, subset, select, drop=FALSE,...)## S3 method for class 'data.frame'subset(x, subset, select, drop=FALSE,...)
x | object to be subsetted. |
subset | logical expression indicating elements or rows to keep:missing values are taken as false. |
select | expression, indicating columns to select from adata frame. |
drop | passed on to |
... | further arguments to be passed to or from other methods. |
This is a generic function, with methods supplied for matrices, dataframes and vectors (including lists). Packages and users can addfurther methods.
For ordinary vectors, the result is simplyx[subset & !is.na(subset)]
.
For data frames, thesubset
argument works on the rows. Notethatsubset
will be evaluated in the data frame, so columns canbe referred to (by name) as variables in the expression (see the examples).
Theselect
argument exists only for the methods for data framesand matrices. It works by first replacing column names in theselection expression with the corresponding column numbers in the dataframe and then using the resulting integer vector to index thecolumns. This allows the use of the standard indexing conventions sothat for example ranges of columns can be specified easily, or singlecolumns can be dropped (see the examples).
Thedrop
argument is passed on to the indexing method formatrices and data frames: note that the default for matrices isdifferent from that for indexing.
Factors may have empty levels after subsetting; unused levels arenot automatically removed. Seedroplevels
for a way todrop all unused levels from a data frame.
An object similar tox
contain just the selected elements (fora vector), rows and columns (for a matrix or data frame), and so on.
This is a convenience function intended for use interactively. Forprogramming it is better to use the standard subsetting functions like[
, and in particular the non-standard evaluation ofargumentsubset
can have unanticipated consequences.
Peter Dalgaard and Brian Ripley
subset(airquality, Temp > 80, select = c(Ozone, Temp))subset(airquality, Day == 1, select = -Temp)subset(airquality, select = Ozone:Wind)with(airquality, subset(Ozone, Temp > 80))## sometimes requiring a logical 'subset' argument is a nuisancenm <- rownames(state.x77)start_with_M <- nm %in% grep("^M", nm, value = TRUE)subset(state.x77, start_with_M, Illiteracy:Murder)# but in recent versions of R this can simply besubset(state.x77, grepl("^M", nm), Illiteracy:Murder)
subset(airquality, Temp>80, select= c(Ozone, Temp))subset(airquality, Day==1, select=-Temp)subset(airquality, select= Ozone:Wind)with(airquality, subset(Ozone, Temp>80))## sometimes requiring a logical 'subset' argument is a nuisancenm<- rownames(state.x77)start_with_M<- nm%in% grep("^M", nm, value=TRUE)subset(state.x77, start_with_M, Illiteracy:Murder)# but in recent versions of R this can simply besubset(state.x77, grepl("^M", nm), Illiteracy:Murder)
substitute
returns the parse tree for the (unevaluated)expressionexpr
, substituting any variables bound inenv
.
quote
simply returns its argument. The argument is not evaluatedand can be any R expression.
enquote
is a simple one-line utility which transforms a call ofthe formFoo(....)
into the callquote(Foo(....))
. Thisis typically used to protect acall
from early evaluation.
substitute(expr, env)quote(expr)enquote(cl)
substitute(expr, env)quote(expr)enquote(cl)
expr | any syntactically validR expression. |
cl | |
env | an environment or a list object. Defaults to thecurrent evaluation environment. |
The typical use ofsubstitute
is to create informative labelsfor data sets and plots.Themyplot
example below shows a simple use of this facility.It uses the functionsdeparse
andsubstitute
to create labels for a plot which are character string versionsof the actual arguments to the functionmyplot
.
Substitution takes place by examining each component of the parse treeas follows: If it is not a bound symbol inenv
, it isunchanged. If it is a promise object, i.e., a formal argument to afunction or explicitly created usingdelayedAssign()
,the expression slot of the promise replaces the symbol. If it is anordinary variable, its value is substituted, unlessenv
is.GlobalEnv
in which case the symbol is left unchanged.
Bothquote
andsubstitute
are ‘special’primitive functions which do not evaluate their arguments.
Themode
of the result is generally"call"
butmay in principle be any type. In particular, single-variableexpressions have mode"name"
and constants have theappropriate base mode.
substitute
works on a purely lexical basis. There is noguarantee that the resulting expression makes any sense.
Substituting and quoting often cause confusion when the argument isexpression(...)
. The result is a call to theexpression
constructor function and needs to be evaluatedwitheval
to give the actual expression object.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
missing
for argument ‘missingness’,bquote
for partial substitution,sQuote
anddQuote
for adding quotationmarks to strings.Quotes
about forward, back, and double quotes ‘'’,‘`’, and ‘"’.
all.names
to retrieve the symbol names from an expressionor call.
require(graphics)(s.e <- substitute(expression(a + b), list(a = 1))) #> expression(1 + b)(s.s <- substitute( a + b, list(a = 1))) #> 1 + bc(mode(s.e), typeof(s.e)) # "call", "language"c(mode(s.s), typeof(s.s)) # (the same)# but:(e.s.e <- eval(s.e)) #> expression(1 + b)c(mode(e.s.e), typeof(e.s.e)) # "expression", "expression"substitute(x <- x + 1, list(x = 1)) # nonsensemyplot <- function(x, y) plot(x, y, xlab = deparse1(substitute(x)), ylab = deparse1(substitute(y)))## Simple examples about lazy evaluation, etc:f1 <- function(x, y = x) { x <- x + 1; y }s1 <- function(x, y = substitute(x)) { x <- x + 1; y }s2 <- function(x, y) { if(missing(y)) y <- substitute(x); x <- x + 1; y }a <- 10f1(a) # 11s1(a) # 11s2(a) # atypeof(s2(a)) # "symbol"
require(graphics)(s.e<- substitute(expression(a+ b), list(a=1)))#> expression(1 + b)(s.s<- substitute( a+ b, list(a=1)))#> 1 + bc(mode(s.e), typeof(s.e))# "call", "language"c(mode(s.s), typeof(s.s))# (the same)# but:(e.s.e<- eval(s.e))#> expression(1 + b)c(mode(e.s.e), typeof(e.s.e))# "expression", "expression"substitute(x<- x+1, list(x=1))# nonsensemyplot<-function(x, y) plot(x, y, xlab= deparse1(substitute(x)), ylab= deparse1(substitute(y)))## Simple examples about lazy evaluation, etc:f1<-function(x, y= x){ x<- x+1; y}s1<-function(x, y= substitute(x)){ x<- x+1; y}s2<-function(x, y){if(missing(y)) y<- substitute(x); x<- x+1; y}a<-10f1(a)# 11s1(a)# 11s2(a)# atypeof(s2(a))# "symbol"
Extract or replace substrings in a character vector.
substr(x, start, stop)substring(text, first, last = 1000000L)substr(x, start, stop) <- valuesubstring(text, first, last = 1000000L) <- value
substr(x, start, stop)substring(text, first, last=1000000L)substr(x, start, stop)<- valuesubstring(text, first, last=1000000L)<- value
x ,text | a character vector. |
start ,first | integer. The first element to be extracted or replaced. |
stop ,last | integer. The last element to be extracted or replaced. |
value | a character vector, recycled if necessary. |
substring
is compatible with S, withfirst
andlast
instead ofstart
andstop
.For vector arguments, it expands the arguments cyclically to thelength of the longestprovided none are of zero length.
When extracting, ifstart
is larger than the string length then""
is returned.
For the extraction functions,x
ortext
will beconverted to a character vector byas.character
if it is notalready one.
For the replacement functions, ifstart
is larger than thestring length then no replacement is done. If the portion to bereplaced is longer than the replacement string, then only theportion the length of the string is replaced.
If any argument is anNA
element, the corresponding element ofthe answer isNA
.
Elements of the result will be have the encoding declared as that ofthe current locale (seeEncoding
) if the correspondinginput had a declared Latin-1 or UTF-8 encoding and the current localeis either Latin-1 or UTF-8.
If an input element has declared"bytes"
encoding (seeEncoding
), the subsetting is done in units of bytes notcharacters.
Forsubstr
, a character vector of the same length and with thesame attributes asx
(after possible coercion).
Forsubstring
, a character vector of length the longest of thearguments. This will have names taken fromx
(if it has anyafter coercion, repeated as needed), and other attributes copied fromx
if it is the longest of the arguments).
For the replacement functions, a character vector of the same length asx
ortext
, withattributes
such asnames
preserved.
Elements ofx
ortext
with a declared encoding (seeEncoding
) will be returned with the same encoding.
The S version ofsubstring<-
ignoreslast
; this versiondoes not.
These functions are often used withnchar
to truncate adisplay. That does not really work (you want to limit the width, notthe number of characters, so it would be better to usestrtrim
), but at least make sure you use the defaultnchar(type = "chars")
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (substring
.)
substr("abcdef", 2, 4)substring("abcdef", 1:6, 1:6)## strsplit() is more efficient ...substr(rep("abcdef", 4), 1:4, 4:5)x <- c("asfef", "qwerty", "yuiop[", "b", "stuff.blah.yech")substr(x, 2, 5)substring(x, 2, 4:6)X <- xnames(X) <- LETTERS[seq_along(x)]comment(X) <- noquote("is a named vector")str(aX <- attributes(X))substring(x, 2) <- c("..", "+++")substring(X, 2) <- c("..", "+++")Xstopifnot(x == X, identical(aX, attributes(X)), nzchar(comment(X)))
substr("abcdef",2,4)substring("abcdef",1:6,1:6)## strsplit() is more efficient ...substr(rep("abcdef",4),1:4,4:5)x<- c("asfef","qwerty","yuiop[","b","stuff.blah.yech")substr(x,2,5)substring(x,2,4:6)X<- xnames(X)<- LETTERS[seq_along(x)]comment(X)<- noquote("is a named vector")str(aX<- attributes(X))substring(x,2)<- c("..","+++")substring(X,2)<- c("..","+++")Xstopifnot(x== X, identical(aX, attributes(X)), nzchar(comment(X)))
sum
returns the sum of all the valuespresent in its arguments.
sum(..., na.rm = FALSE)
sum(..., na.rm=FALSE)
... | numeric or complex or logical vectors. |
na.rm | logical. Should missing values (including |
This is a generic function: methods can be defined for itdirectly or via theSummary
group generic.For this to work properly, the arguments...
should beunnamed, and dispatch is on the first argument.
Ifna.rm
isFALSE
anNA
orNaN
value inany of the arguments will cause a value ofNA
orNaN
tobe returned, otherwiseNA
andNaN
values are ignored.
Logical true values are regarded as one, false values as zero.For historical reasons,NULL
is accepted and treated as if itwereinteger(0)
.
Loss of accuracy can occur when summing values of different signs:this can even occur for sufficiently long integer inputs if thepartial sums would cause integer overflow. Where possibleextended-precision accumulators are used, typically well supportedwith C99 and newer, but possibly platform-dependent.
The sum. If all of the...
arguments are of typeinteger or logical, then the sum isinteger
whenpossible and isdouble
otherwise. Integer overflow should nolonger happen sinceR version 3.5.0.For other argument types it is a length-one numeric(double
) or complex vector.
NB: the sum of an empty set is zero, by definition.
This is part of the S4Summary
group generic. Methods for it must use the signaturex, ..., na.rm
.
‘plotmath’ for the use ofsum
in plot annotation.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
colSums
for row and column sums.
## Pass a vector to sum, and it will add the elements together.sum(1:5)## Pass several numbers to sum, and it also adds the elements.sum(1, 2, 3, 4, 5)## In fact, you can pass vectors into several arguments, and everything gets added.sum(1:2, 3:5)## If there are missing values, the sum is unknown, i.e., also missing, ....sum(1:5, NA)## ... unless we exclude missing values explicitly:sum(1:5, NA, na.rm = TRUE)
## Pass a vector to sum, and it will add the elements together.sum(1:5)## Pass several numbers to sum, and it also adds the elements.sum(1,2,3,4,5)## In fact, you can pass vectors into several arguments, and everything gets added.sum(1:2,3:5)## If there are missing values, the sum is unknown, i.e., also missing, ....sum(1:5,NA)## ... unless we exclude missing values explicitly:sum(1:5,NA, na.rm=TRUE)
summary
is a generic function used to produce result summariesof the results of various model fitting functions. The functioninvokes particularmethods
which depend on theclass
of the first argument.
summary(object, ...)## Default S3 method:summary(object, ..., digits, quantile.type = 7)## S3 method for class 'data.frame'summary(object, maxsum = 7, digits = max(3, getOption("digits")-3), ...)## S3 method for class 'factor'summary(object, maxsum = 100, ...)## S3 method for class 'matrix'summary(object, ...)## S3 method for class 'summaryDefault'format(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'summaryDefault'print(x, digits = max(3L, getOption("digits") - 3L), ...)
summary(object,...)## Default S3 method:summary(object,..., digits, quantile.type=7)## S3 method for class 'data.frame'summary(object, maxsum=7, digits= max(3, getOption("digits")-3),...)## S3 method for class 'factor'summary(object, maxsum=100,...)## S3 method for class 'matrix'summary(object,...)## S3 method for class 'summaryDefault'format(x, digits= max(3L, getOption("digits")-3L),...)## S3 method for class 'summaryDefault'print(x, digits= max(3L, getOption("digits")-3L),...)
object | an object for which a summary is desired. |
x | a result of thedefault method of |
maxsum | integer, indicating how many levels should be shown for |
digits | integer, used for number formatting with |
quantile.type | integer code used in |
... | additional arguments affecting the summary produced. |
Forfactor
s, the frequency of the firstmaxsum - 1
most frequent levels is shown, and the less frequent levels aresummarized in"(Others)"
(resulting in at mostmaxsum
frequencies).
The functionssummary.lm
andsummary.glm
are examplesof particular methods which summarize the results produced bylm
andglm
.
The form of the value returned bysummary
depends on theclass of its argument. See the documentation of the particularmethods for details of what is produced by that method.
The default method returns an object of classc("summaryDefault", "table")
which has specializedformat
andprint
methods. Thefactor
method returns an integer vector.
The matrix and data frame methods return a matrix of class"table"
, obtained by applyingsummary
to eachcolumn and collating the results.
Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.
summary(attenu, digits = 4) #-> summary.data.frame(...), default precisionsummary(attenu $ station, maxsum = 20) #-> summary.factor(...)lst <- unclass(attenu$station) > 20 # logical with NAs## summary.default() for logicals -- different from *.factor:summary(lst)summary(as.factor(lst))
summary(attenu, digits=4)#-> summary.data.frame(...), default precisionsummary(attenu$ station, maxsum=20)#-> summary.factor(...)lst<- unclass(attenu$station)>20# logical with NAs## summary.default() for logicals -- different from *.factor:summary(lst)summary(as.factor(lst))
Compute the singular-value decomposition of a rectangular matrix.
svd(x, nu = min(n, p), nv = min(n, p), LINPACK = FALSE)La.svd(x, nu = min(n, p), nv = min(n, p))
svd(x, nu= min(n, p), nv= min(n, p), LINPACK=FALSE)La.svd(x, nu= min(n, p), nv= min(n, p))
x | a numeric or complex matrix whose SVD decompositionis to be computed. Logical matrices are coerced to numeric. |
nu | the number of left singular vectors to be computed.This must between |
nv | the number of right singular vectors to be computed.This must be between |
LINPACK | logical. Defunct and an error. |
The singular value decomposition plays an important role in manystatistical techniques.svd
andLa.svd
provide twointerfaces which differ in their return values.
Computing the singular vectors is the slow part for large matrices.The computation will be more efficient if bothnu <= min(n, p)
andnv <= min(n, p)
, and even more so if both are zero.
Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code (most often1
): these canonly be interpreted by detailed study of the FORTRAN code but meanthat the algorithm failed to converge.
Missing,NaN
or infinite values inx
will givenan error.
The SVD decomposition of the matrix as computed by LAPACK,
where and
areorthogonal,
meansV transposed (and conjugatedfor complex input), and
is a diagonal matrix with the(non-negative) singular values
in decreasingorder. Equivalently,
, which is verified inthe examples.
The returned value is a list with components
d | a vector containing the singular values of |
u | a matrix whose columns contain the left singular vectors of |
v | a matrix whose columns contain the right singular vectors of |
Recall that the singular vectors are only defined up to sign (aconstant of modulus one in the complex case). If a left singularvector has its sign changed, changing the sign of the correspondingright vector gives an equivalent decomposition.
ForLa.svd
the return value replacesv
byvt
, the(conjugated if complex) transpose ofv
.
The main functions used are the LAPACK routinesDGESDD
andZGESDD
.
LAPACK is fromhttps://netlib.org/lapack/ and its guide islisted in the references.
Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.
The‘Singular-value decomposition’ Wikipedia article.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, `+`) }X <- hilbert(9)[, 1:6](s <- svd(X))D <- diag(s$d)s$u %*% D %*% t(s$v) # X = U D V't(s$u) %*% X %*% s$v # D = U' X V
hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}X<- hilbert(9)[,1:6](s<- svd(X))D<- diag(s$d)s$u%*% D%*% t(s$v)# X = U D V't(s$u)%*% X%*% s$v# D = U' X V
Return an array obtained from an input array by sweeping out a summarystatistic.
sweep(x, MARGIN, STATS, FUN = "-", check.margin = TRUE, ...)
sweep(x, MARGIN, STATS, FUN="-", check.margin=TRUE,...)
x | an array, including a matrix. |
MARGIN | a vector of indices giving the extent(s) of |
STATS | the summary statistic which is to be swept out. |
FUN | the function to be used to carry out the sweep. |
check.margin | logical. If |
... | optional arguments to |
FUN
is found by a call tomatch.fun
. As in thedefault, binary operators can be supplied if quoted or backquoted.
FUN
should be a function of two arguments: it will be calledwith argumentsx
and an array of the same dimensions generatedfromSTATS
byaperm
.
The consistency check amongSTATS
,MARGIN
andx
is stricter ifSTATS
is an array than if it is a vector.In the vector case, some kinds of recycling are allowed without awarning. Usesweep(x, MARGIN, as.array(STATS))
ifSTATS
is a vector and you want to be warned if any recycling occurs.
An array with the same shape asx
, but with the summarystatistics swept out.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
apply
on whichsweep
used to be based;scale
for centering and scaling.
require(stats) # for medianmed.att <- apply(attitude, 2, median)sweep(data.matrix(attitude), 2, med.att) # subtract the column medians## More sweeping:A <- array(1:24, dim = 4:2)## no warnings in normal usesweep(A, 1, 5)(A.min <- apply(A, 1, min)) # == 1:4sweep(A, 1, A.min)sweep(A, 1:2, apply(A, 1:2, median))## warnings when mismatchsweep(A, 1, 1:3) # STATS does not recyclesweep(A, 1, 6:1) # STATS is longer## exact recycling:sweep(A, 1, 1:2) # no warningsweep(A, 1, as.array(1:2)) # warning## Using named dimnamesdimnames(A) <- list(fee=1:4, fie=1:3, fum=1:2)mn_fum_fie <- apply(A, c("fum", "fie"), mean)mn_fum_fiesweep(A, c("fum", "fie"), mn_fum_fie)
require(stats)# for medianmed.att<- apply(attitude,2, median)sweep(data.matrix(attitude),2, med.att)# subtract the column medians## More sweeping:A<- array(1:24, dim=4:2)## no warnings in normal usesweep(A,1,5)(A.min<- apply(A,1, min))# == 1:4sweep(A,1, A.min)sweep(A,1:2, apply(A,1:2, median))## warnings when mismatchsweep(A,1,1:3)# STATS does not recyclesweep(A,1,6:1)# STATS is longer## exact recycling:sweep(A,1,1:2)# no warningsweep(A,1, as.array(1:2))# warning## Using named dimnamesdimnames(A)<- list(fee=1:4, fie=1:3, fum=1:2)mn_fum_fie<- apply(A, c("fum","fie"), mean)mn_fum_fiesweep(A, c("fum","fie"), mn_fum_fie)
switch
evaluatesEXPR
and accordingly chooses one of thefurther arguments (in...
).
switch(EXPR, ...)
switch(EXPR,...)
EXPR | an expression evaluating to a number or a characterstring. |
... | the list of alternatives. If it is intended that |
switch
works in two distinct ways depending whether the firstargument evaluates to a character string or a number.
If the value ofEXPR
is not a character string it is coerced tointeger. Note that this also happens forfactor
s, witha warning, as typically the character level is meant. If the integeris between 1 andnargs()-1
then the corresponding element of...
is evaluated and the result returned: thus if the firstargument is3
then the fourth argument is evaluated andreturned.
IfEXPR
evaluates to a character string then that string ismatched (exactly) to the names of the elements in...
. Ifthere is a match then that element is evaluated unless it is missing,in which case the next non-missing element is evaluated, so forexampleswitch("cc", a = 1, cc =, cd =, d = 2)
evaluates to2
. If there is more than one match, the first matching elementis used. In the case of no match, if there is an unnamed element of...
its value is returned. (If there is more than one suchargument an error is signaled.)
The first argument is always taken to beEXPR
: if it is namedits name must (partially) match.
A warning is signaled if no alternatives are provided, as this isusually a coding error.
This is implemented as aprimitive function that only evaluatesits first argument and one other if one is selected.
The value of one of the elements of...
, orNULL
,invisibly (whenever no element is selected).
The result has the visibility (seeinvisible
) of theelement evaluated.
It is possible to write calls toswitch
that can be confusingand may not work in the same way in earlier versions ofR. Forcompatibility (and clarity), always haveEXPR
as the firstargument, naming it if partial matching is a possibility. For thecharacter-string form, have a single unnamed argument as the defaultafter the named values.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
require(stats)centre <- function(x, type) { switch(type, mean = mean(x), median = median(x), trimmed = mean(x, trim = .1))}x <- rcauchy(10)centre(x, "mean")centre(x, "median")centre(x, "trimmed")ccc <- c("b","QQ","a","A","bb")# note: cat() produces no output for NULLfor(ch in ccc) cat(ch,":", switch(EXPR = ch, a = 1, b = 2:3), "\n")for(ch in ccc) cat(ch,":", switch(EXPR = ch, a =, A = 1, b = 2:3, "Otherwise: last"),"\n")## switch(f, *) with a factor fff <- gl(3,1, labels=LETTERS[3:1])ff[1] # C## so one might expect " is C" here, butswitch(ff[1], A = "I am A", B="Bb..", C=" is C")# -> "I am A"## so we give a warning## Numeric EXPR does not allow a default value to be specified## -- it is always NULLfor(i in c(-1:3, 9)) print(switch(i, 1, 2 , 3, 4))## visibilityswitch(1, invisible(pi), pi)switch(2, invisible(pi), pi)
require(stats)centre<-function(x, type){ switch(type, mean= mean(x), median= median(x), trimmed= mean(x, trim=.1))}x<- rcauchy(10)centre(x,"mean")centre(x,"median")centre(x,"trimmed")ccc<- c("b","QQ","a","A","bb")# note: cat() produces no output for NULLfor(chin ccc) cat(ch,":", switch(EXPR= ch, a=1, b=2:3),"\n")for(chin ccc) cat(ch,":", switch(EXPR= ch, a=, A=1, b=2:3,"Otherwise: last"),"\n")## switch(f, *) with a factor fff<- gl(3,1, labels=LETTERS[3:1])ff[1]# C## so one might expect " is C" here, butswitch(ff[1], A="I am A", B="Bb..", C=" is C")# -> "I am A"## so we give a warning## Numeric EXPR does not allow a default value to be specified## -- it is always NULLfor(iin c(-1:3,9)) print(switch(i,1,2,3,4))## visibilityswitch(1, invisible(pi), pi)switch(2, invisible(pi), pi)
OutlinesR syntax and gives the precedence of operators.
The following unary and binary operators are defined. They are listedin precedence groups, from highest to lowest.
:: ::: | access variables in a namespace |
$ @ | component / slot extraction |
[ [[ | indexing |
^ | exponentiation (right to left) |
- + | unary minus and plus |
: | sequence operator |
%any% |> | special operators (including%% and%/% ) |
* / | multiply, divide |
+ - | (binary) add, subtract |
< > <= >= == != | ordering and comparison |
! | negation |
& && | and |
| || | or |
~ | as in formulae |
-> ->> | rightwards assignment |
<- <<- | assignment (right to left) |
= | assignment (right to left) |
? | help (unary and binary) |
Within an expression operators of equal precedence are evaluatedfrom left to right except where indicated. (Note that=
is notnecessarily an operator.)
The binary operators::
,:::
,$
and@
requirenames or string constants on the right hand side, and the first twoalso require them on the left.
The links in theSee Also section cover most other aspects ofthe basic syntax.
There are substantial precedence differences betweenR and S. Inparticular, in S?
has the same precedence as (binary)+ -
and& && | ||
have equal precedence.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Arithmetic
,Comparison
,Control
,Extract
,Logic
,NumericConstants
,Paren
,Quotes
,Reserved
.
The ‘R Language Definition’ manual.
## Logical AND ("&&") has higher precedence than OR ("||"):TRUE || TRUE && FALSE # is the same asTRUE || (TRUE && FALSE) # and different from(TRUE || TRUE) && FALSE## Special operators have higher precedence than "!" (logical NOT).## You can use this for %in% :! 1:10 %in% c(2, 3, 5, 7) # same as !(1:10 %in% c(2, 3, 5, 7))## but we strongly advise to use the "!( ... )" form in this case!## '=' has lower precedence than '<-' ... so you should not mix them## (and '<-' is considered better style anyway):## Not run: ## Consequently, this gives a ("non-catchable") error x <- y = 5 #-> Error in (x <- y) = 5 : ....## End(Not run)
## Logical AND ("&&") has higher precedence than OR ("||"):TRUE||TRUE&&FALSE# is the same asTRUE||(TRUE&&FALSE)# and different from(TRUE||TRUE)&&FALSE## Special operators have higher precedence than "!" (logical NOT).## You can use this for %in% :!1:10%in% c(2,3,5,7)# same as !(1:10 %in% c(2, 3, 5, 7))## but we strongly advise to use the "!( ... )" form in this case!## '=' has lower precedence than '<-' ... so you should not mix them## (and '<-' is considered better style anyway):## Not run: ## Consequently, this gives a ("non-catchable") error x<- y=5#-> Error in (x <- y) = 5 : ....## End(Not run)
Sys.getenv
obtains the values of the environment variables.
Sys.getenv(x = NULL, unset = "", names = NA)
Sys.getenv(x=NULL, unset="", names=NA)
x | a character vector, or |
unset | a character string. |
names | logical: should the result be named? If |
Both arguments will be coerced to character if necessary.
Settingunset = NA
will enable unset variables and those set tothe value""
to be distinguished,if the OS does. POSIXrequires the OS to distinguish, and all known currentR platforms do.
A vector of the same length asx
, with (ifnames == TRUE
) the variable names as itsnames
attribute. Each elementholds the value of the environment variable named by the correspondingcomponent ofx
(or the value ofunset
if no environmentvariable with that name was found).
On most platformsSys.getenv()
will return a named vectorgiving the values of all the environment variables, sorted in thecurrent locale. It may be confused by names containing=
whichsome platforms allow but POSIX does not. (Windows is such a platform:there names including=
are truncated just before the first=
.)
Whenx
is missing andnames
is not false, the result isof class"Dlist"
in order to get a niceprint
method.
Sys.setenv
,Sys.getlocale
for the locale in use,getwd
for the working directory.
The help for ‘environment variables’ lists many of theenvironment variables used byR.
## whether HOST is set will be shell-dependent e.g. Solaris' csh did not.Sys.getenv(c("R_HOME", "R_PAPERSIZE", "R_PRINTCMD", "HOST"))s <- Sys.getenv() # *all* environment variablesop <- options(width=111) # (nice printing)names(s) # all settings (the values could be very long)head(s, 12) # using the Dlist print() method## Language and Locale settings -- but rather use Sys.getlocale()s[grep("^L(C|ANG)", names(s))]## typically R-related:s[grep("^_?R_", names(s))]options(op)# reset
## whether HOST is set will be shell-dependent e.g. Solaris' csh did not.Sys.getenv(c("R_HOME","R_PAPERSIZE","R_PRINTCMD","HOST"))s<- Sys.getenv()# *all* environment variablesop<- options(width=111)# (nice printing)names(s)# all settings (the values could be very long)head(s,12)# using the Dlist print() method## Language and Locale settings -- but rather use Sys.getlocale()s[grep("^L(C|ANG)", names(s))]## typically R-related:s[grep("^_?R_", names(s))]options(op)# reset
Get the process ID of theR Session. It is guaranteed by theoperating system that twoR sessions running simultaneously willhave different IDs, but it is possible thatR sessions running atdifferent times will have the same ID.
Sys.getpid()
Sys.getpid()
An integer, often between 1 and 32767 under Unix-alikes (but forexample FreeBSD and macOS use IDs up to 99999) and apositive integer (up to 32767) under Windows.
Sys.getpid()## Show files opened from this R processif(.Platform$OS.type == "unix") ## on Unix-alikes such Linux, macOS, FreeBSD: system(paste("lsof -p", Sys.getpid()))
Sys.getpid()## Show files opened from this R processif(.Platform$OS.type=="unix")## on Unix-alikes such Linux, macOS, FreeBSD: system(paste("lsof -p", Sys.getpid()))
Function to do wildcard expansion (also known as ‘globbing’) onfile paths.
Sys.glob(paths, dirmark = FALSE)
Sys.glob(paths, dirmark=FALSE)
paths | character vector of patterns for relative or absolutefilepaths. Missing values will be ignored. |
dirmark | logical: should matches to directories from patternsthat do not already end in have a slash appended? May not be supported on all platforms. |
This expands tilde (seetilde expansion) and wildcards in file paths. For precise details of wildcards expansion, see yoursystem's documentation on theglob
system call. There is aPOSIX 1003.2 standard (seehttps://pubs.opengroup.org/onlinepubs/9699919799/functions/glob.html)but some OSes will go beyond this.
All systems should interpret*
(match zero or more characters),?
(match a single character) and (probably)[
(begin acharacter class or range). The handling of pathsending with a separator is system-dependent. On a POSIX-2008compliant OS they will match directories (only), but as they are notvalid filepaths on Windows, they match nothing there. (Earlier POSIXstandards allowed them to match files.)
The rest of these details are indicative (and based on the POSIXstandard).
If a filename starts with.
this may need to be matchedexplicitly: for exampleSys.glob("*.RData")
may or may notmatch ‘.RData’ but will not usually match ‘.aa.RData’. Notethat this is platform-dependent: e.g. on SolarisSys.glob("*.*")
matches ‘.’ and ‘..’.
[
begins a character class. If the first character in[...]
is not!
, this is a character class which matchesa single character against any of the characters specified. The classcannot be empty, so]
can be included provided it is first. Ifthe first character is!
, the character class matches a singlecharacter which isnone of the specified characters. Whether.
in a character class matches a leading.
in thefilename is OS-dependent.
Character classes can include ranges such as[A-Z]
: include-
as a character by having it first or last in a class. (Theinterpretation of ranges should be locale-specific, so the example isnot a good idea in an Estonian locale.)
One can remove the special meaning of?
,*
and[
by preceding them by a backslash (except within acharacter class).
A character vector of matched file paths. The order issystem-specific (but in the order of the elements ofpaths
): itis normally collated in either the current locale or in byte (ASCII)order; however, on Windows collation is in the order of Unicodepoints.
Directory errors are normally ignored, so the matches are toaccessible file paths (but not necessarily accessible files).
Quotes for handling backslashes in character strings.
Sys.glob(file.path(R.home(), "library", "*", "R", "*.rdx"))
Sys.glob(file.path(R.home(),"library","*","R","*.rdx"))
Reports system and user information.
Sys.info()
Sys.info()
This uses POSIX or Windows system calls. Note that OS names (sysname
) might notbe what you expect: for example macOS identifies itself as‘Darwin’ and Solaris as ‘SunOS’.
Sys.info()
returns details of the platformR is running on,whereasR.version
gives details of the platformR wasbuilt on: therelease
andversion
may well be different.
A character vector with fields
sysname | The operating system name. |
release | The OS release. |
version | The OS version. |
nodename | A name by which the machine is known on the network (ifany). |
machine | A concise description of the hardware, often the CPU type. |
login | The user's login name, or |
user | The name of the real user ID, or |
effective_user | The name of the effective user ID, or |
The first five fields come from theuname(2)
system call. Thelogin name comes fromgetlogin(2)
, and the user names fromgetpwuid(getuid())
andgetpwuid(geteuid())
.
The last three fields give the same value.
The meaning ofrelease
andversion
is system-dependent:on a Unix-alike they normally refer to the kernel. There, usuallyrelease
contains a numeric version andversion
givesadditional information. Examples forrelease
:
"4.17.11-200.fc28.x86_64" # Linux (Fedora) "3.16.0-5-amd64" # Linux (Debian) "17.7.0" # macOS 10.13.6 "5.11" # Solaris
There is no guarantee that the node or login or user names will bewhat you might reasonably expect. (In particular on some Linuxdistributions the login name is unknown from sessions with re-directedinputs.)
The use of alternatives such assystem("whoami")
is notportable: the POSIX commandsystem("id")
is much more portableon Unix-alikes, provided only the POSIX options-[Ggu][nr] areused (and not the many BSD and GNU extensions).whoami
isequivalent toid -un
(on Solaris,/usr/xpg4/bin/id -un
).
Windows may report unexpected versions: there, see the help for
.Platform
, andR.version
.sessionInfo()
gives a synopsis of both your system andtheR session (and gives the OS version in a human-readable form).
Sys.info()## An alternative (and probably better) way to get the login name on UnixSys.getenv("LOGNAME")
Sys.info()## An alternative (and probably better) way to get the login name on UnixSys.getenv("LOGNAME")
Get details of the numerical and monetary representations in thecurrent locale.
Sys.localeconv()
Sys.localeconv()
NormallyR is run without looking at the value ofLC_NUMERIC,so the decimal point remains '.
'. So the first three of thesecomponents will only be useful if you have set the locale categoryLC_NUMERIC
usingSys.setlocale
in the currentR session(whenR may not work correctly).
The monetary components will only be set to non-default values (seethe ‘Examples’ section) if theLC_MONETARY
category isset. It often is not set: set the examples for how to trigger setting it.
A character vector with 18 named components. See your ISO Cdocumentation for details of the meaning.
It is possible to compileR without support for locales, in whichcase the value will beNULL
.
Sys.setlocale
for ways to set locales.
Sys.localeconv()## The results in the C locale are## decimal_point thousands_sep grouping int_curr_symbol## "." "" "" ""## currency_symbol mon_decimal_point mon_thousands_sep mon_grouping## "" "" "" ""## positive_sign negative_sign int_frac_digits frac_digits## "" "" "127" "127"## p_cs_precedes p_sep_by_space n_cs_precedes n_sep_by_space## "127" "127" "127" "127"## p_sign_posn n_sign_posn## "127" "127"## Now try your default locale (which might be "C").old <- Sys.getlocale()## The category may not be set:## the following may do so, but it might not be supported.Sys.setlocale("LC_MONETARY", locale = "")Sys.localeconv()## or set an appropriate value yourself, e.g.Sys.setlocale("LC_MONETARY", "de_AT")Sys.localeconv()Sys.setlocale(locale = old)## Not run: read.table("foo", dec=Sys.localeconv()["decimal_point"])
Sys.localeconv()## The results in the C locale are## decimal_point thousands_sep grouping int_curr_symbol## "." "" "" ""## currency_symbol mon_decimal_point mon_thousands_sep mon_grouping## "" "" "" ""## positive_sign negative_sign int_frac_digits frac_digits## "" "" "127" "127"## p_cs_precedes p_sep_by_space n_cs_precedes n_sep_by_space## "127" "127" "127" "127"## p_sign_posn n_sign_posn## "127" "127"## Now try your default locale (which might be "C").old<- Sys.getlocale()## The category may not be set:## the following may do so, but it might not be supported.Sys.setlocale("LC_MONETARY", locale="")Sys.localeconv()## or set an appropriate value yourself, e.g.Sys.setlocale("LC_MONETARY","de_AT")Sys.localeconv()Sys.setlocale(locale= old)## Not run: read.table("foo", dec=Sys.localeconv()["decimal_point"])
These functions provide access toenvironment
s(‘frames’ in S terminology) associated with functions furtherup the calling stack.
sys.call(which = 0)sys.frame(which = 0)sys.nframe()sys.function(which = 0)sys.parent(n = 1)sys.calls()sys.frames()sys.parents()sys.on.exit()sys.status()parent.frame(n = 1)
sys.call(which=0)sys.frame(which=0)sys.nframe()sys.function(which=0)sys.parent(n=1)sys.calls()sys.frames()sys.parents()sys.on.exit()sys.status()parent.frame(n=1)
which | the frame number if non-negative, the number of framesto go back if negative. |
n | the number of generations to go back. (See the‘Details’ section.) |
.GlobalEnv
is given number 0 in the list of frames.Each subsequent function evaluation increases the frame stack by 1.The call, function definition and the environment for evaluationof that function are returned bysys.call
,sys.function
andsys.frame
with the appropriate index.
sys.call
,sys.function
andsys.frame
acceptinteger values for the argumentwhich
. Non-negative values ofwhich
are frame numbers starting from.GlobalEnv
whereas negative values are counted back from the frame number of thecurrent evaluation.
The parent frame of a function evaluation is the environment in whichthe function was called. It is not necessarily numbered one less thanthe frame number of the current evaluation, nor is it the environmentwithin which the function was defined.sys.parent
returns thenumber of the parent frame ifn
is 1 (the default), thegrandparent ifn
is 2, and so on. See also the ‘Note’.
sys.nframe
returns an integer, the number of the current frameas described in the first paragraph.
sys.calls
andsys.frames
give a pairlist of all theactive calls and frames, respectively, andsys.parents
returnsan integer vector of indices of the parent frames of each of thoseframes.
Notice that even though thesys.
xxx functions (exceptsys.status
) are interpreted, their contexts are not counted norare they reported. There is no access to them.
sys.status()
returns a list with componentssys.calls
,sys.parents
andsys.frames
, the results of calls tothose three functions (which will include the call tosys.status
: see the first example).
sys.on.exit()
returns the expression stored for use byon.exit
in the function currently being evaluated.(Note that this differs from S, which returns a list of expressionsfor the current frame and its parents.)
parent.frame(n)
is a convenient shorthand forsys.frame(sys.parent(n))
(implemented slightly more efficiently).
sys.call
returns a call,sys.function
a functiondefinition, andsys.frame
andparent.frame
return anenvironment.
For the other functions, see the ‘Details’ section.
Strictly,sys.parent
andparent.frame
refer to thecontext of the parent interpreted function. So internalfunctions (which may or may not set contexts and so may or may notappear on the call stack) may not be counted, and S3 methods can also dosurprising things.
As an effect of lazy evaluation, these functions look at the call stack atthe time they are evaluated, not at the time they are called. Passingcalls to them as function arguments is unlikely to be a good idea, butthese functions still look at the call stack and count frames from theframe of the function evaluation from which they were called.
Hence, when these functions are called to provide default values forfunction arguments, they are evaluated in the evaluation of the calledfunction and they count frames accordingly (see e.g. theenvir
argument ofeval
).
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (Notparent.frame
.)
eval
for a usage ofsys.frame
andparent.frame
.
require(utils)## Note: the first two examples will give different results## if run by example().ff <- function(x) gg(x)gg <- function(y) sys.status()str(ff(1))gg <- function(y) { ggg <- function() { cat("current frame is", sys.nframe(), "\n") cat("parents are", sys.parents(), "\n") print(sys.function(0)) # ggg print(sys.function(2)) # gg } if(y > 0) gg(y-1) else ggg()}gg(3)t1 <- function() { aa <- "here" t2 <- function() { ## in frame 2 here cat("current frame is", sys.nframe(), "\n") str(sys.calls()) ## list with two components t1() and t2() cat("parents are frame numbers", sys.parents(), "\n") ## 0 1 print(ls(envir = sys.frame(-1))) ## [1] "aa" "t2" invisible() } t2()}t1()test.sys.on.exit <- function() { on.exit(print(1)) ex <- sys.on.exit() str(ex) cat("exiting...\n")}test.sys.on.exit()## gives 'language print(1)', prints 1 on exit## An example where the parent is not the next frame up the stack## since method dispatch uses a frame.as.double.foo <- function(x){ str(sys.calls()) print(sys.frames()) print(sys.parents()) print(sys.frame(-1)); print(parent.frame()) x}t2 <- function(x) as.double(x)a <- structure(pi, class = "foo")t2(a)
require(utils)## Note: the first two examples will give different results## if run by example().ff<-function(x) gg(x)gg<-function(y) sys.status()str(ff(1))gg<-function(y){ ggg<-function(){ cat("current frame is", sys.nframe(),"\n") cat("parents are", sys.parents(),"\n") print(sys.function(0))# ggg print(sys.function(2))# gg}if(y>0) gg(y-1)else ggg()}gg(3)t1<-function(){ aa<-"here" t2<-function(){## in frame 2 here cat("current frame is", sys.nframe(),"\n") str(sys.calls())## list with two components t1() and t2() cat("parents are frame numbers", sys.parents(),"\n")## 0 1 print(ls(envir= sys.frame(-1)))## [1] "aa" "t2" invisible()} t2()}t1()test.sys.on.exit<-function(){ on.exit(print(1)) ex<- sys.on.exit() str(ex) cat("exiting...\n")}test.sys.on.exit()## gives 'language print(1)', prints 1 on exit## An example where the parent is not the next frame up the stack## since method dispatch uses a frame.as.double.foo<-function(x){ str(sys.calls()) print(sys.frames()) print(sys.parents()) print(sys.frame(-1)); print(parent.frame()) x}t2<-function(x) as.double(x)a<- structure(pi, class="foo")t2(a)
Find out if a file path is a symbolic link, and if so what it islinked to,via the system callreadlink
.
Symbolic links are a POSIX concept, not implemented on Windows but formost filesystems on Unix-alikes.
Sys.readlink(paths)
Sys.readlink(paths)
paths | character vector of file paths. Tilde expansion is done:see |
A character vector of the same length aspaths
. Theentries are the path of the file linked to,""
if the path isnot a symbolic link, andNA
if there is an error (e.g., thepath does not exist or cannot be converted to the native encoding).
On platforms without thereadlink
system call, all elements are""
.
file.symlink
for the creation of symbolic links (andtheir Windows analogues),file.info
##' To check if files (incl. directories) are symbolic links:is.symlink <- function(paths) isTRUE(nzchar(Sys.readlink(paths), keepNA=TRUE))## will return all FALSE when the platform has no `readlink` system call.is.symlink("/foo/bar")
##' To check if files (incl. directories) are symbolic links:is.symlink<-function(paths) isTRUE(nzchar(Sys.readlink(paths), keepNA=TRUE))## will return all FALSE when the platform has no `readlink` system call.is.symlink("/foo/bar")
Sys.setenv
sets environment variables (for other processescalled from withinR or future calls toSys.getenv
fromthisR process).
Sys.unsetenv
removes environment variables.
Sys.setenv(...)Sys.unsetenv(x)
Sys.setenv(...)Sys.unsetenv(x)
... | named arguments with values coercible to a character string. |
x | a character vector, or an object coercible to character. |
Non-standardR names must be quoted inSys.setenv
: see theexamples. Most platforms (and POSIX) do not allow names containing"="
. Windows does, but the facilities provided byR may nothandle these correctly so they should be avoided. Most platformsallow setting an environment variable to""
, but Windows doesnot and thereSys.setenv(FOO = "")
unsetsFOO.
There may be system-specific limits on the maximum length of thevalues of individual environment variables or of names+values of allenvironment variables.
Recent versions of Windows have a maximum length of 32,767 characters for aenvironment variable; howevercmd.exe
has a limit of 8192characters for a command line, henceset
can only set 8188.
A logical vector, with elements being true if (un)setting thecorresponding variable succeeded. (ForSys.unsetenv
thisincludes attempting to remove a non-existent variable.)
On Unix-alikes, ifSys.unsetenv
is not supported, it will atleast try to set the value of the environment variable to""
,with a warning.
Sys.getenv
,Startup for ways to set environmentvariables for theR session.
setwd
for the working directory.
Sys.setlocale
to set (and get) language locale variables,and notablySys.setLanguage
to set theLANGUAGEenvironment variable which is used forconditionMessage
translations.
The help for ‘environment variables’ lists many of theenvironment variables used byR.
print(Sys.setenv(R_TEST = "testit", "A+C" = 123)) # `A+C` could also be usedSys.getenv("R_TEST")Sys.unsetenv("R_TEST") # on Unix-alike may warn and not succeedSys.getenv("R_TEST", unset = NA)
print(Sys.setenv(R_TEST="testit","A+C"=123))# `A+C` could also be usedSys.getenv("R_TEST")Sys.unsetenv("R_TEST")# on Unix-alike may warn and not succeedSys.getenv("R_TEST", unset=NA)
Uses system calls to set the times on a file or directory.
Sys.setFileTime(path, time)
Sys.setFileTime(path, time)
path | A character vector containing file or directory paths. |
time | A date-time of class |
This attempts sets the file time to the value specified.
On a Unix-alike it uses the system callutimensat
if that isavailable, otherwiseutimes
orutime
. On a POSIX filesystem it sets both the last-access and modification times.Fractional seconds will set as fromR 3.4.0 on OSes with therequisite system calls and suitable filesystems.
On Windows it uses the system callSetFileTime
to set the‘last write time’. Some Windows file systems only record thetime at a resolution of two seconds.
Sys.setFileTime
has been vectorized inR 3.6.0. Earlier versionsofR requiredpath
andtime
to be vectors of length one.
A logical vector indicating if the operation succeeded for each of thefiles and directories attempted, returned invisibly.
Suspend execution ofR expressions for a specified time interval.
Sys.sleep(time)
Sys.sleep(time)
time | The time interval to suspend execution for, in seconds. |
Using this function allowsR to temporarily be given very lowpriority and hence not to interfere with more important foregroundtasks. A typical use is to allow a process launched fromR to setitself up and read its input files beforeR execution is resumed.
The intention is that this function suspends execution ofRexpressions but wakes the process up often enough to respond to GUIevents, typically every half second. It can be interrupted(e.g. by ‘Ctrl-C’ or ‘Esc’ at theR console).
There is no guarantee that the process will sleep for the whole of thespecified interval (sleep might be interrupted), and it may well takeslightly longer in real time to resume execution.
time
must be non-negative (and notNA
norNaN
):Inf
is allowed (and might be appropriate if the intention is towait indefinitely for an interrupt). The resolution of the timeinterval is system-dependent, but will normally be 20ms or better.(On modern Unix-alikes it will be better than 1ms.)
InvisibleNULL
.
Despite its name, this is not currently implemented using thesleep
system call (although on Windows it does make use ofSleep
).
testit <- function(x){ p1 <- proc.time() Sys.sleep(x) proc.time() - p1 # The cpu usage should be negligible}testit(3.7)
testit<-function(x){ p1<- proc.time() Sys.sleep(x) proc.time()- p1# The cpu usage should be negligible}testit(3.7)
Parses expressions in the given file, and then successively evaluatesthem in the specified environment.
sys.source(file, envir = baseenv(), chdir = FALSE, keep.source = getOption("keep.source.pkgs"), keep.parse.data = getOption("keep.parse.data.pkgs"), toplevel.env = as.environment(envir))
sys.source(file, envir= baseenv(), chdir=FALSE, keep.source= getOption("keep.source.pkgs"), keep.parse.data= getOption("keep.parse.data.pkgs"), toplevel.env= as.environment(envir))
file | a character string naming the file to be read from. |
envir | anR object specifying the environment in which theexpressions are to be evaluated. May also be a list or an integer.The default |
chdir | logical; if |
keep.source | logical. If |
keep.parse.data | logical. If |
toplevel.env | anR environment to be used as top level whileevaluating the expressions. This argument is useful for frameworksrunning package tests; the default should be used in other cases. |
For large files,keep.source = FALSE
may save quite a bit ofmemory. Disabling only parse data viakeep.parse.data = FALSE
can already save a lot.
envir
In order for the code being evaluated to use the correct environment(for example, in global assignments), source code in packages shouldcalltopenv()
, which will return the namespace, if any,the environment set up bysys.source
, or the global environmentif a saved image is being used.
source
, andloadNamespace
whichis called fromlibrary(.)
and usessys.source(.)
.
## a simple way to put some objects in an environment## high on the search pathtmp <- tempfile()writeLines("aaa <- pi", tmp)env <- attach(NULL, name = "myenv")sys.source(tmp, env)unlink(tmp)search()aaadetach("myenv")
## a simple way to put some objects in an environment## high on the search pathtmp<- tempfile()writeLines("aaa <- pi", tmp)env<- attach(NULL, name="myenv")sys.source(tmp, env)unlink(tmp)search()aaadetach("myenv")
Sys.time
andSys.Date
returns the system's idea of thecurrent date with and without time.
Sys.time()Sys.Date()
Sys.time()Sys.Date()
Sys.time
returns an absolute date-time value which can beconverted to various time zones and may return different days.
Sys.Date
returns the current day in the currenttime zone.
Sys.time
returns an object of class"POSIXct"
(seeDateTimeClasses). On almost all systems it will havesub-second accuracy, possibly microseconds or better. On Windows itincrements in clock ticks (usually 1/60 of a second) reported tomillisecond accuracy.
Sys.Date
returns an object of class"Date"
(seeDate).
Sys.time
may return fractional seconds, but they are ignored bythe default conversions (e.g., printing) for class"POSIXct"
.See the examples andformat.POSIXct
for ways to reveal them.
date
for the system time in a fixed-format characterstring.
system.time
for measuring elapsed/CPU time of expressions.
Sys.time()## print with possibly greater accuracy:op <- options(digits.secs = 6)Sys.time()options(op)## locale-specific version of date()format(Sys.time(), "%a %b %d %X %Y")Sys.Date()
Sys.time()## print with possibly greater accuracy:op<- options(digits.secs=6)Sys.time()options(op)## locale-specific version of date()format(Sys.time(),"%a %b %d %X %Y")Sys.Date()
This is an interface to the system commandwhich
, or to anemulation on Windows.
Sys.which(names)
Sys.which(names)
names | Character vector of names or paths of possible executables. |
The system commandwhich
reports on the full path names ofan executable (including an executable script) as would be executed bya shell, accepting either absolute paths or looking on the path.
On Windows an ‘executable’ is a file with extension‘.exe’, ‘.com’, ‘.cmd’ or ‘.bat’. Such files neednot actually be executable, but they are whatsystem
tries.
On a Unix-alike the full path towhich
(usually‘/usr/bin/which’) is found whenR is installed.
A character vector of the same length asnames
, named bynames
. The elements are either the full path to theexecutable or some indication that no executable of that name wasfound. Typically the indication is""
, but this does depend onthe OS (and the known exceptions are changed to""
). Missingvalues innames
have missing return values.
On Windows the paths will be short paths (8+3 components, no spaces)with\
as the path delimiter.
Except on Windows this calls the system commandwhich
: sincethat is not part of e.g. the POSIX standards, exactly what it does isOS-dependent. It will usually do tilde-expansion and it may make useofcsh
aliases.
## the first two are likely to exist everywhere## texi2dvi exists on most Unix-alikes and under MiKTeXSys.which(c("ftp", "ping", "texi2dvi", "this-does-not-exist"))
## the first two are likely to exist everywhere## texi2dvi exists on most Unix-alikes and under MiKTeXSys.which(c("ftp","ping","texi2dvi","this-does-not-exist"))
system
invokes the OS command specified bycommand
.
system(command, intern = FALSE, ignore.stdout = FALSE, ignore.stderr = FALSE, wait = TRUE, input = NULL, show.output.on.console = TRUE, minimized = FALSE, invisible = TRUE, timeout = 0, receive.console.signals = wait)
system(command, intern=FALSE, ignore.stdout=FALSE, ignore.stderr=FALSE, wait=TRUE, input=NULL, show.output.on.console=TRUE, minimized=FALSE, invisible=TRUE, timeout=0, receive.console.signals= wait)
command | the system command to be invoked, as a character string. |
intern | a logical (not |
ignore.stdout ,ignore.stderr | a logical (not |
wait | a logical (not |
input | if a character vector is supplied, this is copied onestring per line to a temporary file, and the standard input of |
timeout | timeout in seconds, ignored if 0. This is a limit for theelapsed time running |
receive.console.signals | a logical (not |
show.output.on.console ,minimized ,invisible | argumentsthat are accepted on Windows but ignored on this platform, with awarning. |
This interface has become rather complicated over the years: seesystem2
for a more portable and flexible interfacewhich is recommended for new code.
command
is parsed as a command plus arguments separated byspaces. So if the path to the command (or a single argument such as afile path) contains spaces, it must be quoted e.g. byshQuote
.
Unix-alikes pass the command line to a shell (normally ‘/bin/sh’,and POSIX requires that shell), socommand
can be anything theshell regards as executable, including shell scripts, and it cancontain multiple commands separated by;
.
On Windows,system
does not use a shell and there is a separatefunctionshell
which passes command lines to a shell.
Ifintern
isTRUE
thenpopen
is used to invoke thecommand and the output collected, line by line, into anRcharacter
vector. Ifintern
isFALSE
thenthe C functionsystem
is used to invoke the command.
wait
is implemented by appending&
to the command: thisis in principle shell-dependent, but required by POSIX and so widelysupported.
Whentimeout
is non-zero, the command is terminated after the givennumber of seconds. The termination works for typical commands, but is notguaranteed: it is possible to write a program that would keep runningafter the time is out. Timeouts can only be set withwait = TRUE
.
Timeouts cannot be used with interactive commands: the command is run withstandard input redirected from ‘/dev/null’ and it must not modifyterminal settings. As long as ttytostop
option is disabled, whichit usually is by default, the executed command may write to standardoutput and standard error. One cannot rely on that the execution time ofthe child processes will be included intouser.child
andsys.child
element ofproc_time
returned byproc.time
. For the time to be included, all child processes have to be waited for bytheir parents, which has to be implemented in the parent applications.
The ordering of arguments after the first two has changed from time totime: it is recommended to name all arguments after the first.
There are many pitfalls in usingsystem
to ascertain if acommand can be run —Sys.which
is more suitable.
receive.console.signals = TRUE
is useful when running asynchronousprocesses (usingwait = FALSE
) to implement a synchronous operation.In all other cases it is recommended to use the default.
Ifintern = TRUE
, a character vector giving the output of thecommand, one line per character string. (Output lines of more than8095 bytes will be split on some systems.)If the command could not be run anR error is generated.
Ifcommand
runs but gives a non-zero exit status this will bereported with a warning and in the attribute"status"
of theresult: an attribute"errmsg"
may also be available.
Ifintern = FALSE
, the return value is an error code (0
for success), given the invisible attribute (so needs to be printedexplicitly). If the command could not be run for any reason, thevalue is127
and a warning is issued (as fromR 3.5.0).Otherwise ifwait = TRUE
the value is the exit status returnedby the command, and ifwait = FALSE
it is0
(theconventional success value).
If the command times out, a warning is reported and the exit status is124
.
For command-lineR, error messages written to ‘stderr’ will besent to the terminal unlessignore.stderr = TRUE
. They can becaptured (in the most likely shells) by
system("some command 2>&1", intern = TRUE)
For GUIs, what happens to output sent to ‘stdout’ or‘stderr’ ifintern = FALSE
is interface-specific, and itis unsafe to assume that such messages will appear on a GUI console(they do on the macOS GUI's console, but not on some others).
How processes are launched differs fundamentally between Windows andUnix-alike operating systems, as do the higher-level OS functions onwhich thisR function is built. So it should not be surprising thatthere are many differences between OSes in howsystem
behaves.For the benefit of programmers, the more important ones are summarizedin this section.
The most important difference is that on a Unix-alikesystem
launches a shell which then runscommand
. OnWindows the command is run directly – useshell
for aninterface which runscommand
via a shell (by defaultthe Windows shellcmd.exe
, which has many differences froma POSIX shell).
This means that it cannot be assumed that redirection or piping willwork insystem
(redirection sometimes does, but we have seencases where it stopped working after a Windows security patch), andsystem2
(orshell
) must be used on Windows.
What happens tostdout
andstderr
when notcaptured depends on howR is running: Windows batch commands behavelike a Unix-alike, but from the Windows GUI they aregenerally lost.system(intern = TRUE)
captures ‘stderr’when run from the Windows GUI console unlessignore.stderr = TRUE
.
The behaviour on error is different in subtle ways (and hasdiffered betweenR versions).
The quoting conventions forcommand
differ, butshQuote
is a portable interface.
Argumentsshow.output.on.console
,minimized
,invisible
only do something on Windows (and are most relevanttoRgui
there).
man system
andman sh
for how this is implementedon the OS in use.
.Platform
for platform-specific variables.
pipe
to set up a pipe connection.
# list all files in the current directory using the -F flag## Not run: system("ls -F")# t1 is a character vector, each element giving a line of output from who# (if the platform has who)t1 <- try(system("who", intern = TRUE))try(system("ls fizzlipuzzli", intern = TRUE, ignore.stderr = TRUE))# zero-length result since file does not exist, and will give warning.
# list all files in the current directory using the -F flag## Not run: system("ls -F")# t1 is a character vector, each element giving a line of output from who# (if the platform has who)t1<- try(system("who", intern=TRUE))try(system("ls fizzlipuzzli", intern=TRUE, ignore.stderr=TRUE))# zero-length result since file does not exist, and will give warning.
Finds the full file names of files in packages etc.
system.file(..., package = "base", lib.loc = NULL, mustWork = FALSE)
system.file(..., package="base", lib.loc=NULL, mustWork=FALSE)
... | character vectors, specifying subdirectory and file(s)within some package. The default, none, returns theroot of the package. Wildcards are not supported. |
package | a character string with the name of a single package.An error occurs if more than one package name is given. |
lib.loc | a character vector with path names ofR libraries.See ‘Details’ for the meaning of the default value of |
mustWork | logical. If |
This checks the existence of the specified files withfile.exists
. So file paths are only returned if thereare sufficient permissions to establish their existence.
The unnamed arguments in...
are usually character strings, butif character vectors they are recycled to the same length.
This usesfind.package
to find the package, and hencewith the defaultlib.loc = NULL
looks first for attachedpackages then in each library listed in.libPaths()
.Note that if a namespace is loaded but the package is not attached,this will look only on.libPaths()
.
A character vector of positive length, containing the file pathsthat matched...
, or the empty string,""
, if nonematched (unlessmustWork = TRUE
).
If matching the root of a package, there is no trailing separator.
system.file()
with no arguments gives the root of thebase package.
R.home
for the root directory of theRinstallation,list.files
.
Sys.glob
to find paths via wildcards.
system.file() # The root of the 'base' packagesystem.file(package = "stats") # The root of package 'stats'system.file("INDEX")system.file("help", "AnIndex", package = "splines")
system.file()# The root of the 'base' packagesystem.file(package="stats")# The root of package 'stats'system.file("INDEX")system.file("help","AnIndex", package="splines")
Return CPU (and other) times thatexpr
used.
system.time(expr, gcFirst = TRUE)
system.time(expr, gcFirst=TRUE)
expr | ValidR expression to be timed. |
gcFirst | Logical - should a garbage collection be performedimmediately before the timing? Default is |
system.time
calls the functionproc.time
,evaluatesexpr
, and then callsproc.time
once more,returning the difference between the twoproc.time
calls.
unix.time
has been an alias ofsystem.time
, forcompatibility with S, has been deprecated in 2016 and finally becamedefunct in 2022.
Timings of evaluations of the same expression can vary considerablydepending on whether the evaluation triggers a garbage collection. WhengcFirst
isTRUE
a garbage collection (gc
)will be performed immediately before the evaluation ofexpr
.This will usually produce more consistent timings.
A object of class"proc_time"
: seeproc.time
for details.
proc.time
,time
which is for time series.
setTimeLimit
to limit the (CPU/elapsed) timeR is allowedto use.
Sys.time
to get the current date & time.
require(stats)system.time(for(i in 1:100) mad(runif(1000)))## Not run: exT <- function(n = 10000) { # Purpose: Test if system.time works ok; n: loop size system.time(for(i in 1:n) x <- mean(rt(1000, df = 4)))}#-- Try to interrupt one of the following (using Ctrl-C / Escape):exT() #- about 4 secs on a 2.5GHz Xeonsystem.time(exT()) #~ +/- same## End(Not run)
require(stats)system.time(for(iin1:100) mad(runif(1000)))## Not run:exT<-function(n=10000){# Purpose: Test if system.time works ok; n: loop size system.time(for(iin1:n) x<- mean(rt(1000, df=4)))}#-- Try to interrupt one of the following (using Ctrl-C / Escape):exT()#- about 4 secs on a 2.5GHz Xeonsystem.time(exT())#~ +/- same## End(Not run)
system2
invokes the OS command specified bycommand
.
system2(command, args = character(), stdout = "", stderr = "", stdin = "", input = NULL, env = character(), wait = TRUE, minimized = FALSE, invisible = TRUE, timeout = 0, receive.console.signals = wait)
system2(command, args= character(), stdout="", stderr="", stdin="", input=NULL, env= character(), wait=TRUE, minimized=FALSE, invisible=TRUE, timeout=0, receive.console.signals= wait)
command | the system command to be invoked, as a character string. |
args | a character vector of arguments to |
stdout ,stderr | where output to ‘stdout’ or‘stderr’ should be sent. Possible values are |
stdin | should input be diverted? |
input | if a character vector is supplied, this is copied onestring per line to a temporary file, and the standard input of |
env | character vector of name=value strings to set environmentvariables. |
wait | a logical (not |
timeout | timeout in seconds, ignored if 0. This is a limit for theelapsed time running |
receive.console.signals | a logical (not |
minimized ,invisible | arguments that are accepted on Windows butignored on this platform, with a warning. |
Unlikesystem
,command
is always quoted byshQuote
, so it must be a single command without arguments.
For details of howcommand
is found seesystem
.
On Windows,env
is only supported for commands such asR
andmake
which accept environment variables ontheir command line.
Some Unix commands (such as some implementations ofls
) changetheir output if they consider it to be piped or redirected:stdout = TRUE
uses a pipe whereasstdout = "some_file_name"
uses redirection.
Because of the way it is implemented, on a Unix-alikestderr = TRUE
impliesstdout = TRUE
: a warning is given if this isnot what was specified.
Whentimeout
is non-zero, the command is terminated after the givennumber of seconds. The termination works for typical commands, but is notguaranteed: it is possible to write a program that would keep runningafter the time is out. Timeouts can only be set withwait = TRUE
.
Timeouts cannot be used with interactive commands: the command is run withstandard input redirected from/dev/null
and it must not modifyterminal settings. As long as ttytostop
option is disabled, whichit usually is by default, the executed command may write to standardoutput and standard error.
receive.console.signals = TRUE
is useful when running asynchronousprocesses (usingwait = FALSE
) to implement a synchronous operation.In all other cases it is recommended to use the default.
Ifstdout = TRUE
orstderr = TRUE
, a character vectorgiving the output of the command, one line per character string.(Output lines of more than 8095 bytes will be split.) If the commandcould not be run anR error is generated. Ifcommand
runs butgives a non-zero exit status this will be reported with a warning andin the attribute"status"
of the result: an attribute"errmsg"
may also be available.
In other cases, the return value is an error code (0
forsuccess), given theinvisible attribute (so needs to be printedexplicitly). If the command could not be run for any reason, thevalue is127
and a warning is issued (as fromR 3.5.0).Otherwise ifwait = TRUE
the value is the exit status returnedby the command, and ifwait = FALSE
it is0
(theconventional success value).
If the command times out, a warning is issued and the exit status is124
.
system2
is a more portable and flexible interface thansystem
. It allows redirection of output without needingto invoke a shell on Windows, a portable way to set environmentvariables for the execution ofcommand
, and finer control overthe redirection ofstdout
andstderr
. Conversely,system
(andshell
on Windows) allows the invocation ofarbitrary command lines.
There is no guarantee that ifstdout
andstderr
are bothTRUE
or the same file that the two streams will be interleavedin order. This depends on both the buffering used by the command andthe OS.
Given a matrix ordata.frame
x
,t
returns the transpose ofx
.
t(x)
t(x)
x | a matrix or data frame, typically. |
This is a generic function for which methods can be written. Thedescription here applies to the default and"data.frame"
methods.
A data frame is first coerced to a matrix: seeas.matrix
.Whenx
is a vector, it is treated as a column, i.e., theresult is a 1-row matrix.
A matrix, withdim
anddimnames
constructedappropriately from those ofx
, and other attributes exceptnames copied across.
Theconjugate transpose of a complex matrix, denoted
or
, is computed as
Conj(t(A))
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
aperm
for permuting the dimensions of arrays.
a <- matrix(1:30, 5, 6)ta <- t(a) ##-- i.e., a[i, j] == ta[j, i] for all i,j :for(j in seq(ncol(a))) if(! all(a[, j] == ta[j, ])) stop("wrong transpose")
a<- matrix(1:30,5,6)ta<- t(a)##-- i.e., a[i, j] == ta[j, i] for all i,j :for(jin seq(ncol(a)))if(! all(a[, j]== ta[j,])) stop("wrong transpose")
table
uses cross-classifying factors to build a contingencytable of the counts at each combination of factor levels.
table(..., exclude = if (useNA == "no") c(NA, NaN), useNA = c("no", "ifany", "always"), dnn = list.names(...), deparse.level = 1)as.table(x, ...)is.table(x)## S3 method for class 'table'as.data.frame(x, row.names = NULL, ..., responseName = "Freq", stringsAsFactors = TRUE, sep = "", base = list(LETTERS))
table(..., exclude=if(useNA=="no") c(NA,NaN), useNA= c("no","ifany","always"), dnn= list.names(...), deparse.level=1)as.table(x,...)is.table(x)## S3 method for class 'table'as.data.frame(x, row.names=NULL,..., responseName="Freq", stringsAsFactors=TRUE, sep="", base= list(LETTERS))
... | one or more objects which can be interpreted as factors(including numbers or character strings), or a |
exclude | levels to remove for all factors in |
useNA | whether to include |
dnn | the names to be given to the dimensions in the result (thedimnames names). |
deparse.level | controls how the default |
x | an arbitraryR object, or an object inheriting from class |
row.names | a character vector giving the row names for the dataframe. |
responseName | the name to be used for the column of tableentries, usually counts. |
stringsAsFactors | logical: should the classifying factors bereturned as factors (the default) or character vectors? |
sep ,base | passed to |
If the argumentdnn
is not supplied, the internal functionlist.names
is called to compute the ‘dimname names’ asfollows:If...
is onelist
with its ownnames()
,thesenames
are used. Otherwise, if thearguments in...
are named, those names are used. For theremaining arguments,deparse.level = 0
gives an empty name,deparse.level = 1
uses the supplied argument if it is a symbol,anddeparse.level = 2
will deparse the argument.
Only whenexclude
is specified (i.e., not by default) andnon-empty, willtable
potentially drop levels of factorarguments.
useNA
controls if the table includes counts ofNA
values: the allowed values correspond to never ("no"
), only if the count ispositive ("ifany"
) and even for zero counts ("always"
).Note the somewhat “pathological” case of two different kinds ofNA
s which are treated differently, depending on bothuseNA
andexclude
, seed.patho
in the‘Examples:’ below.
Bothexclude
anduseNA
operate on an “all or none”basis. If you want to control the dimensions of a multiway tableseparately, modify each argument usingfactor
oraddNA
.
Non-factor argumentsa
are coerced viafactor(a, exclude=exclude)
. SinceR 3.4.0, care is takennot tocount the excluded values (where they were included in theNA
count, previously).
Thesummary
method for class"table"
(used for objectscreated bytable
orxtabs
) which gives basicinformation and performs a chi-squared test for independence offactors (note that the functionchisq.test
currentlyonly handles 2-d tables).
table()
returns acontingency table, an object ofclass"table"
, an array of integer values.Note that unlike S the result is always anarray
, a 1Darray if one factor is given.
as.table
andis.table
coerce to and test for contingencytable, respectively.
Theas.data.frame
method for objects inheriting from class"table"
can be used to convert the array-based representationof a contingency table to a data frame containing the classifyingfactors and the corresponding entries (the latter as componentnamed byresponseName
). This is the inverse ofxtabs
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
tabulate
is the underlying function and allows finercontrol.
Useftable
for printing (and more) ofmultidimensional tables.margin.table
,prop.table
,addmargins
.
addNA
for constructing factors withNA
asa level.
xtabs
for cross tabulation of data frames with aformula interface.
require(stats) # for rpois and xtabs## Simple frequency distributiontable(rpois(100, 5))## Check the design:with(warpbreaks, table(wool, tension))table(state.division, state.region)# simple two-way contingency tablewith(airquality, table(cut(Temp, quantile(Temp)), Month))a <- letters[1:3]table(a, sample(a)) # dnn is c("a", "")table(a, sample(a), dnn = NULL) # dimnames() have no namestable(a, sample(a), deparse.level = 0) # dnn is c("", "")table(a, sample(a), deparse.level = 2) # dnn is c("a", "sample(a)")## xtabs() <-> as.data.frame.table() :UCBAdmissions ## already a contingency tableDF <- as.data.frame(UCBAdmissions)class(tab <- xtabs(Freq ~ ., DF)) # xtabs & table## tab *is* "the same" as the original table:all(tab == UCBAdmissions)all.equal(dimnames(tab), dimnames(UCBAdmissions))a <- rep(c(NA, 1/0:3), 10)table(a) # does not report NA'stable(a, exclude = NULL) # reports NA'sb <- factor(rep(c("A","B","C"), 10))table(b)table(b, exclude = "B")d <- factor(rep(c("A","B","C"), 10), levels = c("A","B","C","D","E"))table(d, exclude = "B")print(table(b, d), zero.print = ".")## NA counting:is.na(d) <- 3:4d. <- addNA(d)d.[1:7]table(d.) # ", exclude = NULL" is not needed## i.e., if you want to count the NA's of 'd', usetable(d, useNA = "ifany")## "pathological" case:d.patho <- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho) <- 3:4d.patho## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here :as.integer(d.patho) # 1 4 NA NA 1 2#### In R >= 3.4.0, table() allows to differentiate:table(d.patho) # counts the "unusual" NAtable(d.patho, useNA = "ifany") # counts all threetable(d.patho, exclude = NULL) # (ditto)table(d.patho, exclude = NA) # counts none## Two-way tables with NA counts. The 3rd variant is absurd, but shows## something that cannot be done using exclude or useNA.with(airquality, table(OzHi = Ozone > 80, Month, useNA = "ifany"))with(airquality, table(OzHi = Ozone > 80, Month, useNA = "always"))with(airquality, table(OzHi = Ozone > 80, addNA(Month)))
require(stats)# for rpois and xtabs## Simple frequency distributiontable(rpois(100,5))## Check the design:with(warpbreaks, table(wool, tension))table(state.division, state.region)# simple two-way contingency tablewith(airquality, table(cut(Temp, quantile(Temp)), Month))a<- letters[1:3]table(a, sample(a))# dnn is c("a", "")table(a, sample(a), dnn=NULL)# dimnames() have no namestable(a, sample(a), deparse.level=0)# dnn is c("", "")table(a, sample(a), deparse.level=2)# dnn is c("a", "sample(a)")## xtabs() <-> as.data.frame.table() :UCBAdmissions## already a contingency tableDF<- as.data.frame(UCBAdmissions)class(tab<- xtabs(Freq~ ., DF))# xtabs & table## tab *is* "the same" as the original table:all(tab== UCBAdmissions)all.equal(dimnames(tab), dimnames(UCBAdmissions))a<- rep(c(NA,1/0:3),10)table(a)# does not report NA'stable(a, exclude=NULL)# reports NA'sb<- factor(rep(c("A","B","C"),10))table(b)table(b, exclude="B")d<- factor(rep(c("A","B","C"),10), levels= c("A","B","C","D","E"))table(d, exclude="B")print(table(b, d), zero.print=".")## NA counting:is.na(d)<-3:4d.<- addNA(d)d.[1:7]table(d.)# ", exclude = NULL" is not needed## i.e., if you want to count the NA's of 'd', usetable(d, useNA="ifany")## "pathological" case:d.patho<- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho)<-3:4d.patho## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here :as.integer(d.patho)# 1 4 NA NA 1 2#### In R >= 3.4.0, table() allows to differentiate:table(d.patho)# counts the "unusual" NAtable(d.patho, useNA="ifany")# counts all threetable(d.patho, exclude=NULL)# (ditto)table(d.patho, exclude=NA)# counts none## Two-way tables with NA counts. The 3rd variant is absurd, but shows## something that cannot be done using exclude or useNA.with(airquality, table(OzHi= Ozone>80, Month, useNA="ifany"))with(airquality, table(OzHi= Ozone>80, Month, useNA="always"))with(airquality, table(OzHi= Ozone>80, addNA(Month)))
tabulate
takes the integer-valued vectorbin
and countsthe number of times each integer occurs in it.
tabulate(bin, nbins = max(1, bin, na.rm = TRUE))
tabulate(bin, nbins= max(1, bin, na.rm=TRUE))
bin | a numeric vector (of positive integers), or a factor.Long vectors are supported. |
nbins | the number of bins to be used. |
tabulate
is the workhorse for thetable
function.
Ifbin
is a factor, its internal integer representationis tabulated.
If the elements ofbin
are numeric but not integers,they are truncated byas.integer
.
An integer valuedinteger
ordouble
vector(without names). There is a bin for each of the values1, ..., nbins
; values outside that range andNA
s are (silently)ignored.
On 64-bit platformsbin
can have or moreelements (i.e.,
length(bin) > .Machine$integer.max
), and hencea count could exceed the maximum integer. For this reason, the returnvalue is of type double for such longbin
vectors.
tabulate(c(2,3,5))tabulate(c(2,3,3,5), nbins = 10)tabulate(c(-2,0,2,3,3,5)) # -2 and 0 are ignoredtabulate(c(-2,0,2,3,3,5), nbins = 3)tabulate(factor(letters[1:10]))
tabulate(c(2,3,5))tabulate(c(2,3,3,5), nbins=10)tabulate(c(-2,0,2,3,3,5))# -2 and 0 are ignoredtabulate(c(-2,0,2,3,3,5), nbins=3)tabulate(factor(letters[1:10]))
Tailcall
andExec
Tailcall
andExec
allow writing morestack-space-efficient recursive functions inR.
Tailcall(FUN, ...)Exec(expr, envir)
Tailcall(FUN,...)Exec(expr, envir)
FUN | a function or a non-empty character string naming thefunction to be called. |
... | all the arguments to be passed. |
expr | a call expression. |
envir | environment for evaluating |
Tailcall
evaluates a call toFUN
with arguments ... inthe current environment, andExec
evaluates the callexpr
in environmentenvir
. If aTailcall
orExec
expression appears in tail position in anR function, andif there are noon.exit
expressions set, then the evaluationcontext of the new calls replaces the currently executing call contextwith a new one. If the requirements for context re-use are not met,then evaluation proceeds in the standard way adding another context tothe stack.
UsingTailcall
it is possible to define tail-recursivefunctions that do not grow the evaluation stack.Exec
can beused to simplify the call stack for functions that create and thenevaluate an expression.
Because of lazy evaluation of arguments inR it may be necessary toforce evaluation of some arguments to avoid accumulating deferredevaluations.
Thistail call optimization has the advantage of not growingthe call stack and permitting arbitrarily deep tail recursions. Itdoes also mean that stack traces produced bytraceback
orsys.calls
will only show the call specified byTailcall
orExec
, not the previous call whose stackentry has been replaced.
Tailcall
andExec
are experimental and may bechanged or dropped in future released versions ofR.
## tail-recursive log10-factoriallfact <- function(n) { lfact_iter <- function(val, n) { if (n <= 0) val else { val <- val + log10(n) # forces val Tailcall(lfact_iter, val, n - 1) } } lfact_iter(0, n)}10 ^ lfact(3)lfact(100000)## simplified variant of do.call using Exec:docall <- function (what, args, quote = FALSE) { if (!is.list(args)) stop("second argument must be a list") if (quote) args <- lapply(args, enquote) Exec(as.call(c(list(substitute(what)), args)), parent.frame())}## the call stack does not contain the call to docall:docall(function() sys.calls(), list()) |> Find(function(x) identical(x[[1]], quote(docall)), x = _)## contrast to do.call:do.call(function(x) sys.calls(), list()) |> Find(function(x) identical(x[[1]], quote(do.call)), x = _)
## tail-recursive log10-factoriallfact<-function(n){ lfact_iter<-function(val, n){if(n<=0) valelse{ val<- val+ log10(n)# forces val Tailcall(lfact_iter, val, n-1)}} lfact_iter(0, n)}10^ lfact(3)lfact(100000)## simplified variant of do.call using Exec:docall<-function(what, args, quote=FALSE){if(!is.list(args)) stop("second argument must be a list")if(quote) args<- lapply(args, enquote) Exec(as.call(c(list(substitute(what)), args)), parent.frame())}## the call stack does not contain the call to docall:docall(function() sys.calls(), list())|> Find(function(x) identical(x[[1]], quote(docall)), x= _)## contrast to do.call:do.call(function(x) sys.calls(), list())|> Find(function(x) identical(x[[1]], quote(do.call)), x= _)
Apply a function to each cell of a ragged array, that is to each(non-empty) group of values or data rows given by a uniquecombination of the levels of certain factors.
tapply(X, INDEX, FUN = NULL, ..., default = NA, simplify = TRUE)
tapply(X, INDEX, FUN=NULL,..., default=NA, simplify=TRUE)
X | anR object for which a |
INDEX | a |
FUN | a function (or name of a function) to be applied, or |
... | optional arguments to |
default | (only in the case of simplification to an array) thevalue with which the array is initialized as |
simplify | logical; if |
IfFUN
is notNULL
, it is passed tomatch.fun
, and hence it can be a function or a symbol orcharacter string naming a function.
WhenFUN
is present,tapply
callsFUN
for eachcell that has any data in it. IfFUN
returns a single atomicvalue for each such cell (e.g., functionsmean
orvar
)and whensimplify
isTRUE
,tapply
returns amulti-wayarray containing the values, andNA
for theempty cells. The array has the same number of dimensions asINDEX
has components; the number of levels in a dimension isthe number of levels (nlevels()
) in the corresponding componentofINDEX
. Note that if the return value has a class (e.g., anobject of class"Date"
) the class is discarded.
simplify = TRUE
always returns an array, possibly 1-dimensional.
IfFUN
does not return a single atomic value,tapply
returns an array of modelist
whose components are thevalues of the individual calls toFUN
, i.e., the result is alist with adim
attribute.
When there is an array answer, itsdimnames
are named bythe names ofINDEX
and are based on the levels of the groupingfactors (possibly after coercion).
For a list result, the elements corresponding to empty cells areNULL
.
Thearray2DF
function can be used to convert the arrayreturned bytapply
into a data frame, which may be moreconvenient for further analysis.
Optional arguments toFUN
supplied by the...
argumentare not divided into cells. It is therefore inappropriate forFUN
to expect additional arguments with the same length asX
.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
the convenience functionsby
andaggregate
(usingtapply
);apply
,lapply
with its versionssapply
andmapply
.
array2DF
to convert the result into a data frame.
require(stats)groups <- as.factor(rbinom(32, n = 5, prob = 0.4))tapply(groups, groups, length) #- is almost the same astable(groups)## contingency table from data.frame : array with named dimnamestapply(warpbreaks$breaks, warpbreaks[,-1], sum)tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum)n <- 17; fac <- factor(rep_len(1:3, n), levels = 1:5)table(fac)tapply(1:n, fac, sum)tapply(1:n, fac, sum, default = 0) # maybe more desirabletapply(1:n, fac, sum, simplify = FALSE)tapply(1:n, fac, range)tapply(1:n, fac, quantile)tapply(1:n, fac, length) ## NA'stapply(1:n, fac, length, default = 0) # == table(fac)## example of ... argument: find quarterly meanstapply(presidents, cycle(presidents), mean, na.rm = TRUE)ind <- list(c(1, 2, 2), c("A", "A", "B"))table(ind)tapply(1:3, ind) #-> the split vectortapply(1:3, ind, sum)## Some assertions (not held by all patch propsals):nq <- names(quantile(1:5))stopifnot( identical(tapply(1:3, ind), c(1L, 2L, 4L)), identical(tapply(1:3, ind, sum), matrix(c(1L, 2L, NA, 3L), 2, dimnames = list(c("1", "2"), c("A", "B")))), identical(tapply(1:n, fac, quantile)[-1], array(list(`2` = structure(c(2, 5.75, 9.5, 13.25, 17), names = nq), `3` = structure(c(3, 6, 9, 12, 15), names = nq), `4` = NULL, `5` = NULL), dim=4, dimnames=list(as.character(2:5)))))
require(stats)groups<- as.factor(rbinom(32, n=5, prob=0.4))tapply(groups, groups, length)#- is almost the same astable(groups)## contingency table from data.frame : array with named dimnamestapply(warpbreaks$breaks, warpbreaks[,-1], sum)tapply(warpbreaks$breaks, warpbreaks[,3, drop=FALSE], sum)n<-17; fac<- factor(rep_len(1:3, n), levels=1:5)table(fac)tapply(1:n, fac, sum)tapply(1:n, fac, sum, default=0)# maybe more desirabletapply(1:n, fac, sum, simplify=FALSE)tapply(1:n, fac, range)tapply(1:n, fac, quantile)tapply(1:n, fac, length)## NA'stapply(1:n, fac, length, default=0)# == table(fac)## example of ... argument: find quarterly meanstapply(presidents, cycle(presidents), mean, na.rm=TRUE)ind<- list(c(1,2,2), c("A","A","B"))table(ind)tapply(1:3, ind)#-> the split vectortapply(1:3, ind, sum)## Some assertions (not held by all patch propsals):nq<- names(quantile(1:5))stopifnot( identical(tapply(1:3, ind), c(1L,2L,4L)), identical(tapply(1:3, ind, sum), matrix(c(1L,2L,NA,3L),2, dimnames= list(c("1","2"), c("A","B")))), identical(tapply(1:n, fac, quantile)[-1], array(list(`2`= structure(c(2,5.75,9.5,13.25,17), names= nq), `3`= structure(c(3,6,9,12,15), names= nq), `4`=NULL, `5`=NULL), dim=4, dimnames=list(as.character(2:5)))))
addTaskCallback
registers an R functionthat is to be called each time a top-level taskis completed.
removeTaskCallback
un-registers a functionthat was registered earlier viaaddTaskCallback
.
These provide low-level access to the internal/nativemechanism for managing task-completion actions.One can usetaskCallbackManager
at theR-language level to manageR functionsthat are called at the completion of each task.This is easier and more direct.
addTaskCallback(f, data = NULL, name = character())removeTaskCallback(id)
addTaskCallback(f, data=NULL, name= character())removeTaskCallback(id)
f | the function that is to be invoked each time a top-level taskis successfully completed. This is called with 5 or 4 argumentsdepending on whether |
data | if specified, this is the 5-th argument in the call to thecallback function |
id | a string or an integer identifying the element in theinternal callback list to be removed.Integer indices are 1-based, i.e the first element is 1.The names of currently registered handlers is availableusing |
name | character: names to be used. |
Top-level tasks are individual expressionsrather than entire lines of input. Thus an inputline of the formexpression1 ; expression2
will give rise to 2 top-level tasks.
A top-level task callback is called with the expression for thetop-level task, the result of the top-level task, a logical valueindicating whether it was successfully completed or not (always TRUEat present), and a logical value indicating whether the result wasprinted or not. If thedata
argument was specified in the calltoaddTaskCallback
, that value is given as the fifth argument.
The callback function should return a logical value.If the value is FALSE, the callback is removed from the tasklist and will not be called again by this mechanism.If the function returns TRUE, it is kept in the list andwill be called on the completion of the next top-level task.
addTaskCallback
returnsan integer value giving the position in the listof task callbacks that this new callback occupies.This is only the current position of the callback.It can be used to remove the entry as long asno other values are removed from earlier positionsin the list first.
removeTaskCallback
returns a logical valueindicating whether the specified element was removed.This can fail (i.e., returnFALSE
)if an incorrect name or index is given that does notcorrespond to the name or position of an element in the list.
There is also C-level access to top-level task callbacksto allow C routines rather than R functions be used.
getTaskCallbackNames
taskCallbackManager
https://developer.r-project.org/TaskHandlers.pdf
times <- function(total = 3, str = "Task a") { ctr <- 0 function(expr, value, ok, visible) { ctr <<- ctr + 1 cat(str, ctr, "\n") keep.me <- (ctr < total) if (!keep.me) cat("handler removing itself\n") # return keep.me }}# add the callback that will work for# 4 top-level tasks and then remove itself.n <- addTaskCallback(times(4))# now remove it, assuming it is still first in the list.removeTaskCallback(n)## See how the handler is called every time till "self destruction":addTaskCallback(times(4)) # counts as once alreadysum(1:10) ; mean(1:3) # two moresinpi(1) # 4th - and "done"cospi(1)tanpi(1)
times<-function(total=3, str="Task a"){ ctr<-0function(expr, value, ok, visible){ ctr<<- ctr+1 cat(str, ctr,"\n") keep.me<-(ctr< total)if(!keep.me) cat("handler removing itself\n")# return keep.me}}# add the callback that will work for# 4 top-level tasks and then remove itself.n<- addTaskCallback(times(4))# now remove it, assuming it is still first in the list.removeTaskCallback(n)## See how the handler is called every time till "self destruction":addTaskCallback(times(4))# counts as once alreadysum(1:10); mean(1:3)# two moresinpi(1)# 4th - and "done"cospi(1)tanpi(1)
This provides an entirelyR-language mechanismfor managing callbacks or actions that are invoked atthe conclusion of each top-level task. Essentially,we register a singleR function from this managerwith the underlying, nativetask-callback mechanism and this function handles invoking the otherR callbacks under the control of the manager.The manager consists of a collection of functions that access sharedvariables to manage the list of user-level callbacks.
taskCallbackManager(handlers = list(), registered = FALSE, verbose = FALSE)
taskCallbackManager(handlers= list(), registered=FALSE, verbose=FALSE)
handlers | this can be a list of callbacks in which each elementis a list with an element named |
registered | a logical value indicating whetherthe |
verbose | a logical value, which if |
Alist
containing 6 functions:
add() | register a callback with this manager, giving thefunction, an optional 5-th argument, an optional nameby which the callback is stored in the list,and a |
remove() | remove an element from the manager's collectionof callbacks, either by name or position/index. |
evaluate() | the ‘real’ callback function that is registeredwith the C-level dispatch mechanism and which invokes each of theR-level callbacks within this manager's control. |
suspend() | a function to set the suspend stateof the manager. If it is suspended, none of the callbacks will beinvoked when a task is completed. One sets the state by specifyinga logical value for the |
register() | a function to register the |
callbacks() | returns the list of callbacks being maintainedby this manager. |
Duncan Temple Lang (2001)Top-level Task Callbacks in R,https://developer.r-project.org/TaskHandlers.pdf
addTaskCallback
,removeTaskCallback
,getTaskCallbackNames
and the reference.
# create the managerh <- taskCallbackManager()# add a callbackh$add(function(expr, value, ok, visible) { cat("In handler\n") return(TRUE) }, name = "simpleHandler")# look at the internal callbacks.getTaskCallbackNames()# look at the R-level callbacksnames(h$callbacks())removeTaskCallback("R-taskCallbackManager")
# create the managerh<- taskCallbackManager()# add a callbackh$add(function(expr, value, ok, visible){ cat("In handler\n") return(TRUE)}, name="simpleHandler")# look at the internal callbacks.getTaskCallbackNames()# look at the R-level callbacksnames(h$callbacks())removeTaskCallback("R-taskCallbackManager")
This provides a way to get the names (or identifiers)for the currently registered task callbacksthat are invoked at the conclusion of each top-level task.These identifiers can be used to remove a callback.
getTaskCallbackNames()
getTaskCallbackNames()
A character vector giving the name for each of theregistered callbacks which are invoked whena top-level task is completed successfully.Each name is the one used when registeringthe callbacks and returned as the in thecall toaddTaskCallback
.
One can usetaskCallbackManager
to manage user-level task callbacks,i.e., S-language functions, entirely withinthe S language and access the namesmore directly.
addTaskCallback
,removeTaskCallback
,taskCallbackManager
\https://developer.r-project.org/TaskHandlers.pdf
n <- addTaskCallback(function(expr, value, ok, visible) { cat("In handler\n") return(TRUE) }, name = "simpleHandler") getTaskCallbackNames() # now remove it by name removeTaskCallback("simpleHandler") h <- taskCallbackManager() h$add(function(expr, value, ok, visible) { cat("In handler\n") return(TRUE) }, name = "simpleHandler") getTaskCallbackNames() removeTaskCallback("R-taskCallbackManager")
n<- addTaskCallback(function(expr, value, ok, visible){ cat("In handler\n") return(TRUE)}, name="simpleHandler") getTaskCallbackNames()# now remove it by name removeTaskCallback("simpleHandler") h<- taskCallbackManager() h$add(function(expr, value, ok, visible){ cat("In handler\n") return(TRUE)}, name="simpleHandler") getTaskCallbackNames() removeTaskCallback("R-taskCallbackManager")
tempfile
returns a vector of character strings which can be used asnames for temporary files.
tempfile(pattern = "file", tmpdir = tempdir(), fileext = "")tempdir(check = FALSE)
tempfile(pattern="file", tmpdir= tempdir(), fileext="")tempdir(check=FALSE)
pattern | a non-empty character vector giving the initial partof the name. |
tmpdir | a non-empty character vector giving the directory name. |
fileext | a non-empty character vector giving the file extension. |
check |
|
The length of the result is the maximum of the lengths of the threearguments; values of shorter arguments are recycled.
The names are very likely to be unique among calls totempfile
in anR session and across simultaneousR sessions (unlesstmpdir
is specified). The filenames are guaranteed not to becurrently in use.
The file name is made by concatenating the path given bytmpdir
, thepattern
string, a random string in hex anda suffix offileext
.
By default,tmpdir
will be the directory given bytempdir()
. This will be a subdirectory of the per-sessiontemporary directory found by the following rule when theR session isstarted. The environment variablesTMPDIR,TMP andTEMP are checked in turn and the first found which points to awritable directory is used:if none succeeds ‘/tmp’ is used. The path must not contain spaces.
Note that setting any of these environment variables in theR sessionhas no effect ontempdir()
: the per-session temporary directoryis created before the interpreter is started.
Fortempfile
a character vector giving the names of possible(temporary) files. Note that no files are generated bytempfile
.
Fortempdir
, the path of the per-session temporary directory.
On Windows, both will use a backslash as the path separator.
On a Unix-alike, the value will be an absolute path (unlesstmpdir
is set to a relative path), but it need not be canonical(seenormalizePath
) and on macOS it often is not.
R processes forked by functions such asmclapply
andmakeForkCluster
in packageparallel share aper-session temporary directory. Further, the ‘guaranteed notto be currently in use’ applies only at the time of asking, and twochildren could ask simultaneously. This is circumvented by ensuringthattempfile
calls in different children try different names.
The final component oftempdir()
is created by the POSIX systemcallmkdtemp
, or if this is not available (e.g. onWindows) a version derived from the source code of GNUglibc
.
It will be of the form ‘RtmpXXXXXX’ where the last 6 charactersare replaced in a platform-specific way. POSIX only requires that thereplacements be ASCII, which allows.
(so the value may appearto have a file extension) andregexp metacharacters such as+
. Most commonly the replacements are from theregexppattern[A-Za-z0-9]
, but.
has been seen.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
unlink
for deleting files.
tempfile(c("ab", "a b c")) # give file name with spaces in!tempfile("plot", fileext = c(".ps", ".pdf"))tempdir() # works on all platforms with a platform-dependent result## Show how 'check' is working on some platforms:if(exists("I'm brave") && `I'm brave` && identical(.Platform$OS.type, "unix") && grepl("^/tmp/", tempdir())) { cat("Current tempdir(): ", tempdir(), "\n") cat("Removing it :", file.remove(tempdir()), "; dir.exists(tempdir()):", dir.exists(tempdir()), "\n") cat("and now tempdir(check = TRUE) :", tempdir(check = TRUE),"\n")}
tempfile(c("ab","a b c"))# give file name with spaces in!tempfile("plot", fileext= c(".ps",".pdf"))tempdir()# works on all platforms with a platform-dependent result## Show how 'check' is working on some platforms:if(exists("I'm brave")&& `I'm brave`&& identical(.Platform$OS.type,"unix")&& grepl("^/tmp/", tempdir())){ cat("Current tempdir(): ", tempdir(),"\n") cat("Removing it :", file.remove(tempdir()),"; dir.exists(tempdir()):", dir.exists(tempdir()),"\n") cat("and now tempdir(check = TRUE) :", tempdir(check=TRUE),"\n")}
Input and output text connections.
textConnection(object, open = "r", local = FALSE, name = deparse1(substitute(object)), encoding = c("", "bytes", "UTF-8"))textConnectionValue(con)
textConnection(object, open="r", local=FALSE, name= deparse1(substitute(object)), encoding= c("","bytes","UTF-8"))textConnectionValue(con)
object | character. A description of theconnection.For an input this is anR character vector object, and for an outputconnection the name for theR character vector to receive theoutput, or |
open | character string. Either |
local | logical. Used only for output connections. If |
name | a |
encoding | character string, partially matched. Used only for input connections. Howmarked strings in |
con | an output text connection. |
An input text connection is opened and the character vector is copiedat time the connection object is created, andclose
destroysthe copy.object
should be the name of a character vector:however, short expressions will be accepted provided theydeparse toless than 60 bytes.
An output text connection is opened and creates anR character vectorof the given name in the user's workspace or in the calling environment,depending on the value of thelocal
argument. This object will at alltimes hold the completed lines of output to the connection, andisIncomplete
will indicate if there is an incompletefinal line. Closing the connection will output the final line,complete or not. (A line is complete once it has been terminated byend-of-line, represented by"\n"
inR.) The output charactervector has locked bindings (seelockBinding
) untilclose
is called on the connection. The character vector canalso be retrievedviatextConnectionValue
, which is theonly way to do so ifobject = NULL
. If the current locale isdetected as Latin-1 or UTF-8, non-ASCII elements of the character vectorwill be marked accordingly (seeEncoding
).
Opening a text connection withmode = "a"
will attempt toappend to an existing character vector with the given name in theuser's workspace or the calling environment. If none is found (evenif an object exists of the right name but the wrong type) a newcharacter vector will be created, with a warning.
You cannotseek
on a text connection, andseek
willalways return zero as the position.
Text connections have slightly unusual semantics: they are alwaysopen, and throwing away an input text connection without closing it(so it get garbage-collected) does not give a warning.
FortextConnection
, a connection object of class"textConnection"
which inherits from class"connection"
.
FortextConnectionValue
, a character vector.
As output text connections keep the character vector up to dateline-by-line, they are relatively expensive to use, and it is oftenbetter to use an anonymousfile()
connection to collectoutput.
On (rare) platforms wherevsnprintf
does not return the neededlength of output there is a 100,000 character limit on the length ofline for output connections: longer lines will be truncated with awarning.
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language. Springer.
[S has input text connections only.]
connections
,showConnections
,pushBack
,capture.output
.
zz <- textConnection(LETTERS)readLines(zz, 2)scan(zz, "", 4)pushBack(c("aa", "bb"), zz)scan(zz, "", 4)close(zz)zz <- textConnection("foo", "w")writeLines(c("testit1", "testit2"), zz)cat("testit3 ", file = zz)isIncomplete(zz)cat("testit4\n", file = zz)isIncomplete(zz)close(zz)foo# capture R output: use part of example from help(lm)zz <- textConnection("foo", "w")ctl <- c(4.17, 5.58, 5.18, 6.11, 4.5, 4.61, 5.17, 4.53, 5.33, 5.14)trt <- c(4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69)group <- gl(2, 10, 20, labels = c("Ctl", "Trt"))weight <- c(ctl, trt)sink(zz)anova(lm.D9 <- lm(weight ~ group))cat("\nSummary of Residuals:\n\n")summary(resid(lm.D9))sink()close(zz)cat(foo, sep = "\n")
zz<- textConnection(LETTERS)readLines(zz,2)scan(zz,"",4)pushBack(c("aa","bb"), zz)scan(zz,"",4)close(zz)zz<- textConnection("foo","w")writeLines(c("testit1","testit2"), zz)cat("testit3 ", file= zz)isIncomplete(zz)cat("testit4\n", file= zz)isIncomplete(zz)close(zz)foo# capture R output: use part of example from help(lm)zz<- textConnection("foo","w")ctl<- c(4.17,5.58,5.18,6.11,4.5,4.61,5.17,4.53,5.33,5.14)trt<- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)group<- gl(2,10,20, labels= c("Ctl","Trt"))weight<- c(ctl, trt)sink(zz)anova(lm.D9<- lm(weight~ group))cat("\nSummary of Residuals:\n\n")summary(resid(lm.D9))sink()close(zz)cat(foo, sep="\n")
Tilde is used to separate the left- and right-hand sides in a model formula.
y ~ model
y~ model
y ,model | symbolic expressions. |
The left-hand side is optional, and one-sided formulae are used insome contexts.
A formula hasmodecall
. It can be subsetted by[[
: the components are~
, the left-hand side (ifpresent) and the right-hand sidein that order. (Thusone-sided formulae have two components.)
Chambers, J. M. and Hastie, T. J. (1992)Statistical models.Chapter 2 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Information about time zones inR.Sys.timezone
returnsthe name of the current time zone.
Sys.timezone(location = TRUE)OlsonNames(tzdir = NULL)
Sys.timezone(location=TRUE)OlsonNames(tzdir=NULL)
location | logical. Defunct, with a warning if |
tzdir | the time-zone database to be used: the default is to tryknown locations until one is found. |
Time zones are a system-specific topic, but these days almost allRplatforms use similar underlying code, used by Linux, macOS, Solaris,AIX and FreeBSD, and installed withR on Windows. (Unfortunatelythere are many system-specific errors in the implementations.) It ispossible to use theR sources' version of the code on Unix-alikes aswell as on Windows: this is the default on macOS.
It should be possible to set the current time zone via the environmentvariableTZ: see the section on ‘Time zone names’ forsuitable values.Sys.timezone()
will return the value ofTZ if set initially (and on some OSes it is always set),otherwise it will try to retrieve from the OS a value which if set forTZ would give the initial time zone. (‘Initially’ meansbefore any time-zone functions are used: ifTZ is being set tooverride the OS setting or if the ‘try’ does not get thisright, it should be set before theR process is started or (probablyearly enough) in file.Rprofile
).
IfTZ is set but invalid, most platforms default to ‘UTC’,the time zone colloquially known as ‘GMT’ (seehttps://en.wikipedia.org/wiki/Coordinated_Universal_Time).(Some but not all platforms will give a warning for invalid values.)If it is unset or empty thesystem time zone is used (the onereturned bySys.timezone
).
Time zones did not come into use until the middle of the nineteenthcentury and were not widely adopted until the twentieth, anddaylight saving time (DST, also known assummer time)was first introduced in the early twentieth century, most widely in1916. Over the last 100 years places have changed their affiliationbetween major time zones, have opted out of (or in to) DST in variousyears or adopted DST rule changes late or not at all. (For example,the UK experimented with DST throughout 1971, only.) In a fewcountries (one is the Irish Republic) it is the summer time which isthe ‘standard’ time and a different name is used in winter.And there can be multiple changes during a year, for example forRamadan.
A quite common system implementation ofPOSIXct
was as signed32-bit integers and so only went back to the end of 1901: on suchsystemsR assumes that dates prior to that are in the same time zoneas they were in 1902. Most of the world had not adopted time zones by1902 (so used local ‘mean time’ based on longitude) but for afew places there had been time-zone changes before then. 64-bitrepresentations are becoming by far the most common; unfortunately onsome 64-bit OSes the database information is 32-bit and so onlyavailable for the range 1901–2038, and incompletely for the endyears.
When a time zone location is first found in a session its value iscached in object.sys.timezone
in the base environment.
Sys.timezone
returns an OS-specific character string, possiblyNA
or an empty string (which on some OSes means ‘UTC’).This will be a location such as"Europe/London"
if one can beascertained.
A time zone region may be known by several names: for example‘"Europe/London"’ may also be known as ‘GB’, ‘GB-Eire’,‘Europe/Belfast’, ‘Europe/Guernsey’,‘Europe/Isle_of_Man’ and ‘Europe/Jersey’. A few regions arealso known by a summary of their time zone,e.g. ‘PST8PDT’ is (on most but not all systems) an aliasfor ‘America/Los_Angeles’.
OlsonNames
returns a character vector, see the examples fortypical cases. It may have an attribute"Version"
, somethinglike ‘"2023a"’. (It does on systems using--with-internal-tzcode and those like Fedora distributingfile ‘tzdata.zi’.)
Names"UTC"
and its synonym"GMT"
are accepted on allplatforms.
Where OSes describe their valid time zones can be obscure. The helpfor the C functiontzset
can be helpful, but it can also beinaccurate. There is a cumbersome POSIX specification (listed underenvironment variableTZ athttps://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08),which is often at least partially supported, but there are other moreuser-friendly ways to specify time zones.
Almost allR platforms make use of a time-zone database originallycompiled by Arthur David Olson and now managed byIANA, in which thepreferred way to refer to a time zone is by a location (typically of acity), e.g.,Europe/London
,America/Los_Angeles
,Pacific/Easter
within a ‘time zone region’. Sometraditional designations are also allowed such asEST5EDT
orGB
. (Beware that some of these designations may not be whatyou expect: in particularEST
is a time zone used in Canadawithout daylight saving time, and notEST5EDT
nor(Australian) Eastern Standard Time.) The designation can also be anoptional colon prepended to the path to a file giving complied zoneinformation (and the examples above are all files in a system-specificlocation). Seehttps://data.iana.org/time-zones/tz-link.htmlfor more details and references. By convention, regions with a uniquetime-zone history since 1970 have specific names in the database, butthose with different earlier histories may not. Each time zone hasone or two (the second for ‘summer’)abbreviations used whenformatting times.
Increasingly OSes are (optionally or always) not including‘legacy’ names such asUS/Eastern
: only names of theformsContinent/City
andEtc/...
are fully portable.
The abbreviations used have changed over the years: for example Franceused ‘PMT’ (‘Paris Mean Time’) from 1891 to 1911 then‘WET/WEST’ up to 1940 and ‘CET/CEST’ from 1946. (In almostall time zones the abbreviations have been stable since 1970.) ThePOSIX standard allows only one or two abbreviations per time zone, soyou may see the current abbreviation(s) used for older times.
For some time zones abbreviations are like ‘-03’ and‘+0845’: this is done when there is no official abbreviation.(Negative values are behind (West of) UTC, as for the"%z"
format forstrftime
.)
The functionOlsonNames
returns the time-zone names known tothe currently selected Olson/IANA database. The system-specificlocation in the file system varies,e.g. ‘/usr/share/zoneinfo’ (Linux, macOS, FreeBSD),‘/usr/share/lib/zoneinfo’ (Solaris, AIX), .... It is likelythat there is a file named something like ‘zone1970.tab’ or(older) ‘zone.tab’ under that directory listing the locationsknown as time-zone names (but not for exampleEST5EDT
). Seealsohttps://en.wikipedia.org/wiki/Zone.tab.
WhereR was configured with option--with-internal-tzcode(the default on Windows), the database atfile.path(R.home("share"), "zoneinfo")
is used by default: file‘VERSION’ in that directory states the version. That option isalso the default on macOS but there whichever is more recent of thesystem database at ‘/var/db/timezone/zoneinfo’ and thatdistributed withR is used by default. Environment variableTZDIR can be used to give the full path to a different‘zoneinfo’ database: value"internal"
indicates thedatabase from theR sources and"macOS"
indicates the systemdatabase. (Setting either of those values would not be recognized byother software usingTZDIR.)
SettingTZDIR is also supported by the native services on someOSes, e.g. Linux usingglibc
except in secure modes.
Time zones given by name (via environment variableTZ, intz
arguments to functions such asas.POSIXlt
andperhaps the system time zone) are loaded from the currently selected‘zoneinfo’ database.
On Windows only:An attempt is made (once only per session) to map Windows' idea of thecurrent time zone to a location, following a version ofhttp://unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xmlwith additional values deduced from the Windows Registry and documentation.It can be overridden by setting theTZ environment variablebefore any date-times are used in the session.
Most platforms support time zones of the form ‘Etc/GMT+n’ and‘Etc/GMT-n’ (possibly also without prefix ‘Etc/’), whichassume a fixed offset from UTC (hence no DST). Contrary to someexpectations (but consistent with names such as ‘PST8PDT’),negative offsets are times ahead of (East of) UTC, positive offsetsare times behind (West of) UTC.
Immediately prior to the advent of legislated time zones, most peopleused time based on their longitude (or that of a nearby town), knownas ‘Local Mean Time’ and abbreviated as ‘LMT’ in thedatabases: in many countries that was codified with a specific namebefore the switch to a standard time. For example, Paris codified itsLMT as ‘Paris Mean Time’ in 1891 (to be used throughoutmainland France) and switched to ‘GMT+0’ in 1911.
Some systems (notably Linux) have atzselect
command whichallows the interactive selection of a supported time zone name. Onsystems usingsystemd
(notably Linux), the OS commandtimedatectl list-timezones
will list all available time zonenames.
There is a system-specific upper limit on the number of bytes in(abbreviated) time-zone names which can be as low as 6 (as required byPOSIX). Some OSes allow the setting of time zones with names whichexceed their limit, and that can crash theR session.
Information about future times is speculative (‘proleptic’):the database provides the best-known information based on currentrules set by civil authorities. For the period 1900–1970 those rules(and which of any authority's rules were enacted) are often obscure,and the databases do get corrected frequently.
OlsonNames
tries to find an Olson database in known locations.It might not succeed (when it returns an empty vector with a warning)and even if it does it might not locate the database used by thedate-time code linked intoR. Fortunately names are added rarelyand most databases are pretty complete. On the other hand, many nameswhich duplicate other named timezones have been moved to the‘backward’ list – these are regarded as optional and omitted onminimal installations. Similarly, there are timezones named in file‘backzone’ which differ only from those in the main lists priorto 1970 – these are usually included but may not be in minimalistsystems.
For many years, the legacy namesEST5EDT
andPST8PDT
were portable, butmusl
(the C runtime used by Alpine Linux)does not use DST with those names.
This section is of background interest for users of a Unix-alike, butmay help if anNA
value is returned unexpectedly.
Commercial Unixen such as Solaris and AIX setTZ, so the valuewhenR is started is used.
All other common platforms (Linux, macOS, *BSD) use similar schemes,either derived fromtzcode
(currently distributed fromhttps://www.iana.org/time-zones) or independently coded(glibc
,musl-libc
). Such systems read the time-zoneinformation from a file ‘localtime’, usually under ‘/etc’(but possibly under ‘/usr/local/etc’ or‘/usr/local/etc/zoneinfo’). As the usual Linux manual page forlocaltime
says
‘Because the time zone identifier is extracted from the symlinktarget name of ‘/etc/localtime’, this file may not be a normalfile or hardlink.’
Nevertheless, some Linux distributions (including the one from whichthat quote was taken) or sysadmins have chosen to copy a time-zone fileto ‘localtime’. For a non-symlink, the ultimate fallback is tocompare that file to all files in the time-zone database.
Some Linux platforms provide two other mechanisms which are tried inturn before looking at ‘/etc/localtime’.
‘Modern’ Linux systems usesystemd
whichprovides mechanisms to set and retrieve the time zone (amongst otherthings). There is a commandtimedatectl
to give details.(Unfortunately RHEL/Centos 6.x were not ‘modern’.)
Debian-derived systems sinceca 2007 have supplied afile ‘/etc/timezone’. Its format is undocumented butempirically it contains a single line of text naming the time zone.
In each case a sanity check is performed that the time-zone name is thename of a file in the time-zone database. (The systems probably usethe time-zone file (symlinked to) ‘/etc/localtime’, but theSys.timezone
code does not check that is the same as the namedfile in the database. This is deliberate as they may be fromdifferent dates.)
Since 2007 there has been considerable disruption over changes to thetimings of the DST transitions; these often have short notice andtime-zone databases may not be up to date. (Morocco in 2013 announceda change to the end of DST ata day's notice. In2023 there was chaos in Lebanon as the authorities changed their mindsrepeatedly and some changes were not widely implemented.)
There have also been changes to the ‘standard’ time with littlenotice (Kazakhstan switched to a single time zone in Mar 2024 with sixweeks' notice), and to whether ‘summer’ or ‘winter’time is regarded as ‘standard’ (and hence to abbreviations).
On platforms with case-insensitive file systems, time zone names will becase-insensitive. They may or may not be on other platforms and so,for example,"gmt"
is valid on some platforms and not on others.
Note that except where replaced, the operation of time zones is an OSservice, and even where replaced a third-party database is used andcan be updated (see the section on ‘Time zone names’).Incorrect results will never be anR issue, so please ensure that youhave the courtesy not to blameR for them.
https://en.wikipedia.org/wiki/Time_zone andhttps://data.iana.org/time-zones/tz-link.htmlfor extensive sets of links.
https://data.iana.org/time-zones/theory.html for the‘rules’ of the Olson/IANA database.
Sys.timezone()str(OlsonNames()) ## typically around six hundred names,## typically some acronyms/aliases such as "UTC", "NZ", "MET", "Eire", ..., but## mostly pairs (and triplets) such as "Pacific/Auckland"table(sl <- grepl("/", OlsonNames()))OlsonNames()[ !sl ] # the simple oneshead(Osl <- strsplit(OlsonNames()[sl], "/"))(tOS1 <- table(vapply(Osl, `[[`, "", 1))) # Continents, countries, ...table(lengths(Osl))# most are pairs, some tripletsstr(Osl[lengths(Osl) >= 3])# "America" South and North ...
Sys.timezone()str(OlsonNames())## typically around six hundred names,## typically some acronyms/aliases such as "UTC", "NZ", "MET", "Eire", ..., but## mostly pairs (and triplets) such as "Pacific/Auckland"table(sl<- grepl("/", OlsonNames()))OlsonNames()[!sl]# the simple oneshead(Osl<- strsplit(OlsonNames()[sl],"/"))(tOS1<- table(vapply(Osl, `[[`,"",1)))# Continents, countries, ...table(lengths(Osl))# most are pairs, some tripletsstr(Osl[lengths(Osl)>=3])# "America" South and North ...
This is a helper function forformat
to produce a singlecharacter string describing anR object.
toString(x, ...)## Default S3 method:toString(x, width = NULL, ...)
toString(x,...)## Default S3 method:toString(x, width=NULL,...)
x | The object to be converted. |
width | Suggestion for the maximum field width. Values of |
... | Optional arguments passed to or from methods. |
This is a generic function for which methods can be written: only thedefault method is described here. Most methods should honor thewidth
argument to specify the maximum display width (as measuredbynchar(type = "width")
) of the result.
The default method first convertsx
to character and thenconcatenates the elements separated by", "
.Ifwidth
is supplied and is notNULL
, the default methodreturns the firstwidth - 4
characters of the result with....
appended, if the full result would use more thanwidth
characters.
A character vector of length 1 is returned.
Robert Gentleman
x <- c("a", "b", "aaaaaaaaaaa")toString(x)toString(x, width = 8)
x<- c("a","b","aaaaaaaaaaa")toString(x)toString(x, width=8)
A call totrace
allows you to insert debugging code (e.g., acall tobrowser
orrecover
) at chosenplaces in any function. A call tountrace
cancels the tracing.Specified methods can be traced the same way, without tracing allcalls to the generic function. Trace code (tracer
) can be anyR expression. Tracing can be temporarily turned on or off globallyby callingtracingState
.
trace(what, tracer, exit, at, print, signature, where = topenv(parent.frame()), edit = FALSE)untrace(what, signature = NULL, where = topenv(parent.frame()))tracingState(on = NULL).doTrace(expr, msg)returnValue(default = NULL)
trace(what, tracer, exit, at, print, signature, where= topenv(parent.frame()), edit=FALSE)untrace(what, signature=NULL, where= topenv(parent.frame()))tracingState(on=NULL).doTrace(expr, msg)returnValue(default=NULL)
what | the name, possibly |
tracer | either afunction or an unevaluated expression. Thefunction will be called or the expression will be evaluated eitherat the beginning of the call, or before those steps in the callspecified by the argument |
exit | either a |
at | optional numeric vector or list. If supplied, |
print | if |
signature | an optionalsignature for a method for function |
edit | For complicated tracing, such as tracing within a loopinside the function, you will need to insert the desired calls byediting the body of the function. If so, supply the |
where | where to look for the function to betraced; by default, the top-level environment of the call to An important use of this argument is to trace functions from apackage which are “hidden” or called from another package.The namespace mechanism imports the functions to be called (with theexception of functions in the base package). The functions beingcalled arenot the same objects seen from the top-level (ingeneral, the imported packages may not even be attached).Therefore, you must ensure that the correct versions are beingtraced. The way to do this is to set argument |
on | logical; a call to the support function |
expr ,msg | arguments to the support function |
default | if |
Thetrace
function operates by constructing a revised versionof the function (or of the method, ifsignature
is supplied),and assigning the new object back where the original was found.If only thewhat
argument is given, a line of trace printing isproduced for each call to the function (back compatible with theearlier version oftrace
).
The object constructed bytrace
is from a class that extends"function"
and which contains the original, untraced version.A call tountrace
re-assigns this version.
If the argumenttracer
orexit
is the name of afunction, the tracing expression will be a call to that function, withno arguments. This is the easiest and most common case, with thefunctionsbrowser
andrecover
thelikeliest candidates; the former browses in the frame of the functionbeing traced, and the latter allows browsing in any of the currentlyactive calls. The argumentstracer
andexit
are evaluated tosee whether they are functions, but only their names are used in thetracing expressions. The lookup is done again when the traced functionexecutes, so it may not betracer
orexit
that will be calledwhile tracing.
Thetracer
orexit
argument can also be an unevaluatedexpression (such as returned by a call toquote
orsubstitute
). This expression itself is inserted in thetraced function, so it will typically involve arguments or localobjects in the traced function. An expression of this form is usefulif you only want to interact when certain conditions apply (and inthis case you probably want to supplyprint = FALSE
in the calltotrace
also).
When theat
argument is supplied, it can be a vector ofintegers referring to the substeps of the body of the function (thisonly works if the body of the function is enclosed in{ ...}
). Inthis casetracer
isnot called on entry, but insteadjust before evaluating each of the steps listed inat
. (Hint:you don't want to try to count the steps in the printed version of afunction; instead, look atas.list(body(f))
to get the numbersassociated with the steps in functionf
.)
Theat
argument can also be a list of integer vectors. Inthis case, each vector refers to a step nested within another step ofthe function. For example,at = list(c(3,4))
will call the tracer just before the fourth step of the third stepof the function. See the example below.
UsingsetBreakpoint
(from packageutils) may be analternative, callingtrace(...., at, ...)
.
Theexit
argument is called duringon.exit
processing. In anon.exit
expression, the experimentalreturnValue()
function may be called to obtain the value about to be returned bythe function. Calling this function in other circumstances will giveundefined results.
An intrinsic limitation in theexit
argument is that it won'twork if the function itself useson.exit
withadd= FALSE
(the default), since the existing calls will override the onesupplied bytrace
.
Tracing does not nest. Any call totrace
replaces previouslytraced versions of that function or method (except for editedversions as discussed below), anduntrace
alwaysrestores an untraced version. (Allowing nested tracing has too manypotentials for confusion and for accidentally leaving traced versionsbehind.)
When theedit
argument is used repeatedly with no call tountrace
on the same function or method in between, thepreviously edited version is retained. If you want to throw awayall the previous tracing and then edit, calluntrace
before the nextcall totrace
. Editing may be combined with automatictracing; just supply the other arguments such astracer
, andtheedit
argument as well. Theedit = TRUE
argumentuses the default editor (seeedit
).
Tracing primitive functions (builtins and specials) from the basepackage works, but only by a special mechanism and not veryinformatively. Tracing a primitive causes the primitive to bereplaced by a function with argument ... (only). You can get a bitof information out, but not much. A warning message is issued whentrace
is used on a primitive.
The practice of saving the traced version of the function back wherethe function came from means that tracing carries over from onesession to another,if the traced function is saved in thesession image. (In the next session,untrace
will remove thetracing.) On the other hand, functions that were in a package, not inthe global environment, are not saved in the image, so tracing expireswith the session for such functions.
Tracing an S4 method is basically just like tracing a function, with theexception that the traced version is stored by a call tosetMethod
rather than by direct assignment, and so isthe untraced version after a call tountrace
.
The version oftrace
described here is largely compatible withthe version in S-Plus, although the two work by entirely differentmechanisms. The S-Plustrace
uses the session frame, with theresult that tracing never carries over from one session to another (Rdoes not have a session frame). Another relevant distinction hasnothing directly to do withtrace
: The browser in S-Plusallows changes to be made to the frame being browsed, and the changeswill persist after exiting the browser. TheR browser allows changes,but they disappear when the browser exits. This may be relevant inthat the S-Plus version allows you to experiment with code changesinteractively, but theR version does not. (A future revision mayinclude a ‘destructive’ browser forR.)
In the simple version (just the first argument),trace
returnsan invisibleNULL
.Otherwise, the traced function(s) name(s). The relevant consequence is theassignment that takes place.
untrace
returns the function name invisibly.
tracingState
returns the current global tracing state, and possiblychanges it.
When called duringon.exit
processing,returnValue
returnsthe value about to be returned by the exiting function. Behaviour inother circumstances is undefined.
Usingtrace()
is conceptually a generalization ofdebug
, implemented differently. Namely by callingbrowser
via itstracer
orexit
argument.
The version of function tracing that includes any of the argumentsexcept for the function name requires themethods package(because it uses special classes of objects to store and restoreversions of the traced functions).
If methods dispatch is not currently on,trace
will load themethods namespace, but will not put the methods package on thesearch
list.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
browser
andrecover
, the likeliesttracing functions;also,quote
andsubstitute
forconstructing general expressions.
require(stats)## Very simple usetrace(sum)hist(rnorm(100)) # shows about 3-4 calls to sum()untrace(sum)## Show how pt() is called from inside power.t.test():if(FALSE) trace(pt) ## would show ~20 calls, but we want to see more:trace(pt, tracer = quote(cat(sprintf("tracing pt(*, ncp = %.15g)\n", ncp))), print = FALSE) # <- not showing typical extrapower.t.test(20, 1, power=0.8, sd=NULL) ##--> showing the ncp root finding:untrace(pt)f <- function(x, y) { y <- pmax(y, 0.001) if (x > 0) x ^ y else stop("x must be positive")}## arrange to call the browser on entering and exiting## function ftrace("f", quote(browser(skipCalls = 4)), exit = quote(browser(skipCalls = 4)))## instead, conditionally assign some data, and then browse## on exit, but only then. Don't bother me otherwisetrace("f", quote(if(any(y < 0)) yOrig <- y), exit = quote(if(exists("yOrig")) browser(skipCalls = 4)), print = FALSE)## Enter the browser just before stop() is called. First, find## the step numbersuntrace(f) # (as it has changed f's body !)as.list(body(f))as.list(body(f)[[3]]) # -> stop(..) is [[4]]## Now call the browser theretrace("f", quote(browser(skipCalls = 4)), at = list(c(3,4)))## Not run: f(-1,2) # --> enters browser just before stop(..)## End(Not run)## trace a utility function, with recover so we## can browse in the calling functions as well.trace("as.matrix", recover)## turn off the tracing (that happened above)untrace(c("f", "as.matrix"))## Not run: ## Useful to find how system2() is called in a higher-up function:trace(base::system2, quote(print(ls.str())))## End(Not run)##-------- Tracing hidden functions : need 'where = *'#### 'where' can be a function whose environment is meant:trace(quote(ar.yw.default), where = ar)a <- ar(rnorm(100)) # "Tracing ..."untrace(quote(ar.yw.default), where = ar)## trace() more than one function simultaneously:## expression(E1, E2, ...) here is equivalent to## c(quote(E1), quote(E2), quote(.*), ..)trace(expression(ar.yw, ar.yw.default), where = ar)a <- ar(rnorm(100)) # --> 2 x "Tracing ..."# and turn it off:untrace(expression(ar.yw, ar.yw.default), where = ar)## Not run: ## trace calls to the function lm() that come from## the nlme package.trace("lm", where = asNamespace("nlme")) lm (len ~ log(dose) * supp, ToothGrowth) -> fit1 # NOT tracednlme::lmList(len ~ log(dose) | supp, ToothGrowth) -> fit2 # traceduntrace("lm", where = asNamespace("nlme"))## End(Not run)
require(stats)## Very simple usetrace(sum)hist(rnorm(100))# shows about 3-4 calls to sum()untrace(sum)## Show how pt() is called from inside power.t.test():if(FALSE) trace(pt)## would show ~20 calls, but we want to see more:trace(pt, tracer= quote(cat(sprintf("tracing pt(*, ncp = %.15g)\n", ncp))), print=FALSE)# <- not showing typical extrapower.t.test(20,1, power=0.8, sd=NULL)##--> showing the ncp root finding:untrace(pt)f<-function(x, y){ y<- pmax(y,0.001)if(x>0) x^ yelse stop("x must be positive")}## arrange to call the browser on entering and exiting## function ftrace("f", quote(browser(skipCalls=4)), exit= quote(browser(skipCalls=4)))## instead, conditionally assign some data, and then browse## on exit, but only then. Don't bother me otherwisetrace("f", quote(if(any(y<0)) yOrig<- y), exit= quote(if(exists("yOrig")) browser(skipCalls=4)), print=FALSE)## Enter the browser just before stop() is called. First, find## the step numbersuntrace(f)# (as it has changed f's body !)as.list(body(f))as.list(body(f)[[3]])# -> stop(..) is [[4]]## Now call the browser theretrace("f", quote(browser(skipCalls=4)), at= list(c(3,4)))## Not run:f(-1,2)# --> enters browser just before stop(..)## End(Not run)## trace a utility function, with recover so we## can browse in the calling functions as well.trace("as.matrix", recover)## turn off the tracing (that happened above)untrace(c("f","as.matrix"))## Not run:## Useful to find how system2() is called in a higher-up function:trace(base::system2, quote(print(ls.str())))## End(Not run)##-------- Tracing hidden functions : need 'where = *'#### 'where' can be a function whose environment is meant:trace(quote(ar.yw.default), where= ar)a<- ar(rnorm(100))# "Tracing ..."untrace(quote(ar.yw.default), where= ar)## trace() more than one function simultaneously:## expression(E1, E2, ...) here is equivalent to## c(quote(E1), quote(E2), quote(.*), ..)trace(expression(ar.yw, ar.yw.default), where= ar)a<- ar(rnorm(100))# --> 2 x "Tracing ..."# and turn it off:untrace(expression(ar.yw, ar.yw.default), where= ar)## Not run:## trace calls to the function lm() that come from## the nlme package.trace("lm", where= asNamespace("nlme")) lm(len~ log(dose)* supp, ToothGrowth)-> fit1# NOT tracednlme::lmList(len~ log(dose)| supp, ToothGrowth)-> fit2# traceduntrace("lm", where= asNamespace("nlme"))## End(Not run)
By defaulttraceback()
prints the call stack of the lastuncaught error, i.e., the sequence of calls that lead to the error.This is useful when an error occurs with an unidentifiable errormessage. It can also be used to print the current stack orarbitrary lists of calls.
.traceback()
nowreturns the above call stack (andtraceback(x, *)
can be regarded as convenience function forprinting the result of.traceback(x)
).
traceback(x = NULL, max.lines = getOption("traceback.max.lines", getOption("deparse.max.lines", -1L))).traceback(x = NULL, max.lines = getOption("traceback.max.lines", getOption("deparse.max.lines", -1L)))
traceback(x=NULL, max.lines= getOption("traceback.max.lines", getOption("deparse.max.lines",-1L))).traceback(x=NULL, max.lines= getOption("traceback.max.lines", getOption("deparse.max.lines",-1L)))
x |
|
max.lines | a number, the maximum number of lines to be printedper call. The default is unlimited. Applies only when |
The default display is of the stack of the last uncaught error asstored as a list ofcall
s in.Traceback
, whichtraceback
prints in a user-friendly format. The stack ofcalls always contains all function calls and all foreignfunction calls (such as.Call
): if profiling is inprogress it will include calls to some primitive functions. (Callsto builtins are included, but not to specials.)
Errors which are caughtviatry
ortryCatch
do not generate a traceback, so what is printedis the call sequence for the last uncaught error, and not necessarilyfor the last error.
Ifx
is numeric, then the current stack is printed, skippingx
entries at the top of the stack. For example,options(error = function() traceback(3))
will print the stackat the time of the error, skipping the call totraceback()
and.traceback()
and the error function that called it.
Otherwise,x
is assumed to be a list or pairlist of calls ordeparsed calls and will be displayed in the same way.
.traceback()
and by extensiontraceback()
may triggerdeparsing ofcall
s. This is an expensive operationfor large calls so it may be advisable to setmax.lines
to a reasonable value when such calls are on the call stack.
.traceback()
returns the deparsed call stack deepest callfirst as a list or pairlist. The number of lines deparsed fromthe call can be limited viamax.lines
. Calls for whichmax.lines
results in truncated output will gain a"truncated"
attribute.
traceback()
formats, prints, and returns the call stackproduced by.traceback()
invisibly.
It is undocumented where.Traceback
is stored nor that it isvisible, and this is subject to change. Currently.Traceback
contains thecall
s as languageobjects.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
foo <- function(x) { print(1); bar(2) }bar <- function(x) { x + a.variable.which.does.not.exist }## Not run: foo(2) # gives a strange errortraceback()## End(Not run)## 2: bar(2)## 1: foo(2)bar## Ah, this is the culprit ...## This will print the stack trace at the time of the error.options(error = function() traceback(3))
foo<-function(x){ print(1); bar(2)}bar<-function(x){ x+ a.variable.which.does.not.exist}## Not run:foo(2)# gives a strange errortraceback()## End(Not run)## 2: bar(2)## 1: foo(2)bar## Ah, this is the culprit ...## This will print the stack trace at the time of the error.options(error=function() traceback(3))
This function marks an object so that a message is printed whenever theinternal code copies the object. It is amajor cause of hard-to-predict memory use in R.
tracemem(x)untracemem(x)retracemem(x, previous = NULL)
tracemem(x)untracemem(x)retracemem(x, previous=NULL)
x | An R object, not a function or environment or |
previous | A value as returned by |
This functionality is optional, determined at compilation, because itmakes R run a little more slowly even when no objects are beingtraced.tracemem
anduntracemem
give errors when R is notcompiled with memory profiling;retracemem
does not (so it can beleft in code during development).
It is enabled in the CRAN macOS and Windows builds ofR.
When an object is traced any copying of the object by the C functionduplicate
produces a message to standard output, as does typecoercion and copying when passing arguments to.C
or.Fortran
.
The message consists of the stringtracemem
, the identifyingstrings for the object being copied and the new object being created,and a stack trace showing where the duplication occurred.retracemem()
is used to indicate that a variable should beconsidered a copy of a previous variable (e.g., after subscripting).
The messages can be turned off withtracingState
.
It is not possible to trace functions, as this would conflict withtrace
and it is not useful to traceNULL
,environments, promises, weak references, or external pointer objects, asthese are not duplicated.
These functions areprimitive.
A character string for identifying the object in the trace output (anaddress in hex enclosed in angle brackets), orNULL
(invisibly).
capabilities("profmem")
to see if this was enabled forthis build ofR.
https://developer.r-project.org/memory-profiling.html
## Not run: a <- 1:10tracemem(a)## b and a share memoryb <- ab[1] <- 1untracemem(a)## copying in lm: less than R <= 2.15.0d <- stats::rnorm(10)tracemem(d)lm(d ~ a+log(b))## f is not a copy and is not tracedf <- d[-1]f+1## indicate that f should be traced as a copy of dretracemem(f, retracemem(d))f+1## End(Not run)
## Not run:a<-1:10tracemem(a)## b and a share memoryb<- ab[1]<-1untracemem(a)## copying in lm: less than R <= 2.15.0d<- stats::rnorm(10)tracemem(d)lm(d~ a+log(b))## f is not a copy and is not tracedf<- d[-1]f+1## indicate that f should be traced as a copy of dretracemem(f, retracemem(d))f+1## End(Not run)
transform
is a generic function, which—at leastcurrently—only does anything useful withdata frames.transform.default
converts its first argument toa data frame if possible and callstransform.data.frame
.
transform(`_data`, ...)
transform(`_data`,...)
_data | The object to be transformed |
... | Further arguments of the form |
The...
arguments totransform.data.frame
are taggedvector expressions, which are evaluated in the data frame_data
. The tags are matched againstnames(_data)
, and forthose that match, the value replace the corresponding variable in_data
, and the others are appended to_data
.
The modified value of_data
.
This is a convenience function intended for use interactively. Forprogramming it is better to use the standard subsetting arithmetic functions,and in particular the non-standard evaluation ofargumenttransform
can have unanticipated consequences.
If some of the values are not vectors of the appropriate length,you deserve whatever you get!
Peter Dalgaard
within
for a more flexible approach,subset
,list
,data.frame
transform(airquality, Ozone = -Ozone)transform(airquality, new = -Ozone, Temp = (Temp-32)/1.8)attach(airquality)transform(Ozone, logOzone = log(Ozone)) # marginally interesting ...detach(airquality)
transform(airquality, Ozone=-Ozone)transform(airquality, new=-Ozone, Temp=(Temp-32)/1.8)attach(airquality)transform(Ozone, logOzone= log(Ozone))# marginally interesting ...detach(airquality)
These functions give the obvious trigonometric functions. Theyrespectively compute the cosine, sine, tangent, arc-cosine, arc-sine,arc-tangent, and the two-argument arc-tangent.
cospi(x)
,sinpi(x)
, andtanpi(x)
, computecos(pi*x)
,sin(pi*x)
, andtan(pi*x)
.
cos(x)sin(x)tan(x)acos(x)asin(x)atan(x)atan2(y, x)cospi(x)sinpi(x)tanpi(x)
cos(x)sin(x)tan(x)acos(x)asin(x)atan(x)atan2(y, x)cospi(x)sinpi(x)tanpi(x)
x ,y | numeric or complex vectors. |
The arc-tangent of two argumentsatan2(y, x)
returns the anglebetween the x-axis and the vector from the origin to,i.e., for positive arguments
atan2(y, x) == atan(y/x)
.
Angles are in radians, not degrees, for the standard versions (i.e., aright angle is), and in ‘half-rotations’ for
cospi
etc.
cospi(x)
,sinpi(x)
, andtanpi(x)
are accurateforx
values which are multiples of a half.
All exceptatan2
areinternal genericprimitivefunctions: methods can be defined for them individually or via theMath
group generic.
These are all wrappers to system calls of the same name (with prefixc
for complex arguments) where available. (cospi
,sinpi
, andtanpi
are part of a C11 extensionand provided by e.g. macOS and Solaris: where not yetavailable call tocos
etc are used, with special casesfor multiples of a half.)
tanpi(0.5)
isNaN
. Similarly for other inputswith fractional part0.5
.
For the inverse trigonometric functions, branch cuts are defined as inAbramowitz and Stegun, figure 4.4, page 79.
Forasin
andacos
, there are two cuts, both alongthe real axis: and
.
Foratan
there are two cuts, both along the pure imaginaryaxis: and
.
The behaviour actually on the cuts follows the C99 standard whichrequires continuity coming round the endpoint in a counter-clockwisedirection.
Complex arguments forcospi
,sinpi
, andtanpi
are not yet implemented, and they are a ‘future direction’ ofISO/IEC TS 18661-4.
All exceptatan2
are S4 generic functions: methods can be definedfor them individually or via theMath
group generic.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Abramowitz, M. and Stegun, I. A. (1972).Handbook ofMathematical Functions. New York: Dover.
Chapter 4. Elementary Transcendental Functions: Logarithmic,Exponential, Circular and Hyperbolic Functions
Forcospi
,sinpi
, andtanpi
the C11 extensionISO/IEC TS 18661-4:2015 (draft athttps://www.open-std.org/jtc1/sc22/wg14/www/docs/n1950.pdf).
x <- seq(-3, 7, by = 1/8)tx <- cbind(x, cos(pi*x), cospi(x), sin(pi*x), sinpi(x), tan(pi*x), tanpi(x), deparse.level=2)op <- options(digits = 4, width = 90) # for nice formattinghead(tx)tx[ (x %% 1) %in% c(0, 0.5) ,]options(op)
x<- seq(-3,7, by=1/8)tx<- cbind(x, cos(pi*x), cospi(x), sin(pi*x), sinpi(x), tan(pi*x), tanpi(x), deparse.level=2)op<- options(digits=4, width=90)# for nice formattinghead(tx)tx[(x%%1)%in% c(0,0.5),]options(op)
Remove leading and/or trailing whitespace from character strings.
trimws(x, which = c("both", "left", "right"), whitespace = "[ \t\r\n]")
trimws(x, which= c("both","left","right"), whitespace="[ \t\r\n]")
x | a character vector. |
which | a character string specifying whether to remove bothleading and trailing whitespace (default), or only leading( |
whitespace | a string specifying a regular expression to match(one character of) “white space”, see Details foralternatives to the default. |
Internally,sub(re, "", *, perl = TRUE)
, i.e., PCRElibrary regular expressions are used.For portability, the default ‘whitespace’ is the character class[ \t\r\n]
(space, horizontal tab, carriage return,newline). Alternatively,[\h\v]
is a good (PCRE)generalization to match all Unicode horizontal and vertical whitespace characters, see alsohttps://www.pcre.org.
x <- " Some text. "xtrimws(x)trimws(x, "l")trimws(x, "r")## Unicode --> need "stronger" 'whitespace' to match all :tt <- "text with unicode 'non breakable space'."xu <- paste(" \t\v", tt, "\u00a0 \n\r")(tu <- trimws(xu, whitespace = "[\\h\\v]"))stopifnot(identical(tu, tt))
x<-" Some text. "xtrimws(x)trimws(x,"l")trimws(x,"r")## Unicode --> need "stronger" 'whitespace' to match all :tt<-"text with unicode 'non breakable space'."xu<- paste(" \t\v", tt,"\u00a0 \n\r")(tu<- trimws(xu, whitespace="[\\h\\v]"))stopifnot(identical(tu, tt))
try
is a wrapper to run an expression that might fail and allowthe user's code to handle error-recovery.
try(expr, silent = FALSE, outFile = getOption("try.outFile", default = stderr()))
try(expr, silent=FALSE, outFile= getOption("try.outFile", default= stderr()))
expr | anR expression to try. |
silent | logical: should the report of error messages besuppressed? |
outFile | aconnection, or a character string naming thefile to print to (via |
try
evaluates an expression and traps any errors that occurduring the evaluation. If an error occurs then the errormessage is printed to thestderr
connection unlessoptions("show.error.messages")
is false orthe call includessilent = TRUE
. The error message is alsostored in a buffer where it can be retrieved bygeterrmessage
. (This should not be needed as the value returnedin case of an error contains the error message.)
try
is implemented usingtryCatch
; forprogramming, instead oftry(expr, silent = TRUE)
, something liketryCatch(expr, error = function(e) e)
(or other simpleerror handler functions) may be more efficient and flexible.
It may be useful to set the default foroutFile
tostdout()
, i.e.,
options(try.outFile = stdout())
instead of the defaultstderr()
,notably whentry()
is used inside aSweave
codechunk and the error message should appear in the resulting document.
The value of the expression ifexpr
is evaluated without error:otherwise an invisible object inheriting from class"try-error"
containing the error message with the error condition as the"condition"
attribute.
Do not test
if (class(res) == "try-error"))
as if there is no error, the result might (now or in future) have aclass of length > 1. Useif(inherits(res, "try-error"))
instead.
options
for setting error handlers and suppressing theprinting of error messages;geterrmessage
for retrieving the last error message.The underlyingtryCatch
provides more flexible means ofcatching and handling errors.
assertCondition
in packagetools is related anduseful for testing.
## this example will not work correctly in example(try), but## it does work correctly if pasted inoptions(show.error.messages = FALSE)try(log("a"))print(.Last.value)options(show.error.messages = TRUE)## alternatively,print(try(log("a"), TRUE))## run a simulation, keep only the results that worked.set.seed(123)x <- stats::rnorm(50)doit <- function(x){ x <- sample(x, replace = TRUE) if(length(unique(x)) > 30) mean(x) else stop("too few unique points")}## alternative 1res <- lapply(1:100, function(i) try(doit(x), TRUE))## alternative 2## Not run: res <- vector("list", 100)for(i in 1:100) res[[i]] <- try(doit(x), TRUE)## End(Not run)unlist(res[sapply(res, function(x) !inherits(x, "try-error"))])
## this example will not work correctly in example(try), but## it does work correctly if pasted inoptions(show.error.messages=FALSE)try(log("a"))print(.Last.value)options(show.error.messages=TRUE)## alternatively,print(try(log("a"),TRUE))## run a simulation, keep only the results that worked.set.seed(123)x<- stats::rnorm(50)doit<-function(x){ x<- sample(x, replace=TRUE)if(length(unique(x))>30) mean(x)else stop("too few unique points")}## alternative 1res<- lapply(1:100,function(i) try(doit(x),TRUE))## alternative 2## Not run: res <- vector("list", 100)for(iin1:100) res[[i]]<- try(doit(x),TRUE)## End(Not run)unlist(res[sapply(res,function(x)!inherits(x,"try-error"))])
typeof
determines the (R internal)type or storage mode of any object
typeof(x)
typeof(x)
x | anyR object. |
A character string. The possible values are listed in the structureTypeTable
in ‘src/main/util.c’. Current values arethe vector types"logical"
,"integer"
,"double"
,"complex"
,"character"
,"raw"
and"list"
,"NULL"
,"closure"
(function),"special"
and"builtin"
(basic functions and operators),"environment"
,"S4"
(some S4 objects) and others that are unlikely to be seen at userlevel ("symbol"
,"pairlist"
,"promise"
,"object"
,"language"
,"char"
,"..."
,"any"
,"expression"
,"externalptr"
,"bytecode"
and"weakref"
).
isS4
to determine if an object has an S4 class.
typeof(2)mode(2)## for a table of examples, see ?mode / examples(mode)
typeof(2)mode(2)## for a table of examples, see ?mode / examples(mode)
unique
returns a vector, data frame or array likex
but with duplicate elements/rows removed.
unique(x, incomparables = FALSE, ...)## Default S3 method:unique(x, incomparables = FALSE, fromLast = FALSE, nmax = NA, ...)## S3 method for class 'matrix'unique(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, ...)## S3 method for class 'array'unique(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, ...)
unique(x, incomparables=FALSE,...)## Default S3 method:unique(x, incomparables=FALSE, fromLast=FALSE, nmax=NA,...)## S3 method for class 'matrix'unique(x, incomparables=FALSE, MARGIN=1, fromLast=FALSE,...)## S3 method for class 'array'unique(x, incomparables=FALSE, MARGIN=1, fromLast=FALSE,...)
x | a vector or a data frame or an array or |
incomparables | a vector of values that cannot be compared. |
fromLast | logical indicating if duplication should be consideredfrom the last, i.e., the last (or rightmost) of identical elements willbe kept. This only matters for |
nmax | the maximum number of unique items expected (greater than one).See |
... | arguments for particular methods. |
MARGIN | the array margin to be held fixed: a single integer. |
This is a generic function with methods for vectors, data frames andarrays (including matrices).
The array method calculates for each element of the dimensionspecified byMARGIN
if the remaining dimensions are identicalto those for an earlier element (in row-major order). This would mostcommonly be used for matrices to find unique rows (the default) or columns(withMARGIN = 2
).
Note that unlike the Unix commanduniq
this omitsduplicated and not justrepeated elements/rows. Thatis, an element is omitted if it is equal to any previous element andnot just if it is equal the immediately previous one. (For thelatter, seerle
).
Missing values ("NA"
) are regarded as equal, numeric andcomplex ones differing fromNaN
; character strings will be compared in a“common encoding”; for details, seematch
(andduplicated
) which use the same concept.
Values inincomparables
will never be marked as duplicated.This is intended to be used for a fairly small set of values and willnot be efficient for a very large set.
When used on a data frame with more than one column, or an array ormatrix when comparing dimensions of length greater than one, thistests for identity of character representations. This willcatch people who unwisely rely on exact equality of floating-pointnumbers!
For a vector, an object of the same type ofx
, but with onlyone copy of each duplicated element. No attributes are copied (sothe result has no names).
For a data frame, a data frame is returned with the same columns butpossibly fewer rows (and with row names from the first occurrences ofthe unique rows).
A matrix or array is subsetted by[, drop = FALSE]
, sodimensions and dimnames are copied appropriately, and the resultalways has the same number of dimensions asx
.
Using this for lists is potentially slow, especially if the elementsare not atomic vectors (seevector
) or differ onlyin their attributes. In the worst case it is.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
duplicated
which gives the indices of duplicatedelements.
rle
which is the equivalent of the Unixuniq -c
command.
x <- c(3:5, 11:8, 8 + 0:5)(ux <- unique(x))(u2 <- unique(x, fromLast = TRUE)) # different orderstopifnot(identical(sort(ux), sort(u2)))length(unique(sample(100, 100, replace = TRUE)))## approximately 100(1 - 1/e) = 63.21unique(iris)
x<- c(3:5,11:8,8+0:5)(ux<- unique(x))(u2<- unique(x, fromLast=TRUE))# different orderstopifnot(identical(sort(ux), sort(u2)))length(unique(sample(100,100, replace=TRUE)))## approximately 100(1 - 1/e) = 63.21unique(iris)
unlink
deletes the file(s) or directories specified byx
.
unlink(x, recursive = FALSE, force = FALSE, expand = TRUE)
unlink(x, recursive=FALSE, force=FALSE, expand=TRUE)
x | a character vector with the names of the file(s) ordirectories to be deleted. |
recursive | logical. Should directories be deleted recursively? |
force | logical. Should permissions be changed (if possible) toallow the file or directory to be removed? |
expand | logical. Should wildcards (see ‘Details’ below) andtilde (see |
Ifrecursive = FALSE
directories are not deleted,not even empty ones.
On most platforms ‘file’ includes symbolic links, fifos andsockets.unlink(x, recursive = TRUE)
deletes just the symbolic link if the target of such a link is a directory.
Wildcard expansion (normally ‘*’ and ‘?’ are allowed) is done bythe internal code ofSys.glob
. Wildcards never match aleading ‘.’ in the filename, and files ‘.’, ‘..’ and‘~’ will never be considered for deletion.Wildcards will only be expanded if the system supports it. Mostsystems will support not only ‘*’ and ‘?’ but also characterclasses such as ‘[a-z]’ (see theman
pages for the systemcallglob
on your OS). The metacharacters* ? [
canoccur in Unix filenames, and this makes it difficult to useunlink
to delete such files (seefile.remove
),although escaping the metacharacters by backslashes usually works. Ifa metacharacter matches nothing it is considered as a literalcharacter.
recursive = TRUE
might not be supported on all platforms, when itwill be ignored, with a warning: however there are no known currentexamples.
0
for success,1
for failure, invisibly.Not deleting a non-existent file is not a failure, nor is being unableto delete a directory ifrecursive = FALSE
. However, missingvalues inx
are regarded as failures.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Given a list structurex
,unlist
simplifies it toproduce a vector which contains all the atomic componentswhich occur inx
.
unlist(x, recursive = TRUE, use.names = TRUE)
unlist(x, recursive=TRUE, use.names=TRUE)
x | anR object, typically a list or vector. |
recursive | logical. Should unlisting be applied to listcomponents of |
use.names | logical. Should names be preserved? |
unlist
is generic: you can write methods to handlespecific classes of objects, seeInternalMethods,and note, e.g.,relist
with theunlist
methodforrelistable
objects.
Ifrecursive = FALSE
, the function will not recurse beyond thefirst level items inx
.
Factors are treated specially. If all non-list elements ofx
arefactor
(or ordered factor) objects then the resultwill be a factor withlevels the union of the level sets of the elements, in the order thelevels occur in the level sets of the elements (which means that ifall the elements have the same level set, that is the level set of theresult).
x
can be an atomic vector, but thenunlist
does nothing useful,not even drop names.
By default,unlist
tries to retain the naminginformation present inx
. Ifuse.names = FALSE
allnaming information is dropped.
Where possible the list elements are coerced to a common mode duringthe unlisting, and so the result often ends up as a charactervector. Vectors will be coerced to the highest type of the componentsin the hierarchy NULL < raw < logical < integer < double < complex < character< list < expression: pairlists are treated as lists.
A list is a (generic) vector, and the simplified vector might still bea list (and might be unchanged). Non-vector elements of the list(for example language elements such as names, formulas and calls)are not coerced, and so a list containing one or more of these remains alist. (The effect of unlisting anlm
fit is a list whichhas individual residuals as components.)Note thatunlist(x)
now returnsx
unchanged also fornon-vectorx
, instead of signalling an error in that case.
NULL
or an expression or a vector of an appropriate mode tohold the list components.
The output type is determined from the highest typeof the components in the hierarchy NULL < raw < logical < integer < double <complex < character < list < expression, after coercion of pairliststo lists.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
unlist(options())unlist(options(), use.names = FALSE)l.ex <- list(a = list(1:5, LETTERS[1:5]), b = "Z", c = NA)unlist(l.ex, recursive = FALSE)unlist(l.ex, recursive = TRUE)l1 <- list(a = "a", b = 2, c = pi+2i)unlist(l1) # a character vectorl2 <- list(a = "a", b = as.name("b"), c = pi+2i)unlist(l2) # remains a listll <- list(as.name("sinc"), quote( a + b ), 1:10, letters, expression(1+x))utils::str(ll)for(x in ll) stopifnot(identical(x, unlist(x)))
unlist(options())unlist(options(), use.names=FALSE)l.ex<- list(a= list(1:5, LETTERS[1:5]), b="Z", c=NA)unlist(l.ex, recursive=FALSE)unlist(l.ex, recursive=TRUE)l1<- list(a="a", b=2, c= pi+2i)unlist(l1)# a character vectorl2<- list(a="a", b= as.name("b"), c= pi+2i)unlist(l2)# remains a listll<- list(as.name("sinc"), quote( a+ b),1:10, letters, expression(1+x))utils::str(ll)for(xin ll) stopifnot(identical(x, unlist(x)))
names
ordimnames
Remove thenames
ordimnames
attribute ofanR object.
unname(obj, force = FALSE)
unname(obj, force=FALSE)
obj | anR object. |
force | logical; if true, the |
Object asobj
but withoutnames
ordimnames
.
require(graphics); require(stats)## Answering a question on R-help (14 Oct 1999):col3 <- 750+ 100*rt(1500, df = 3)breaks <- factor(cut(col3, breaks = 360+5*(0:155)))z <- table(breaks)z[1:5] # The names are larger than the data ...barplot(unname(z), axes = FALSE)
require(graphics); require(stats)## Answering a question on R-help (14 Oct 1999):col3<-750+100*rt(1500, df=3)breaks<- factor(cut(col3, breaks=360+5*(0:155)))z<- table(breaks)z[1:5]# The names are larger than the data ...barplot(unname(z), axes=FALSE)
Use packages in R scripts by loading their namespace and attaching apackage environment including (a subset of) their exports to thesearch path.
use(package, include.only)
use(package, include.only)
package | a character string given the name of a package. |
include.only | character vector of names of objects toinclude in the attached environment frame. If missing, all exportsare included. |
This is a simple wrapper aroundlibrary
which alwaysusesattach.required = FALSE
, so that packages listed in theDepends
clause of theDESCRIPTION
file of the package tobe used never get attached automatically to the search path.
This therefore allows to write R scripts with full control over whatgets found on the search path. In addition, such scripts can easilybe integrated as package code, replacing the calls touse
bythe correspondingImportFrom
directives in ‘NAMESPACE’files.
(invisibly) a logical indicating whether the package to be used isavailable.
This functionality is still experimental: interfaces may change infuture versions.
R possesses a simple generic function mechanism which can be used foran object-oriented style of programming. Method dispatch takes placebased on the class(es) of the first argument to the generic function or ofthe object supplied as an argument toUseMethod
orNextMethod
.
UseMethod(generic, object)NextMethod(generic = NULL, object = NULL, ...)
UseMethod(generic, object)NextMethod(generic=NULL, object=NULL,...)
generic | a character string naming a function (and not abuilt-in operator). Required for |
object | for |
... | further arguments to be passed to the next method. |
AnR object is a data object which has aclass
attribute (and this can be tested byis.object
).A class attribute is a character vector giving the names ofthe classes from which the objectinherits.
If the object does not have a class attribute, it has animplicit class. Matrices and arrays have class"matrix"
or"array"
followed by the class of the underlying vector.Most vectors have class the result ofmode(x)
, exceptthat integer vectors have classc("integer", "numeric")
andreal vectors have classc("double", "numeric")
.Function.class2(x)
(sinceR 4.0.x) returns the fullimplicit (or explicit) class vector ofx
.
When a function callingUseMethod("fun")
is applied to anobject with class vectorc("first", "second")
, the systemsearches for a function calledfun.first
and, if it finds it,applies it to the object. If no such function is found a functioncalledfun.second
is tried. If no class name produces asuitable function, the functionfun.default
is used, if itexists, or an error results.
Functionmethods
can be used to find out about themethods for a particular generic function or class.
UseMethod
is a primitive function but uses standard argumentmatching. It is not the only means of dispatch of methods, for thereareinternal generic andgroup generic functions.UseMethod
currently dispatches on the implicit class even forarguments that are not objects, but the other means of dispatch donot.
NextMethod
invokes the next method (determined by theclass vector, either of the object supplied to the generic, or ofthe first argument to the function containingNextMethod
if amethod was invoked directly). NormallyNextMethod
is used withonly one argument,generic
, but if further arguments aresupplied these modify the call to the next method.
NextMethod
should not be called except in methods called byUseMethod
or from internal generics (seeInternalGenerics). In particular it will not work insideanonymous calling functions (e.g.,get("print.ts")(AirPassengers)
).
Namespaces can register methods for generic functions. To supportthis,UseMethod
andNextMethod
search for methods intwo places: in the environment in which the generic functionis called, and in the registration data base for theenvironment in which the generic is defined (typically a namespace).So methods for a generic function need to be available in theenvironment of the call to the generic, or they must be registered.(It does not matter whether they are visible in the environment inwhich the generic is defined.) As fromR 3.5.0, the registrationdata base is searched after the top level environment (seetopenv
) of the calling environment (but before theparents of the top level environment).
Now for some obscure details that need to appear somewhere. Thesecomments will be slightly different than those in Chambers(1992).(See also the draft ‘R Language Definition’.)UseMethod
creates a new function call witharguments matched as they came in to the generic. [Previously localvariables defined before the call toUseMethod
were retained;as ofR 4.4.0 this is no longer the case.] Anystatements after the call toUseMethod
will not be evaluated asUseMethod
does not return.UseMethod
can be called withmore than two arguments: a warning will be given and additionalarguments ignored. (They are not completely ignored in S.) If it iscalled with just one argument, the class of the first argument of theenclosing function is used asobject
: unlike S this is the firstactual argument passed and not the current value of the object of thatname.
NextMethod
works by creating a special call frame for the nextmethod. If no new arguments are supplied, the arguments will be thesame in number, order and name as those to the current method buttheir values will be promises to evaluate their name in the currentmethod and environment. Any named arguments matched to...
are handled specially: they either replace existing arguments of thesame name or are appended to the argument list. They are passed on asthe promise that was supplied as an argument to the currentenvironment. (S does this differently!) If they have been evaluatedin the current (or a previous environment) they remain evaluated.(This is a complex area, and subject to change: see the draft‘R Language Definition’.)
The search for methods forNextMethod
is slightly differentfrom that forUseMethod
. Finding nofun.default
is notnecessarily an error, as the search continues to the genericitself. This is to pick up aninternal generic like[
which has no separate default method, and succeeds only if the genericis aprimitive function or a wrapper for a.Internal
function of the same name. (When a primitiveis called as the default method, argument matching may not work asdescribed above due to the different semantics of primitives.)
You will see objects such as.Generic
,.Method
, and.Class
used in methods. These are set in the environmentwithin which the method is evaluated by the dispatch mechanism, whichis as follows:
Find the context for the calling function (the generic): thisgives us the unevaluated arguments for the original call.
Evaluate the object (usually an argument) to be used fordispatch, and find a method (possibly the default method) or throwan error.
Create an environment for evaluating the method and insertspecial variables (see below) into that environment. Also copy anyvariables in the environment of the generic that are not formal (oractual) arguments.
Fix up the argument list to be the arguments of the callmatched to the formals of the method.
.Generic
is a length-one character vector naming the generic function.
.Method
is a character vector (normally of length one) namingthe method function. (For functions in the group genericOps
it is of length two.)
.Class
is a character vector of classes used to find the nextmethod.NextMethod
adds an attribute"previous"
to.Class
giving the.Class
last used for dispatch, andshifts.Class
along to that used for dispatch.
.GenericCallEnv
and.GenericDefEnv
are the environmentsof the call to be generic and defining the generic respectively. (Thelatter is used to find methods registered for the generic.)
Note that.Class
is set when the generic is called, and isunchanged if the class of the dispatching argument is changed in amethod. It is possible to change the method thatNextMethod
would dispatch by manipulating.Class
, but ‘this is notrecommended unless you understand the inheritance mechanismthoroughly’ (Chambers & Hastie, 1992, p. 469).
This scheme is calledS3 (S version 3). For new projects,it is recommended to use the more flexible and robustS4 schemeprovided in themethods package.
Chambers, J. M. (1992)Classes and methods: object-oriented programming in S.Appendix A ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
The draft ‘R Language Definition’.
methods
,class
incl.class2()
;getS3method
,is.object
.
These functions allow users to set actions to be taken before packagesare attached/detached and namespaces are (un)loaded.
getHook(hookName)setHook(hookName, value, action = c("append", "prepend", "replace"))packageEvent(pkgname, event = c("onLoad", "attach", "detach", "onUnload"))
getHook(hookName)setHook(hookName, value, action= c("append","prepend","replace"))packageEvent(pkgname, event= c("onLoad","attach","detach","onUnload"))
hookName | character string: the hook name. |
pkgname | character string: the package/namespace name. |
event | character string: an event for the package. Can be abbreviated. |
value | a function or a list of functions, or for |
action | the action to be taken. Can be abbreviated. |
setHook
provides a general mechanism for users to registerhooks, a list of functions to be called from system (or user)functions. The initial set of hooks was associated with events onpackages/namespaces: these hooks are named via calls topackageEvent
.
To remove a hook completely, callsetHook(hookName, NULL, "replace")
.
When anR package is attached bylibrary
or loaded byother means, it can call initialization code. See.onLoad
for a description of the package hook functionscalled during initialization. Users can add their own initializationcode via the hooks provided bysetHook()
, functions which willbe called asfunname(pkgname, pkgpath)
inside atry
call.
The sequence of events depends on which hooks are defined, and whethera package is attached or just loaded. In the case where all hooksare defined and a package is attached, the order of initializationevents is as follows:
The package namespace is loaded.
The package's.onLoad
function is run.
If S4 methods dispatch is on, any actions set bysetLoadAction
are run.
The namespace is sealed.
The user's"onLoad"
hook is run.
The package is added to the search path.
The package's.onAttach
function is run.
The package environment is sealed.
The user's"attach"
hook is run.
A similar sequence (but in reverse) is run when a package is detachedand its namespace unloaded:
The user's"detach"
hook is run.
The package's.Last.lib
function is run.
The package is removed from the search path.
The user's"onUnload"
hook is run.
The package's.onUnload
function is run.
The package namespace is unloaded.
Note that when anR session is finished, packages are not detached andnamespaces are not unloaded, so the corresponding hooks will not berun.
Also note that some of the user hooks are run without the packagebeing on the search path, so in those hooks objects in the packageneed to be referred to using the double (or triple) colon operator,as in the example.
If multiple hooks are added, they are normally run in the order shownbygetHook
, but the"detach"
and"onUnload"
hooksare run in reverse order so the default for package events is to addhooks ‘inside’ existing ones.
The hooks are stored in the environment.userHooksEnv
in thebase package, with ‘mangled’ names.
ForgetHook
function, a list of functions (possibly empty).ForsetHook
function, no return value.ForpackageEvent
, the derived hook name (a character string).
Hooks need to be set before the event they modify: for standardpackages this can be problematic asmethods is loaded andattached early in the startup sequence. The usual place to set hookssuch as the example below is in the ‘.Rprofile’ file, but thatwill not work formethods.
See::
for a discussion of the double and triple colon operators.
Other hooks may be added later: functionsplot.new
andpersp
already have them.
setHook(packageEvent("grDevices", "onLoad"), function(...) grDevices::ps.options(horizontal = FALSE))
setHook(packageEvent("grDevices","onLoad"),function(...) grDevices::ps.options(horizontal=FALSE))
Conversion of UTF-8 encoded character vectors to and from integervectors representing a UTF-32 encoding.
utf8ToInt(x)intToUtf8(x, multiple = FALSE, allow_surrogate_pairs = FALSE)
utf8ToInt(x)intToUtf8(x, multiple=FALSE, allow_surrogate_pairs=FALSE)
x | object to be converted. |
multiple | logical: should the conversion be to a singlecharacter string or multiple individual characters? |
allow_surrogate_pairs | logical: should interpretation ofsurrogate pairs be attempted? (See ‘Details’.)Only supported for |
These will work in any locale, including on platforms that do nototherwise support multi-byte character sets.
Unicode defines a name and a number of all of the glyphs itencompasses: the numbers are calledcode points: since RFC3629they run from0
to0x10FFFF
(with about 5% beingassigned by version 13.0 of the Unicode standard and 7% reserved for‘private use’).
intToUtf8
does not by default handle surrogate pairs: inputs inthe surrogate ranges are mapped toNA
. They might occur if aUTF-16 byte stream has been read as 2-byte integers (in the correctbyte order), in which caseallow_surrogate_pairs = TRUE
willtry to interpret them (with unmatched surrogate values still treatedasNA
).
utf8ToInt
converts a length-one character string encoded inUTF-8 to an integer vector of Unicode code points.
intToUtf8
converts a numeric vector of Unicode code pointseither (default) to a single character string or a character vector ofsingle characters. Non-integral numeric values are truncated tointegers. For output to a single character string0
issilently omitted: otherwise0
is mapped to""
. TheEncoding
of a non-NA
return value is declared as"UTF-8"
.
Invalid andNA
inputs are mapped toNA
output.
Which code points are regarded as valid has changed over the lifetimeof UTF-8. Originally all 32-bit unsigned integers were potentiallyvalid and could be converted to up to 6 bytes in UTF-8. Since 2003 ithas been stated that there will never be valid code points larger than0x10FFFF
, and so valid UTF-8 encodings are never more than 4bytes.
The code points in the surrogate-pair range0xD800
to0xDFFF
are prohibited in UTF-8 and so are regarded as invalidbyutf8ToInt
and by default byintToUtf8
.
The position of ‘noncharacters’ (notably0xFFFE
and0xFFFF
) was clarified by ‘Corrigendum 9’ in 2013. Theseare valid but will never be given an official interpretation. (In someearlier versions ofRutf8ToInt
treated them as invalid.)
https://www.rfc-editor.org/rfc/rfc3629, the current standard for UTF-8.
https://www.unicode.org/versions/corrigendum9.html for non-characters.
## will only display in some locales and fontsintToUtf8(0x03B2L) # Greek betautf8ToInt("bi\u00dfchen")utf8ToInt("\xfa\xb4\xbf\xbf\x9f")## A valid UTF-16 surrogate pair (for U+10437)x <- c(0xD801, 0xDC37)intToUtf8(x)intToUtf8(x, TRUE)(xx <- intToUtf8(x, , TRUE)) # will only display in some locales and fontscharToRaw(xx)## An example of how surrogate pairs might occurx <- "\U10437"charToRaw(x)foo <- tempfile()writeLines(x, file(foo, encoding = "UTF-16LE"))## next two are OS-specific, but are mandated by POSIXsystem(paste("od -x", foo)) # 2-byte units, correct on little-endian platformssystem(paste("od -t x1", foo)) # single bytes as hexy <- readBin(foo, "integer", 2, 2, FALSE, endian = "little")sprintf("%X", y)intToUtf8(y, , TRUE)
## will only display in some locales and fontsintToUtf8(0x03B2L)# Greek betautf8ToInt("bi\u00dfchen")utf8ToInt("\xfa\xb4\xbf\xbf\x9f")## A valid UTF-16 surrogate pair (for U+10437)x<- c(0xD801,0xDC37)intToUtf8(x)intToUtf8(x,TRUE)(xx<- intToUtf8(x,,TRUE))# will only display in some locales and fontscharToRaw(xx)## An example of how surrogate pairs might occurx<-"\U10437"charToRaw(x)foo<- tempfile()writeLines(x, file(foo, encoding="UTF-16LE"))## next two are OS-specific, but are mandated by POSIXsystem(paste("od -x", foo))# 2-byte units, correct on little-endian platformssystem(paste("od -t x1", foo))# single bytes as hexy<- readBin(foo,"integer",2,2,FALSE, endian="little")sprintf("%X", y)intToUtf8(y,,TRUE)
Most modern file systems store file-path components (names ofdirectories and files) in a character encoding of wide scope: usuallyUTF-8 on a Unix-alike and UCS-2/UTF-16 on Windows. However, this wasnot true whenR was first developed and there are still exceptionsamongst file systems, e.g. FAT32.
This was not something anticipated by the C and POSIX standards whichonly provide means to access filesvia file paths encoded inthe current locale, for example those specified in Latin-1 in aLatin-1 locale.
Everything here apart from the specific section on Windows is aboutUnix-alikes.
It is possible to mark character strings (elements of charactervectors) as being in UTF-8 or Latin-1 (seeEncoding
).This allows file paths not in the native encoding to beexpressed inR character vectors but there is almost no way to usethem unless they can be translated to the native encoding. That is ofcourse not a problem if that is UTF-8, so these details are really onlyrelevant to the use of a non-UTF-8 locale (including a C locale) on aUnix-alike.
Functions to open a file such asfile
,fifo
,pipe
,gzfile
,bzfile
,xzfile
andunz
givean error for non-native filepaths. Where functions look at existencesuch asfile.exists
,dir.exists
,unlink
,file.info
andlist.files
, non-native filepaths are treated asnon-existent.
Many other functions usefile
orgzfile
to open theirfiles.
file.path
allows non-native file paths to be combined,marking them as UTF-8 if needed.
path.expand
only handles paths in the native encoding.
Windows provides proprietary entry points to access its file systems,and these gained ‘wide’ versions in Windows NT that allowedfile paths in UCS-2/UTF-16 to be accessed from any locale.
SomeR functions use these entry points when file paths are markedas Latin-1 or UTF-8 to allow access to paths not in the currentencoding. These includefile
,file.access
,file.append
,file.copy
,file.create
,file.exists
,file.info
,file.link
,file.remove
,file.rename
,file.symlink
anddir.create
,dir.exists
,normalizePath
,path.expand
,pipe
,Sys.glob
,Sys.junction
,
unlink
but notgzfile
bzfile
,xzfile
norunz
.
For functions usinggzfile
(includingload
,readRDS
,read.dcf
andtar
), it is often possible to use agzcon
connection wrapping afile
connection.
Other notable exceptions arelist.files
,list.dirs
,system
and file-path inputs forgraphics devices.
BeforeR 4.0.0, file paths marked as being in Latin-1 or UTF-8 weresilently translated to the native encoding using escapes such as‘<e7>’ or ‘<U+00e7>’. This created valid file names butmaybe not those intended.
This document is still a work-in-progress.
Check if each element of a character vector is valid in its impliedencoding.
validUTF8(x)validEnc(x)
validUTF8(x)validEnc(x)
x | a character vector. |
These use similar checks to those used by functions such asgrep
.
validUTF8
ignores any marked encoding (seeEncoding
) and so looks directly if the bytes in eachstring are valid UTF-8. (For the validity of ‘noncharacters’see the help forintToUtf8
.)
validEnc
regards character strings as validly encoded unlesstheir encodings are marked as UTF-8 or they are unmarked and theRsession is in a UTF-8 or other multi-byte locale. (The checks inother multi-byte locales depend on the OS and as withiconv
not all invalid inputs may be detected.)
A logical vector of the same length asx
.NA
elementsare regarded as validly encoded.
It would be possible to check for the validity of character strings ina Latin-1 encoding, but extensions such as CP1252 are widely acceptedas ‘Latin-1’ and 8-bit encodings rarely need to be checked forvalidity.
x <- ## from example(text)c("Jetz", "no", "chli", "z\xc3\xbcrit\xc3\xbc\xc3\xbctsch:", "(noch", "ein", "bi\xc3\x9fchen", "Z\xc3\xbc", "deutsch)", ## from a CRAN check log "\xfa\xb4\xbf\xbf\x9f")validUTF8(x)validEnc(x) # depends on the localeEncoding(x) <-"UTF-8"validEnc(x) # typically the last, x[10], is invalid## Maybe advantageous to declare it "unknown":G <- x ; Encoding(G[!validEnc(G)]) <- "unknown"try( substr(x, 1,1) ) # gives 'invalid multibyte string' error in a UTF-8 localetry( substr(G, 1,1) ) # works in a UTF-8 localenchar(G) # fine, too## but it is not "more valid" typically:all.equal(validEnc(x), validEnc(G)) # typically TRUE
x<-## from example(text)c("Jetz","no","chli","z\xc3\xbcrit\xc3\xbc\xc3\xbctsch:","(noch","ein","bi\xc3\x9fchen","Z\xc3\xbc","deutsch)",## from a CRAN check log"\xfa\xb4\xbf\xbf\x9f")validUTF8(x)validEnc(x)# depends on the localeEncoding(x)<-"UTF-8"validEnc(x)# typically the last, x[10], is invalid## Maybe advantageous to declare it "unknown":G<- x; Encoding(G[!validEnc(G)])<-"unknown"try( substr(x,1,1))# gives 'invalid multibyte string' error in a UTF-8 localetry( substr(G,1,1))# works in a UTF-8 localenchar(G)# fine, too## but it is not "more valid" typically:all.equal(validEnc(x), validEnc(G))# typically TRUE
Avector inR is either an atomic vector i.e., one of the atomictypes, see ‘Details’, or of type (typeof
) or modelist
orexpression
.
vector
produces a ‘simple’ vector of the given length andmode, where a ‘simple’ vector has no attribute, i.e., fulfillsis.null(attributes(.))
.
as.vector
, a generic, attempts to coerce its argument into avector of modemode
(the default is to coerce to whichevervector mode is most convenient): if the result is atomic(is.atomic
), all attributes are removed.Formode="any"
, see ‘Details’.
is.vector(x)
returnsTRUE
ifx
is a vector of thespecified mode having no attributesother than names.Formode="any"
, see ‘Details’.
vector(mode = "logical", length = 0)as.vector(x, mode = "any")is.vector(x, mode = "any")
vector(mode="logical", length=0)as.vector(x, mode="any")is.vector(x, mode="any")
mode | character string naming an atomic mode or |
length | a non-negative integer specifying the desired length. Foralong vector, i.e., |
x | anR object. |
The atomic modes are"logical"
,"integer"
,"numeric"
(synonym"double"
),"complex"
,"character"
and"raw"
.
Ifmode = "any"
,is.vector
may returnTRUE
forthe atomic modes,list
andexpression
.For anymode
, it will returnFALSE
ifx
has anyattributes except names. (This is incompatible with S.) On the otherhand,as.vector
removesall attributes including namesfor results of atomic mode.
Formode = "any"
, and atomic vectorsx
,as.vector(x)
strips allattributes
(includingnames
),returning a simple atomic vector.
However, whenx
is of type"list"
or"expression"
,as.vector(x)
currently returns theargumentx
unchanged, unless there is anas.vector
methodforclass(x)
.
Note that factors arenot vectors;is.vector
returnsFALSE
andas.vector
converts a factor to a charactervector formode = "any"
.
Forvector
, a vector of the given length and mode. Logicalvector elements are initialized toFALSE
, numeric vectorelements to0
, character vector elements to""
, rawvector elements tonul
bytes and list/expression elements toNULL
.
Foras.vector
, a vector (atomic or of type list or expression).All attributes are removed from the result if it is of an atomic mode,but not in general for a list or expression result. The default method handles 24input types and 12 values oftype
: the details of mostcoercions are undocumented and subject to change.
Foris.vector
,TRUE
orFALSE
.is.vector(x, mode = "numeric")
can be true for vectors of types"integer"
or"double"
whereasis.vector(x, mode = "double")
can only be true for those of type"double"
.
as.vector()
Writers of methods foras.vector
need to take care tofollow the conventions of the default method. In particular
Argumentmode
can be"any"
, any of the atomicmodes,"list"
,"expression"
,"symbol"
,"pairlist"
or one of the aliases"double"
and"name"
.
The return value should be of the appropriate mode. Formode = "any"
this means an atomic vector or list or expression.
Attributes should be treated appropriately: in particular whenthe result is an atomic vector there should be no attributes, noteven names.
is.vector(as.vector(x, m), m)
should be true for anymodem
, including the default"any"
.
Currently this is not fulfilled inR whenm == "any"
andx
is of typelist
orexpression
withattributes in addition tonames
— typically the case for(S3 or S4) objects (seeis.object
) which are listsinternally.
as.vector
andis.vector
are quite distinct from themeaning of the formal class"vector"
in themethodspackage, and henceas(x, "vector")
andis(x, "vector")
.
Note thatas.vector(x)
is not necessarily a null operation ifis.vector(x)
is true: any names will be removed from an atomicvector.
Non-vectormode
s"symbol"
(synonym"name"
) and"pairlist"
are accepted but have long been undocumented: theyare used to implementas.name
andas.pairlist
, and those functions should preferably beused directly. None of the description here applies to thosemode
s: see the help for the preferred forms.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
c
,is.numeric
,is.list
, etc.
df <- data.frame(x = 1:3, y = 5:7)## Error:try(as.vector(data.frame(x = 1:3, y = 5:7), mode = "numeric"))x <- c(a = 1, b = 2)is.vector(x)as.vector(x)all.equal(x, as.vector(x)) ## FALSE###-- All the following are TRUE:is.list(df)! is.vector(df)! is.vector(df, mode = "list")is.vector(list(), mode = "list")
df<- data.frame(x=1:3, y=5:7)## Error:try(as.vector(data.frame(x=1:3, y=5:7), mode="numeric"))x<- c(a=1, b=2)is.vector(x)as.vector(x)all.equal(x, as.vector(x))## FALSE###-- All the following are TRUE:is.list(df)! is.vector(df)! is.vector(df, mode="list")is.vector(list(), mode="list")
Vectorize
creates a function wrapper that vectorizes theaction of its argumentFUN
.
Vectorize(FUN, vectorize.args = arg.names, SIMPLIFY = TRUE, USE.NAMES = TRUE)
Vectorize(FUN, vectorize.args= arg.names, SIMPLIFY=TRUE, USE.NAMES=TRUE)
FUN | function to apply, found via |
vectorize.args | a character vector of arguments which should bevectorized. Defaults to all arguments of |
SIMPLIFY | logical or character string; attempt to reduce theresult to a vector, matrix or higher dimensional array; seethe |
USE.NAMES | logical; use names if the first ... argument hasnames, or if it is a character vector, use that character vector asthe names. |
The arguments named in thevectorize.args
argument toVectorize
are the arguments passed in the...
list tomapply
. Only those that are actually passed will bevectorized; default values will not. See the examples.
Vectorize
cannot be used with primitive functions as they donot have a value forformals
.
It also cannot be used with functions that have arguments namedFUN
,vectorize.args
,SIMPLIFY
orUSE.NAMES
, as they will interfere with theVectorize
arguments. See thecombn
example below for a workaround.
A function with the same arguments asFUN
, wrapping a call tomapply
.
# We use rep.int as rep is primitivevrep <- Vectorize(rep.int)vrep(1:4, 4:1)vrep(times = 1:4, x = 4:1)vrep <- Vectorize(rep.int, "times")vrep(times = 1:4, x = 42)f <- function(x = 1:3, y) c(x, y)vf <- Vectorize(f, SIMPLIFY = FALSE)f(1:3, 1:3)vf(1:3, 1:3)vf(y = 1:3) # Only vectorizes y, not x# Nonlinear regression contour plot, based on nls() examplerequire(graphics)SS <- function(Vm, K, resp, conc) { pred <- (Vm * conc)/(K + conc) sum((resp - pred)^2 / pred)}vSS <- Vectorize(SS, c("Vm", "K"))Treated <- subset(Puromycin, state == "treated")Vm <- seq(140, 310, length.out = 50)K <- seq(0, 0.15, length.out = 40)SSvals <- outer(Vm, K, vSS, Treated$rate, Treated$conc)contour(Vm, K, SSvals, levels = (1:10)^2, xlab = "Vm", ylab = "K")# combn() has an argument named FUNcombnV <- Vectorize(function(x, m, FUNV = NULL) combn(x, m, FUN = FUNV), vectorize.args = c("x", "m"))combnV(4, 1:4)combnV(4, 1:4, sum)
# We use rep.int as rep is primitivevrep<- Vectorize(rep.int)vrep(1:4,4:1)vrep(times=1:4, x=4:1)vrep<- Vectorize(rep.int,"times")vrep(times=1:4, x=42)f<-function(x=1:3, y) c(x, y)vf<- Vectorize(f, SIMPLIFY=FALSE)f(1:3,1:3)vf(1:3,1:3)vf(y=1:3)# Only vectorizes y, not x# Nonlinear regression contour plot, based on nls() examplerequire(graphics)SS<-function(Vm, K, resp, conc){ pred<-(Vm* conc)/(K+ conc) sum((resp- pred)^2/ pred)}vSS<- Vectorize(SS, c("Vm","K"))Treated<- subset(Puromycin, state=="treated")Vm<- seq(140,310, length.out=50)K<- seq(0,0.15, length.out=40)SSvals<- outer(Vm, K, vSS, Treated$rate, Treated$conc)contour(Vm, K, SSvals, levels=(1:10)^2, xlab="Vm", ylab="K")# combn() has an argument named FUNcombnV<- Vectorize(function(x, m, FUNV=NULL) combn(x, m, FUN= FUNV), vectorize.args= c("x","m"))combnV(4,1:4)combnV(4,1:4, sum)
Generates a warning message that corresponds to its argument(s) and(optionally) the expression or function from which it was called.
warning(..., call. = TRUE, immediate. = FALSE, noBreaks. = FALSE, domain = NULL)suppressWarnings(expr, classes = "warning")
warning(..., call.=TRUE, immediate.=FALSE, noBreaks.=FALSE, domain=NULL)suppressWarnings(expr, classes="warning")
... | either zero or more objects which can be coercedto character (and which are pasted together with no separator)or a singlecondition object. |
call. | logical, indicating if the call should become part of thewarning message. |
immediate. | logical, indicating if the warning should be outputimmediately, even if |
noBreaks. | logical, indicating as far as possible the message shouldbe output as a single line when |
expr | expression to evaluate. |
domain | see |
classes | character, indicating which classes of warnings shouldbe suppressed. |
The resultdepends on the value ofoptions("warn")
and on handlers established in theexecuting code.
If acondition object is supplied it should be the onlyargument, and further arguments will be ignored, with a message.options(warn = 1)
can be used to request an immediatereport.
warning
signals a warning condition by (effectively) callingsignalCondition
. If there are no handlers or if all handlersreturn, then the value ofwarn =getOption("warn")
isused to determine the appropriate action. Ifwarn
is negativewarnings are ignored; if it is zero they are stored and printed afterthe top–level function has completed; if it is one they are printedas they occur and if it is 2 (or larger) warnings are turned intoerrors. Callingwarning(immediate. = TRUE)
turnswarn <= 0
intowarn = 1
for this call only.
Ifwarn
is zero (the default), a read-only variablelast.warning
is created. It contains the warnings which can beprinted via a call towarnings
.
Warnings will be truncated togetOption("warning.length")
characters, default 1000, indicated by[... truncated]
.
While the warning is being processed, amuffleWarning
restartis available. If this restart is invoked withinvokeRestart
,thenwarning
returns immediately.
An attempt is made to coerce other types of inputs towarning
to character vectors.
suppressWarnings
evaluates its expression in a context thatignores all warnings.
The warning message ascharacter
string, invisibly.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
stop
for fatal errors,message
for diagnostic messages,warnings
,andoptions
with argumentwarn=
.
gettext
for the mechanisms for the automated translationof messages.
testit <- function() warning("testit")testit() ## shows calltestit <- function() warning("problem in testit", call. = FALSE)testit() ## no callsuppressWarnings(warning("testit"))
testit<-function() warning("testit")testit()## shows calltestit<-function() warning("problem in testit", call.=FALSE)testit()## no callsuppressWarnings(warning("testit"))
warnings
and itsprint
method print thevariablelast.warning
in a pleasing form.
warnings(...)## S3 method for class 'warnings'summary(object, ...)## S3 method for class 'warnings'print(x, tags, header = ngettext(n, "Warning message:\n", "Warning messages:\n"), ...)## S3 method for class 'summary.warnings'print(x, ...)
warnings(...)## S3 method for class 'warnings'summary(object,...)## S3 method for class 'warnings'print(x, tags, header= ngettext(n,"Warning message:\n","Warning messages:\n"),...)## S3 method for class 'summary.warnings'print(x,...)
... | arguments to be passed to |
object | a |
x | a |
tags | if not |
header | a character string |
See the description ofoptions("warn")
for thecircumstances under which there is alast.warning
object andwarnings()
is used. In essence this is ifoptions(warn = 0)
andwarning
has been called at least once.
Note that thelength(last.warning)
is maximallygetOption("nwarnings")
(at the time the warnings aregenerated) which is50
by default. To increase, use somethinglike
options(nwarnings = 10000)
It is possible thatlast.warning
refers to the last recordedwarning and not to the last warning, for example ifoptions(warn)
hasbeen changed or if a catastrophic error occurred.
warnings()
returns an object of S3 class"warnings"
, basically a namedlist
.InR versions before 4.4.0, it returnedNULL
when therewere no warnings, contrary to the above documentation.
summary(<warnings>)
returns a"summary.warnings"
object which is basically thelist
of unique warnings(unique(object)
) with a"counts"
attribute, somewhatexperimentally.
It is undocumented wherelast.warning
is stored nor that it isvisible, and this is subject to change.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
## NB this example is intended to be pasted in,## rather than run by example()ow <- options("warn")for(w in -1:1) { options(warn = w); cat("\n warn =", w, "\n") for(i in 1:3) { cat(i,"..\n"); m <- matrix(1:7, 3,4) } cat("--=--=--\n")}## at the end prints all three warnings, from the 'option(warn = 0)' aboveoptions(ow) # reset to previous, typically 'warn = 0'tail(warnings(), 2) # see the last two warnings only (via '[' method)## Often the most useful way to look at many warnings:summary(warnings())op <- options(nwarnings = 10000) ## <- get "full statistics"x <- 1:36; for(n in 1:13) for(m in 1:12) A <- matrix(x, n,m) # There were 105 warnings ...summary(warnings())options(op) # revert to previous (keeping 50 messages by default)
## NB this example is intended to be pasted in,## rather than run by example()ow<- options("warn")for(win-1:1){ options(warn= w); cat("\n warn =", w,"\n")for(iin1:3){ cat(i,"..\n"); m<- matrix(1:7,3,4)} cat("--=--=--\n")}## at the end prints all three warnings, from the 'option(warn = 0)' aboveoptions(ow)# reset to previous, typically 'warn = 0'tail(warnings(),2)# see the last two warnings only (via '[' method)## Often the most useful way to look at many warnings:summary(warnings())op<- options(nwarnings=10000)## <- get "full statistics"x<-1:36;for(nin1:13)for(min1:12) A<- matrix(x, n,m)# There were 105 warnings ...summary(warnings())options(op)# revert to previous (keeping 50 messages by default)
Extract the weekday, month or quarter, or the Julian time(days since some origin). These are generic functions: the methodsfor the internal date-time classes are documented here.
weekdays(x, abbreviate)## S3 method for class 'POSIXt'weekdays(x, abbreviate = FALSE)## S3 method for class 'Date'weekdays(x, abbreviate = FALSE)months(x, abbreviate)## S3 method for class 'POSIXt'months(x, abbreviate = FALSE)## S3 method for class 'Date'months(x, abbreviate = FALSE)quarters(x, abbreviate)## S3 method for class 'POSIXt'quarters(x, ...)## S3 method for class 'Date'quarters(x, ...)julian(x, ...)## S3 method for class 'POSIXt'julian(x, origin = as.POSIXct("1970-01-01", tz = "GMT"), ...)## S3 method for class 'Date'julian(x, origin = as.Date("1970-01-01"), ...)
weekdays(x, abbreviate)## S3 method for class 'POSIXt'weekdays(x, abbreviate=FALSE)## S3 method for class 'Date'weekdays(x, abbreviate=FALSE)months(x, abbreviate)## S3 method for class 'POSIXt'months(x, abbreviate=FALSE)## S3 method for class 'Date'months(x, abbreviate=FALSE)quarters(x, abbreviate)## S3 method for class 'POSIXt'quarters(x,...)## S3 method for class 'Date'quarters(x,...)julian(x,...)## S3 method for class 'POSIXt'julian(x, origin= as.POSIXct("1970-01-01", tz="GMT"),...)## S3 method for class 'Date'julian(x, origin= as.Date("1970-01-01"),...)
x | an object inheriting from class |
abbreviate | logical vector (possibly recycled). Should the names beabbreviated? |
origin | an length-one object inheriting from class |
... | arguments for other methods. |
weekdays
andmonths
return a charactervector of names in the locale in use, i.e.,Sys.getlocale("LC_TIME")
.
quarters
returns a character vector of"Q1"
to"Q4"
.
julian
returns the number of days (possibly fractional)since the origin, with the origin as a"origin"
attribute.All time calculations inR are done ignoring leap-seconds.
Other components such as the day of the month or the year arevery easy to compute: just useas.POSIXlt
and extractthe relevant component. Alternatively (especially if the componentsare desired as character strings), usestrftime
.
DateTimeClasses
,Date
;Sys.getlocale("LC_TIME")
crucially formonths()
andweekdays()
.
## first two are locale dependent:weekdays(.leap.seconds)months (.leap.seconds)quarters(.leap.seconds)## Show how easily you get month, day, year, day (of {month, week, yr}), ... :## (remember to count from 0 (!): mon = 0..11, wday = 0..6, etc !!)##' Transform (Time-)Date vector to convenient data frame :dt2df <- function(dt, dName = deparse(substitute(dt))) { DF <- as.data.frame(unclass(as.POSIXlt( dt ))) `names<-`(cbind(dt, DF, deparse.level=0L), c(dName, names(DF)))}## e.g.,dt2df(.leap.seconds) # date+timedt2df(Sys.Date() + 0:9) # date##' Even simpler: Date -> Matrix - dropping time info {sec,min,hour, isdst}d2mat <- function(x) simplify2array(unclass(as.POSIXlt(x))[4:7])## e.g.,d2mat(seq(as.Date("2000-02-02"), by=1, length.out=30)) # has R 1.0.0's release date## Julian Day Number (JDN, https://en.wikipedia.org/wiki/Julian_day)## is the number of days since noon UTC on the first day of 4317 BCE.## in the proleptic Julian calendar. To more recently, in## 'Terrestrial Time' which differs from UTC by a few seconds## See https://en.wikipedia.org/wiki/Terrestrial_Timejulian(Sys.Date(), -2440588) # from a dayfloor(as.numeric(julian(Sys.time())) + 2440587.5) # from a date-time
## first two are locale dependent:weekdays(.leap.seconds)months(.leap.seconds)quarters(.leap.seconds)## Show how easily you get month, day, year, day (of {month, week, yr}), ... :## (remember to count from 0 (!): mon = 0..11, wday = 0..6, etc !!)##' Transform (Time-)Date vector to convenient data frame :dt2df<-function(dt, dName= deparse(substitute(dt))){ DF<- as.data.frame(unclass(as.POSIXlt( dt))) `names<-`(cbind(dt, DF, deparse.level=0L), c(dName, names(DF)))}## e.g.,dt2df(.leap.seconds)# date+timedt2df(Sys.Date()+0:9)# date##' Even simpler: Date -> Matrix - dropping time info {sec,min,hour, isdst}d2mat<-function(x) simplify2array(unclass(as.POSIXlt(x))[4:7])## e.g.,d2mat(seq(as.Date("2000-02-02"), by=1, length.out=30))# has R 1.0.0's release date## Julian Day Number (JDN, https://en.wikipedia.org/wiki/Julian_day)## is the number of days since noon UTC on the first day of 4317 BCE.## in the proleptic Julian calendar. To more recently, in## 'Terrestrial Time' which differs from UTC by a few seconds## See https://en.wikipedia.org/wiki/Terrestrial_Timejulian(Sys.Date(),-2440588)# from a dayfloor(as.numeric(julian(Sys.time()))+2440587.5)# from a date-time
Give theTRUE
indices of a logical object, allowing for arrayindices.
which(x, arr.ind = FALSE, useNames = TRUE)arrayInd(ind, .dim, .dimnames = NULL, useNames = FALSE)
which(x, arr.ind=FALSE, useNames=TRUE)arrayInd(ind, .dim, .dimnames=NULL, useNames=FALSE)
x | a |
arr.ind | logical; shouldarrayindices be returnedwhen |
ind | integer-valued index vector, as resulting from |
.dim |
|
.dimnames | optional list of character |
useNames | logical indicating if the value of |
Ifarr.ind == FALSE
(the default), an integer vector,or a double vector ifx
is along vector, withlength
equal tosum(x)
, i.e., to the number ofTRUE
s inx
.
Basically, the result is(1:length(x))[x]
in typical cases;more generally, including whenx
hasNA
's,which(x)
isseq_along(x)[!is.na(x) & x]
plusnames
whenx
has.
Ifarr.ind == TRUE
andx
is anarray
(hasadim
attribute), the result isarrayInd(which(x), dim(x), dimnames(x))
, namely a matrixwhose rows each are the indices of one element ofx
; seeExamples below.
Unlike most other baseR functions this does not coercex
to logical: only arguments withtypeof
logical areaccepted and others give an error.
Werner Stahel and Peter Holzer (ETH Zurich) proposed thearr.ind
option.
Logic
,which.min
for the index ofthe minimum or maximum, andmatch
for the first index ofan element in a vector, i.e., for a scalara
,match(a, x)
is equivalent tomin(which(x == a))
but much more efficient.
which(LETTERS == "R")which(ll <- c(TRUE, FALSE, TRUE, NA, FALSE, FALSE, TRUE)) #> 1 3 7names(ll) <- letters[seq(ll)]which(ll)which((1:12)%%2 == 0) # which are even?which(1:10 > 3, arr.ind = TRUE)( m <- matrix(1:12, 3, 4) )div.3 <- m %% 3 == 0which(div.3)which(div.3, arr.ind = TRUE)rownames(m) <- paste("Case", 1:3, sep = "_")which(m %% 5 == 0, arr.ind = TRUE)dim(m) <- c(2, 2, 3); mwhich(div.3, arr.ind = FALSE)which(div.3, arr.ind = TRUE)vm <- c(m)dim(vm) <- length(vm) #-- funny thing with length(dim(...)) == 1which(div.3, arr.ind = TRUE)
which(LETTERS=="R")which(ll<- c(TRUE,FALSE,TRUE,NA,FALSE,FALSE,TRUE))#> 1 3 7names(ll)<- letters[seq(ll)]which(ll)which((1:12)%%2==0)# which are even?which(1:10>3, arr.ind=TRUE)( m<- matrix(1:12,3,4))div.3<- m%%3==0which(div.3)which(div.3, arr.ind=TRUE)rownames(m)<- paste("Case",1:3, sep="_")which(m%%5==0, arr.ind=TRUE)dim(m)<- c(2,2,3); mwhich(div.3, arr.ind=FALSE)which(div.3, arr.ind=TRUE)vm<- c(m)dim(vm)<- length(vm)#-- funny thing with length(dim(...)) == 1which(div.3, arr.ind=TRUE)
Determines the location, i.e., index of the (first) minimum or maximumof a numeric (or logical) vector.
which.min(x)which.max(x)
which.min(x)which.max(x)
x | numeric (logical, integer or double) vector or anR objectfor which the internal coercion to |
Missing andNaN
values are discarded.
aninteger
or on 64-bit platforms, iflength(x) =: n
an integervalued
double
of length 1 or 0 (iffx
has nonon-NA
s), giving the index of thefirst minimum ormaximum respectively ofx
.
If this extremum is unique (or empty), the results are the same as(but more efficient than)which(x == min(x, na.rm = TRUE))
orwhich(x == max(x, na.rm = TRUE))
respectively.
x
– FirstTRUE
orFALSE
For alogical
vectorx
with bothFALSE
andTRUE
values,which.min(x)
andwhich.max(x)
returnthe index of the firstFALSE
orTRUE
, respectively, asFALSE < TRUE
. However,match(FALSE, x)
ormatch(TRUE, x)
are typicallypreferred, as they doindicate mismatches.
Martin Maechler
UsearrayInd()
, if you need array/matrix indices insteadof 1D vector ones.
which.is.max
in packagennet differs inbreaking ties at random (and having a ‘fuzz’ in the definitionof ties).
x <- c(1:4, 0:5, 11)which.min(x)which.max(x)## it *does* work with NA's present, by discarding them:presidents[1:30]range(presidents, na.rm = TRUE)which.min(presidents) # 28which.max(presidents) # 2## Find the first occurrence, i.e. the first TRUE, if there is at least one:x <- rpois(10000, lambda = 10); x[sample.int(50, 20)] <- NA## where is the first value >= 20 ?which.max(x >= 20)## Also works for lists (which can be coerced to numeric vectors):which.min(list(A = 7, pi = pi)) ## -> c(pi = 2L)
x<- c(1:4,0:5,11)which.min(x)which.max(x)## it *does* work with NA's present, by discarding them:presidents[1:30]range(presidents, na.rm=TRUE)which.min(presidents)# 28which.max(presidents)# 2## Find the first occurrence, i.e. the first TRUE, if there is at least one:x<- rpois(10000, lambda=10); x[sample.int(50,20)]<-NA## where is the first value >= 20 ?which.max(x>=20)## Also works for lists (which can be coerced to numeric vectors):which.min(list(A=7, pi= pi))## -> c(pi = 2L)
Evaluate anR expression in an environment constructed from data,possibly modifying (a copy of) the original data.
with(data, expr, ...)within(data, expr, ...)## S3 method for class 'list'within(data, expr, keepAttrs = TRUE, ...)
with(data, expr,...)within(data, expr,...)## S3 method for class 'list'within(data, expr, keepAttrs=TRUE,...)
data | data to use for constructing an environment. For thedefault |
expr | expression to evaluate; particularly for { a <- somefun() b <- otherfun() ..... rm(unused1, temp) } |
keepAttrs | for the |
... | arguments to be passed to (future) methods. |
with
is a generic function that evaluatesexpr
in alocal environment constructed fromdata
. The environment hasthe caller's environment as its parent. This is useful forsimplifying calls to modeling functions. (Note: ifdata
isalready an environment then this is used with its existing parent.)
Note that assignments withinexpr
take place in the constructedenvironment and not in the user's workspace.
within
is similar, except that it examines the environmentafter the evaluation ofexpr
and makes the correspondingmodifications to a copy ofdata
(this may fail in the dataframe case if objects are created which cannot be stored in a dataframe), and returns it.within
can be used as an alternativetotransform
.
Forwith
, the value of the evaluatedexpr
. Forwithin
, the modified object.
Forinteractive use this is very effective and nice to read. Forprogramming however, i.e., in one's functions, more care isneeded, and typically one should refrain from usingwith()
, as,e.g., variables indata
may accidentally override localvariables, see the reference.
Further, when using modeling or graphics functions with an explicitdata
argument (and typically usingformula
s),it is typically preferred to use thedata
argument of thatfunction rather than to usewith(data, ...)
.
Thomas Lumley (2003)Standard nonstandard evaluation rules.https://developer.r-project.org/nonstandard-eval.pdf
evalq
,attach
,assign
,transform
.
with(mtcars, mpg[cyl == 8 & disp > 350]) # is the same as, but nicer thanmtcars$mpg[mtcars$cyl == 8 & mtcars$disp > 350]require(stats); require(graphics)# examples from glm:with(data.frame(u = c(5,10,15,20,30,40,60,80,100), lot1 = c(118,58,42,35,27,25,21,19,18), lot2 = c(69,35,26,21,18,16,13,12,12)), list(summary(glm(lot1 ~ log(u), family = Gamma)), summary(glm(lot2 ~ log(u), family = Gamma))))aq <- within(airquality, { # Notice that multiple vars can be changed lOzone <- log(Ozone) Month <- factor(month.abb[Month]) cTemp <- round((Temp - 32) * 5/9, 1) # From Fahrenheit to Celsius S.cT <- Solar.R / cTemp # using the newly created variable rm(Day, Temp)})head(aq)# example from boxplot:with(ToothGrowth, { boxplot(len ~ dose, boxwex = 0.25, at = 1:3 - 0.2, subset = (supp == "VC"), col = "yellow", main = "Guinea Pigs' Tooth Growth", xlab = "Vitamin C dose mg", ylab = "tooth length", ylim = c(0, 35)) boxplot(len ~ dose, add = TRUE, boxwex = 0.25, at = 1:3 + 0.2, subset = supp == "OJ", col = "orange") legend(2, 9, c("Ascorbic acid", "Orange juice"), fill = c("yellow", "orange"))})# alternate form that avoids subset argument:with(subset(ToothGrowth, supp == "VC"), boxplot(len ~ dose, boxwex = 0.25, at = 1:3 - 0.2, col = "yellow", main = "Guinea Pigs' Tooth Growth", xlab = "Vitamin C dose mg", ylab = "tooth length", ylim = c(0, 35)))with(subset(ToothGrowth, supp == "OJ"), boxplot(len ~ dose, add = TRUE, boxwex = 0.25, at = 1:3 + 0.2, col = "orange"))legend(2, 9, c("Ascorbic acid", "Orange juice"), fill = c("yellow", "orange"))
with(mtcars, mpg[cyl==8& disp>350])# is the same as, but nicer thanmtcars$mpg[mtcars$cyl==8& mtcars$disp>350]require(stats); require(graphics)# examples from glm:with(data.frame(u= c(5,10,15,20,30,40,60,80,100), lot1= c(118,58,42,35,27,25,21,19,18), lot2= c(69,35,26,21,18,16,13,12,12)), list(summary(glm(lot1~ log(u), family= Gamma)), summary(glm(lot2~ log(u), family= Gamma))))aq<- within(airquality,{# Notice that multiple vars can be changed lOzone<- log(Ozone) Month<- factor(month.abb[Month]) cTemp<- round((Temp-32)*5/9,1)# From Fahrenheit to Celsius S.cT<- Solar.R/ cTemp# using the newly created variable rm(Day, Temp)})head(aq)# example from boxplot:with(ToothGrowth,{ boxplot(len~ dose, boxwex=0.25, at=1:3-0.2, subset=(supp=="VC"), col="yellow", main="Guinea Pigs' Tooth Growth", xlab="Vitamin C dose mg", ylab="tooth length", ylim= c(0,35)) boxplot(len~ dose, add=TRUE, boxwex=0.25, at=1:3+0.2, subset= supp=="OJ", col="orange") legend(2,9, c("Ascorbic acid","Orange juice"), fill= c("yellow","orange"))})# alternate form that avoids subset argument:with(subset(ToothGrowth, supp=="VC"), boxplot(len~ dose, boxwex=0.25, at=1:3-0.2, col="yellow", main="Guinea Pigs' Tooth Growth", xlab="Vitamin C dose mg", ylab="tooth length", ylim= c(0,35)))with(subset(ToothGrowth, supp=="OJ"), boxplot(len~ dose, add=TRUE, boxwex=0.25, at=1:3+0.2, col="orange"))legend(2,9, c("Ascorbic acid","Orange juice"), fill= c("yellow","orange"))
This function evaluates an expression, returning it in a two element listcontaining its value and a flag showing whether it would automatically print.
withVisible(x)
withVisible(x)
x | an expression to be evaluated. |
The argument,not anexpression
object, ratheran (unevaluated function)call
, is evaluated in thecaller's context.
This is aprimitive function.
value | The value of |
visible | logical; whether the value would auto-print. |
invisible
,eval
;withAutoprint()
callssource()
whichitself useswithVisible()
in order to correctly“auto print”.
x <- 1withVisible(x <- 1) # *$visible is FALSExwithVisible(x) # *$visible is TRUE# Wrap the call in evalq() for special handlingdf <- data.frame(a = 1:5, b = 1:5)evalq(withVisible(a + b), envir = df)
x<-1withVisible(x<-1)# *$visible is FALSExwithVisible(x)# *$visible is TRUE# Wrap the call in evalq() for special handlingdf<- data.frame(a=1:5, b=1:5)evalq(withVisible(a+ b), envir= df)
Write datax
to a file or otherconnection
.
As it simply callscat()
, less formatting happens thanwithprint()
ing.Ifx
is a matrix you need to transpose it (and typically setncolumns
) to get the columns infile
the same as those inthe internal representation.
Whereas atomic vectors (numeric
,character
,etc, including matrices) are written plainly, i.e., without any names,less simple vector-like objects such as"factor"
,"Date"
, or"POSIXt"
may beformat
ted to character before writing.
write(x, file = "data", ncolumns = if(is.character(x)) 1 else 5, append = FALSE, sep = " ")
write(x, file="data", ncolumns=if(is.character(x))1else5, append=FALSE, sep=" ")
x | the data to be written out. |
file | a When |
ncolumns | the number of columns to write the data in. |
append | if |
sep | a string used to separate columns. Using |
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
write
is a wrapper forcat
, which gives furtherdetails on the format used.
write.table
for matrix and data frame objects,writeLines
for lines of text,andscan
for reading data.
saveRDS
andsave
are often preferable (forwriting anyR objects).
# Demonstrate default ncolumns, writing to the consolewrite(month.abb, "") # 1 element per line for "character"write(stack.loss, "") # 5 elements per line for "numeric"# Build a file with sequential callsfil <- tempfile("data")write("# Model settings", fil)write(month.abb, fil, ncolumns = 6, append = TRUE)write("\n# Initial parameter values", fil, append = TRUE)write(sqrt(stack.loss), fil, append = TRUE)if(interactive()) file.show(fil)unlink(fil) # tidy up
# Demonstrate default ncolumns, writing to the consolewrite(month.abb,"")# 1 element per line for "character"write(stack.loss,"")# 5 elements per line for "numeric"# Build a file with sequential callsfil<- tempfile("data")write("# Model settings", fil)write(month.abb, fil, ncolumns=6, append=TRUE)write("\n# Initial parameter values", fil, append=TRUE)write(sqrt(stack.loss), fil, append=TRUE)if(interactive()) file.show(fil)unlink(fil)# tidy up
Write text lines to a connection.
writeLines(text, con = stdout(), sep = "\n", useBytes = FALSE)
writeLines(text, con= stdout(), sep="\n", useBytes=FALSE)
text | a character vector. |
con | aconnection object or a character string. |
sep | character string. A string to be written to the connectionafter each line of text. |
useBytes | logical. See ‘Details’. |
If thecon
is a character string, the function callsfile
to obtain a file connection which is opened forthe duration of the function call.(tilde expansion of the file path is done byfile
.)
If the connection is open it is written from its current position.If it is not open, it is opened for the duration of the call in"wt"
mode and then closed again.
NormallywriteLines
is used with a text-mode connection, and thedefault separator is converted to the normal separator for thatplatform (LF on Unix/Linux,CRLF on Windows). For morecontrol, open a binary connection and specify the precise value you want written tothe file insep
. For even more control, usewriteChar
on a binary connection.
useBytes
is for expert use. Normally (when false) characterstrings with marked encodings are converted to the current encodingbefore being passed to the connection (which might do furtherre-encoding).useBytes = TRUE
suppresses the re-encoding ofmarked strings so they are passed byte-by-byte to the connection:this can be useful when strings have already been re-encoded bye.g.iconv
. (It is invoked automatically for stringswith marked encoding"bytes"
.)
connections
,writeChar
,writeBin
,readLines
,cat
A generic auxiliary function that produces a numeric vector whichwill sort in the same order asx
.
xtfrm(x)
xtfrm(x)
x | anR object. |
This is a special case of ranking, but as a less general function thanrank
is more suitable to be made generic. The defaultmethod is similar torank(x, ties.method = "min", na.last = "keep")
, soNA
values are given rankNA
and alltied values are given equal integer rank.
Thefactor
method extracts the codes.
The default method will unclass the object ifis.numeric(x)
is true but otherwise make use of==
and>
methods for the class ofx[i]
(forintegersi
), and theis.na
method for the class ofx
, but might be rather slow when doing so.
This is aninternal genericprimitive, so S3 or S4methods can be written for it.Differently to other internal generics, the default method is calledexplicitly when no other dispatch has happened.
A numeric (usually integer) vector of the same length asx
.
zapsmall
determines adigits
argumentdr
forcallinground(x, digits = dr)
such that values close tozero (compared with the maximal absolute value in the vector) are‘zapped’, i.e., replaced by0
.
zapsmall(x, digits = getOption("digits"), mFUN = function(x, ina) max(abs(x[!ina])), min.d = 0L)
zapsmall(x, digits= getOption("digits"), mFUN=function(x, ina) max(abs(x[!ina])), min.d=0L)
x | a numeric or complex vector or anyR number-like objectwhich has a |
digits | integer indicating the precision to be used. |
mFUN | a |
min.d | an integer specifying the minimal number of digits to use inthe resulting |
Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.
x2 <- pi * 100^(-2:2)/10 print( x2, digits = 4)zapsmall( x2) # automatical digitszapsmall( x2, digits = 4)zapsmall(c(x2, Inf)) # round()s to integer ..zapsmall(c(x2, Inf), min.d=-Inf) # everything is small wrt Inf(z <- exp(1i*0:4*pi/2))zapsmall(z)zapShow <- function(x, ...) rbind(orig = x, zapped = zapsmall(x, ...))zapShow(x2)## using a *robust* mFUNmF_rob <- function(x, ina) boxplot.stats(x, do.conf=FALSE)$stats[5]## with robust mFUN(), 'Inf' is no longer distorting the picture:zapShow(c(x2, Inf), mFUN = mF_rob)zapShow(c(x2, Inf), mFUN = mF_rob, min.d = -5) # the samezapShow(c(x2, 999), mFUN = mF_rob) # same *rounding* as w/ InfzapShow(c(x2, 999), mFUN = mF_rob, min.d = 3) # the samezapShow(c(x2, 999), mFUN = mF_rob, min.d = 8) # small diff
x2<- pi*100^(-2:2)/10 print( x2, digits=4)zapsmall( x2)# automatical digitszapsmall( x2, digits=4)zapsmall(c(x2,Inf))# round()s to integer ..zapsmall(c(x2,Inf), min.d=-Inf)# everything is small wrt Inf(z<- exp(1i*0:4*pi/2))zapsmall(z)zapShow<-function(x,...) rbind(orig= x, zapped= zapsmall(x,...))zapShow(x2)## using a *robust* mFUNmF_rob<-function(x, ina) boxplot.stats(x, do.conf=FALSE)$stats[5]## with robust mFUN(), 'Inf' is no longer distorting the picture:zapShow(c(x2,Inf), mFUN= mF_rob)zapShow(c(x2,Inf), mFUN= mF_rob, min.d=-5)# the samezapShow(c(x2,999), mFUN= mF_rob)# same *rounding* as w/ InfzapShow(c(x2,999), mFUN= mF_rob, min.d=3)# the samezapShow(c(x2,999), mFUN= mF_rob, min.d=8)# small diff
.packages
returns information about package availability.
.packages(all.available = FALSE, lib.loc = NULL)
.packages(all.available=FALSE, lib.loc=NULL)
all.available | logical; if |
lib.loc | a character vector describing the location ofRlibrary trees to search through, or |
.packages()
returns the names of the currentlyattached packagesinvisibly whereas.packages(all.available = TRUE)
gives (visibly)allpackages available in the library location pathlib.loc
.
For a package to be regarded as being ‘available’ it must have validmetadata (and hence be an installed package). However, this willreport a package as available if the metadata does not match thedirectory name: usefind.package
to confirm that themetadata match orinstalled.packages
for a much slowerbut more comprehensive check of ‘available’ packages.
A character vector of package base names, invisible unlessall.available = TRUE
.
.packages(all.available = TRUE)
is not a way to find out if asmall number of packages are available for use: not only is itexpensive when thousands of packages are installed, it is anincomplete test. See the help forfind.package
for whyrequire
should be used.
R core;Guido Masarotto for theall.available = TRUE
part of.packages
.
library
,.libPaths
,installed.packages
.
(.packages()) # maybe just "base".packages(all.available = TRUE) # return all available as character vectorrequire(splines)(.packages()) # "splines", toodetach("package:splines")
(.packages())# maybe just "base".packages(all.available=TRUE)# return all available as character vectorrequire(splines)(.packages())# "splines", toodetach("package:splines")
Miscellaneous internal/programming utilities.
.standard_regexps()
.standard_regexps()
.standard_regexps
returns a list of ‘standard’ regexps,including elements namedvalid_package_name
andvalid_package_version
with the obvious meanings. The regexpsare not anchored.