Movatterモバイル変換


[0]ホーム

URL:


Package 'base'

Title:The R Base Package
Description:Base R functions.
Authors:R Core Team and contributors worldwide
Maintainer:R Core Team <[email protected]>
License:Part of R 4.4.1
Version:4.4.1
Built:2024-06-15 17:27:47 UTC
Source:base

Help Index


The R Base Package

Description

Base R functions

Details

This package contains the basic functions which letRfunction as a language: arithmetic, input/output, basicprogramming support, etc. Its contents are available throughinheritance from any environment.

For a complete list of functions, uselibrary(help = "base").


Bin a Numeric Vector

Description

Bin a numeric vector and return integer codes for the binning.

Usage

.bincode(x, breaks, right=TRUE, include.lowest=FALSE)

Arguments

x

a numeric vector which is to be converted to integer codes bybinning.

breaks

a numeric vector of two or more cut points, sorted inincreasing order.

right

logical, indicating if the intervals should be closed onthe right (and open on the left) or vice versa.

include.lowest

logical, indicating if an ‘x[i]’ equal tothe lowest (or highest, forright = FALSE) ‘breaks’value should be included in the first (or last) bin.

Details

This is a ‘barebones’ version ofcut.default(labels = FALSE) intended for use in other functions which have checked thearguments passed. (Note the different order of the arguments they havein common.)

Unlikecut, thebreaks do not need to be unique.An input can only fall into a zero-length interval if it is closedat both ends, so only ifinclude.lowest = TRUE and it is thefirst (or last forright = FALSE) interval.

Value

An integer vector of the same length asx indicating which bineach element falls into (the leftmost bin being bin1).NaN andNA elements ofx are mapped toNA codes, as are values outside range ofbreaks.

See Also

cut,tabulate

Examples

## An example with non-unique breaks:x<- c(0,0.01,0.5,0.99,1)b<- c(0,0,1,1).bincode(x, b,TRUE).bincode(x, b,FALSE).bincode(x, b,TRUE,TRUE).bincode(x, b,FALSE,TRUE)

Lists of Open/Active Graphics Devices

Description

A pairlist of the names of open graphics devices is stored in.Devices. The name of the active device (seedev.cur) is stored in.Device. Both are symbolsand so appear in the base namespace.

Usage

.Device.Devices

Details

.Device is a length-one character vector.

.Devices is apairlist of length-one character vectors.The first entry is always"null device", and there are as manyentries as the maximal number of graphics devices which have beensimultaneously active. If a device has been removed, its entry will be"" until the device number is reused.

Devices may add attributes to the character vector: for exampledevices which write to a file may record its path in attribute"filepath".


Numerical Characteristics of the Machine

Description

.Machine is a variable holding information on the numericalcharacteristics of the machineR is running on, such as the largestdouble or integer and the machine's precision.

Usage

.Machine

Details

The algorithm is based on Cody's (1988) subroutineMACHAR. As allcurrent implementations ofR use 32-bit integers and useIEC 60559floating-point (double precision) arithmetic, the"integer" and"double" related values are the same for almost allR builds.

Note that on most platforms smaller positive values than.Machine$double.xmin can occur. On a typicalR platform thesmallest positive double is about5e-324.

Value

A list with components

double.eps

the smallest positive floating-point numberx such that1 + x != 1. It equalsdouble.base ^ ulp.digits if eitherdouble.base is 2 ordouble.rounding is 0; otherwise, it is(double.base ^ double.ulp.digits) / 2. Normally2.220446e-16.

double.neg.eps

a small positive floating-point numberxsuch that1 - x != 1. It equalsdouble.base ^ double.neg.ulp.digits ifdouble.base is 2ordouble.rounding is 0; otherwise, it is(double.base ^ double.neg.ulp.digits) / 2. Normally1.110223e-16. Asdouble.neg.ulp.digits is boundedbelow by-(double.digits + 3),double.neg.eps may notbe the smallest number that can alter 1 by subtraction.

double.xmin

the smallest non-zero normalizedfloating-point number, a power of the radix, i.e.,double.base ^ double.min.exp. Normally2.225074e-308.

double.xmax

the largest normalized floating-point number.Typically, it is equal to(1 - double.neg.eps) * double.base ^ double.max.exp, buton some machines it is only the second or third largest suchnumber, being too small by 1 or 2 units in the last digit of thesignificand. Normally1.797693e+308. Note that largerunnormalized numbers can occur.

double.base

the radix for the floating-point representation:normally2.

double.digits

the number of base digits in the floating-pointsignificand: normally53.

double.rounding

the rounding action, one of
0 if floating-point addition chops;
1 if floating-point addition rounds, but not in the IEEE style;
2 if floating-point addition rounds in the IEEE style;
3 if floating-point addition chops, and there is partial underflow;
4 if floating-point addition rounds, but not in the IEEE style, andthere is partial underflow;
5 if floating-point addition rounds in the IEEE style, and there ispartial underflow.
Normally5.

double.guard

the number of guard digits for multiplicationwith truncating arithmetic. It is 1 if floating-point arithmetictruncates and more thandouble digits base-double.base digitsparticipate in the post-normalization shift of the floating-pointsignificand in multiplication, and 0 otherwise.
Normally0.

double.ulp.digits

the largest negative integeri suchthat1 + double.base ^ i != 1, except that it is bounded below by-(double.digits + 3). Normally-52.

double.neg.ulp.digits

the largest negative integerisuch that1 - double.base ^ i != 1, except that it is boundedbelow by-(double.digits + 3). Normally-53.

double.exponent

the number of bits (decimal places ifdouble.base is 10) reservedfor the representation of the exponent (including the bias or sign)of a floating-point number. Normally11.

double.min.exp

the largest in magnitude negative integeri such thatdouble.base ^ i is positive and normalized. Normally-1022.

double.max.exp

the smallest positive power ofdouble.base that overflows. Normally1024.

integer.max

the largest integer which can be represented.Always2311=21474836472^{31} - 1 = 2147483647.

sizeof.long

the number of bytes in a C ‘⁠long⁠’ type:4 or8 (most 64-bit systems, but not Windows).

sizeof.longlong

the number of bytes in a C ‘⁠long long⁠’type. Will be zero if there is no such type, otherwise usually8.

sizeof.longdouble

the number of bytes in a C ‘⁠long double⁠’type. Will be zero if there is no such type (or its use wasdisabled whenR was built), otherwise possibly12 (most 32-bit builds),16 (most 64-bit builds) or 8(CPUs such as ARM where for most compilers ‘⁠long double⁠’ isidentical todouble).

sizeof.pointer

the number of bytes in the CSEXPtype. Will be4 on 32-bit builds and8 on 64-bitbuilds ofR.

sizeof.time_t

the number ofbytes in the Ctime_ttype: a 64-bittime_t (value8) is much preferred thesedays. Note that this is the type used by code inR itself, notnecessarily thesystem type ifR was configured with--with-internal-tzcode as also used on Windows.

longdouble.eps,longdouble.neg.eps,longdouble.digits,...

introduced inR 4.0.0. Whencapabilities("long.double") is true, there are 10 such"longdouble.kind" values, specifying the ‘⁠long double⁠’property corresponding to its"double.*" counterpart. Seealso ‘Note’.

Note

In the (typical) case wherecapabilities("long.double") istrue,R uses the ‘⁠long double⁠’ C type in quite a few places internallyfor accumulators in e.g.sum, reading non-integernumeric constants into (binary) double precision numbers, or arithmeticsuch asx %% y; also, ‘⁠long double⁠’ can be read byreadBin.
For this reason, in that case,.Machine contains ten further components,longdouble.eps,*.neg.eps,*.digits,*.rounding*.guard,*.ulp.digits,*.neg.ulp.digits,*.exponent,*.min.exp, and*.max.exp, computedentirely analogously to theirdouble.* counterparts, see there.

sizeof.longdouble only tells you the amount of storageallocated for a long double. Often what is stored is the 80-bit extendeddouble type ofIEC 60559, padded to the double alignment used on theplatform — this seems to be the case for the commonR platformsusing ix86 and x86_64 chips. There are other implementation of longdouble, usually in software for example on Sparc Solaris and AIX.

Note that it is legal for a platform to have a ‘⁠long double⁠’ Ctype which is identical to the ‘⁠double⁠’ type — this happens onARM CPUs. In that casecapabilities("long.double") willbe false but on versions ofR prior to 4.0.4,.Machine may contain"longdouble.kind" elements.

Source

Uses a C translation of Fortran code in the reference, modified by theR Core Team to defeat over-optimization in modern compilers.

References

Cody, W. J. (1988).MACHAR: A subroutine to dynamically determine machine parameters.Transactions on Mathematical Software,14(4), 303–311.doi:10.1145/50063.51907.

See Also

.Platform for details of the platform.

Examples

.Machine## or for a neat printoutnoquote(unlist(format(.Machine)))

Platform Specific Variables

Description

.Platform is a list with some details of the platform underwhichR was built. This provides means to write OS-portableRcode.

Usage

.Platform

Value

A list with at least the following components:

OS.type

character string, giving theOperatingSystem(family) of the computer. One of"unix" or"windows".

file.sep

character string, giving thefileseparator used on yourplatform:"/" on both Unix-alikesand on Windows (butnot on the former port to Classic Mac OS).

dynlib.ext

character string, giving the file nameextension ofdynamically loadablelibraries, e.g.,".dll" onWindows and".so" or".sl" on Unix-alikes. (Note formacOS users: these are shared objects as loaded bydyn.load and not dylibs: seedyn.load.)

GUI

character string, giving the type of GUI in use, or"unknown"if no GUI can be assumed. Possible values are for Unix-alikes thevalues given via the-g command-line flag ("X11","Tk"),"AQUA" (running underR.app on macOS),"Rgui" and"RTerm" (Windows) and perhaps others underalternative front-ends or embeddedR.

endian

character string,"big" or"little", giving the‘endianness’ of the processor in use. This is relevant when it isnecessary to know the order to read/write bytes of e.g. aninteger or double from/to aconnection: seereadBin.

pkgType

character string, the preferred setting foroptions("pkgType"). Values"source","mac.binary" and"win.binary" are currently in use.

This shouldnot be used to identify the OS.

path.sep

character string, giving thepathseparator,used on your platform, e.g.,":" on Unix-alikes and";" on Windows. Used to separate paths in environmentvariables such asPATH andTEXINPUTS.

r_arch

character string, possibly"". The name of anarchitecture-specific directory used in this build ofR.

AQUA

.Platform$GUI is set to"AQUA" under the macOS GUI,R.app. This has a number of consequences:

  • /usr/local/bin’ isappended to thePATHenvironment variable.

  • the default graphics device is set toquartz.

  • selects native (rather than Tk) widgets for thegraphics = TRUE options ofmenu andselect.list.

  • HTML help is displayed in the internal browser.

  • the spreadsheet-like data editor/viewer uses a Quartz versionrather than the X11 one.

See Also

R.version andSys.info give more detailsabout the OS. In particular,R.version$platform is thecanonical name of the platform under whichR was compiled.osVersion may give more details about the platformR is running on.

.Machine for details of the arithmetic used, andsystem for invoking platform-specific system commands.

capabilities andextSoftVersion (and linksthere) for availability of capabilities partlyexternal toRbut used fromR functions.

Examples

## Note: this can be done in a system-independent way by dir.exists()if(.Platform$OS.type=="unix"){   system.test<-function(...) system(paste("test",...))==0L   dir.exists2<-function(dir)       sapply(dir,function(d) system.test("-d", d))   dir.exists2(c(R.home(),"/tmp","~","/NO"))# > T T T F}

Abbreviate Strings

Description

Abbreviate strings to at leastminlength characters,such that they remainunique (if they were),unlessstrict = TRUE.

Usage

abbreviate(names.arg, minlength=4, use.classes=TRUE,           dot=FALSE, strict=FALSE,           method= c("left.kept","both.sides"), named=TRUE)

Arguments

names.arg

a character vector of names to be abbreviated, or anobject to be coerced to a character vector byas.character.

minlength

the minimum length of the abbreviations.

use.classes

logical: should lowercase characters be removed first?

dot

logical: should a dot (".") be appended?

strict

logical: shouldminlength be observed strictly?Note that settingstrict = TRUE may returnnon-uniquestrings.

method

a character string specifying the method used with default"left.kept", see ‘Details’ below. Partial matchesallowed.

named

logical: shouldnames (with original vector) be returned.

Details

The default algorithm (method = "left.kept") used is similarto that of S. For a single string it works as follows.First spaces at the ends of the string are stripped.Then (if necessary) any other spaces are stripped.Next, lower case vowels are removed followed by lower case consonants.Finally if the abbreviation is still longer thanminlengthupper case letters and symbols are stripped.

Characters are always stripped from the end of the strings first. Ifan element ofnames.arg contains more than one word (words areseparated by spaces) then at least one letter from each word will beretained.

Missing (NA) values are unaltered.

Ifuse.classes isFALSE then the only distinction is tobe between letters and space.

Value

A character vector containing abbreviations for the character stringsin its first argument. Duplicates in the originalnames.argwill be given identical abbreviations. If any non-duplicated elementshave the sameminlength abbreviations then, ifmethod = "both.sides" the basic internalabbreviate() algorithm isapplied to the characterwisereversed strings; if there arestill duplicated abbreviations and ifstrict = FALSE as bydefault,minlength is incremented by one and new abbreviationsare found for those elements only. This process is repeated until allunique elements ofnames.arg have unique abbreviations.

Ifnames is true, the character version ofnames.arg isattached to the returned value as anames attribute: noother attributes are retained.

If a input element contains non-ASCII characters, the correspondingvalue will be in UTF-8 and marked as such (seeEncoding).

Warning

Ifuse.classes is true (the default), this is really onlysuitable for English, and prior toR 3.3.0 did not work correctlywith non-ASCII characters in multibyte locales. It will warn if usedwith non-ASCII characters (and required to reduce the length). It isunlikely to work well with inputs not in the Unicode Basic MultilingualPlane nor on (rare) platforms where wide characters are not encoded inUnicode.

As fromR 3.3.0 the concept of ‘vowel’ is extended fromEnglish vowels by including characters which are accented versions oflower-case English vowels (including ‘o with stroke’). Ofcourse, there are languages (even Western European languages such asWelsh) with other vowels.

See Also

substr.

Examples

x<- c("abcd","efgh","abce")abbreviate(x,2)abbreviate(x,2, strict=TRUE)# >> 1st and 3rd are == "ab"(st.abb<- abbreviate(state.name,2))stopifnot(identical(unname(st.abb),           abbreviate(state.name,2, named=FALSE)))table(nchar(st.abb))# out of 50, 3 need 4 letters :as<- abbreviate(state.name,3, strict=TRUE)as[which(as=="Mss")]## and without distinguishing vowels:st.abb2<- abbreviate(state.name,2,FALSE)cbind(st.abb, st.abb2)[st.abb2!= st.abb,]## method = "both.sides" helps:  no 4-letters, and only 4 3-letters:st.ab2<- abbreviate(state.name,2, method="both")table(nchar(st.ab2))## Compare the two methods:cbind(st.abb, st.ab2)

Approximate String Matching (Fuzzy Matching)

Description

Searches for approximate matches topattern (the first argument)within each element of the stringx (the second argument) usingthe generalized Levenshtein edit distance (the minimal possiblyweighted number of insertions, deletions and substitutions needed totransform one string into another).

Usage

agrep(pattern, x, max.distance=0.1, costs=NULL,      ignore.case=FALSE, value=FALSE, fixed=TRUE,      useBytes=FALSE)agrepl(pattern, x, max.distance=0.1, costs=NULL,       ignore.case=FALSE, fixed=TRUE, useBytes=FALSE)

Arguments

pattern

a non-empty character string to be matched. Forfixed = FALSE this should contain an extendedregular expression. Coerced byas.character to a string if possible.

x

character vector where matches are sought.Coerced byas.character to a character vector ifpossible.

max.distance

maximum distance allowed for a match. Expressedeither as integer, or as a fraction of thepattern lengthtimes the maximal transformation cost (will be replaced by thesmallest integer not less than the corresponding fraction), or alist with possible components

cost:

maximum number/fraction of match cost(generalized Levenshtein distance)

all:

maximal number/fraction ofalltransformations (insertions, deletions and substitutions)

insertions:

maximum number/fraction of insertions

deletions:

maximum number/fraction of deletions

substitutions:

maximum number/fraction ofsubstitutions

Ifcost is not given,all defaults to 10%, and theother transformation number bounds default toall.The component names can be abbreviated.

costs

a numeric vector or list with names partially matching‘⁠insertions⁠’, ‘⁠deletions⁠’ and ‘⁠substitutions⁠’ givingthe respective costs for computing the generalized Levenshteindistance, orNULL (default) indicating using unit cost forall three possible transformations.Coerced to integer viaas.integer if possible.

ignore.case

ifFALSE, the pattern matching iscasesensitive and ifTRUE, case is ignored during matching.

value

ifFALSE, a vector containing the (integer)indices of the matches determined is returned and ifTRUE, avector containing the matching elements themselves is returned.

fixed

logical. IfTRUE (default), the pattern ismatched literally (as is). Otherwise, it is matched as a regularexpression.

useBytes

logical. IfTRUE the matching is donebyte-by-byte rather than character-by-character. See‘Details’.

Details

The Levenshtein edit distance is used as measure of approximateness:it is the (possibly cost-weighted) total number of insertions,deletions and substitutions required to transform one string intoanother.

This uses thetre code by Ville Laurikari(https://github.com/laurikari/tre), which supportsMBCScharacter matching.

The main effect ofuseBytes = TRUE is to avoid errors/warningsabout invalid inputs and spurious matches in multibyte locales.It inhibits the conversion of inputs with marked encodings, and isforced if any input is found which is marked as"bytes" (seeEncoding).

Value

agrep returns a vector giving the indices of the elements thatyielded a match, or, ifvalue isTRUE, the matchedelements (after coercion, preserving names but no other attributes).

agrepl returns a logical vector.

Note

Since someone who read the description carelessly even filed a bugreport on it, do note that this matches substrings of each element ofx (just asgrep does) andnot wholeelements. See alsoadist in packageutils, whichoptionally returns the offsets of the matched substrings.

Author(s)

Original version inR < 2.10.0 by David Meyer.Current version by Brian Ripley and Kurt Hornik.

See Also

grep,adist.A different interface to approximate string matching is provided byaregexec().

Examples

agrep("lasy","1 lazy 2")agrep("lasy", c(" 1 lazy 2","1 lasy 2"), max.distance= list(sub=0))agrep("laysy", c("1 lazy","1","1 LAZY"), max.distance=2)agrep("laysy", c("1 lazy","1","1 LAZY"), max.distance=2, value=TRUE)agrep("laysy", c("1 lazy","1","1 LAZY"), max.distance=2, ignore.case=TRUE)

Are All Values True?

Description

Given a set of logical vectors, are all of the values true?

Usage

all(..., na.rm=FALSE)

Arguments

...

zero or more logical vectors. Other objects of zerolength are ignored, and the rest are coerced to logical ignoringany class.

na.rm

logical. If trueNA values are removed beforethe result is computed.

Details

This is a generic function: methods can be defined for itdirectly or via theSummary group generic.For this to work properly, the arguments... should beunnamed, and dispatch is on the first argument.

Coercion of types other than integer (raw, double, complex, character,list) gives a warning as this is often unintentional.

This is aprimitive function.

Value

The value is a logical vector of length one.

Letx denote the concatenation of all the logical vectors in... (after coercion), after removingNAs if requested byna.rm = TRUE.

The value returned isTRUE if all of the values inx areTRUE (including if there are no values), andFALSE if atleast one of the values inx isFALSE. Otherwise thevalue isNA (which can only occur ifna.rm = FALSE and... contains noFALSE values and at least oneNA value).

S4 methods

This is part of the S4Summarygroup generic. Methods for it must use the signaturex, ..., na.rm.

Note

Thatall(logical(0)) is true is a useful convention:it ensures that

all(all(x), all(y)) == all(x, y)

even ifx has length zero.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

any, the ‘complement’ ofall, andstopifnot(*) which is anall(*)‘insurance’.

Examples

range(x<- sort(round(stats::rnorm(10)-1.2,1)))if(all(x<0)) cat("all x values are negative\n")all(logical(0))# true, as all zero of the elements are true.

Test if Two Objects are (Nearly) Equal

Description

all.equal(x, y) is a utility to compareR objectsxandy testing ‘near equality’. If they are different,comparison is still made to some extent, and a report of thedifferences is returned. Do not useall.equal directly inif expressions—either useisTRUE(all.equal(....)) oridentical if appropriate.

Usage

all.equal(target, current,...)## Default S3 method:all.equal(target, current,..., check.class=TRUE)## S3 method for class 'numeric'all.equal(target, current,          tolerance= sqrt(.Machine$double.eps), scale=NULL,          countEQ=FALSE,          formatFUN=function(err, what) format(err),..., check.attributes=TRUE, check.class=TRUE, giveErr=FALSE)## S3 method for class 'list'all.equal(target, current,...,          check.attributes=TRUE, use.names=TRUE)## S3 method for class 'environment'all.equal(target, current, all.names=TRUE,          evaluate=TRUE,...)## S3 method for class 'function'all.equal(target, current, check.environment=TRUE,...)## S3 method for class 'POSIXt'all.equal(target, current,..., tolerance=1e-3, scale,          check.tzone=TRUE)attr.all.equal(target, current,...,               check.attributes=TRUE, check.names=TRUE)

Arguments

target

R object.

current

otherR object, to be compared withtarget.

...

further arguments for different methods, notably thefollowing two, for numerical comparison:

tolerance

numeric\ge 0. Differences smaller thantolerance are not reported. The default value is close to1.5e-8.

scale

NULL or numeric > 0, typically of length 1 orlength(target). See ‘Details’.

countEQ

logical indicating if thetarget == currentcases should be counted when computing the mean (absolute orrelative) differences. The default,FALSE may seemmisleading in cases wheretarget andcurrent onlydiffer in a few places; see the extensive example.

formatFUN

afunction of two arguments,err, the relative, absolute or scaled error, andwhat, a character string indicating thekind of error;may be used, e.g., to format relative and absolute errors differently.

check.attributes

logical indicating if theattributes oftarget andcurrent(other than the names) should be compared.

check.class

logical indicating if thedata.class()oftarget andcurrent should be compared.

giveErr

logical indicating if the result shouldcontain the numerical error as an"err" attribute.

use.names

logical indicating iflist comparisonshould report differing components by name (if matching) instead ofinteger index. Note that this comes after... and so mustbe specified by its full name.

all.names

logical passed tols indicating if“hidden” objects should also be considered in the environments.

evaluate

for theenvironment method:logical indicating if“promises should be forced”, i.e., typically formal function argumentsbe evaluated for comparison. If false, only thenames ofthe objects in the two environments are checked for equality.

check.environment

logical requiring that theenvironment()s of functions should be compared, too.You may need to setcheck.environment=FALSE in unexpectedcases, such as when comparing twonls() fits.

check.tzone

logical indicating if the"tzone" attributesoftarget andcurrent should be compared.

check.names

logical indicating if thenames(.)oftarget andcurrent should be compared.

Details

all.equal is a generic function, dispatching methods on thetarget argument. To see the available methods, usemethods("all.equal"), but note that the default methodalso does some dispatching, e.g. using the raw method for logicaltargets.

Remember that arguments which follow... must be specified by(unabbreviated) name. It is inadvisable to pass unnamed arguments in... as these will match different arguments in differentmethods.

Numerical comparisons forscale = NULL (the default) aretypically on arelative difference scale unless thetarget values are close to zero or infinite. Specifically,the scale is computed as the mean absolute value oftarget.If this scale is finite and exceedstolerance, differencesare expressed relative to it; otherwise, absolute differences are used.Note that this scale and all further steps are computed only for thosevector elementswheretarget is notNA and differs fromcurrent.IfcountEQ is true, the equal andNA cases arecounted in determining the “sample” size.

Ifscale is numeric (and positive), absolute comparisons aremade after scaling (dividing) byscale. Note that if all ofscale is close to 1 (specifically, within 1e-7), the difference is stillreported as being on an absolute scale.

For complextarget, the modulus (Mod) of thedifference is used:all.equal.numeric is called so argumentstolerance andscale are available.

Thelist method compares components oftarget andcurrent recursively, passing all otherarguments, as long as both are “list-like”, i.e., fulfilleitheris.vector oris.list.

Theenvironment method works via thelist method,and is also used for reference classes (unless a specificall.equal method is defined).

The method for date-time objects usesall.equal.numeric tocompare times (in"POSIXct" representation) with adefaulttolerance of 0.001 seconds, ignoringscale.A time zone mismatch betweentarget andcurrent isreported unlesscheck.tzone = FALSE.

attr.all.equal is used for comparingattributes, returningNULL or acharacter vector.

Value

EitherTRUE (NULL forattr.all.equal) or a vectorofmode"character" describing the differencesbetweentarget andcurrent.

References

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer (for=).

See Also

identical,isTRUE,==, andall for exact equality testing.

Examples

all.equal(pi,355/113)# not precise enough (default tol) > relative errorquarts<-1/4+1:10# exactd45<- pi*quarts; one<- rep(1,10)tan(d45)== one# mostly FALSE, as typically exact; embarrassingly,tanpi(quarts)== one# (is always FALSE (Fedora 34; gcc 11.2.1))stopifnot(all.equal(          tan(d45), one))# TRUE, but not if we are picky:all.equal(tan(d45), one, tolerance=0)# to see differenceall.equal(tan(d45), one, tolerance=0, scale=1)# "absolute diff.."all.equal(tan(d45), one, tolerance=0, scale=1+(-2:2)/1e9)# "absolute"all.equal(tan(d45), one, tolerance=0, scale=1+(-2:2)/1e6)# "scaled"## advanced: equality of environmentsae<- all.equal(as.environment("package:stats"),                asNamespace("stats"))stopifnot(is.character(ae), length(ae)>10,## were incorrectly "considered equal" in R <= 3.1.1          all.equal(asNamespace("stats"), asNamespace("stats")))## A situation where  'countEQ = TRUE' makes sense:x1<- x2<-(1:100)/10;  x2[2]<-1.1*x1[2]## 99 out of 100 pairs (x1[i], x2[i]) are equal:plot(x1,x2, main="all.equal.numeric() -- not counting equal parts")all.equal(x1,x2)## "Mean relative difference: 0.1"mtext(paste("all.equal(x1,x2) :", all.equal(x1,x2)), line=-2)##' extract the 'Mean relative difference' as number:all.eqNum<-function(...) as.numeric(sub(".*:",'', all.equal(...)))set.seed(17)## When x2 is jittered, typically all pairs (x1[i],x2[i]) do differ:summary(r<- replicate(100, all.eqNum(x1, x2*(1+rnorm(x1)*1e-7))))mtext(paste("mean(all.equal(x1, x2*(1 + eps_k))) {100 x} Mean rel.diff.=",            signif(mean(r),3)), line=-4, adj=0)## With argument  countEQ=TRUE, get "the same" (w/o need for jittering):mtext(paste("all.equal(x1,x2, countEQ=TRUE) :",          signif(all.eqNum(x1,x2, countEQ=TRUE),3)), line=-6, col=2)## Using giveErr=TRUE :x1.<- x1*(1+1e-9*rnorm(x1))str(all.equal(x1, x1., giveErr=TRUE))## logi TRUE## - attr(*,  "err")= num 8.66e-10## - attr(*, "what")= chr "relative"## Used with stopifnot(), still *showing* diff:all.equalShow<-function(...){   r<- all.equal(..., giveErr=TRUE)   cat(attr(r,"what"),"err:", attr(r,"err"),"\n")   c(r)# can drop attributes, as not used anymore}# checks, showing error in any case:stopifnot(all.equalShow(x1, x1.))# -> relative err: 8.66002e-10tryCatch(error=identity, stopifnot(all.equalShow(x1,2*x1)))-> eAestopifnot(inherits(eAe,"error"))# stopifnot(all.equal....()) giving smart msg:cat(conditionMessage(eAe),"\n")two<- structure(2, foo=1, class="bar")all.equal(two^20,2^20)# lots of diffall.equal(two^20,2^20, check.attributes=FALSE)# "target is bar, current is numeric"all.equal(two^20,2^20, check.attributes=FALSE, check.class=FALSE)# TRUE## comparison of date-time objectsnow<- Sys.time()stopifnot(all.equal(now, now+1e-4)# TRUE (default tolerance = 0.001 seconds))all.equal(now, now+0.2)all.equal(now, as.POSIXlt(now,"UTC"))stopifnot(all.equal(now, as.POSIXlt(now,"UTC"), check.tzone=FALSE)# TRUE)

Find All Names in an Expression

Description

Return a character vector containing all the names which occur in anexpression or call.

Usage

all.names(expr, functions=TRUE, max.names=-1L, unique=FALSE)all.vars(expr, functions=FALSE, max.names=-1L, unique=TRUE)

Arguments

expr

anexpression orcall from which the namesare to be extracted.

functions

a logical value indicating whether function namesshould be included in the result.

max.names

the maximum number of names to be returned.-1indicates no limit (other than vector size limits).

unique

a logical value which indicates whether duplicate namesshould be removed from the value.

Details

These functions differ only in the default values for theirarguments.

Value

A character vector with the extracted names.

See Also

substitute to replace symbols with values in an expression.

Examples

all.names(expression(sin(x+y)))all.names(quote(sin(x+y)))# or a callall.vars(expression(sin(x+y)))

Are Some Values True?

Description

Given a set of logical vectors, is at least one of the values true?

Usage

any(..., na.rm=FALSE)

Arguments

...

zero or more logical vectors. Other objects of zerolength are ignored, and the rest are coerced to logical ignoringany class.

na.rm

logical. If trueNA values are removed beforethe result is computed.

Details

This is a generic function: methods can be defined for itdirectly or via theSummary group generic.For this to work properly, the arguments... should beunnamed, and dispatch is on the first argument.

Coercion of types other than integer (raw, double, complex, character,list) gives a warning as this is often unintentional.

This is aprimitive function.

Value

The value is a logical vector of length one.

Letx denote the concatenation of all the logical vectors in... (after coercion), after removingNAs if requested byna.rm = TRUE.

The value returned isTRUE if at least one of the values inx isTRUE, andFALSE if all of the values inx areFALSE (including if there are no values). Otherwisethe value isNA (which can only occur ifna.rm = FALSEand... contains noTRUE values and at least oneNA value).

S4 methods

This is part of the S4Summarygroup generic. Methods for it must use the signaturex, ..., na.rm.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

all, the ‘complement’ ofany.

Examples

range(x<- sort(round(stats::rnorm(10)-1.2,1)))if(any(x<0)) cat("x contains negative values\n")

Array Transposition

Description

Transpose an array by permuting its dimensions and optionally resizingit.

Usage

aperm(a, perm,...)## Default S3 method:aperm(a, perm=NULL, resize=TRUE,...)## S3 method for class 'table'aperm(a, perm=NULL, resize=TRUE, keep.class=TRUE,...)

Arguments

a

the array to be transposed.

perm

the subscript permutation vector, usually a permutation ofthe integers1:n, wheren is the number of dimensionsofa. Whena has named dimnames, it can be acharacter vector of lengthn giving a permutation of thosenames. The default (used wheneverperm has zero length) is toreverse the order of the dimensions.

resize

a flag indicating whether the vector should beresized as well as having its elements reordered (defaultTRUE).

keep.class

logical indicating if the result should be of thesame class asa.

...

potential further arguments of methods.

Value

A transposed version of arraya, with subscripts permuted asindicated by the arrayperm. Ifresize isTRUE,the array is reshaped as well as having its elements permuted, thedimnames are also permuted; ifresize = FALSE then thereturned object has the same dimensions asa, and the dimnamesare dropped. In each case other attributes are copied froma.

The functiont provides a faster and more convenient way oftransposing matrices.

Author(s)

Jonathan Rougier,[email protected] did thefaster C implementation.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

t, to transpose matrices.

Examples

# interchange the first two subscripts on a 3-way array xx<- array(1:24,2:4)xt<- aperm(x, c(2,1,3))stopifnot(t(xt[,,2])== x[,,2],          t(xt[,,3])== x[,,3],          t(xt[,,4])== x[,,4])UCB<- aperm(UCBAdmissions, c(2,1,3))UCB[1,,]summary(UCB)# UCB is still a contingency table

Vector Merging

Description

Add elements to a vector.

Usage

append(x, values, after= length(x))

Arguments

x

the vector the values are to be appended to.

values

to be included in the modified vector.

after

a subscript, after which the values are to be appended.

Value

A vector containing the values inx with the elements ofvalues appended after the specified element ofx.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Examples

append(1:5,0:1, after=3)

Apply Functions Over Array Margins

Description

Returns a vector or array or list of values obtained by applying afunction to margins of an array or matrix.

Usage

apply(X, MARGIN, FUN,..., simplify=TRUE)

Arguments

X

an array, including a matrix.

MARGIN

a vector giving the subscripts which the function willbe applied over. E.g., for a matrix1 indicates rows,2 indicates columns,c(1, 2) indicates rows andcolumns. WhereX has named dimnames, it can be a charactervector selecting dimension names.

FUN

the function to be applied: see ‘Details’.In the case of functions like+,%*%, etc., thefunction name must be backquoted or quoted.

...

optional arguments toFUN.

simplify

a logical indicating whether results should besimplified if possible.

Details

IfX is not an array but an object of a class with a non-nulldim value (such as a data frame),apply attemptsto coerce it to an array viaas.matrix if it is two-dimensional(e.g., a data frame) or viaas.array.

FUN is found by a call tomatch.fun and typicallyis either a function or a symbol (e.g., a backquoted name) or acharacter string specifying a function to be searched for from theenvironment of the call toapply.

Arguments in... cannot have the same name as any of theother arguments, and care may be needed to avoid partial matching toMARGIN orFUN. In general-purpose code it is goodpractice to name the first three arguments if... is passedthrough: this both avoids partial matching toMARGINorFUN and ensures that a sensible error message is given ifarguments namedX,MARGIN orFUN are passedthrough....

Value

If each call toFUN returns a vector of lengthn,andsimplify isTRUE, thenapply returns an array of dimensionc(n, dim(X)[MARGIN])ifn > 1. Ifn equals1,apply returns avector ifMARGIN has length 1 and an array of dimensiondim(X)[MARGIN] otherwise.Ifn is0, the result has length 0 but not necessarilythe ‘correct’ dimension.

If the calls toFUN return vectors of different lengths,or ifsimplify isFALSE,apply returns a list of lengthprod(dim(X)[MARGIN]) withdim set toMARGIN if this has length greater than one.

In all cases the result is coerced byas.vector to oneof the basic vector types before the dimensions are set, so that (forexample) factor results will be coerced to a character array.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

lapply and there,simplify2array;tapply, and convenience functionssweep andaggregate.

Examples

## Compute row and column sums for a matrix:x<- cbind(x1=3, x2= c(4:1,2:5))dimnames(x)[[1]]<- letters[1:8]apply(x,2, mean, trim=.2)col.sums<- apply(x,2, sum)row.sums<- apply(x,1, sum)rbind(cbind(x, Rtot= row.sums), Ctot= c(col.sums, sum(col.sums)))stopifnot( apply(x,2, is.vector))## Sort the columns of a matrixapply(x,2, sort)## keeping named dimnamesnames(dimnames(x))<- c("row","col")x3<- array(x, dim= c(dim(x),3),    dimnames= c(dimnames(x), list(C= paste0("cop.",1:3))))identical(x,  apply( x,2,  identity))identical(x3, apply(x3,2:3, identity))##- function with extra args:cave<-function(x, c1, c2) c(mean(x[c1]), mean(x[c2]))apply(x,1, cave,  c1="x1", c2= c("x1","x2"))ma<- matrix(c(1:4,1,6:8), nrow=2)maapply(ma,1, table)#--> a list of length 2apply(ma,1, stats::quantile)# 5 x n matrix with rownamesstopifnot(dim(ma)== dim(apply(ma,1:2, sum)))## Example with different lengths for each callz<- array(1:24, dim=2:4)zseq<- apply(z,1:2,function(x) seq_len(max(x)))zseq## a 2 x 3 matrixtypeof(zseq)## listdim(zseq)## 2 3zseq[1,]apply(z,3,function(x) seq_len(max(x)))# a list without a dim attribute

Argument List of a Function

Description

Displays the argument names and corresponding default values of a(non-primitive or primitive) function.

Usage

args(name)

Arguments

name

a function (a primitive or a closure, i.e.,“non-primitive”).Ifname is a character string then the function with thatname is found and used.

Details

This function is mainly used interactively to print the argument listof a function. For programming, consider usingformalsinstead.

Value

For a closure, a closure with identical formal argument list but anempty (NULL) body.

For a primitive (function), a closure with the documented usage andNULLbody. Note that some primitives do not make use of named argumentsand match by position rather than name.

NULL in case of a non-function.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

formals,help;str also prints the argument list of a function.

Examples

## "regular" (non-primitive) functions "print their arguments"## (by returning another function with NULL body which you also see):args(ls)args(graphics::plot.default)utils::str(ls)# (just "prints": does not show a NULL)## You can also pass a string naming a function.args("scan")## ...but :: package specification doesn't work in this case.tryCatch(args("graphics::plot.default"), error= print)## As explained above, args() gives a function with empty body:list(is.f= is.function(args(scan)), body= body(args(scan)))## Primitive functions mostly behave like non-primitive functions.args(c)args(`+`)## primitive functions without well-defined argument list return NULL:args(`if`)

Arithmetic Operators

Description

These unary and binary operators perform arithmetic on numeric orcomplex vectors (or objects which can be coerced to them).

Usage

+ x- xx+ yx- yx* yx/ yx^ yx%% yx%/% y

Arguments

x,y

numeric or complex vectors or objects which can becoerced to such, or other objects for which methods have been written.

Details

The unary and binary arithmetic operators are generic functions:methods can be written for them individually or via theOps group generic function. (SeeOps for how dispatch is computed.)

If applied to arrays the result will be an array if this is sensible(for example it will not if the recycling rule has been invoked).

Logical vectors will be coerced to integer or numeric vectors,FALSE having value zero andTRUE having value one.

1 ^ y andy ^ 0 are1,always.x ^ y should also give the proper limit result wheneither (numeric) argument isinfinite (one ofInf or-Inf).

Objects such as arrays or time-series can be operated on thisway provided they are conformable.

For double arguments,%% can be subject to catastrophic loss ofaccuracy ifx is much larger thany, and a warning isgiven if this is detected.

%% andx %/% y can be used for non-integery,e.g.1 %/% 0.2, but the results are subject to representationerror and so may be platform-dependent. Mathematically, the answer to1 %/% 0.2 should be5, but because theIEC 60559representation of0.2 is a binary fraction slightly larger than0.2 most platforms give4.

Users are sometimes surprised by the value returned, for example why(-8)^(1/3) isNaN. Fordouble inputs,R makesuse ofIEC 60559 arithmetic on all platforms, together with the Csystem function ‘⁠pow⁠’ for the^ operator. The relevantstandards define the result in many corner cases. In particular, theresult in the example above is mandated by the C99 standard. On manyUnix-alike systems the commandman pow gives details of thevalues in a large number of corner cases.

Arithmetic on typedouble inR is supposed to be done in‘round to nearest, ties to even’ mode, but this does depend onthe compiler andFPU being set up correctly.

Value

Unary+ and unary- return a numeric or complex vector.All attributes (including class) are preserved if there is nocoercion: logicalx is coerced to integer and names, dims anddimnames are preserved.

The binary operators return vectors containing the result of the elementby element operations. If involving a zero-length vector the resulthas length zero. Otherwise, the elements of shorter vectors are recycledas necessary (with awarning when they are recycled onlyfractionally). The operators are+ for addition,- for subtraction,* for multiplication,/ fordivision and^ for exponentiation.

%% indicatesx mod y (“x modulo y”), i.e.,computes the ‘remainder’r <- x %% y, and%/% indicates integer division, whereR uses “floored”integer division, i.e.,q <- x %/% y := floor(x/y), as promotedby Donald Knuth, see the Wikipedia page on ‘Modulo operation’,and hencesign(r) == sign(y). It is guaranteed that

x == (x %% y) + y * (x %/% y)

(up to rounding error)

unlessy == 0 where the result of%% isNA_integer_ orNaN (depending on thetypeof of the arguments) or for some non-finitearguments, e.g., when the RHS of the identity aboveamounts toInf - Inf.

If either argument is complex the result will be complex, otherwise ifone or both arguments are numeric, the result will be numeric. Ifboth arguments are of typeinteger, the type of the result of/ and^ isnumeric and for the other operators itis integer (with overflow, which occurs at±(2311)\pm(2^{31} - 1),returned asNA_integer_ with a warning).

The rules for determining the attributes of the result are rathercomplicated. Most attributes are taken from the longer argument.Names will be copied from the first if it is the same length as theanswer, otherwise from the second if that is. If the arguments arethe same length, attributes will be copied from both, with those ofthe first argument taking precedence when the same attribute ispresent in both arguments. For time series, these operations areallowed only if the series are compatible, when the class andtsp attribute of whichever is a time series (the same,if both are) are used. For arrays (and an array result) thedimensions and dimnames are taken from first argument if it is anarray, otherwise the second.

S4 methods

These operators are members of the S4Arith group generic,and so methods can be written for them individually as well as for thegroup generic (or theOps group generic), with argumentsc(e1, e2) (withe2 missing for a unary operator).

Implementation limits

R is dependent on OS services (and they onFPUs) for floating-pointarithmetic. On all currentR platformsIEC 60559 (also known as IEEE754) arithmetic is used, but some things in those standards areoptional. In particular, the support fordenormal akasubnormal numbers(those outside the range given by.Machine) may differbetween platforms and even between calculations on a single platform.

Another potential issue is signed zeroes: onIEC 60559 platforms thereare two zeroes with internal representations differing by sign. WherepossibleR treats them as the same, but for example direct outputfrom C code often does not do so and may output ‘⁠-0.0⁠’ (and onWindows whether it does so or not depends on the version of Windows).One place inR where the difference might be seen is in division byzero:1/x isInf or-Inf depending on the sign ofzerox. Another place isidentical(0, -0, num.eq = FALSE).

Note

All logical operations involving a zero-length vector have azero-length result.

The binary operators are sometimes called as functions ase.g.`&`(x, y): see the description of howargument-matching is done inOps.

** is translated in the parser to^, but this wasundocumented for many years. It appears as an index entry in Beckeret al. (1988), pointing to the help forDeprecated butis not actually mentioned on that page. Even though it had beendeprecated in S for 20 years, it was still accepted inR in 2008.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

D. Goldberg (1991).What Every Computer Scientist Should Know about Floating-PointArithmetic.ACM Computing Surveys,23(1), 5–48.doi:10.1145/103162.103163.
Also available athttps://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html.

For theIEC 60559 (aka IEEE 754) standard:https://www.iso.org/standard/57469.html andhttps://en.wikipedia.org/wiki/IEEE_754.

On the integer division and remainder (modulo) computations,%%and%/%:https://en.wikipedia.org/wiki/Modulo_operation, andDonald Knuth (1972)The Art of Computer Programming, Vol.1.

See Also

sqrt for miscellaneous andSpecial for specialmathematical functions.

Syntax for operator precedence.

%*% for matrix multiplication.

Examples

x<--1:12x+12* x+3x%%3# is periodic  2 0  1  2 0  1 ...x%%-3#  (ditto)    -1 0 -2 -1 0 -2 ...x%/%5x%%Inf# now is defined by limit (gave NaN in earlier versions of R)## Illustrating PR#18677, see above1%/% print(0.2, digits=19)

Multi-way Arrays

Description

Creates or tests for arrays.

Usage

array(data=NA, dim= length(data), dimnames=NULL)as.array(x,...)is.array(x)

Arguments

data

a vector (including a list orexpressionvector) giving data to fill the array. Non-atomic classed objectsare coerced byas.vector.

dim

the dim attribute for the array to be created, that is aninteger vector of length one or more giving the maximal indices ineach dimension.

dimnames

eitherNULL or the names for the dimensions.This must be a list (or it will be ignored) with one component for eachdimension, eitherNULL or a character vector of the lengthgiven bydim for that dimension. The list can be named, andthe list names will be used as names for the dimensions. If thelist is shorter than the number of dimensions, it is extended byNULLs to the length required.

x

anR object.

...

additional arguments to be passed to or from methods.

Details

An array inR can have one, two or more dimensions. It is simply avector which is stored with additionalattributes giving thedimensions (attribute"dim") and optionally names for thosedimensions (attribute"dimnames").

A two-dimensional array is the same thing as amatrix.

One-dimensional arrays often look like vectors, but may be handleddifferently by some functions:str does distinguishthem in recent versions ofR.

The"dim" attribute is an integer vector of length one or morecontaining non-negative values: the product of the values must matchthe length of the array.

The"dimnames" attribute is optional: if present it is a listwith one component for each dimension, eitherNULL or acharacter vector of the length given by the element of the"dim" attribute for that dimension.

is.array is aprimitive function.

For a list array, theprint methods prints entries of lengthnot one in the form ‘⁠integer,7⁠’ indicating the type and length.

Value

array returns an array with the extents specified indimand naming information indimnames. The values indata aretaken to be those in the array with the leftmost subscript movingfastest. If there are too few elements indata to fill the array,then the elements indata are recycled. Ifdata haslength zero,NA of an appropriate type is used for atomicvectors (0 for raw vectors) andNULL for lists.

Unlikematrix,array does not currently removeany attributes left byas.vector from a classed listdata, so can return a list array with a class attribute.

as.array is a generic function for coercing to arrays. Thedefault method does so by attaching adim attribute toit. It also attachesdimnames ifx hasnames. The sole purpose of this is to make it possibleto access thedim[names] attribute at a later time.

is.array returnsTRUE orFALSE depending onwhether its argument is an array (i.e., has adim attribute ofpositive length) or not. It is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.

Note

is.array is aprimitive function.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

aperm,matrix,dim,dimnames.

Examples

dim(as.array(letters))array(1:3, c(2,4))# recycle 1:3 "2 2/3 times"#     [,1] [,2] [,3] [,4]#[1,]    1    3    2    1#[2,]    2    1    3    2

Convert array to data frame

Description

array2DF converts an array, including list arrays commonlyreturned bytapply, into data frames for use in furtheranalysis or plotting functions.

Usage

array2DF(x, responseName="Value",         sep="", base= list(LETTERS),         simplify=TRUE, allowLong=TRUE)

Arguments

x

an array object.

responseName

character string, used for creating column name(s)in the result, if required.

sep

character string, used as separator when creating newnames, if required.

base

character vector, giving an initial set of names to createdimnames ofx, if missing.

simplify

logical, whether to attempt simplification of theresult.

allowLong

logical, specifying whether a long format data frameshould be returned ifx is a list array and all elements ofx are unnamed atomic vectors. Ignored unlesssimplify = TRUE.

Details

The main use ofarray2DF is to convert an array, as typicallyreturned bytapply, into a data frame.

Whensimplify = FALSE, this is similar toas.data.frame.table, except that it works for listarrays as well as atomic arrays. Specifically, the resulting dataframe has one row for each element of the array, with one column foreach dimension of the array giving the correspondingdimnames. The contents of the array are placed in acolumn whose name is given by theresponseName argument. Themode of this column is the same as that ofx, usually an atomicvector or a list.

Ifx does not havedimnames, they areautomatically created usingbase andsep.

In the default case, whensimplify = TRUE, some common casesare handled specially.

If all components ofx are data frames with identical columnnames (with possibly different numbers of rows), they arerbind-ed to form the response. The additional columnsgivingdimnames are repeated according to the number ofrows, andresponseName is ignored in this case.

If all components ofx areunnamed atomic vectorsandallowLong = TRUE, each component is treated as asingle-column data frame with column name given byresponseName, and processed as above.

In all other cases, an attempt to simplify is made bysimplify2array. If this results in multiple unnamedcolumns, names are constructed usingresponseName andsep.

Value

A data frame with at leastlength(dim(x)) + 1 columns. Thefirstlength(dim(x)) columns each represent one dimension ofx and gives the corresponding values ofdimnames, whichare implicitly created if necessary. The remaining columns contain thecontents ofx, after attempted simplification if requested.

See Also

tapply,as.data.frame.table,split,aggregate.

Examples

s1<- with(ToothGrowth,           tapply(len, list(dose, supp), mean, simplify=TRUE))s2<- with(ToothGrowth,           tapply(len, list(dose, supp), mean, simplify=FALSE))str(s1)# atomic arraystr(s2)# list arraystr(array2DF(s1, simplify=FALSE))# Value column is vectorstr(array2DF(s2, simplify=FALSE))# Value column is liststr(array2DF(s2, simplify=TRUE))# simplified to vector### The remaining examples use the default 'simplify = TRUE'## List array with list components: columns are lists (no simplification)with(ToothGrowth,     tapply(len, list(dose, supp),function(x) t.test(x)[c("p.value","alternative")]))|>  array2DF()|> str()## List array with data frame components: columns are atomic (simplified)with(ToothGrowth,     tapply(len, list(dose, supp),function(x) with(t.test(x), data.frame(p.value, alternative))))|>  array2DF()|> str()## named vectorswith(ToothGrowth,     tapply(len, list(dose, supp),            quantile))|> array2DF()## unnamed vectors: long formatwith(ToothGrowth,     tapply(len, list(dose, supp),            sample, size=5))|> array2DF()## unnamed vectors: wide formatwith(ToothGrowth,     tapply(len, list(dose, supp),            sample, size=5))|> array2DF(allowLong=FALSE)## unnamed vectors of unequal lengthwith(ToothGrowth[-1,],     tapply(len, list(dose, supp),            sample, replace=TRUE))|>  array2DF(allowLong=FALSE)## unnamed vectors of unequal length with allowLong = TRUE## (within-group bootstrap)with(ToothGrowth[-1,],     tapply(len, list(dose, supp), sample, replace=TRUE))|>  array2DF()|> str()## data frame inputtapply(ToothGrowth,~ dose+ supp, FUN= with,       data.frame(n= length(len), mean= mean(len), sd= sd(len)))|>  array2DF()

Coerce to a Data Frame

Description

Functions to check if an object is a data frame, or coerce it if possible.

Usage

as.data.frame(x, row.names=NULL, optional=FALSE,...)## S3 method for class 'character'as.data.frame(x,...,              stringsAsFactors=FALSE)## S3 method for class 'list'as.data.frame(x, row.names=NULL, optional=FALSE,...,              cut.names=FALSE, col.names= names(x), fix.empty.names=TRUE,              check.names=!optional,              stringsAsFactors=FALSE)## S3 method for class 'matrix'as.data.frame(x, row.names=NULL, optional=FALSE,              make.names=TRUE,...,              stringsAsFactors=FALSE)as.data.frame.vector(x, row.names=NULL, optional=FALSE,...,                     nm= deparse1(substitute(x)))is.data.frame(x)

Arguments

x

anyR object.

row.names

NULL or a character vector giving the rownames for the data frame. Missing values are not allowed.

optional

logical. IfTRUE, setting row names andconverting column names (to syntactic names: seemake.names) is optional. Note that all ofR'sbase packageas.data.frame() methods useoptional only for column names treatment, basically with themeaning ofdata.frame(*, check.names = !optional).See also themake.names argument of thematrix method.

...

additional arguments to be passed to or from methods.

stringsAsFactors

logical: should the character vector be convertedto a factor?

cut.names

logical or integer; indicating if column names withmore than 256 (orcut.names if that is numeric) charactersshould be shortened (and the last 6 characters replaced by" ...").

col.names

(optional) character vector of column names.

fix.empty.names

logical indicating if empty column names, i.e.,"" should be fixed up (indata.frame) or not.

check.names

logical; passed to thedata.frame() call.

make.names

alogical, i.e., one ofFALSE, NA, TRUE,indicating what should happen if the row names (of the matrixx) are invalid. If they are invalid, the default,TRUE, callsmake.names(*, unique=TRUE);make.names=NA will use “automatic” row names and aFALSE value will signal an error for invalid row names.

nm

acharacter string to be used as column name.

Details

as.data.frame is a generic function with many methods, andusers and packages can supply further methods. For classes that actas vectors, often a copy ofas.data.frame.vector will workas the method.

SinceR 4.3.0, thedefault method will callas.data.frame.vector for atomic (as byis.atomic)x.

Direct calls ofas.data.frame.class are still possible (base package!),for 12 atomic base classes, but are deprecatedwhere callingas.data.frame.vector instead is recommended.

If a list is supplied, each element is converted to a column in thedata frame. Similarly, each column of a matrix is converted separately.This can be overridden if the object has a class which hasa method foras.data.frame: two examples arematrices of class"model.matrix" (which areincluded as a single column) and list objects of class"POSIXlt" which are coerced to class"POSIXct".

Arrays can be converted to data frames. One-dimensional arrays aretreated like vectors and two-dimensional arrays like matrices. Arrayswith more than two dimensions are converted to matrices by‘flattening’ all dimensions after the first and creatingsuitable column labels.

Character variables are converted to factor columns unless protectedbyI.

If a data frame is supplied, all classes preceding"data.frame"are stripped, and the row names are changed if that argument is supplied.

Ifrow.names = NULL, row names are constructed from the namesor dimnames ofx, otherwise are the integer sequencestarting at one. Few of the methods check for duplicated row names.Names are removed from vector columns unlessI.

Value

as.data.frame returns a data frame, normally with all row names"" ifoptional = TRUE.

is.data.frame returnsTRUE if its argument is a dataframe (that is, has"data.frame" amongst its classes)andFALSE otherwise.

References

Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

data.frame,as.data.frame.table for thetable method (which has additional arguments if called directly).


Date Conversion Functions to and from Character

Description

Functions to convert between character representations and objects ofclass"Date" representing calendar dates.

Usage

as.Date(x,...)## S3 method for class 'character'as.Date(x, format, tryFormats= c("%Y-%m-%d","%Y/%m/%d"),        optional=FALSE,...)## S3 method for class 'numeric'as.Date(x, origin,...)## S3 method for class 'POSIXct'as.Date(x, tz="UTC",...)## S3 method for class 'Date'format(x, format="%Y-%m-%d",...)## S3 method for class 'Date'as.character(x,...)

Arguments

x

an object to be converted.

format

acharacter string. If not specified whenconverting from a character representation, it will trytryFormats one by one on the first non-NA element, andgive an error if none works. Otherwise, the processing is viastrptime() whose help page describes availableconversion specifications.

tryFormats

character vector offormatstrings to try ifformat is not specified.

optional

logical indicating to returnNA(instead of signalling an error) if the format guessing does not succeed.

origin

aDate object, or something which can be coerced byas.Date(origin, ...) to such an object ormissing. In that case,"1970-01-01" is used.

tz

a time zone name.

...

further arguments to be passed from or to other methods.

Details

The usual vector re-cycling rules are applied tox andformat so the answer will be of length that of the longer of thevectors.

Locale-specific conversions to and from character strings are usedwhere appropriate and available. This affects the names of the daysand months.

Theas.Date methods accept character strings, factors, logicalNA and objects of classes"POSIXlt" and"POSIXct". (The last is converted to days by ignoringthe time after midnight in the representation of the time in specifiedtime zone, default UTC.) Also objects of class"date" (frompackagedate) and"dates" (frompackagechron). Character strings are processedas far as necessary for the format specified: any trailing charactersare ignored.

as.Date will accept numeric data (the number of days since anepoch), sinceR 4.3.0 also whenorigin is not supplied.

Theformat andas.character methods ignore anyfractional part of the date.

Value

Theformat andas.character methods return a character vectorrepresenting the date.NA dates are returned asNA_character_.

Theas.Date methods return an object of class"Date".

Conversion from other Systems

Most systems record dates internally as the number of days since someorigin, but this is fraught with problems, including

  • Is the origin day 0 or day 1? As the ‘Examples’ show,Excel manages to use both choices for its two date systems.

  • If the origin is far enough back, the designers may show theirignorance of calendar systems. For example, Excel's designerthought 1900 was a leap year (claiming to copy the error fromearlier DOS spreadsheets), and Matlab's designer chose thenon-existent date of ‘January 0, 0000’ (there is no such day),not specifying the calendar. (There is such a year in the‘Gregorian’ calendar as used in ISO 8601:2004, but that does saythat it is only to be used for years before 1582 with the agreementof the parties in information exchange.)

The only safe procedure is to check the other systems values for knowndates: reports on the Internet (including R-help) are more often wrongthan right.

Note

The default formats follow the rules of the ISO 8601 internationalstandard which expresses a day as"2001-02-03".

If the date string does not specify the date completely, the returnedanswer may be system-specific. The most common behaviour is to assumethat a missing year, month or day is the current one. If it specifiesa date incorrectly, reliable implementations will give an error andthe date is reported asNA. Unfortunately some commonimplementations (such as ‘⁠glibc⁠’) are unreliable and guess at theintended meaning.

Years before 1CE (aka 1AD) will probably not be handled correctly.

References

International Organization for Standardization (2004, 1988, 1997,...)ISO 8601. Data elements and interchange formats –Information interchange – Representation of dates and times.For links to versions available on-line see (at the time of writing)https://www.qsl.net/g1smd/isopdf.htm.

See Also

Date for details of the date class;locales to query or set a locale.

Your system's help pages onstrftime andstrptime to seehow to specify their formats. Windows users will find no help pageforstrptime: code based on ‘⁠glibc⁠’ is used (withcorrections), so all the format specifiers described here aresupported, but with no alternative number representation nor eraavailable in any locale.

Examples

## locale-specific version of the dateformat(Sys.Date(),"%a %b %d")## read in date info in format 'ddmmmyyyy'## This will give NA(s) in some locales; setting the C locale## as in the commented lines will overcome this on most systems.## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")x<- c("1jan1960","2jan1960","31mar1960","30jul1960")z<- as.Date(x,"%d%b%Y")## Sys.setlocale("LC_TIME", lct)z## read in date/time info in format 'm/d/y'dates<- c("02/27/92","02/27/92","01/14/92","02/28/92","02/01/92")as.Date(dates,"%m/%d/%y")## date given as number of days since 1900-01-01 (a date in 1989)as.Date(32768, origin="1900-01-01")## Excel is said to use 1900-01-01 as day 1 (Windows default) or## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel## incorrectly treating 1900 as a leap year.## So for dates (post-1901) from Windows Excelas.Date(35981, origin="1899-12-30")# 1998-07-05## and Mac Excelas.Date(34519, origin="1904-01-01")# 1998-07-05## (these values come from http://support.microsoft.com/kb/214330)## Experiment shows that Matlab's origin is 719529 days before ours,## (it takes the non-existent 0000-01-01 as day 1)## so Matlab day 734373 can be imported asas.Date(734373)-719529# 2010-08-23## (value from## http://www.mathworks.de/de/help/matlab/matlab_prog/represent-date-and-times-in-MATLAB.html)## Time zone effectz<- ISOdate(2010,04,13, c(0,12))# midnight and midday UTCas.Date(z)# in UTC## these time zone names are commonas.Date(z, tz="NZ")as.Date(z, tz="HST")# Hawaii

Coerce to an Environment Object

Description

A generic function coercing anR object to anenvironment. A number or a character string isconverted to the corresponding environment on the search path.

Usage

as.environment(x)

Arguments

x

anR object to convert. If it is already anenvironment, just return it. If it is a positive number, return theenvironment corresponding to that position on the search list. If itis-1, the environment it is called from. Ifit is a character string, match the string to the names on thesearch list.

If it is a list, the equivalent oflist2env(x, parent = emptyenv()) is returned.

Ifis.object(x) is true and it has aclassfor which anas.environment method is found, that is used.

Details

This is aprimitive generic function: you can write methods tohandle specific classes of objects, seeInternalMethods.

Value

The corresponding environment object.

Author(s)

John Chambers

See Also

environment for creation and manipulation,search;list2env.

Examples

as.environment(1)## the global environmentidentical(globalenv(), as.environment(1))## is TRUEtry(## <<- stats need not be attached    as.environment("package:stats"))ee<- as.environment(list(a="A", b= pi, ch= letters[1:8]))ls(ee)# names of objects in eeutils::ls.str(ee)

Convert Object to Function

Description

as.function is a generic function which is used to convertobjects to functions.

as.function.default works on a listx, which should contain theconcatenation of a formal argument list and an expression or anobject of mode"call" which will become the function body.The function will be defined in a specified environment, by defaultthat of the caller.

Usage

as.function(x,...)## Default S3 method:as.function(x, envir= parent.frame(),...)

Arguments

x

object to convert, a list for the default method.

...

additional arguments to be passed to or from methods.

envir

environment in which the function should be defined.

Value

The desired function.

Author(s)

Peter Dalgaard

See Also

function;alist which is handy for the construction ofargument lists, etc.

Examples

as.function(alist(a=, b=2, a+b))as.function(alist(a=, b=2, a+b))(3)

Date-time Conversion Functions

Description

Functions to manipulate objects of classes"POSIXlt" and"POSIXct" representing calendar dates and times.

Usage

as.POSIXct(x, tz="",...)as.POSIXlt(x, tz="",...)## S3 method for class 'character'as.POSIXlt(x, tz="", format,           tryFormats= c("%Y-%m-%d %H:%M:%OS","%Y/%m/%d %H:%M:%OS","%Y-%m-%d %H:%M","%Y/%m/%d %H:%M","%Y-%m-%d","%Y/%m/%d"),           optional=FALSE,...)## Default S3 method:as.POSIXlt(x, tz="",           optional=FALSE,...)## S3 method for class 'numeric'as.POSIXlt(x, tz="", origin,...)## S3 method for class 'Date'as.POSIXct(x, tz="UTC",...)## S3 method for class 'Date'as.POSIXlt(x, tz="UTC",...)## S3 method for class 'numeric'as.POSIXct(x, tz="", origin,...)## S3 method for class 'POSIXlt'as.double(x,...)

Arguments

x

R object to be converted.

tz

a character string. The time zone specification to be usedfor the conversion,if one is required. System-specific (seetime zones), but"" is the current time zone, and"GMT" is UTC (Universal Time, Coordinated). Invalid valuesare most commonly treated as UTC, on some platforms with a warning.

...

further arguments to be passed to or from other methods.

format

character string giving a date-time format as usedbystrptime.

tryFormats

character vector offormatstrings to try ifformat is not specified.

optional

logical indicating to returnNA(instead of signalling an error) if the format guessing does not succeed.

origin

a date-time object, or something which can be coerced byas.POSIXct(tz = "GMT") to such an object. Optional sinceR4.3.0, where the equivalent of"1970-01-01" is used.

Details

Theas.POSIX* functions convert an object to one of the twoclasses used to represent date/times (calendar dates plus time to thenearest second). They can convert objects of the other class and ofclass"Date" to these classes. Dates without times aretreated as being at midnight UTC.

They can also convert character strings of the formats"2001-02-03" and"2001/02/03" optionally followed bywhite space and a time in the format"14:52" or"14:52:03". (Formats such as"01/02/03" are ambiguousbut can be converted via a format specification bystrptime.) Fractional seconds are allowed.Alternatively,format can be specified for character vectors orfactors: if it is not specified and no standard format works forall non-NA inputs an error is thrown.

Ifformat is specified, remember that some of the formatspecifications are locale-specific, and you may need to set theLC_TIME category appropriatelyviaSys.setlocale. This most often affects the use of%a,%A (weekday names),%b,%B (month names) and%p (AM/PM).

LogicalNAs can be converted to either of the classes, but noother logical vectors can be.

If you are given a numeric time as the number of seconds since anepoch, see the examples.

Character input is first converted to class"POSIXlt" bystrptime: numeric input is first converted to"POSIXct". Any conversion that needs to go between the twodate-time classes requires a time zone: conversion from"POSIXlt" to"POSIXct" will validate times in theselected time zone. One issue is what happens at transitionsto and from DST, for example in the UK

as.POSIXct(strptime("2011-03-27 01:30:00", "%Y-%m-%d %H:%M:%S"))as.POSIXct(strptime("2010-10-31 01:30:00", "%Y-%m-%d %H:%M:%S"))

are respectively invalid (the clocks went forward at 1:00 GMT to 2:00BST) and ambiguous (the clocks went back at 2:00BST to 1:00 GMT). Whathappens in such cases is OS-specific: one should expect the first tobeNA, but the second could be interpreted as eitherBST orGMT (and common OSes give both possible values). Note too (seestrftime) that OS facilities may not format invalidtimes correctly.

Value

as.POSIXct andas.POSIXlt return an object of theappropriate class. Iftz was specified,as.POSIXltwill give an appropriate"tzone" attribute. Date-times knownto be invalid will be returned asNA.

Note

Some of the concepts used have to be extended backwards in time (theusage is said to be ‘proleptic’). For example, the origin oftime for the"POSIXct" class, ‘1970-01-01 00:00.00 UTC’,is before UTC was defined. More importantly, conversion is doneassuming the Gregorian calendar which was introduced in 1582 and notused near-universally until the 20th century. One of there-interpretations assumed by ISO 8601:2004 is that there was a yearzero, even though current year numbering (and zero) is a much laterconcept (525 CE for year numbers from 1 CE).

Conversions between"POSIXlt" and"POSIXct" of futuretimes are speculative except in UTC. The main uncertainty is in theuse of and transitions to/from DST (most systems will assume thecontinuation of current rules but these can be changed at shortnotice).

If you want to extract specific aspects of a time (such as the day ofthe week) just convert it to class"POSIXlt" and extract therelevant component(s) of the list, or if you want a characterrepresentation (such as a named day of the week) use theformat method.

If a time zone is needed and that specified is invalid on your system,what happens is system-specific but attempts to set it will probablybe ignored.

Conversion from character needs to find a suitable format unless oneis supplied (by trying common formats in turn): this can be slow forlong inputs.

See Also

DateTimeClasses for details of the classes;strptime for conversion to and from characterrepresentations.

Sys.timezone for details of the (system-specific) namingof time zones.

locales for locale-specific aspects.

Examples

(z<- Sys.time())# the current datetime, as class "POSIXct"unclass(z)# a large integerfloor(unclass(z)/86400)# the number of days since 1970-01-01 (UTC)(now<- as.POSIXlt(Sys.time()))# the current datetime, as class "POSIXlt"str(unclass(now))# the internal list ; use now$hour, etc :now$year+1900# see ?DateTimeClassesmonths(now); weekdays(now)# see ?months; using LC_TIME locale## suppose we have a time in seconds since 1960-01-01 00:00:00 GMT## (the origin used by SAS)z<-1472562988# ways to convert thisas.POSIXct(z, origin="1960-01-01")# localas.POSIXct(z, origin="1960-01-01", tz="GMT")# in UTC## SPSS dates (R-help 2006-02-16)z<- c(10485849600,10477641600,10561104000,10562745600)as.Date(as.POSIXct(z, origin="1582-10-14", tz="GMT"))## Stata date-times: milliseconds since 1960-01-01 00:00:00 GMT## format %tc excludes leap-seconds, assumed here## For format %tC including leap seconds, see foreign::read.dta()z<-1579598122120op<- options(digits.secs=3)# avoid rounding down: milliseconds are not exactly representableas.POSIXct((z+0.1)/1000, origin="1960-01-01")options(op)## Matlab 'serial day number' (days and fractional days)z<-7.343736909722223e5# 2010-08-23 16:35:00as.POSIXct((z-719529)*86400, origin="1970-01-01", tz="UTC")as.POSIXlt(Sys.time(),"GMT")# the current time in UTC## These may not be correct names on your systemas.POSIXlt(Sys.time(),"America/New_York")# in New Yorkas.POSIXlt(Sys.time(),"EST5EDT")# alternative.as.POSIXlt(Sys.time(),"EST")# somewhere in Eastern Canadaas.POSIXlt(Sys.time(),"HST")# in Hawaiias.POSIXlt(Sys.time(),"Australia/Darwin")tab<- file.path(R.home("share"),"zoneinfo","zone1970.tab")if(file.exists(tab)){# typically on Windows; *not* on Linux  cols<- c("code","coordinates","TZ","comments")  tmp<- read.delim(tab,                    header=FALSE, comment.char="#", col.names= cols)if(interactive()) View(tmp)  head(tmp,10)}

Inhibit Interpretation/Conversion of Objects

Description

Change the class of an object to indicate that it should be treated‘as is’.

Usage

I(x)

Arguments

x

an object

Details

FunctionI has two main uses.

  • In functiondata.frame. Protecting an object byenclosing it inI() in a call todata.frame inhibits theconversion of character vectors to factors and the dropping ofnames, and ensures that matrices are inserted as single columns.I can also be used to protect objects which are to beadded to a data frame, or converted to a data frameviaas.data.frame.

    It achieves this by prepending the class"AsIs" to the object'sclasses. Class"AsIs" has a few of its own methods, includingfor[,as.data.frame,print andformat.

  • In functionformula. There it is used toinhibit the interpretation of operators such as"+","-","*" and"^" as formula operators, so theyare used as arithmetical operators. This is interpreted as a symbolbyterms.formula.

Value

A copy of the object with class"AsIs" prepended to the class(es).

References

Chambers, J. M. (1992)Linear models.Chapter 4 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

data.frame,formula


Split Array/Matrix By Its Margins

Description

Split an array or matrix by its margins.

Usage

asplit(x, MARGIN)

Arguments

x

an array, including a matrix.

MARGIN

a vector giving the margins to split by.E.g., for a matrix1 indicates rows,2 indicatescolumns,c(1, 2) indicates rows and columns.Wherex has named dimnames, it can be a character vectorselecting dimension names.

Details

SinceR 4.1.0, one can also obtain the splits (less efficiently)usingapply(x, MARGIN, identity, simplify = FALSE).The values of the splits can also be obtained (less efficiently) bysplit(x, slice.index(x, MARGIN)).

Value

A “list array” with dimensiondvdv and each element anarray of dimensiondede and dimnames preserved as available, wheredvdv anddede are, respectively, the dimensions ofxincluded and not included inMARGIN.

Examples

## A 3-dimensional array of dimension 2 x 3 x 4:d<-2:4x<- array(seq_len(prod(d)), d)x## Splitting by margin 2 gives a 1-d list array of length 3## consisting of 2 x 4 arrays:asplit(x,2)## Splitting by margins 1 and 2 gives a 2 x 3 list array## consisting of 1-d arrays of length 4:asplit(x, c(1,2))## Compare tosplit(x, slice.index(x, c(1,2)))## A 2 x 3 matrix:(x<- matrix(1:6,2,3))## To split x by its rows, one can useasplit(x,1)## or less efficientlysplit(x, slice.index(x,1))split(x, row(x))

Assign a Value to a Name

Description

Assign a value to a name in an environment.

Usage

assign(x, value, pos=-1, envir= as.environment(pos),       inherits=FALSE, immediate=TRUE)

Arguments

x

a variable name, given as a character string. No coercion isdone, and the first element of a character vector of length greaterthan one will be used, with a warning.

value

a value to be assigned tox.

pos

where to do the assignment. By default, assigns into thecurrent environment. See ‘Details’ for other possibilities.

envir

theenvironment to use. See ‘Details’.

inherits

should the enclosing frames of the environment beinspected?

immediate

an ignored compatibility feature.

Details

There are no restrictions on the name given asx: it can be anon-syntactic name (seemake.names).

Thepos argument can specify the environment in which to assignthe object in any of several ways: as-1 (the default),as a positive integer (the position in thesearch list); asthe character string name of an element in the search list; or as anenvironment (including usingsys.frame toaccess the currently active function calls).Theenvir argument is an alternative way to specify anenvironment, but is primarily for back compatibility.

assign does not dispatch assignment methods, so it cannot beused to set elements of vectors, names, attributes, etc.

Note that assignment to an attached list or data frame changes theattached copy and not the original object: seeattachandwith.

Value

This function is invoked for its side effect, which is assigningvalue to the variablex. If noenvir isspecified, then the assignment takes place in the currently activeenvironment.

Ifinherits isTRUE, enclosing environments of the suppliedenvironment are searched until the variablex is encountered.The value is then assigned in the environment in which the variable isencountered (provided that the binding is not locked: seelockBinding: if it is, an error is signaled). If thesymbol is not encountered then assignment takes place in the user'sworkspace (the global environment).

Ifinherits isFALSE, assignment takes place in theinitial frame ofenvir, unless an existing binding is locked orthere is no existing binding and the environment is locked (when anerror is signaled).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

<-,get, the inverse ofassign(),exists,environment.

Examples

for(iin1:6){#-- Create objects  'r.1', 'r.2', ... 'r.6' --    nam<- paste("r", i, sep=".")    assign(nam,1:i)}ls(pattern="^r..$")##-- Global assignment within a function:myf<-function(x){    innerf<-function(x) assign("Global.res", x^2, envir= .GlobalEnv)    innerf(x+1)}myf(3)Global.res# 16a<-1:4assign("a[1]",2)a[1]==2# FALSEget("a[1]")==2# TRUE

Assignment Operators

Description

Assign a value to a name.

Usage

x<- valuex<<- valuevalue-> xvalue->> xx= value

Arguments

x

a variable name (possibly quoted).

value

a value to be assigned tox.

Details

There are three different assignment operators: two of themhave leftwards and rightwards forms.

The operators<- and= assign into the environment inwhich they are evaluated. The operator<- can be usedanywhere, whereas the operator= is only allowed at the toplevel (e.g., in the complete expression typed at the command prompt)or as one of the subexpressions in a braced list of expressions.

The operators<<- and->> are normally only used infunctions, and cause a search to be made through parent environmentsfor an existing definition of the variable being assigned. If sucha variable is found (and its binding is not locked) then its valueis redefined, otherwise assignment takes place in the globalenvironment. Note that their semantics differ from that in the Slanguage, but are useful in conjunction with the scoping rules ofR. See ‘The R Language Definition’ manual for furtherdetails and examples.

In all the assignment operator expressions,x can be a nameor an expression defining a part of an object to be replaced (e.g.,z[[1]]). A syntactic name does not need to be quoted,though it can be (preferably bybackticks).

The leftwards forms of assignment<- = <<- group right to left,the other from left to right.

Value

value. Thus one can usea <- b <- c <- 6.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer (for=).

See Also

assign (and its inverseget),for “subassignment” such asx[i] <- v,see[<-; further,environment.


Attach Set of R Objects to Search Path

Description

The database is attached to theR search path. This means that thedatabase is searched byR when evaluating a variable, so objects inthe database can be accessed by simply giving their names.

Usage

attach(what, pos=2L, name= deparse1(substitute(what), backtick=FALSE),       warn.conflicts=TRUE)

Arguments

what

‘database’. This can be adata.frame or alist or aR data file created withsave orNULL or an environment. See also‘Details’.

pos

integer specifying position insearch() whereto attach.

name

name to use for the attached database. Names starting withpackage: are reserved forlibrary.

warn.conflicts

logical. IfTRUE,message()s areprinted aboutconflicts from attaching the database,unless that database contains an object.conflicts.OK. Aconflict is a function masking a function, or a non-function maskinga non-function.

NB: Even though the name iswarn.conflicts for historicalreasons, the messages about conflicts arenotwarning()s butmessage()s.

Details

When evaluating a variable or function nameR searches forthat name in the databases listed bysearch. The firstname of the appropriate type is used.

By attaching a data frame (or list) to the search path it is possibleto refer to the variables in the data frame by their names alone,rather than as components of the data frame (e.g., in the example below,height rather thanwomen$height).

By default the database is attached in position 2 in the search path,immediately after the user's workspace and before all previouslyattached packages and previously attached databases. This can bealtered to attach later in the search path with thepos option,but you cannot attach atpos = 1.

The database is not actually attached. Rather, a new environment iscreated on the search path and the elements of a list (includingcolumns of a data frame) or objects in a save file or an environmentarecopied into the new environment. If you use<<- orassign to assign to an attacheddatabase, you only alter the attached copy, not the original object.(Normal assignment will place a modified version in the user'sworkspace: see the examples.) For this reasonattach can leadto confusion.

One useful ‘trick’ is to usewhat = NULL (or equivalently alength-zero list) to create a new environment on the search path intowhich objects can be assigned byassign orload orsys.source.

Names starting"package:" are reserved forlibrary and should not be used by end users. Attachedfiles are by default given the namefile:what. Thename argument given for the attached environment will be usedbysearch and can be used as the argument toas.environment.

Value

Theenvironment is returned invisibly with a"name" attribute.

Good practice

attach has the side effect of altering the search path and thiscan easily lead to the wrong object of a particular name being found.People do often forget todetach databases.

In interactive use,with is usually preferable to theuse ofattach/detach, unlesswhat is asave()-produced file in which caseattach() is a (safety) wrapper forload().

In programming, functions should not change the search path unlessthat is their purpose. Oftenwith can be used within afunction. If not, good practice is to

  • Always use a distinctivename argument, and

  • To immediately follow theattach call by anon.exit call todetach using the distinctive name.

This ensures that the search path is left unchanged even if thefunction is interrupted or if code after theattach callchanges the search path.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

library,detach,search,objects,environment,with.

Examples

require(utils)summary(women$height)# refers to variable 'height' in the data frameattach(women)summary(height)# The same variable now available by nameheight<- height*2.54# Don't do this. It creates a new variable# in the user's workspacefind("height")summary(height)# The new variable in the workspacerm(height)summary(height)# The original variable.height<<- height*25.4# Change the copy in the attached environmentfind("height")summary(height)# The changed copydetach("women")summary(women$height)# unchanged## Not run: ## create an environment on the search path and populate itsys.source("myfuns.R", envir= attach(NULL, name="myfuns"))## End(Not run)

Object Attributes

Description

Get or set specific attributes of an object.

Usage

attr(x, which, exact=FALSE)attr(x, which)<- value

Arguments

x

an object whose attributes are to be accessed.

which

a non-empty character string specifying which attributeis to be accessed.

exact

logical: shouldwhich be matched exactly?

value

an object, the new value of the attribute, orNULLto remove the attribute.

Details

These functions provide access to a single attribute of an object.The replacement form causes the named attribute to take the valuespecified (or create a new attribute with the value given).

The extraction function first looks for an exact match towhichamongst the attributes ofx, then (unlessexact = TRUE)a unique partial match.(Settingoptions(warnPartialMatchAttr = TRUE) causespartial matches to give warnings.)

The replacement function only uses exact matches.

Note that some attributes (namelyclass,comment,dim,dimnames,names,row.names andtsp) are treated specially and have restrictions onthe values which can be set. (Note that this is not true oflevels which should be set for factors via thelevels replacement function.)

The extractor function allows (and does not match) empty and missingvalues ofwhich: the replacement function does not.

NULL objects cannot have attributes and attempting toassign one byattr gives an error.

Both areprimitive functions.

Value

For the extractor, the value of the attribute matched, orNULLif no exact match is found and no or more than one partial match is found.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

attributes

Examples

# create a 2 by 5 matrixx<-1:10attr(x,"dim")<- c(2,5)

Object Attribute Lists

Description

These functions access an object's attributes.The first form below returns the object's attribute list.The replacement forms uses the list on the right-handside of the assignment as the object's attributes (if appropriate).

Usage

attributes(x)attributes(x)<- valuemostattributes(x)<- value

Arguments

x

anyR object.

value

an appropriate namedlist of attributes, orNULL.

Details

Unlikeattr it is not an error to set attributes on aNULL object: it will first be coerced to an empty list.

Note that some attributes (namelyclass,comment,dim,dimnames,names,row.names andtsp) are treated specially and have restrictions onthe values which can be set. (Note that this is not true oflevels which should be set for factors via thelevels replacement function.)

Attributes are not stored internally as a list and should be thoughtof as a set and not a vector, i.e, theorder of the elements ofattributes() does not matter. This is also reflected byidentical()'s behaviour with the default argumentattrib.as.set = TRUE. Attributes must have unique names (andNA is taken as"NA", not a missing value).

Assigning attributes first removes all attributes, then sets anydim attribute and then the remaining attributes in the ordergiven: this ensures that setting adim attribute always precedesthedimnames attribute.

Themostattributes assignment takes special care for thedim,names anddimnamesattributes, and assigns them only when known to be valid whereas anattributes assignment would give an error if any are not. Itis principally intended for arrays, and should be used with care onclassed objects. For example, it does not check thatrow.names are assigned correctly for data frames.

The names of a pairlist are not stored as attributes, but are reportedas if they were (and can be set by the replacement form ofattributes).

NULL objects cannot have attributes and attempts toassign them will promote the object to an empty list.

Both assignment and replacement forms ofattributes areprimitive functions.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

attr,structure.

Examples

x<- cbind(a=1:3, pi= pi)# simple matrix with dimnamesattributes(x)## strip an object's attributes:attributes(x)<-NULLx# now just a vector of length 6mostattributes(x)<- list(mycomment="really special", dim=3:2,   dimnames= list(LETTERS[1:3], letters[1:5]), names= paste(1:6))x# dim(), but not {dim}names

On-demand Loading of Packages

Description

autoload creates a promise-to-evaluateautoloader andstores it with namename in.AutoloadEnv environment.WhenR attempts to evaluatename,autoloader is run,the package is loaded andname is re-evaluated in the newpackage's environment. The result is thatR behaves as ifpackage was loaded but it does not occupy memory.

.Autoloaded contains the names of the packages forwhich autoloading has been promised.

Usage

autoload(name, package, reset=FALSE,...)autoloader(name, package,...).AutoloadEnv.Autoloaded

Arguments

name

string giving the name of an object.

package

string giving the name of a package containing the object.

reset

logical: for internal use byautoloader.

...

other arguments tolibrary.

Value

This function is invoked for its side-effect. It has no return value.

See Also

delayedAssign,library

Examples

require(stats)autoload("interpSpline","splines")search()ls("Autoloads").Autoloadedx<- sort(stats::rnorm(12))y<- x^2is<- interpSpline(x, y)search()## now has splinesdetach("package:splines")search()is2<- interpSpline(x, y+x)search()## and againdetach("package:splines")

Solve an Upper or Lower Triangular System

Description

Solves a triangular system of linear equations.

Usage

backsolve(r, x, k= ncol(r), upper.tri=TRUE,             transpose=FALSE)forwardsolve(l, x, k= ncol(l), upper.tri=FALSE,             transpose=FALSE)

Arguments

r,l

an upper (or lower) triangular matrix giving thecoefficients for the system to be solved. Values below (above)the diagonal are ignored.

x

a matrix whose columns give the right-hand sides forthe equations.

k

the number of columns ofr and rows ofx to use.

upper.tri

logical; ifTRUE (default), theuppertriangular part ofr is used. Otherwise, the lower one.

transpose

logical; ifTRUE, solvery=xr' * y = x foryy, i.e.,t(r) %*% y == x.

Details

Solves a system of linear equations where the coefficient matrix isupper (or ‘right’, ‘R’) or lower (‘left’,‘L’) triangular.

x <- backsolve (R, b) solvesRx=bR x = b, and
x <- forwardsolve(L, b) solvesLx=bL x = b, respectively.

Ther/l must have at leastk rows and columns,andx must have at leastk rows.

This is a wrapper for the level-3 BLAS routinedtrsm.

Value

The solution of the triangular system. The result will be a vector ifx is a vector and a matrix ifx is a matrix.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide. Philadelphia: SIAM Publications.

See Also

chol,qr,solve.

Examples

## upper triangular matrix 'r':r<- rbind(c(1,2,3),           c(0,1,1),           c(0,0,2))( y<- backsolve(r, x<- c(8,4,2)))# -1 3 1r%*% y# == x = (8,4,2)backsolve(r, x, transpose=TRUE)# 8 -12 -5

Balancing “Ragged” and Out-of-range POSIXlt Date-Times

Description

Utilities to ‘balance’ objects of class"POSIXlt".

unCfillPOSIXlt(x) is a fastprimitive version ofbalancePOSIXlt(x, fill.only=TRUE, classed=FALSE) or equivalently,unclass(balancePOSIXlt(x, fill.only=TRUE)) from where it is named.

Usage

balancePOSIXlt(x, fill.only=FALSE, classed=TRUE)unCfillPOSIXlt(x)

Arguments

x

anR object inheriting from"POSIXlt", seePOSIXlt.

fill.only

alogical specifying ifbalancePOSIXlt(x, ..) should only “fill up” byrecycling, but not re-check validity nor recompute, e.g.,x$wday andx$yday.

classed

alogical specifying if the result should beclassed, true by default. UsingbalancePOSIXlt(x, classed = FALSE)is equivalent to but faster thanunclass(balancePOSIXlt(x)).

“Ragged” and Out-of-rangevs “Balanced” POSIXlt

Note that"POSIXlt" objectsx may have their (9 to 11)list components of differentlengths, by simplyrecycling them to full length. Prior toR 4.3.0, this has worked inprinting, formatting, and conversion to"POSIXct", but oftennot forlength(), conversion to"Date" or indexing,i.e., subsetting,[, or subassigning,[<-.

Relatedly, componentssec,min,hour,mdayandmon could have been out of their designated range (say, 0–23for hours) and still work correctly, e.g. in conversions and printing.This is supported as well, sinceR 4.3.0, at least when the values arenot extreme.

FunctionbalancePOSIXlt(x) will now return a version of the"POSIXlt" objectx which by default is balanced in both ways:All the internal list components are of full length, and their values areinside their ranges as specified inas.POSIXlt's‘Details on POSIXlt’.Settingfill.only = TRUE will only recycle the list componentsto full length, but not check them at all. This is particularly fasterwhen all components ofx are already of full length.

Experimentally,balancePOSIXlt() and other functions returningPOSIXlt objects now set alogical attribute"balanced" withNA meaning “filled-in”, i.e.,not “ragged” andTRUE means (fully) balanced.

See Also

For more details about many aspects of validPOSIXlt objects, notablytheir internal list components, see ‘DateTimeClasses’, e.g.,as.POSIXlt, notably the section ‘Details on POSIXlt’.

Examples

## FIXME: this should also work for regular (non-UTC) time zones.TZ<-"UTC"# Could be# d1 <- as.POSIXlt("2000-01-02 3:45", tz = TZ)# on systems (almost all) which have tm_zone.oldTZ<- Sys.getenv('TZ', unset="unset")Sys.setenv(TZ="UTC")d1<- as.POSIXlt("2000-01-02 3:45")d1$min<- d1$min+(0:16)*20L(f1<- format(d1))str(unclass(d1))# only $min is of length > 1df<- balancePOSIXlt(d1, fill.only=TRUE)# a "POSIXlt" objectstr(unclass(df))# all of length 17; 'min' unchangeddb<- balancePOSIXlt(d1, classed=FALSE)# a liststopifnot(identical(    unCfillPOSIXlt(d1),    balancePOSIXlt(d1, fill.only=TRUE, classed=FALSE)))str(db)# of length 17 *and* in rangeif(oldTZ=="unset") Sys.unsetenv('TZ')else Sys.setenv(TZ= oldTZ)

Manipulate File Paths

Description

basename removes all of the path up to and including the lastpath separator (if any).

dirname returns the part of thepath up to butexcluding the last path separator, or"." if there is no pathseparator.

Usage

basename(path)dirname(path)

Arguments

path

character vector, containing path names.

Details

tilde expansion of the path will be performed.

Trailing path separators are removed before dissecting the path,and fordirname any trailing file separators are removedfrom the result.

Value

A character vector of the same length aspath. A zero-lengthinput will give a zero-length output with no error.

Paths not containing any separators are taken to be in the currentdirectory, sodirname returns".".

If an element ofpath isNA, so is the result.

"" is not a valid pathname, but is returned unchanged.

Behaviour on Windows

On Windows this will accept either\ or/ as the pathseparator, butdirname will return a path using/(except if on a network share, when the leading\\ will bepreserved). Expect these only to be able to handle completepaths, and not for example just a network share or a drive.

UTF-8-encoded path names not valid in the current locale can be used.

Note

These are not wrappers for the POSIX system functions of the samenames: in particular they donot have the special handling ofthe path"/" and of returning"." for empty strings.

See Also

file.path,path.expand.

Examples

basename(file.path("","p1","p2","p3", c("file1","file2")))dirname(file.path("","p1","p2","p3","filename"))

Bessel Functions

Description

Bessel Functions of integer and fractional order, of firstand second kind,JνJ_{\nu} andYνY_{\nu}, andModified Bessel functions (of first and third kind),IνI_{\nu} andKνK_{\nu}.

Usage

besselI(x, nu, expon.scaled=FALSE)besselK(x, nu, expon.scaled=FALSE)besselJ(x, nu)besselY(x, nu)

Arguments

x

numeric,0\ge 0.

nu

numeric; theorder (maybe fractional and negative) ofthe corresponding Bessel function.

expon.scaled

logical; ifTRUE, the results areexponentially scaled in order to avoid overflow(IνI_{\nu}) or underflow (KνK_{\nu}),respectively.

Details

Ifexpon.scaled = TRUE,exIν(x)e^{-x} I_{\nu}(x),orexKν(x)e^{x} K_{\nu}(x) are returned.

Forν<0\nu < 0, formulae 9.1.2 and 9.6.2 fromAbramowitz & Stegunare applied (which is probably suboptimal), except forbesselK which is symmetric innu.

The current algorithms will give warnings about accuracy loss forlarge arguments. In some cases, these warnings are exaggerated, andthe precision is perfect. For largenu, say in the order ofmillions, the current algorithms are rarely useful.

Value

Numeric vector with the (scaled, ifexpon.scaled = TRUE)values of the corresponding Bessel function.

The length of the result is the maximum of the lengths of theparameters. All parameters are recycled to that length.

Author(s)

Original Fortran code:W. J. Cody, Argonne National Laboratory
Translation to C and adaptation toR:Martin Maechler[email protected].

Source

The C code is a translation of Fortran routines fromhttps://netlib.org/specfun/ribesl, ‘⁠../rjbesl⁠’, etc.The four source code files for bessel[IJKY] each contain a paragraph“Acknowledgement” and “References”, a short summary ofwhich is

besselI

based on (code) by David J. Sookne, see Sookne (1973)...Modifications... An earlier version was published in Cody (1983).

besselJ

asbesselI

besselK

based on (code) by J. B. Campbell (1980)... Modifications...

besselY

draws heavily on Temme's Algol program forYY... and on Campbell's programs forYν(x)Y_\nu(x).... ... heavily modified.

References

Abramowitz, M. and Stegun, I. A. (1972).Handbook of Mathematical Functions.Dover, New York;Chapter 9: Bessel Functions of Integer Order.

In order of “Source” citation above:

Sookne, David J. (1973).Bessel Functions of Real Argument and Integer Order.Journal of Research of the National Bureau of Standards,77B, 125–132.doi:10.6028/jres.077B.012.

Cody, William J. (1983).Algorithm 597: Sequence of modified Bessel functions of the first kind.ACM Transactions on Mathematical Software,9(2), 242–245.doi:10.1145/357456.357462.

Campbell, J.B. (1980).On Temme's algorithm for the modified Bessel function of the third kind.ACM Transactions on Mathematical Software,6(4), 581–586.doi:10.1145/355921.355928.

Campbell, J.B. (1979).Bessel functions J_nu(x) and Y_nu(x) of float order and float argument.Computer Physics Communications,18, 133–142.doi:10.1016/0010-4655(79)90030-4.

Temme, Nico M. (1976).On the numerical evaluation of the ordinary Bessel function of thesecond kind.Journal of Computational Physics,21, 343–350.doi:10.1016/0021-9991(76)90032-2.

See Also

Other special mathematical functions, such asgamma,Γ(x)\Gamma(x), andbeta,B(x)B(x).

Examples

require(graphics)nus<- c(0:5,10,20)x<- seq(0,4, length.out=501)plot(x, x, ylim= c(0,6), ylab="", type="n",     main="Bessel Functions  I_nu(x)")for(nuin nus) lines(x, besselI(x, nu= nu), col= nu+2)legend(0,6, legend= paste("nu=", nus), col= nus+2, lwd=1)x<- seq(0,40, length.out=801); yl<- c(-.5,1)plot(x, x, ylim= yl, ylab="", type="n",     main="Bessel Functions  J_nu(x)")abline(h=0, v=0, lty=3)for(nuin nus) lines(x, besselJ(x, nu= nu), col= nu+2)legend("topright", legend= paste("nu=", nus), col= nus+2, lwd=1, bty="n")## Negative nu's --------------------------------------------------xx<-2:7nu<- seq(-10,9, length.out=2001)## --- I() --- --- --- ---matplot(nu, t(outer(xx, nu, besselI)), type="l", ylim= c(-50,200),        main= expression(paste("Bessel ", I[nu](x)," for fixed ", x,",  as ", f(nu))),        xlab= expression(nu))abline(v=0, col="light gray", lty=3)legend(5,200, legend= paste("x=", xx), col=seq(xx), lty=1:5)## --- J() --- --- --- ---bJ<- t(outer(xx, nu, besselJ))matplot(nu, bJ, type="l", ylim= c(-500,200),        xlab= quote(nu), ylab= quote(J[nu](x)),        main= expression(paste("Bessel ", J[nu](x)," for fixed ", x)))abline(v=0, col="light gray", lty=3)legend("topright", legend= paste("x=", xx), col=seq(xx), lty=1:5)## ZOOM into right part:matplot(nu[nu>-2], bJ[nu>-2,], type="l",        xlab= quote(nu), ylab= quote(J[nu](x)),        main= expression(paste("Bessel ", J[nu](x)," for fixed ", x)))abline(h=0, v=0, col="gray60", lty=3)legend("topright", legend= paste("x=", xx), col=seq(xx), lty=1:5)##---------------  x --> 0  -----------------------------x0<-2^seq(-16,5, length.out=256)plot(range(x0), c(1e-40,1), log="xy", xlab="x", ylab="", type="n",     main="Bessel Functions  J_nu(x)  near 0\n log - log  scale"); axis(2, at=1)for(nuin sort(c(nus, nus+0.5)))    lines(x0, besselJ(x0, nu= nu), col= nu+2, lty=1+(nu%%1>0))legend("right", legend= paste("nu=", paste(nus, nus+0.5, sep=", ")),       col= nus+2, lwd=1, bty="n")x0<-2^seq(-10,8, length.out=256)plot(range(x0),10^c(-100,80), log="xy", xlab="x", ylab="", type="n",     main="Bessel Functions  K_nu(x)  near 0\n log - log  scale"); axis(2, at=1)for(nuin sort(c(nus, nus+0.5)))    lines(x0, besselK(x0, nu= nu), col= nu+2, lty=1+(nu%%1>0))legend("topright", legend= paste("nu=", paste(nus, nus+0.5, sep=", ")),       col= nus+2, lwd=1, bty="n")x<- x[x>0]plot(x, x, ylim= c(1e-18,1e11), log="y", ylab="", type="n",     main="Bessel Functions  K_nu(x)"); axis(2, at=1)for(nuin nus) lines(x, besselK(x, nu= nu), col= nu+2)legend(0,1e-5, legend=paste("nu=", nus), col= nus+2, lwd=1)yl<- c(-1.6,.6)plot(x, x, ylim= yl, ylab="", type="n",     main="Bessel Functions  Y_nu(x)")for(nuin nus){    xx<- x[x>.6*nu]    lines(xx, besselY(xx, nu=nu), col= nu+2)}legend(25,-.5, legend= paste("nu=", nus), col= nus+2, lwd=1)## negative nu in bessel_Y -- was bogus for a long timecurve(besselY(x,-0.1),0,10, ylim= c(-3,1), ylab="")for(nuin c(seq(-0.2,-2, by=-0.1)))  curve(besselY(x, nu), add=TRUE)title(expression(besselY(x, nu)*"   "*{nu== list(-0.1,-0.2,...,-2)}))

Binding and Environment Locking, Active Bindings

Description

These functions represent an interface for adjustmentsto environments and bindings within environments. They allow forlocking environments as well as individual bindings, and for linkinga variable to a function.

Usage

lockEnvironment(env, bindings=FALSE)environmentIsLocked(env)lockBinding(sym, env)unlockBinding(sym, env)bindingIsLocked(sym, env)makeActiveBinding(sym, fun, env)bindingIsActive(sym, env)activeBindingFunction(sym, env)

Arguments

env

an environment.

bindings

logical specifying whether bindings should be locked.

sym

a name object or character string.

fun

a function taking zero or one arguments.

Details

The functionlockEnvironment locks its environment argument.Locking theenvironment prevents adding or removing variable bindings from theenvironment. Changing the value of a variable is still possible unlessthe binding has been locked. The namespace environments of packageswith namespaces are locked when loaded.

lockBinding locks individual bindings in the specifiedenvironment. The value of a locked binding cannot be changed. Lockedbindings may be removed from an environment unless the environment islocked.

makeActiveBinding installsfun in environmentenvso that getting the value ofsym callsfun with noarguments, and assigning tosym callsfun with oneargument, the value to be assigned. This allows the implementation ofthings like C variables linked toR variables and variables linked todatabases, and is used to implementsetRefClass. It mayalso be useful for making thread-safe versions of some system globals.Currently active bindings are not preserved during package installation,but they can be created in.onLoad.

Value

ThebindingIsLocked andenvironmentIsLocked return alength-one logical vector. The remaining functions returnNULL, invisibly.

Author(s)

Luke Tierney

Examples

# locking environmentse<- new.env()assign("x",1, envir= e)get("x", envir= e)lockEnvironment(e)get("x", envir= e)assign("x",2, envir= e)try(assign("y",2, envir= e))# error# locking bindingse<- new.env()assign("x",1, envir= e)get("x", envir= e)lockBinding("x", e)try(assign("x",2, envir= e))# errorunlockBinding("x", e)assign("x",2, envir= e)get("x", envir= e)# active bindingsf<- local({    x<-1function(v){if(missing(v))           cat("get\n")else{           cat("set\n")           x<<- v}       x}})makeActiveBinding("fred", f, .GlobalEnv)bindingIsActive("fred", .GlobalEnv)fredfred<-2fred

Bitwise Logical Operations

Description

Logical operations on integer vectors with elements viewed as sets of bits.

Usage

bitwNot(a)bitwAnd(a, b)bitwOr(a, b)bitwXor(a, b)bitwShiftL(a, n)bitwShiftR(a, n)

Arguments

a,b

integer vectors; numeric vectors are coerced to integer vectors.

n

non-negative integer vector of values up to 31.

Details

Each element of an integer vector has 32 bits.

Pairwise operations can result in integerNA.

Shifting is done assuming the values represent unsigned integers.

Value

An integer vector of length the longer of the arguments, or zerolength if one is zero-length.

The output element isNA if an input isNA (aftercoercion) or an invalid shift.

See Also

The logical operators,!,&,|,xor.Notably thesedo work bitwise forraw arguments.

The classes"octmode" and"hexmode" whoseimplementation of the standard logical operators is based on thesefunctions.

Packagebitops has similar functions for numeric vectors whichdiffer in the way they treat integers2312^{31} or larger.

Examples

bitwNot(0:12)# -1 -2  ... -13bitwAnd(15L,7L)#  7bitwOr(15L,7L)# 15bitwXor(15L,7L)#  8bitwXor(-1L,1L)# -2## The "same" for 'raw' instead of integer :rr12<- as.raw(0:12); rbind(rr12,!rr12)c(r15<- as.raw(15), r7<- as.raw(7))#  0f 07r15& r7# 07r15| r7# 0fxor(r15, r7)# 08bitwShiftR(-1,1:31)# shifts of 2^32-1 = 4294967295

Access to and Manipulation of the Body of a Function

Description

Get or set thebody of a function which is basically all ofthe function definition but its formal arguments (formals),see the ‘Details’.

Usage

body(fun= sys.function(sys.parent()))body(fun, envir= environment(fun))<- value

Arguments

fun

a function object, or see ‘Details’.

envir

environment in which the function should be defined.

value

an object, usually alanguage object: see section‘Value’.

Details

For the first form,fun can be a character stringnaming the function to be manipulated, which is searched for from theparent frame. If it is not specified, the function callingbody is used.

The bodies of all but the simplest are braced expressions, that iscalls to{: see the ‘Examples’ section for how tocreate such a call.

Value

body returns the body of the function specified. This isnormally alanguage object, most often a call to{, butit can also be asymbol such aspi or a constant(e.g.,3 or"R") to be the return value of the function.

The replacement form sets the body of a function to theobject on the right hand side, and (potentially) resets theenvironment of the function, and dropsattributes. Ifvalue is of class"expression" the first element is used as the body: anyadditional elements are ignored, with a warning.

See Also

The three parts of a (non-primitive) function are itsformals,body, andenvironment.

Further, seealist,args,function.

Examples

body(body)f<-function(x) x^5body(f)<- quote(5^x)## or equivalently  body(f) <- expression(5^x)f(3)# = 125body(f)## creating a multi-expression bodye<- expression(y<- x^2, return(y))# or a listbody(f)<- as.call(c(as.name("{"), e))ff(8)## Using substitute() may be simpler than 'as.call(c(as.name("{",..)))':stopifnot(identical(body(f), substitute({ y<- x^2; return(y)})))

Partial substitution in expressions

Description

An analogue of the LISP backquote macro.bquote quotes itsargument except that terms wrapped in.() are evaluated in thespecifiedwhere environment. Ifsplice = TRUE thenterms wrapped in..() are evaluated and spliced into a call.

Usage

bquote(expr, where= parent.frame(), splice=FALSE)

Arguments

expr

Alanguage object.

where

An environment.

splice

Logical; ifTRUE splicing is enabled.

Value

Alanguage object.

See Also

quote,substitute

Examples

require(graphics)a<-2bquote(a== a)quote(a== a)bquote(a== .(a))substitute(a== A, list(A= a))plot(1:10, a*(1:10), main= bquote(a== .(a)))## to set a function default argdefault<-1bquote(function(x, y= .(default)) x+y)exprs<- expression(x<-1, y<-2, x+ y)bquote(function(){..(exprs)}, splice=TRUE)

Environment Browser

Description

Interrupt the execution of an expression and allow the inspection ofthe environment wherebrowser was called from.

Usage

browser(text="", condition=NULL, expr=TRUE, skipCalls=0L)

Arguments

text

a text string that can be retrieved once the browser is invoked.

condition

a condition that can be retrieved once the browser isinvoked.

expr

a “condition”. By default, and whenever not falseafter being coerced tological, thedebugger will be invoked, otherwise control is returned directly.

skipCalls

how many previous calls to skip when reporting thecalling context.

Details

A call tobrowser can be included in the body of a function.When reached, this causes a pause in the execution of thecurrent expression and allows access to theR interpreter.

The purpose of thetext andcondition arguments are toallow helper programs (e.g., external debuggers) to insert specificvalues here, so that the specific call to browser (perhaps its locationin a source file) can be identified and special processing can beachieved. The values can be retrieved by callingbrowserTextandbrowserCondition.

The purpose of theexpr argument is to allow for the illusionof conditional debugging. It is an illusion, because execution isalways paused at the call to browser, but control is only passedto the evaluator described below ifexpr is notFALSE aftercoercion to logical.In most cases it is going to be more efficient to use anifstatement in the calling program, but in some cases using this argumentwill be simpler.

TheskipCalls argument should be used when thebrowser()call is nested within another debugging function: it will look furtherup the call stack to report its location.

At the browser prompt the user can enter commands orR expressions,followed by a newline. The commands are

c

exit the browserand continue execution at the next statement.

cont

synonym forc.

f

finish execution of the current loop or function.

help

print this list of commands.

n

evaluate the next statement, stepping over function calls. For byte compiled functions interrupted bybrowser calls,n is equivalent toc.

s

evaluate the next statement, stepping intofunction calls. Again, byte compiled functions makes equivalent toc.

where

print a stack trace of all active function calls.

r

invoke a"resume" restart if one isavailable; interpreted as anR expression otherwise. Typically"resume" restarts are established for continuing from userinterrupts.

Q

exit the browser and the current evaluation andreturn to the top-level prompt.

Leading and trailing whitespace is ignored, except for an empty line.Handling of empty lines depends on the"browserNLdisabled"option; if it isTRUE, empty lines are ignored. If not, an empty line is the same asn (ors, if it was used most recently).

Anything else entered at the browser prompt is interpreted as anR expression to be evaluated in the calling environment: inparticular typing an object name will cause the object to be printed,andls() lists the objects in the calling frame. (If you wantto look at an object with a name such asn, print itexplicitly, or use autoprint via(n).

The number of lines printed for the deparsed call can be limited bysettingoptions(deparse.max.lines).

The browser prompt is of the formBrowse[n]>: heren indicates the ‘browser level’. The browser canbe called when browsing (and often is whendebug is inuse), and each recursive call increases the number. (The actualnumber is the number of ‘contexts’ on the context stack: thisis usually2 for the outer level of browsing and1 whenexamining dumps indebugger.)

This is a primitive function but does argument matching in thestandard way.

Interaction with Condition Handling

Because the browser prompt is implemented using therestart and condition handling mechanism,it prevents error handlers set up before the breakpoint from beingcalled or invoked. The implementation follows this model:

repeat withRestarts(    withCallingHandlers(        readEvalPrint(),        error = function(cnd) {            cat("Error:", conditionMessage(cnd), "\n")            invokeRestart("browser")        }    ),    browser = function(...) NULL)readEvalPrint <- function(env = parent.frame()) {    print(eval(parse(prompt = "Browse[n]> "), env))}

The restart invocation interrupts the lookup for condition handlersand transfers control to the next iteration of the debuggerREPL.

Note that condition handlers for other classes (such as"warning")are still called and may cause a non-local transfer of control out of thedebugger.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

See Also

debug, andtraceback for the stack on error.browserText for how to retrieve the text and condition.


Functions to Retrieve Values Supplied by Calls to the Browser

Description

A call to browser can provide context by supplying either a textargument or a condition argument. These functions can be used toretrieve either of these arguments.

Usage

browserText(n=1)browserCondition(n=1)browserSetDebug(n=1)

Arguments

n

The number of contexts to skip over, it must be non-negative.

Details

Each call tobrowser can supply either a text string or a condition.The functionsbrowserText andbrowserCondition provide waysto retrieve those values. Since there can be multiple browser contextsactive at any time we also support retrieving values from the differentcontexts. The innermost (most recently initiated) browser context isnumbered 1: other contexts are numbered sequentially.

browserSetDebug provides a mechanism for initiating the browser inone of the calling functions. Seesys.frame for a morecomplete discussion of the calling stack. To usebrowserSetDebugyou select some calling function, determine how far back it is in the callstack and callbrowserSetDebug withn set to that value.Then, by typingc at the browser prompt you will cause evaluationto continue, and provided there are no intervening calls to browser orother interrupts, control will halt again once evaluation has returned tothe closure specified. This is similar to the up functionality in GDBor the "step out" functionality in other debuggers.

Value

browserText returns the text, whilebrowserConditionreturns the condition from the specified browser context.

browserSetDebug returns NULL, invisibly.

Note

It may be of interest to allow for querying further up the set of browsercontexts and this functionality may be added at a later date.

Author(s)

R. Gentleman

See Also

browser


Returns the Names of All Built-in Objects

Description

Return the names of all the built-in objects. These are fetcheddirectly from the symbol table of theR interpreter.

Usage

builtins(internal=FALSE)

Arguments

internal

a logical indicating whether only ‘internal’functions (which can be called via.Internal) shouldbe returned.

Details

builtins() returns an unsorted list of the objects in thesymbol table, that is all the objects in the base environment.These are the built-in objects plus any that have been addedsubsequently when the base package was loaded. It is less confusingto usels(baseenv(), all.names = TRUE).

builtins(TRUE) returns an unsorted list of the names of internalfunctions, that is those which can be accessed as.Internal(foo(args ...)) forfoo in the list.

Value

A character vector.


Apply a Function to a Data Frame Split by Factors

Description

Functionby is an object-oriented wrapper fortapply applied to data frames.

Usage

by(data, INDICES, FUN,..., simplify=TRUE)

Arguments

data

anR object, normally a data frame, possibly a matrix.

INDICES

a factor or a list of factors, each of lengthnrow(data). For the data frame method,INDICES can alsobe a formula as in thef argument of thesplitmethod for data frames.

FUN

a function to be applied to (usually data-frame) subsets ofdata.

...

further arguments toFUN.

simplify

logical: seetapply.

Details

A data frame is split by row into data framessubsetted by the values of one or more factors, and functionFUN is applied to each subset in turn.

For the default method, an object with dimensions (e.g., a matrix) iscoerced to a data frame and the data frame method applied. Otherobjects are also coerced to a data frame, butFUN is appliedseparately to (subsets of) each column of the data frame.

Value

An object of class"by", giving the results for each subset.This is always a list ifsimplify is false, otherwise a list orarray (seetapply).

See Also

tapply,simplify2array.array2DF to convert result to a dataframe.ave also applies a function block-wise.

Examples

require(stats)by(warpbreaks[,1:2], warpbreaks[,"tension"], summary)by(warpbreaks[,1],   warpbreaks[,-1],       summary)by(warpbreaks, warpbreaks[,"tension"],function(x) lm(breaks~ wool, data= x))## now suppose we want to extract the coefficients by grouptmp1<- with(warpbreaks,            by(warpbreaks, tension,function(x) lm(breaks~ wool, data= x)))sapply(tmp1, coef)## another waytmp2<- by(warpbreaks,~ tension,           with, coef(lm(breaks~ wool)))array2DF(tmp2, simplify=TRUE)

Combine Values into a Vector or List

Description

This is a generic function which combines its arguments.

The default method combines its arguments to form a vector.All arguments are coerced to a common type which is the typeof the returned value, and all attributes except names are removed.

Usage

## S3 Generic functionc(...)## Default S3 method:c(..., recursive=FALSE, use.names=TRUE)

Arguments

...

objects to be concatenated. AllNULL entriesare dropped before method dispatch unless at the very beginning of theargument list.

recursive

logical. Ifrecursive = TRUE, the functionrecursively descends through lists (and pairlists) combining alltheir elements into a vector.

use.names

logical indicating ifnames should bepreserved.

Details

The output type is determined from the highest type of the componentsin the hierarchy NULL < raw < logical < integer < double < complex < character< list < expression. Pairlists are treated as lists, whereas non-vectorcomponents (such asnames /symbols andcalls)are treated as one-elementlistswhich cannot be unlisted even ifrecursive = TRUE.

If the output type iscomplex, logical, integer, and doubleNAs keep their imaginary parts zero when coerced, and hence willnot becomeNA_complex_ (with imaginary partNA).

There is ac.factor method which combines factors intoa factor.

c is sometimes used for its side effect of removing attributesexcept names, for example to turn anarray into a vector.as.vector is a more intuitive way to do this, but also dropsnames. Note thatc methods other than the default are not requiredto remove attributes (and they will almost certainly preserve a class attribute).

This is aprimitive function.

Value

NULL or an expression or a vector of an appropriate mode.(With no arguments the value isNULL.)

S4 methods

This function is S4 generic, but with argument list(x, ...).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

unlist andas.vector to produceattribute-free vectors.

Examples

c(1,7:9)c(1:5,10.5,"next")## uses with a single argument to drop attributesx<-1:4names(x)<- letters[1:4]xc(x)# has namesas.vector(x)# no namesdim(x)<- c(2,2)xc(x)as.vector(x)## append to a list:ll<- list(A=1, c="C")## do *not* usec(ll, d=1:3)# which is == c(ll, as.list(c(d = 1:3)))## but ratherc(ll, d= list(1:3))# c() combining two lists## descend through lists:c(list(A= c(B=1)), recursive=TRUE)c(list(A= c(B=1, C=2), B= c(E=7)), recursive=TRUE)

Function Calls

Description

Create or test for objects ofmode"call" (or"(", see Details).

Usage

call(name,...)is.call(x)as.call(x)

Arguments

name

a non-empty character string naming the function to be called.

...

arguments to be part of the call.

x

an arbitraryR object.

Details

call

returns an unevaluated function call, that is, anunevaluated expression which consists of the named function applied tothe given arguments (name must be a string which givesthe name of a function to be called). Note that although the call isunevaluated, the arguments... are evaluated.

call is a primitive, so the first argument istaken asname and the remaining arguments as arguments for theconstructed call: if the first argument is named the name mustpartially matchname.

is.call

is used to determine whetherx is a call (i.e.,ofmode"call" or"("). Note that

  • is.call(x) is strictly equivalent totypeof(x) == "language".

  • is.language() is also true for calls (but alsoforsymbols andexpressions whereis.call() is false).

  • Whenis.call(cl) is true,class(cl)typically returns"call", except whencl is one ofif,for,while,(,{,<-,=,which each has its ownclass(cl) (equal to the“function” name), see the ‘Special calls’ example.

as.call(x):

Objects of mode"list" can be coerced to mode"call".The first element of the list becomes the function part of the call,so should be a function or the name of one (as a symbol; a character string will not do).

If you think of usingas.call(string), consider usingstr2lang(string) which is an efficient version ofparse(text=string).Note thatcall() andas.call(), whenapplicable, are much preferable to theseparse() basedapproaches.

All three areprimitive functions.

as.call is generic: you can write methods to handle specificclasses of objects, seeInternalMethods.

Warning

call should not be used to attempt to evade restrictions on theuse of.Internal and other non-API calls.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

do.call for calling a function by name and argumentlist;Recall for recursive calling of functions;furtheris.language,expression,function.

Producingcalls etc from character:str2lang andparse.

Examples

is.call(call)#-> FALSE: Functions are NOT calls## set up a function call to round with argument 10.5cl<- call("round",10.5)is.call(cl)# TRUEclidentical(quote(round(10.5)),# <- less functional, but the same          cl)# TRUE## such a call can also be evaluated.eval(cl)# [1] 10class(cl)# "call"typeof(cl)# "language"is.call(cl)&& is.language(cl)# always TRUE for "call"sA<-10.5call("round", A)# round(10.5)call("round", quote(A))# round(A)f<-"round"call(f, quote(A))# round(A)## if we want to supply a function we need to use as.call or similarf<- round## Not run: call(f, quote(A))  # error: first arg must be character(g<- as.call(list(f, quote(A))))eval(g)## alternatively but less transparentlyg<- list(f, quote(A))mode(g)<-"call"geval(g)## Special calls (and some regular ones):L<- as.list(E<- setNames(, c("if","for","while","repeat","function","(","{","[","<-","<<-","->","=")))for(iin seq_along(L)) L[[i]]<- call(E[[i]])# instead of lapply(E, call) ..list_<-function(...) `names<-`(list(...), vapply(sys.call()[-1L], as.character,""))(Tab<- noquote(sapply(list_(is.call, typeof, class, mode), \(F) sapply(L, F))))## The 7 exceptions:Tab[ Tab[,"class"]!="call", c(3:4,1:2)]## see also the examples in the help for do.call

Call With Current Continuation

Description

A downward-only version of Scheme's call with current continuation.

Usage

callCC(fun)

Arguments

fun

function of one argument, the exit procedure.

Details

callCC provides a non-local exit mechanism that can be usefulfor early termination of a computation.callCC callsfun with one argument, anexit function. The exitfunction takes a single argument, the intended return value. If thebody offun calls the exit function then the call tocallCC immediately returns, with the value supplied to the exitfunction as the value returned bycallCC.

Author(s)

Luke Tierney

Examples

# The following all return the value 1callCC(function(k)1)callCC(function(k) k(1))callCC(function(k){k(1);2})callCC(function(k)repeat k(1))

Modern Interfaces to C/C++ code

Description

Functions to passR objects to compiled C/C++ code that has beenloaded intoR.

Usage

.Call(.NAME,..., PACKAGE).External(.NAME,..., PACKAGE)

Arguments

.NAME

a character string giving the name of a C function,or an object of class"NativeSymbolInfo","RegisteredNativeSymbol" or"NativeSymbol" referring to such a name.

...

arguments to be passed to the compiled code. Up to 65 for.Call.

PACKAGE

if supplied, confine the search for a character string.NAME to the DLL given by this argument (plus theconventional extension, ‘.so’, ‘.dll’, ...).

This argument follows... and so its name cannot be abbreviated.

This is intended to add safety for packages, which can ensure byusing this argument that no other package can override theirexternal symbols, and also speeds up the search (see ‘Note’).

Details

The functions are used to call compiled code which makes use ofinternalR objects, passing the arguments to the code as a sequenceofR objects. They assume C calling conventions, so can usuallyalso be used for C++ code.

For details about how to write code to use with these functions seethe chapter on ‘System and foreign language interfaces’ inthe ‘Writing R Extensions’ manual. They differ in the way thearguments are passed to the C code:.External allows for avariable or unlimited number of arguments.

These functions areprimitive, and.NAME is alwaysmatched to the first argument supplied (which should not be named).For clarity, avoid using names in the arguments passed to...that match or partially match.NAME.

Value

AnR object constructed in the compiled code.

Header files for external code

Writing code for use with these functions will need to use internalRstructures defined in ‘Rinternals.h’ and/or the macros in‘Rdefines.h’.

Note

If one of these functions is to be used frequently, do specifyPACKAGE (to confine the search to a single DLL) or pass.NAME as one of the native symbol objects. Searching forsymbols can take a long time, especially when many namespaces are loaded.

You may seePACKAGE = "base" for symbols linked intoR. Donot use this in your own code: such symbols are not part of the APIand may be changed without warning.

PACKAGE = "" used to be accepted (but was undocumented): it isnow an error.

References

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer. (.Call.)

See Also

dyn.load,.C,.Fortran.

The ‘Writing R Extensions’ manual.


Report Capabilities of this Build of R

Description

Report on the optional features which have been compiled into thisbuild ofR.

Usage

capabilities(what=NULL,             Xchk= any(nas%in% c("X11","jpeg","png","tiff")))

Arguments

what

character vector orNULL, specifyingrequired components.NULL implies that all are required.

Xchk

logical with a smart default, indicating ifX11-related capabilities should be fully checked, notably on macOS.If set to false, may avoid a warning “No protocol specified”and e.g., the "X11" capability may be returned asNA.

Value

A named logical vector. Current components are

jpeg

is thejpeg function operational?

png

is thepng function operational?

tiff

is thetiff function operational?

tcltk

is thetcltk package operational?Note that to make use of Tk you will almost always need to checkthat"X11" is also available.

X11

are theX11 graphics device and theX11-based data editor available? This loads the X11 module if notalready loaded, and checks that the default display can becontacted unless aX11 device has already been used.

aqua

is thequartz function operational?Only on some macOS builds, includingCRAN binarydistributions ofR.

Note that this is distinct from.Platform$GUI == "AQUA",which is true only when using the MacR.app GUI console.

http/ftp

does the default method forurl anddownload.file support ‘⁠http://⁠’ and ‘⁠ftp://⁠’URLs? AlwaysTRUE as fromR 3.3.0. However, in recentversions the default method is"libcurl" which depends on anexternal library and it is conceivable that library might notsupport ‘⁠ftp://⁠’ in future.

sockets

aremake.socket and related functionsavailable? AlwaysTRUE as fromR 3.3.0.

libxml

is there support for integratinglibxml withtheR event loop?TRUE as fromR 3.3.0,FALSE asfromR 4.2.0.

fifo

are FIFOconnections supported?

cledit

is command-line editing available in the currentRsession? This is false in non-interactive sessions.It will be true for the command-line interface ifreadlinesupport has been compiled in and--no-readline wasnot used whenR was invoked. (If--interactivewas used, command-line editing will not actually be available.)

iconv

is internationalization conversion viaiconv supported? Always true in currentR.

NLS

is there Natural Language Support (for message translations)?

Rprof

is there support forRprof() profiling? Thisis true ifR was configured (before compilation) with default settingswhich include--enable-R-profiling.

profmem

is there support for memory profiling? Seetracemem.

cairo

is there support for thesvg,cairo_pdf andcairo_ps devices, andfortype = "cairo" in thebmp,jpeg,png andtiffdevices?Prior toR 4.1.0 this also indicated Cairo support in theX11 device, but it is now possible to buildR withCairo support for the bitmap devices without support for theX11 device (usually when that is not supported at all).

ICU

is ICU available for collation? See the help onComparison andicuSetCollate: it is neverused for a C locale.

long.double

does this build use aClong doubletype which is longer thandouble? Some platforms do nothave such a type, and on others its use can be suppressed by theconfigure option--disable-long-double.

Although not guaranteed, it is a reasonable assumption that ifpresent long doubles will have at least as much range and accuracyas the ISO/IEC 60559 80-bit ‘extended precision’ format. SinceR 4.0.0.Machine gives information on thelong-double type (if present).

libcurl

islibcurl available in this build? Used byfunctioncurlGetHeaders and optionally bydownload.file andurl. As fromR3.3.0 always true for Unix-alikes, and as fromR 4.2.0 true on Windows.

Note to macOS users

Capabilities"jpeg","png" and"tiff" refer tothe X11-based versions of these devices. Ifcapabilities("aqua") is true, then these devices withtype = "quartz" will be available, and out-of-the-box will be thedefault type. Thus for example thetiff device will beavailable ifcapabilities("aqua") || capabilities("tiff") ifthe defaults are unchanged.

See Also

.Platform,extSoftVersion, andgrSoftVersion (and links there)for availability of capabilitiesexternal toR butused fromR functions.

Examples

capabilities()if(!capabilities("ICU"))   warning("ICU is not available")## Does not call the internal X11-checking function:capabilities(Xchk=FALSE)## See also the examples for 'connections'.

Concatenate and Print

Description

Outputs the objects, concatenating the representations.catperforms much less conversion thanprint.

Usage

cat(..., file="", sep=" ", fill=FALSE, labels=NULL,    append=FALSE)

Arguments

...

R objects (see ‘Details’ for the types of objectsallowed).

file

aconnection, or a character string naming the fileto print to. If"" (the default),cat prints to thestandard output connection, the console unless redirected bysink.If it is"|cmd", the output is piped to the command givenby ‘cmd’, by opening a pipe connection.

sep

a character vector of strings to append after each element.

fill

a logical or (positive) numeric controlling how the output isbroken into successive lines. IfFALSE (default), only newlinescreated explicitly by ‘⁠"\n"⁠’ are printed. Otherwise, theoutput is broken into lines with print width equal to the optionwidth iffill isTRUE, or the value offill if this is numeric. Linefeeds are only insertedbetween elements, strings wider thanfill are notwrapped. Non-positivefill values areignored, with a warning.

labels

character vector of labels for the lines printed.Ignored iffill isFALSE.

append

logical. Only used if the argumentfile is thename of file (and not a connection or"|cmd").IfTRUE output will be appended tofile; otherwise, it will overwrite the contents offile.

Details

cat is useful for producing output in user-defined functions.It converts its arguments to character vectors, concatenatesthem to a single character vector, appends the givensep =string(s) to each element and then outputs them.

No line feeds (aka “newline”s) are output unless explicitlyrequested by ‘⁠"\n"⁠’or if generated by filling (if argumentfill isTRUE ornumeric).

Iffile is a connection and open for writing it is written fromits current position. If it is not open, it is opened for theduration of the call in"wt" mode and then closed again.

Currently onlyatomic vectors andnames are handled,together withNULL and other zero-length objects (which produceno output). Character strings are output ‘as is’ (unlikeprint.default which escapes non-printable characters andbackslash — useencodeString if you want to outputencoded strings usingcat). Other types ofR object should beconverted (e.g., byas.character orformat)before being passed tocat. That includes factors, which areoutput as integer vectors.

cat converts numeric/complex elements in the same way asprint (and not in the same way asas.characterwhich is used by the S equivalent), sooptions"digits" and"scipen" are relevant. However, it usesthe minimum field width necessary for each element, rather than thesame field width for all elements.

Value

None (invisibleNULL).

Note

If any element ofsep contains a newline character, it istreated as a vector of terminators rather than separators, an elementbeing output after every vector elementand a newline after thelast. Entries are recycled as needed.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

print,format, andpastewhich concatenates into a string.

Examples

iter<- stats::rpois(1, lambda=10)## print an informative messagecat("iteration = ", iter<- iter+1,"\n")## 'fill' and label lines:cat(paste(letters,100*1:26), fill=TRUE, labels= paste0("{",1:10,"}:"))

Combine R Objects by Rows or Columns

Description

Take a sequence of vector, matrix or data-frame arguments and combinebycolumns orrows, respectively. These are genericfunctions with methods for otherR classes.

Usage

cbind(..., deparse.level=1)rbind(..., deparse.level=1)## S3 method for class 'data.frame'rbind(..., deparse.level=1, make.row.names=TRUE,      stringsAsFactors=FALSE, factor.exclude=TRUE)

Arguments

...

(generalized) vectors or matrices. These can be given as namedarguments. OtherR objects may be coerced as appropriate, or S4methods may be used: see sections ‘Details’ and‘Value’. (For the"data.frame" method ofcbindthese can be further arguments todata.frame such asstringsAsFactors.)

deparse.level

integer controlling the construction of labels inthe case of non-matrix-like arguments (for the default method):
deparse.level = 0 constructs no labels;
the defaultdeparse.level = 1 typically anddeparse.level = 2 always construct labels from the argumentnames, see the ‘Value’ section below.

make.row.names

(only for data frame method:) logicalindicating if unique and validrow.names should beconstructed from the arguments.

stringsAsFactors

logical, passed toas.data.frame;only has an effect when the... arguments contain a(non-data.frame)character.

factor.exclude

if the data frames contain factors, the defaultTRUE ensures thatNA levels of factors are kept, seePR#17562 and the ‘Data frame methods’. InR versions upto 3.6.x,factor.exclude = NA has been implicitly hardcoded(R <= 3.6.0) or the default (R = 3.6.x, x >= 1).

Details

The functionscbind andrbind are S3 generic, withmethods for data frames. The data frame method will be used if atleast one argument is a data frame and the rest are vectors ormatrices. There can be other methods; in particular, there is one fortime series objects. See the section on ‘Dispatch’ for howthe method to be used is selected. If some of the arguments are of anS4 class, i.e.,isS4(.) is true, S4 methods are soughtalso, and the hiddencbind /rbind functionsfrom packagemethods maybe called, which in turn build oncbind2 orrbind2, respectively. In thatcase,deparse.level is obeyed, similarly to the default method.

In the default method, all the vectors/matrices must be atomic (seevector) or lists. Expressions are not allowed.Language objects (such as formulae and calls) and pairlists will becoerced to lists: other objects (such as names and external pointers)will be included as elements in a list result. Any classes the inputsmight have are discarded (in particular, factors are replaced by theirinternal codes).

If there are several matrix arguments, they must all have the samenumber of columns (or rows) and this will be the number of columns (orrows) of the result. If all the arguments are vectors, the number ofcolumns (rows) in the result is equal to the length of the longestvector. Values in shorter arguments are recycled to achieve thislength (with awarning if they are recycled onlyfractionally).

When the arguments consist of a mix of matrices and vectors the numberof columns (rows) of the result is determined by the number of columns(rows) of the matrix arguments. Any vectors have their valuesrecycled or subsetted to achieve this length.

Forcbind (rbind), vectors of zero length (includingNULL) are ignored unless the result would have zero rows(columns), for S compatibility.(Zero-extent matrices do not occur in S3 and are not ignored inR.)

Matrices are restricted to less than2312^{31} rows andcolumns even on 64-bit systems. So input vectors have the same lengthrestriction: as fromR 3.2.0 input matrices with more elements (butmeeting the row and column restrictions) are allowed.

Value

For the default method, a matrix combining the... argumentscolumn-wise or row-wise. (Exception: if there are no inputs or allthe inputs areNULL, the value isNULL.)

The type of a matrix result determined from the highest type of any ofthe inputs in the hierarchy raw < logical < integer < double < complex <character < list .

Forcbind (rbind) the column (row) names are taken fromthecolnames (rownames) of the arguments if these arematrix-like. Otherwise from the names of the arguments or where thoseare not supplied anddeparse.level > 0, by deparsing theexpressions given, fordeparse.level = 1 only if that gives asensible name (a ‘symbol’, seeis.symbol).

Forcbind row names are taken from the first argument withappropriate names: rownames for a matrix, or names for a vector oflength the number of rows of the result.

Forrbind column names are taken from the first argument withappropriate names: colnames for a matrix, or names for a vector oflength the number of columns of the result.

Data frame methods

Thecbind data frame method is just a wrapper fordata.frame(..., check.names = FALSE). This means thatit will split matrix columns in data frame arguments, and convertcharacter columns to factors unlessstringsAsFactors = FALSE isspecified.

Therbind data frame method first drops all zero-column andzero-row arguments. (If that leaves none, it returns the firstargument with columns otherwise a zero-column zero-row data frame.)It then takes the classes of the columns from the first data frame,and matches columns by name (rather than by position). Factors havetheir levels expanded as necessary (in the order of the levels of thelevel sets of the factors encountered) and the result is an orderedfactor if and only if all the components were ordered factors.Old-style categories (integer vectors with levels) are promoted tofactors.

Note that for result columnj,factor(., exclude = X(j))is applied, where

  X(j) := if(isTRUE(factor.exclude)) {             if(!NA.lev[j]) NA # else NULL          } else factor.exclude

whereNA.lev[j] is true iff any contributing data frame has had afactor in columnj with an explicitNA level.

Dispatch

The method dispatching isnot done viaUseMethod(), but by C-internal dispatching.Therefore there is no need for, e.g.,rbind.default.

The dispatch algorithm is described in the source file(‘.../src/main/bind.c’) as

  1. For each argument we get the list of possible classmemberships from the class attribute.

  2. We inspect each class in turn to see if there is anapplicable method.

  3. If we find a method, we use it. Otherwise, if there was an S4object among the arguments, we try S4 dispatch; otherwise, we usethe default code.

If you want to combine other objects with data frames, it may benecessary to coerce them to data frames first. (Note that thisalgorithm can result in calling the data frame method if all thearguments are either data frames or vectors, and this will result inthe coercion of character vectors to factors.)

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

c to combine vectors (and lists) as vectors,data.frame to combine vectors and matrices as a dataframe.

Examples

m<- cbind(1,1:7)# the '1' (= shorter vector) is recycledmm<- cbind(m,8:14)[, c(1,3,2)]# insert a columnmcbind(1:7, diag(3))# vector is subset -> warningcbind(0, rbind(1,1:3))cbind(I=0, X= rbind(a=1, b=1:3))# use some namesxx<- data.frame(I= rep(0,2))cbind(xx, X= rbind(a=1, b=1:3))# named differentlycbind(0, matrix(1, nrow=0, ncol=4))#> Warning (making sense)dim(cbind(0, matrix(1, nrow=2, ncol=0)))#-> 2 x 1## deparse.leveldd<-10rbind(1:4, c=2,"a++"=10, dd, deparse.level=0)# middle 2 rownamesrbind(1:4, c=2,"a++"=10, dd, deparse.level=1)# 3 rownames (default)rbind(1:4, c=2,"a++"=10, dd, deparse.level=2)# 4 rownames## cheap row names:b0<- gl(3,4, labels=letters[1:3])bf<- setNames(b0, paste0("o", seq_along(b0)))df<- data.frame(a=1, B= b0, f= gl(4,3))df.<- data.frame(a=1, B= bf, f= gl(4,3))new<- data.frame(a=8, B="B", f="1")(df1<- rbind(df, new))(df.1<- rbind(df., new))stopifnot(identical(df1, rbind(df,  new, make.row.names=FALSE)),          identical(df1, rbind(df., new, make.row.names=FALSE)))

Expand a String with Respect to a Target Table

Description

Seeks a unique match of its first argument among theelements of its second. If successful, it returns this element;otherwise, it performs an action specified by the third argument.

Usage

char.expand(input, target, nomatch= stop("no match"))

Arguments

input

a character string to be expanded.

target

a character vector with the values to be matchedagainst.

nomatch

anR expression to be evaluated in case expansion wasnot possible.

Details

This function is particularly useful when abbreviations are allowed infunction arguments, and need to be uniquely expanded with respect to atarget table of possible values.

Value

A length-one character vector, one of the elements oftarget(unlessnomatch is changed to be a non-error, when it can be azero-length character string).

See Also

charmatch andpmatch for performingpartial string matching.

Examples

locPars<- c("mean","median","mode")char.expand("me", locPars, warning("Could not expand!"))char.expand("mo", locPars)

Character Vectors

Description

Create or test for objects of type"character".

Usage

character(length=0)as.character(x,...)is.character(x)

Arguments

length

a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error.

x

object to be coerced or tested.

...

further arguments passed to or from other methods.

Details

as.character andis.character are generic: you canwrite methods to handle specific classes of objects,seeInternalMethods. Further, foras.character thedefault method callsas.vector, so, onlyif(is.object(x)) is true, dispatch is first onmethods foras.character and then for methods foras.vector.

as.character represents real and complex numbers to 15 significantdigits (technically the compiler's setting of the ISO C constantDBL_DIG, which will be 15 on machines supportingIEC 60559arithmetic according to the C99 standard). This ensures that all thedigits in the result will be reliable (and not the result ofrepresentation error), but does mean that conversion to character andback to numeric may change the number. If you want to convert numbersto character with the maximum possible precision, useformat.

Value

character creates a character vector of the specified length.The elements of the vector are all equal to"".

as.character attempts to coerce its argument to character type;likeas.vector it strips attributes including names.For lists and pairlists (includinglanguage objects such ascalls) it deparses the elements individually, except that it extractsthe first element of length-one character vectors, see theAbcexample.

is.character returnsTRUE orFALSE depending onwhether its argument is of character type or not.

Note

as.character breaks lines in language objects at 500characters, and inserts newlines. Prior to 2.15.0 lines weretruncated.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

options: optionsscipen andOutDec affect theconversion of numbers.

paste,substr andstrsplitfor character concatenation and splitting,chartr for character translation and case folding (e.g.,upper to lower case) andsub,grep etc forstring matching and substitutions. Note thathelp.search(keyword = "character") gives even more links.

deparse, which is normally preferable toas.character forlanguage objects.

Quotes on how to specifycharacter / stringconstants, includingraw ones.

Examples

form<- y~ a+ b+ cas.character(form)## length 3deparse(form)## like the inputa0<-11/999# has a repeating decimal representation(a1<- as.character(a0))format(a0, digits=16)# shows 1 to 2 more digit(s)a2<- as.numeric(a1)a2- a0# normally around -1e-17as.character(a2)# possibly different from a1print(c(a0, a2), digits=16)as.character(list(A="Abc", xy= c("x","y")))# "Abc"  "c(\"x\", \"y\")"## i.e., "Abc" directly instead of deparsing to "\"Abc\""

Partial String Matching

Description

charmatch seeks matches for the elements of its first argumentamong those of its second.

Usage

charmatch(x, table, nomatch=NA_integer_)

Arguments

x

the values to be matched: converted to a character vector byas.character.Long vectors are supported.

table

the values to be matched against: converted to a charactervector.Long vectors are not supported.

nomatch

the (integer) value to be returned at non-matchingpositions.

Details

Exact matches are preferred to partial matches (those where the valueto be matched has an exact match to the initial part of the target,but the target is longer).

If there is a single exact match or no exact match and a uniquepartial match then the index of the matching value is returned; ifmultiple exact or multiple partial matches are found then0 isreturned and if no match is found thennomatch is returned.

NA values are treated as the string constant"NA".

Value

An integer vector of the same length asx, giving theindices of the elements intable which matched, ornomatch.

Author(s)

This function is based on a C function written by Terry Therneau.

See Also

pmatch,match.

startsWith for another matching of initial parts of strings;grep orregexpr for more general (regexp)matching of strings.

Examples

charmatch("","")# returns 1charmatch("m",   c("mean","median","mode"))# returns 0charmatch("med", c("mean","median","mode"))# returns 2

Character Translation and Case Folding

Description

Translate characters in character vectors, in particular from upper tolower case or vice versa.

Usage

chartr(old, new, x)tolower(x)toupper(x)casefold(x, upper=FALSE)

Arguments

x

a character vector, or an object that can be coerced tocharacter byas.character.

old

a character string specifying the characters to betranslated. If a character vector of length 2 or more is supplied,the first element is used with a warning.

new

a character string specifying the translations. If acharacter vector of length 2 or more is supplied, the first elementis used with a warning.

upper

logical: translate to upper or lower case?

Details

chartr translates each character inx that is specifiedinold to the corresponding character specified innew.Ranges are supported in the specifications, but character classes andrepeated characters are not. Ifold contains more charactersthan new, an error is signaled; if it contains fewer characters, theextra characters at the end ofnew are ignored.

tolower andtoupper convert upper-case characters in acharacter vector to lower-case, or vice versa. Non-alphabeticcharacters are left unchanged. More than one character can be mappedto a single upper-case character.

casefold is a wrapper fortolower andtoupperoriginally written for compatibility with S-PLUS.

Value

A character vector of the same length and with the same attributes asx (after possible coercion).

Elements of the result will be have the encoding declared as that ofthe current locale (seeEncoding) if the correspondinginput had a declared encoding and the current locale is either Latin-1or UTF-8. The result will be in the current locale's encoding unlessthe corresponding input was in UTF-8 or Latin-1, when it will be in UTF-8.

Note

These functions are platform-dependent, usually using OS services.The latter can be quite deficient, for example only covering ASCIIcharacters in 8-bit locales. The definition of ‘alphabetic’ isplatform-dependent and liable to change over time as most platformsare based on the frequently-updated Unicode tables.

See Also

sub andgsub for othersubstitutions in strings.

Examples

x<-"MiXeD cAsE 123"chartr("iXs","why", x)chartr("a-cX","D-Fw", x)tolower(x)toupper(x)## "Mixed Case" Capitalizing - toupper( every first letter of a word ) :.simpleCap<-function(x){    s<- strsplit(x," ")[[1]]    paste(toupper(substring(s,1,1)), substring(s,2),          sep="", collapse=" ")}.simpleCap("the quick red fox jumps over the lazy brown dog")## ->  [1] "The Quick Red Fox Jumps Over The Lazy Brown Dog"## and the better, more sophisticated version:capwords<-function(s, strict=FALSE){    cap<-function(s) paste(toupper(substring(s,1,1)),{s<- substring(s,2);if(strict) tolower(s)else s},                             sep="", collapse=" ")    sapply(strsplit(s, split=" "), cap, USE.NAMES=!is.null(names(s)))}capwords(c("using AIC for model selection"))## ->  [1] "Using AIC For Model Selection"capwords(c("using AIC","for MODEL selection"), strict=TRUE)## ->  [1] "Using Aic"  "For Model Selection"##                ^^^        ^^^^^##               'bad'       'good'## -- Very simple insecure crypto --rot<-function(ch, k=13){   p0<-function(...) paste(c(...), collapse="")   A<- c(letters, LETTERS," '")   I<- seq_len(k); chartr(p0(A), p0(c(A[-I], A[I])), ch)}pw<-"my secret pass phrase"(crypw<- rot(pw,13))#-> you can send this off## now ``decrypt'' :rot(crypw,54-13)# -> the original:stopifnot(identical(pw, rot(crypw,54-13)))

Warn About Extraneous Arguments in the "..." of Its Caller

Description

Warn about extraneous arguments in the... of its caller. Autility to be used e.g., in S3 methods which need a formal...argument but do not make any use of it. This helps catching usererrors in calling the function in question (which is the caller ofchkDots()).

Usage

chkDots(..., which.call=-1, allowed= character(0))

Arguments

...

“the dots”, as passed from the caller.

which.call

passed tosys.call(). A caller mayuse -2 if the message should mentionits caller.

allowed

not yet implemented: character vector ofnamedelements in... which are “allowed” and hence notwarned about.

Author(s)

Martin Maechler, first version outside base, June 2012.

See Also

warning,....

Examples

seq.default## <- you will see  ' chkDots(...) 'seq(1,5, foo="bar")# gives warning via chkDots()## warning with more than one ...-entry:density.f<-function(x,...) NextMethod("density")x<- density(structure(rnorm(10), class="f"), bar=TRUE, baz=TRUE)

The Cholesky Decomposition

Description

Compute the Cholesky factorization of a real symmetricpositive-definite square matrix.

Usage

chol(x,...)## Default S3 method:chol(x, pivot=FALSE,  LINPACK=FALSE, tol=-1,...)

Arguments

x

an object for which a method exists. The default methodapplies to numeric (or logical) symmetric, positive-definite matrices.

...

arguments to be passed to or from methods.

pivot

logical: should pivoting be used?

LINPACK

logical. Defunct and gives an error.

tol

a numeric tolerance for use withpivot = TRUE.

Details

chol is generic: the description here applies to the defaultmethod.

Note that only the upper triangular part ofx is used, sothatRR=xR'R = x whenx is symmetric.

Ifpivot = FALSE andx is not non-negative definite anerror occurs. Ifx is positive semi-definite (i.e., some zeroeigenvalues) an error will also occur as a numerical tolerance is used.

Ifpivot = TRUE, then the Cholesky decomposition of a positivesemi-definitex can be computed. The rank ofx isreturned asattr(Q, "rank"), subject to numerical errors.The pivot is returned asattr(Q, "pivot"). It is no longerthe case thatt(Q) %*% Q equalsx. However, settingpivot <- attr(Q, "pivot") andoo <- order(pivot), itis true thatt(Q[, oo]) %*% Q[, oo] equalsx,or, alternatively,t(Q) %*% Q equalsx[pivot, pivot]. See the examples.

The value oftol is passed to LAPACK, with negative valuesselecting the default tolerance of (usually)nrow(x) * .Machine$double.neg.eps * max(diag(x)). The algorithm terminates oncethe pivot is less thantol.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.

Value

The upper triangular factor of the Cholesky decomposition, i.e., thematrixRR such thatRR=xR'R = x (see example).

If pivoting is used, then two additional attributes"pivot" and"rank" are also returned.

Warning

The code does not check for symmetry.

Ifpivot = TRUE andx is not non-negative definite thenthere will be a warning message but a meaningless result will occur.So only usepivot = TRUE whenx is non-negative definiteby construction.

Source

This is an interface to the LAPACK routinesDPOTRF andDPSTRF,

LAPACK is fromhttps://netlib.org/lapack/ and its guide is listedin the references.

References

Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

chol2inv for itsinverse (without pivoting),backsolve for solving linear systems with uppertriangular left sides.

qr,svd for related matrix factorizations.

Examples

( m<- matrix(c(5,1,1,3),2,2))( cm<- chol(m))t(cm)%*% cm#-- = 'm'crossprod(cm)#-- = 'm'# now for something positive semi-definitex<- matrix(c(1:5,(1:5)^2),5,2)x<- cbind(x, x[,1]+3*x[,2])colnames(x)<- letters[20:22]m<- crossprod(x)qr(m)$rank# is 2, as it should be# chol() may fail, depending on numerical rounding:# chol() unlike qr() does not use a tolerance.try(chol(m))(Q<- chol(m, pivot=TRUE))## we can use this bypivot<- attr(Q,"pivot")crossprod(Q[, order(pivot)])# recover m## now for a non-positive-definite matrix( m<- matrix(c(5,-5,-5,3),2,2))try(chol(m))# fails(Q<- chol(m, pivot=TRUE))# warningcrossprod(Q)# not equal to m

Inverse from Cholesky (or QR) Decomposition

Description

Invert a symmetric, positive definite square matrix from its Choleskydecomposition. Equivalently, compute(XX)1(X'X)^{-1}from the (RR part) of the QR decomposition ofXX.

Usage

chol2inv(x, size= NCOL(x), LINPACK=FALSE)

Arguments

x

a matrix. The firstsize columns of the upper trianglecontain the Cholesky decomposition of the matrix to be inverted.

size

the number of columns ofx containing theCholesky decomposition.

LINPACK

logical. Defunct and gives an error.

Value

The inverse of the matrix whose Cholesky decomposition was given.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.

Source

This is an interface to the LAPACK routineDPOTRI.LAPACK is fromhttps://netlib.org/lapack/ and its guide is listedin the references.

References

Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition.SIAM.Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.

Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide.Philadelphia: SIAM Publications.

See Also

chol,solve.

Examples

cma<- chol(ma<- cbind(1,1:3, c(1,3,7)))ma%*% chol2inv(cma)

Choose the Appropriate Method for Ops

Description

chooseOpsMethod is a function called by theOps Group Generic when twosuitable methods are found for a given call. It determines which method touse for the operation based on the objects being dispatched.

The function is first called withreverse = FALSE, wherex corresponds to the first argument andy to the secondargument of the group generic call. IfchooseOpsMethod() returnsFALSE forx, thenchooseOpsMethod is called again,withx andy swapped,mx andmy swapped,andreverse = TRUE.

Usage

chooseOpsMethod(x, y, mx, my, cl, reverse)

Arguments

x,y

the objects being dispatched on by the group generic.

mx,my

the methods found for objectsx andy.

cl

the call to the group generic.

reverse

logical value indicating whetherx andy arereversed from the way they were supplied to the generic.

Value

This function must return eitherTRUE orFALSE. A value ofTRUE indicates that methodmx should be used.

See Also

Ops

Examples

# Create two objects with custom Ops methodsfoo_obj<- structure(1, class="foo")bar_obj<- structure(1, class="bar")`+.foo`<-function(e1, e2)"foo"Ops.bar<-function(e1, e2)"bar"invisible(foo_obj+ bar_obj)# Warning: Incompatible methodschooseOpsMethod.bar<-function(x, y, mx, my, cl, reverse)TRUEstopifnot(exprs={  identical(foo_obj+ bar_obj,"bar")  identical(bar_obj+ foo_obj,"bar")})# cleanuprm(foo_obj, bar_obj, `+.foo`, Ops.bar, chooseOpsMethod.bar)

Object Classes

Description

R possesses a simple generic function mechanism which can be used foran object-oriented style of programming. Method dispatch takes placebased on the class of the first argument to the generic function.

Usage

class(x)class(x)<- valueunclass(x)inherits(x, what, which=FALSE)nameOfClass(x)isa(x, what)oldClass(x)oldClass(x)<- value.class2(x)

Arguments

x

anR object.

what,value

a character vector naming classes.valuecan also beNULL.what can also be anon-character R object with anameOfClass() method.

which

logical affecting return value: see ‘Details’.

Details

Here, we describe the so called “S3” classes (and methods). For“S4” classes (and methods), see ‘Formal classes’ below.

ManyR objects have aclass attribute, a character vectorgiving the names of the classes from which the objectinherits.(FunctionsoldClass andoldClass<- get and set theattribute, which can also be done directly.)

If the object does not have a class attribute, it has an implicitclass, notably"matrix","array","function" or"numeric" or the result oftypeof(x) (which is similar tomode(x)),but for type"language" andmode"call",where the following extra classes exist for the corresponding functioncalls:if,for,while,(,{,<-,=.

Note that for objectsx of an implicit (or an S4) class, when a(S3) generic functionfoo(x) is called, method dispatch may usemore classes than are returned byclass(x), e.g., for a numericmatrix, thefoo.numeric() method may apply. The exact fullcharacter vector of the classes whichUseMethod() uses, is available as.class2(x) sinceR version 4.0.0. (This also applies to S4 objects when S3 dispatch isconsidered, see below.)

Beware that using.class2() for other reasons than didactical,diagnostical or for debugging may rather be a misuse than smart.

NULL objects (of implicit class"NULL") cannot haveattributes (hence noclass attribute) and attempting to assign aclass is an error.

When a generic functionfun is applied to an object with classattributec("first", "second"), the system searches for afunction calledfun.first and, if it finds it, applies it tothe object. If no such function is found, a function calledfun.second is tried. If no class name produces a suitablefunction, the functionfun.default is used (if it exists). Ifthere is no class attribute, the implicit class is tried, then thedefault method.

The functionclass prints the vector of names of classes anobject inherits from. Correspondingly,class<- sets theclasses an object inherits from. Assigning an empty character vector orNULL removes the class attribute, as foroldClass<- ordirect attribute setting. Whereas it is clearer to explicitly assignNULL to remove the class, using an empty vector is more natural ine.g.,class(x) <-setdiff(class(x), "ts").

unclass returns (a copy of) its argument with its classattribute removed. (It is not allowed for objects which cannot becopied, namely environments and external pointers.)

inherits indicates whether its first argument inherits from anyof the classes specified in thewhat argument. IfwhichisTRUE then an integer vector of the same length aswhat is returned. Each element indicates the position in theclass(x) matched by the element ofwhat; zero indicatesno match. Ifwhich isFALSE thenTRUE isreturned byinherits if any of the names inwhat matchwith anyclass.

nameOfClass is an S3 generic. It is called byinherits to get the class name forwhat, allowing forwhat to be values other than a character vector.nameOfClass methods are expected to return a character vector of length 1.

isa tests whetherx is an object of class(es) as giveninwhat by usingis ifx is an S4object, and otherwise givingTRUE iffall elements ofclass(x) are contained inwhat.

All butinherits andisa areprimitive functions.

Formal classes

An additional mechanism offormal classes, nicknamed“S4”, is available in packagemethods which is attachedby default. For objects which have a formal class, its name isreturned byclass as a character vector of length one andmethod dispatch can happen onseveral arguments, instead ofonly the first. However, S3 method selection attempts to treat objectsfrom an S4 class as if they had the appropriate S3 class attribute, asdoesinherits. Therefore, S3 methods can be defined for S4classes. See the ‘Introduction’ and ‘Methods_for_S3’help pages for basic information on S4 methods and for the relationbetween these and S3 methods.

The replacement version of the function sets the class to the valueprovided. For classes that have a formal definition, directlyreplacing the class this way is strongly deprecated. The expressionas(object, value) is the way to coerce an object to aparticular class.

The analogue ofinherits for formal classes isis. The two functions behave consistentlywith one exception: S4 classes can have conditionalinheritance, with an explicit test. In this case,is willtest the condition, butinherits ignores all conditionalsuperclasses.

Note

UseMethod dispatches on the class as returned byclass (with some interpolated classes: see the link) ratherthanoldClass.However,group generics dispatchon theoldClass for efficiency, andinternal genericsonly dispatch on objects for whichis.object is true.

See Also

UseMethod,NextMethod,‘group generic’, ‘internal generic

Examples

x<-10class(x)# "numeric"oldClass(x)# NULLinherits(x,"a")#FALSEclass(x)<- c("a","b")inherits(x,"a")#TRUEinherits(x,"a",TRUE)# 1inherits(x, c("a","b","c"),TRUE)# 1 2 0class( quote(pi))# "name"## regular callsclass( quote(sin(pi*x)))# "call"## special callsclass( quote(x<-1))# "<-"class( quote((1<2)))# "("class( quote(if(8<3) pi))# "if".class2(pi)# "double" "numeric".class2(matrix(1:6,2,3))# "matrix" "array" "integer" "numeric"

Column Indexes

Description

Returns a matrix of integers indicating their column number in amatrix-like object, or a factor of column labels.

Usage

col(x, as.factor=FALSE).col(dim)

Arguments

x

a matrix-like object, that is one with a two-dimensionaldim.

dim

a matrix dimension, i.e., an integer valued numeric vector oflength two (with non-negative entries).

as.factor

a logical value indicating whether the value shouldbe returned as a factor of column labels (created if necessary)rather than as numbers.

Value

An integer (or factor) matrix with the same dimensions asx and whoseij-th element is equal toj (or thej-th column label).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

row to get rows;slice.index for a general way to get slice indicesin an array.

Examples

# extract an off-diagonal of a matrixma<- matrix(1:12,3,4)ma[row(ma)== col(ma)+1]# create an identity 5-by-5 matrix more slowly than diag(n = 5):x<- matrix(0, nrow=5, ncol=5)x[row(x)== col(x)]<-1(i34<- .col(3:4))stopifnot(identical(i34, .col(c(3,4))))# 'dim' maybe "double"

Colon Operator

Description

Generate regular sequences.

Usage

from:to   a:b

Arguments

from

starting value of sequence.

to

(maximal) end value of the sequence.

a,b

factors of the same length.

Details

The binary operator: has two meanings: for factorsa:b isequivalent tointeraction(a, b) (but the levels areordered and labelled differently).

For other argumentsfrom:to is equivalent toseq(from, to),and generates a sequence fromfrom toto in steps of1or-1. Valueto will be included if it differs fromfrom by an integer up to a numeric fuzz of about1e-7.Non-numeric arguments are coerced internally (hence withoutdispatching methods) to numeric—complex values will have theirimaginary parts discarded with a warning.

Value

For numeric arguments, a numeric vector. This will be of typeinteger iffrom is integer-valued and the resultis representable in theR integer type, otherwise of type"double" (akamode"numeric").

For factors, an unordered factor with levels labelled asla:lband ordered lexicographically (that is,lb varies fastest).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
(for numeric arguments: S does not have: for factors.)

See Also

seq (ageneralization offrom:to).

As an alternative to using: for factors,interaction.

For: used in the formal representation of an interaction, seeformula.

Examples

1:4pi:6# real6:pi# integerf1<- gl(2,3); f1f2<- gl(3,2); f2f1:f2# a factor, the "cross"  f1 x f2

Form Row and Column Sums and Means

Description

Form row and column sums and means for numeric arrays (or data frames).

Usage

colSums(x, na.rm=FALSE, dims=1)rowSums(x, na.rm=FALSE, dims=1)colMeans(x, na.rm=FALSE, dims=1)rowMeans(x, na.rm=FALSE, dims=1).colSums(x, m, n, na.rm=FALSE).rowSums(x, m, n, na.rm=FALSE).colMeans(x, m, n, na.rm=FALSE).rowMeans(x, m, n, na.rm=FALSE)

Arguments

x

an array of two or more dimensions, containing numeric,complex, integer or logical values, or a numeric data frame. For.colSums() etc, a numeric, integer or logical matrix (orvector of lengthm * n).

na.rm

logical. Should missing values (includingNaN)be omitted from the calculations?

dims

integer: Which dimensions are regarded as ‘rows’ or‘columns’ to sum over. Forrow*, the sum or mean isover dimensionsdims+1, ...; forcol* it is overdimensions1:dims.

m,n

the dimensions of the matrixx for.colSums() etc.

Details

These functions are equivalent to use ofapply withFUN = mean orFUN = sum with appropriate margins, butare a lot faster. As they are written for speed, they blur over someof the subtleties ofNaN andNA. Ifna.rm = FALSE and eitherNaN orNA appears in a sum, theresult will be one ofNaN orNA, but which might beplatform-dependent.

Notice that omission of missing values is done on a per-column orper-row basis, so column means may not be over the same set of rows,and vice versa. To use only complete rows or columns, first selectthem withna.omit orcomplete.cases(possibly on the transpose ofx).

The versions with an initial dot in the name (.colSums() etc)are ‘bare-bones’ versions for use in programming: they applyonly to numeric (like) matrices and do not name the result.

Value

A numeric or complex array of suitable size, or a vector if the resultis one-dimensional. For the first four functions thedimnames(ornames for a vector result) are taken from the originalarray.

If there are no values in a range to be summed over (after removingmissing values withna.rm = TRUE), thatcomponent of the output is set to0 (*Sums) orNaN(*Means), consistent withsum andmean.

See Also

apply,rowsum

Examples

## Compute row and column sums for a matrix:x<- cbind(x1=3, x2= c(4:1,2:5))rowSums(x); colSums(x)dimnames(x)[[1]]<- letters[1:8]rowSums(x); colSums(x); rowMeans(x); colMeans(x)x[]<- as.integer(x)rowSums(x); colSums(x)x[]<- x<3rowSums(x); colSums(x)x<- cbind(x1=3, x2= c(4:1,2:5))x[3,]<-NA; x[4,2]<-NArowSums(x); colSums(x); rowMeans(x); colMeans(x)rowSums(x, na.rm=TRUE); colSums(x, na.rm=TRUE)rowMeans(x, na.rm=TRUE); colMeans(x, na.rm=TRUE)## an arraydim(UCBAdmissions)rowSums(UCBAdmissions); rowSums(UCBAdmissions, dims=2)colSums(UCBAdmissions); colSums(UCBAdmissions, dims=2)## complex casex<- cbind(x1=3+2i, x2= c(4:1,2:5)-5i)x[3,]<-NA; x[4,2]<-NArowSums(x); colSums(x); rowMeans(x); colMeans(x)rowSums(x, na.rm=TRUE); colSums(x, na.rm=TRUE)rowMeans(x, na.rm=TRUE); colMeans(x, na.rm=TRUE)

Extract Command Line Arguments

Description

Provides access to a copy of the command line arguments supplied whenthisR session was invoked.

Usage

commandArgs(trailingOnly=FALSE)

Arguments

trailingOnly

logical. Should only arguments after--args be returned?

Details

These arguments are captured before the standardR command lineprocessing takes place. This means that they are the unmodifiedvalues. This is especially useful with the--argscommand-line flag toR, as all of the command line after that flagis skipped.

Value

A character vector containing the name of the executable and theuser-supplied command line arguments. The first element is the nameof the executable by whichR was invoked. The exact form of thiselement is platform dependent: it may be the fully qualified name, orsimply the last component (or basename) of the application, or for anembeddedR it can be anything the programmer supplied.

IftrailingOnly = TRUE, a character vector of those arguments(if any) supplied after--args.

See Also

R.home(),StartupandBATCH

Examples

commandArgs()## Spawn a copy of this application as it was invoked,## subject to shell quoting issues## system(paste(commandArgs(), collapse = " "))

Query or Set a"comment" Attribute

Description

These functions set and query acommentattribute for anyR objects. This is typically useful fordata.frames or model fits.

Contrary to otherattributes, thecomment is notprinted (byprint orprint.default).

AssigningNULL or a zero-length character vector removes thecomment.

Usage

comment(x)comment(x)<- value

Arguments

x

anyR object.

value

acharacter vector, orNULL.

See Also

attributes andattr for other attributes.

Examples

x<- matrix(1:12,3,4)comment(x)<- c("This is my very important data from experiment #0234","Jun 5, 1998")xcomment(x)

Relational Operators

Description

Binary operators which allow the comparison of values in atomic vectors.

Usage

x< yx> yx<= yx>= yx== yx!= y

Arguments

x,y

atomic vectors, symbols, calls, or other objects for whichmethods have been written.

Details

The binary comparison operators are generic functions: methods can bewritten for them individually or via theOps group generic function. (SeeOps for how dispatch is computed.)

Comparison of strings in character vectors is lexicographic within thestrings using the collating sequence of the locale in use: seelocales. The collating sequence of locales such as‘⁠en_US⁠’ is normally different from ‘⁠C⁠’ (which should useASCII) and can be surprising. Beware of makingany assumptionsabout the collation order: e.g. in EstonianZ comes betweenS andT, and collation is not necessarilycharacter-by-character – in Danishaa sorts as a singleletter, afterz. In Welshng may or may not be a singlesorting unit: if it is it followsg. Some platforms maynot respect the locale and always sort in numerical order of the bytesin an 8-bit locale, or in Unicode code-point order for a UTF-8 locale (andmay not sort in the same order for the same language in differentcharacter sets). Collation of non-letters (spaces, punctuation signs,hyphens, fractions and so on) is even more problematic.

Character strings can be compared with different marked encodings(seeEncoding): they are translated to UTF-8 beforecomparison.

Raw vectors should not really be considered to have an order, but thenumeric order of the byte representation is used.

At least one ofx andy must be an atomic vector, but ifthe other is a listR attempts to coerce it to the type of the atomicvector: this will succeed if the list is made up of elements of lengthone that can be coerced to the correct type.

If the two arguments are atomic vectors of different types, one iscoerced to the type of the other, the (decreasing) order of precedencebeing character, complex, numeric, integer, logical and raw.

Missing values (NA) andNaN values areregarded as non-comparable even to themselves, so comparisonsinvolving them will always result inNA. Missing values canalso result when character strings are compared and one is not validin the current collation locale.

Language objects such as symbols and calls can only be used asoperands for== and!=; the other comparisons signal anerror when one of the operands is a language object. Currentlylanguage objects are deparsed to character strings beforecomparison. This can be inefficient and may not be what is reallywanted. For equality comparisonsidentical is usually abetter choice.

Value

A logical vector indicating the result of the element by elementcomparison. The elements of shorter vectors are recycled asnecessary.

Objects such as arrays or time-series can be compared this wayprovided they are conformable.

S4 methods

These operators are members of the S4Compare group generic,and so methods can be written for them individually as well as for thegroup generic (or theOps group generic), with argumentsc(e1, e2).

Note

Do not use== and!= for tests, such as inifexpressions, where you must get a singleTRUE orFALSE. Unless you are absolutely sure that nothing unusualcan happen, you should use theidentical functioninstead.

For numerical and complex values, remember== and!= donot allow for the finite representation of fractions, nor for roundingerror. Usingall.equal withidentical orisTRUE is almost always preferable; see the examples.(This also applies to the other comparison operators.)

These operators are sometimes called as functions ase.g.`<`(x, y): see the description of howargument-matching is done inOps.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Collation of character strings is a complex topic. For anintroduction seehttps://en.wikipedia.org/wiki/Collating_sequence. TheUnicode Collation Algorithm(https://unicode.org/reports/tr10/) is likely to be increasinglyinfluential. Where availableR by default makes use of ICU(https://icu.unicode.org/) for collation (except in a Clocale).

See Also

Logic on how tocombine results of comparisons,i.e., logical vectors.

factor for the behaviour with factor arguments.

Syntax for operator precedence.

capabilities for whether ICU is available, andicuSetCollate to tune the string collation algorithmwhen it is.

Examples

x<- stats::rnorm(20)x<1x[x>0]x1<-0.5-0.3x2<-0.3-0.1x1== x2# FALSE on most machinesisTRUE(all.equal(x1, x2))# TRUE everywhere# range of most 8-bit charsets, as well as of Latin-1 in Unicodez<- c(32:126,160:255)x<-if(l10n_info()$MBCS){    intToUtf8(z, multiple=TRUE)}else rawToChar(as.raw(z), multiple=TRUE)## by numberwriteLines(strwrap(paste(x, collapse=" "), width=60))## by locale collationwriteLines(strwrap(paste(sort(x), collapse=" "), width=60))

Complex Numbers and Basic Functionality

Description

Basic functions which support complex arithmetic inR, in addition tothe arithmetic operators+,-,*,/, and^.

Usage

complex(length.out=0, real= numeric(), imaginary= numeric(),        modulus=1, argument=0)as.complex(x,...)is.complex(x)Re(z)Im(z)Mod(z)Arg(z)Conj(z)

Arguments

length.out

numeric. Desired length of the output vector,inputs being recycled as needed.

real

numeric vector.

imaginary

numeric vector.

modulus

numeric vector.

argument

numeric vector.

x

an object, probably of modecomplex.

z

an object of modecomplex, or one of a class for whicha methods has been defined.

...

further arguments passed to or from other methods.

Details

Complex vectors can be created withcomplex. The vector can bespecified either by giving its length, its real and imaginary parts, ormodulus and argument. (Giving just the length generates a vector ofcomplex zeroes.)

as.complex attempts to coerce its argument to be of complextype: likeas.vector it strips attributes includingnames.SinceR version 4.4.0,as.complex(x) for “number-like”x, i.e., types"logical","integer", and"double", will always keep imaginary part zero, now also forNA's.Up toR versions 3.2.x, all forms ofNA andNaNwere coerced to a complexNA, i.e., theNA_complex_constant, for which both the real and imaginary parts areNA.SinceR 3.3.0, typically only objects which areNA in partsare coerced to complexNA, but others withNaN parts,arenot. As a consequence, complex arithmetic where onlyNaN's (but noNA's) are involved typically willnot give complexNA but complex numbers with real orimaginary parts ofNaN.All of these many different complex numbers fulfillis.na(.) butonly one of them is identical toNA_complex_.

Note thatis.complex andis.numeric are never bothTRUE.

The functionsRe,Im,Mod,Arg andConj have their usual interpretation as returning the realpart, imaginary part, modulus, argument and complex conjugate forcomplex values. The modulus and argument are also called thepolarcoordinates. Ifz=x+iyz = x + i y with realxx andyy, forr=Mod(z)=x2+y2r = Mod(z) = \sqrt{x^2 + y^2},andϕ=Arg(z)\phi = Arg(z),x=rcos(ϕ)x = r \cos(\phi) andy=rsin(ϕ)y = r \sin(\phi). They are allinternal genericprimitive functions: methods can bedefined for themindividually orvia theComplexgroup generic.

In addition to the arithmetic operators (seeArithmetic)+,-,*,/, and^, the elementarytrigonometric, logarithmic, exponential, square root and hyperbolicfunctions are implemented for complex values.

Matrix multiplications (%*%,crossprod,tcrossprod) are also defined for complex matrices(matrix), and so aresolve,eigen orsvd.

Internally, complex numbers are stored as a pair ofdoubleprecision numbers, either or both of which can beNaN(includingNA, seeNA_complex_ and above) orplus or minus infinity.

S4 methods

as.complex is primitive and can have S4 methods set.

Re,Im,Mod,Arg andConjconstitute the S4 group genericComplex and so S4 methods can beset for them individually or via the group generic.

Note

Operations and functions involving complexNaN mostlyrely on the C library's handling of ‘⁠double complex⁠’ arithmetic,which typically returnscomplex(re=NaN, im=NaN) (but we havenot seen a guarantee for that).For+ and-,R's own handling works strictly“coordinate wise”.

Operations involving complexNA, i.e.,NA_complex_, returnNA_complex_.

Only sinceR version 4.4.0,as.complex("1i") gives1i,it returnedNA_complex_ with a warning, previously.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

Arithmetic;polyroot finds allnncomplex roots of a polynomial of degreenn.

Examples

require(graphics)0i^(-3:3)matrix(1i^(-6:5), nrow=4)#- all columns are the same0^1i# a complex NaN## create a complex normal vectorz<- complex(real= stats::rnorm(100), imaginary= stats::rnorm(100))## or also (less efficiently):z2<-1:2+1i*(8:9)## The Arg(.) is an angle:zz<-(rep(1:4, length.out=9)+1i*(9:1))/10zz.shift<- complex(modulus= Mod(zz), argument= Arg(zz)+ pi)plot(zz, xlim= c(-1,1), ylim= c(-1,1), col="red", asp=1,     main= expression(paste("Rotation by "," ", pi==180^o)))abline(h=0, v=0, col="blue", lty=3)points(zz.shift, col="orange")## as.complex(<some NA>): numbers keep Im = 0:stopifnot(identical(as.complex(NA_real_),NA_real_+0i))# has always been trueNAs<- vapply(list(NA,NA_integer_,NA_real_,NA_character_,NA_complex_),              as.complex,0+0i)stopifnot(is.na(NAs), is.na(Re(NAs)))# has always been trueshowC<-function(z) noquote(paste0("(", Re(z),",", Im(z),")"))showC(NAs)Im(NAs)# [0 0 0 NA NA]  \ in R <= 4.3.x was [NA NA 0 NA NA]stopifnot(Im(NAs)[1:3]==0)## The exact result of this *depends* on the platform, compiler, math-library:(NpNA<-NaN+NA_complex_); str(NpNA)# *behaves* as 'cplx NA' ..stopifnot(is.na(NpNA), is.na(NA_complex_), is.na(Re(NA_complex_)), is.na(Im(NA_complex_)))showC(NpNA)# but does not always show '(NaN,NA)'## and this is not TRUE everywhere:identical(NpNA,NA_complex_)showC(NA_complex_)# always == (NA,NA)

Condition Handling and Recovery

Description

These functions provide a mechanism for handling unusual conditions,including errors and warnings.

Usage

tryCatch(expr,..., finally)withCallingHandlers(expr,...)globalCallingHandlers(...)signalCondition(cond)simpleCondition(message, call=NULL)simpleError(message, call=NULL)simpleWarning(message, call=NULL)simpleMessage(message, call=NULL)errorCondition(message,..., class=NULL, call=NULL)warningCondition(message,..., class=NULL, call=NULL)## S3 method for class 'condition'as.character(x,...)## S3 method for class 'error'as.character(x,...)## S3 method for class 'condition'print(x,...)## S3 method for class 'restart'print(x,...)conditionCall(c)## S3 method for class 'condition'conditionCall(c)conditionMessage(c)## S3 method for class 'condition'conditionMessage(c)withRestarts(expr,...)computeRestarts(cond=NULL)findRestart(name, cond=NULL)invokeRestart(r,...)tryInvokeRestart(r,...)invokeRestartInteractively(r)isRestart(x)restartDescription(r)restartFormals(r)suspendInterrupts(expr)allowInterrupts(expr).signalSimpleWarning(msg, call).handleSimpleError(h, msg, call).tryResumeInterrupt()

Arguments

c

a condition object.

call

call expression.

cond

a condition object.

expr

expression to be evaluated.

finally

expression to be evaluated before returning or exiting.

h

function.

message

character string.

msg

character string.

name

character string naming a restart.

r

restart object.

x

object.

class

character string naming a condition class.

...

additional arguments; see details below.

Details

The condition system provides a mechanism for signaling andhandling unusual conditions, including errors and warnings.Conditions are represented as objects that contain informationabout the condition that occurred, such as a message and the call inwhich the condition occurred. Currently conditions are S3-styleobjects, though this may eventually change.

Conditions are objects inheriting from the abstract classcondition. Errors and warnings are objects inheritingfrom the abstract subclasseserror andwarning.The classsimpleError is the class used bystopand all internal error signals. Similarly,simpleWarningis used bywarning, andsimpleMessage is used bymessage. The constructors by the same names take a stringdescribing the condition as argument and an optional call. ThefunctionsconditionMessage andconditionCall aregeneric functions that return the message and call of a condition.

The functionerrorCondition can beused to construct error conditions of a particular class withadditional fields specified as the... argument.warningCondition is analogous for warnings.

Conditions are signaled bysignalCondition. In addition,thestop andwarning functions have been modified toalso accept condition arguments.

The functiontryCatch evaluates its expression argumentin a context where the handlers provided in the...argument are available. Thefinally expression is thenevaluated in the context in whichtryCatch was called; thatis, the handlers supplied to the currenttryCatch call arenot active when thefinally expression is evaluated.

Handlers provided in the... argument totryCatchare established for the duration of the evaluation ofexpr.If no condition is signaled when evaluatingexpr thentryCatch returns the value of the expression.

If a condition is signaled while evaluatingexpr thenestablished handlers are checked, starting with the most recentlyestablished ones, for one matching the class of the condition.When several handlers are supplied in a singletryCatch thenthe first one is considered more recent than the second. If ahandler is found then control is transferred to thetryCatch call that established the handler, the handlerfound and all more recent handlers are disestablished, the handleris called with the condition as its argument, and the resultreturned by the handler is returned as the value of thetryCatch call.

Calling handlers are established bywithCallingHandlers. Ifa condition is signaled and the applicable handler is a callinghandler, then the handler is called bysignalCondition inthe context where the condition was signaled but with the availablehandlers restricted to those below the handler called in thehandler stack. If the handler returns, then the next handler istried; once the last handler has been tried,signalConditionreturnsNULL.

globalCallingHandlers establishes calling handlers globally.These handlers are only called as a last resort, after the otherhandlers dynamically registered withwithCallingHandlers havebeen invoked. They are called before theerror global option(which is the legacy interface for global handling of errors).Registering the same handler multiple times moves that handler ontop of the stack, which ensures that it is called first. Globalhandlers are a good place to define a general purpose logger (forinstance saving the last error object in the global workspace) or ageneral recovery strategy (e.g. installing missing packages via theretry_loadNamespace restart).

LikewithCallingHandlers andtryCatch,globalCallingHandlers takes named handlers. Unlike thesefunctions, it also has anoptions-like interface: youcan establish handlers by passing a single list of named handlers.To unregister all global handlers, supply a single 'NULL'. The listof deleted handlers is returned invisibly. Finally, callingglobalCallingHandlers without arguments returns the list ofcurrently established handlers, visibly.

User interrupts signal a condition of classinterrupt thatinherits directly from classcondition before executing thedefault interrupt action.

Restarts are used for establishing recovery protocols. They can beestablished usingwithRestarts. One pre-established restart isanabort restart that represents a jump to top level.

findRestart andcomputeRestarts find the availablerestarts.findRestart returns the most recently establishedrestart of the specified name.computeRestarts returns alist of all restarts. Both can be given a condition argument andwill then ignore restarts that do not apply to the condition.

invokeRestart transfers control to the point where thespecified restart was established and calls the restart's handler with thearguments, if any, given as additional arguments toinvokeRestart. The restart argument toinvokeRestartcan be a character string, in which casefindRestart is usedto find the restart. If no restart is found, an error is thrown.

tryInvokeRestart is a variant ofinvokeRestart thatreturns silently when the restart cannot be found withfindRestart. Because a condition of a given class might besignalled with arbitrary protocols (error, warning, etc), it isrecommended to use this permissive variant whenever you are handlingconditions signalled from a foreign context. For instance, invocationof a"muffleWarning" restart should be optional because thewarning might have been signalled by the user or from a differentpackage with thestop ormessage protocols. Only useinvokeRestart when you have control of the signalling context,or when it is a logical error if the restart is not available.

New restarts forwithRestarts can be specified in several ways.The simplest is inname = function form where the function isthe handler to call when the restart is invoked. Another simplevariant is asname = string where the string is stored in thedescription field of the restart object returned byfindRestart; in this case the handler ignores its argumentsand returnsNULL. The most flexible form of a restartspecification is as a list that can include several fields, includinghandler,description, andtest. Thetest field should contain a function of one argument, acondition, that returnsTRUE if the restart applies to thecondition andFALSE if it does not; the default functionreturnsTRUE for all conditions.

One additional field that can be specified for a restart isinteractive. This should be a function of no arguments thatreturns a list of arguments to pass to the restart handler. The listcould be obtained by interacting with the user if necessary. ThefunctioninvokeRestartInteractively calls this function toobtain the arguments to use when invoking the restart. The defaultinteractive method queries the user for values for theformal arguments of the handler function.

Interrupts can be suspended while evaluating an expression usingsuspendInterrupts. Subexpression can be evaluated withinterrupts enabled usingallowInterrupts. These functionscan be used to make sure cleanup handlers cannot be interrupted.

.signalSimpleWarning,.handleSimpleError, and.tryResumeInterrupt are used internally and should not becalled directly.

References

ThetryCatch mechanism is similar to Javaerror handling. Calling handlers are based on Common Lisp andDylan. Restarts are based on the Common Lisp restart mechanism.

See Also

stop andwarning signal conditions,andtry is essentially a simplified version oftryCatch.assertCondition in packagetoolsteststhat conditions are signalled and works with several of the abovehandlers.

Examples

tryCatch(1, finally= print("Hello"))e<- simpleError("test error")## Not run: stop(e) tryCatch(stop(e), finally= print("Hello")) tryCatch(stop("fred"), finally= print("Hello"))## End(Not run)tryCatch(stop(e), error=function(e) e, finally= print("Hello"))tryCatch(stop("fred"),  error=function(e) e, finally= print("Hello"))withCallingHandlers({ warning("A");1+2}, warning=function(w){})## Not run:{ withRestarts(stop("A"), abort=function(){});1}## End(Not run)withRestarts(invokeRestart("foo",1,2), foo=function(x, y){x+ y})##--> More examples are part of##-->   demo(error.catching)

Search for Masked Objects on the Search Path

Description

conflicts reports on objects that exist with the same name intwo or more places on thesearch path, usually becausean object in the user's workspace or a package is masking a systemobject of the same name. This helps discover unintentional masking.

Usage

conflicts(where= search(), detail=FALSE)

Arguments

where

A subset of the search path, by default the whole search path.

detail

IfTRUE, give the masked or masking functions forall members of the search path.

Value

Ifdetail = FALSE, a character vector of masked objects.Ifdetail = TRUE, a list of character vectors giving the masked ormasking objects in that member of the search path. Empty vectors areomitted.

Examples

lm<-1:3conflicts(,TRUE)## gives something like# $.GlobalEnv# [1] "lm"## $package:base# [1] "lm"## Remove things from your "workspace" that mask others:remove(list= conflicts(detail=TRUE)$.GlobalEnv)

Functions to Manipulate Connections (Files, URLs, ...)

Description

Functions to create, open and close connections, i.e.,“generalized files”, such as possibly compressed files, URLs,pipes, etc.

Usage

file(description="", open="", blocking=TRUE,     encoding= getOption("encoding"), raw=FALSE,     method= getOption("url.method","default"))url(description, open="", blocking=TRUE,    encoding= getOption("encoding"),    method= getOption("url.method","default"),    headers=NULL)gzfile(description, open="", encoding= getOption("encoding"),       compression=6)bzfile(description, open="", encoding= getOption("encoding"),       compression=9)xzfile(description, open="", encoding= getOption("encoding"),       compression=6)unz(description, filename, open="", encoding= getOption("encoding"))pipe(description, open="", encoding= getOption("encoding"))fifo(description, open="", blocking=FALSE,     encoding= getOption("encoding"))socketConnection(host="localhost", port, server=FALSE,                 blocking=FALSE, open="a+",                 encoding= getOption("encoding"),                 timeout= getOption("timeout"),                 options= getOption("socketOptions"))serverSocket(port)socketAccept(socket, blocking=FALSE, open="a+",             encoding= getOption("encoding"),             timeout= getOption("timeout"),             options= getOption("socketOptions"))open(con,...)## S3 method for class 'connection'open(con, open="r", blocking=TRUE,...)close(con,...)## S3 method for class 'connection'close(con, type="rw",...)flush(con)isOpen(con, rw="")isIncomplete(con)socketTimeout(socket, timeout=-1)

Arguments

description

character string. A description of the connection:see ‘Details’.

open

character string. A description of how to open the connection(if it should be opened initially). See section ‘Modes’ forpossible values.

blocking

logical. See the ‘Blocking’ section.

encoding

the name of the encoding to be assumed. See the‘Encoding’ section.

raw

logical. If true, a ‘raw’ interface is used whichwill be more suitable for arguments which are not regular files,e.g. character devices. This suppresses the check for a compressedfile when opening for text-mode reading, and asserts that the‘file’ may not be seekable.

method

character string, partially matched toc("default", "internal", "wininet", "libcurl"):see ‘Details’.

headers

named character vector of HTTP headers to use in HTTPrequests. It is ignored for non-HTTP URLs. TheUser-Agentheader, coming from theHTTPUserAgent option (seeoptions) is used as the first header, automatically.

compression

integer in 0–9. The amount of compression to beapplied when writing, from none to maximal available. Forxzfile can also be negative: see the ‘Compression’section.

timeout

numeric: the timeout (in seconds) to be used for thisconnection. Beware that some OSes may treat very large values aszero: however the POSIX standard requires values up to 31 days to besupported.

options

optional character vector with options. Currently only"no-delay" is supported on TCP sockets.

filename

a filename within a zip file.

host

character string. Host name for the port.

port

integer. The TCP port number.

server

logical. Should the socket be a client or a server?

socket

a server socket listening for connections.

con

a connection.

type

character string. Currently ignored.

rw

character string. Empty or"read" or"write",partial matches allowed.

...

arguments passed to or from other methods.

Details

The first eleven functions create connections. By default theconnection is not opened (except for a socket connection created bysocketConnection orsocketAccept and for server socketconnection created byserverSocket), but maybe opened by setting a non-empty value of argumentopen.

Forfile the description is a path to the file to be opened(whentilde expansion is done) or a complete URL (when it isthe same as callingurl), or"" (the default) or"clipboard" (see the ‘Clipboard’ section). Use"stdin" to refer to the C-level ‘standard input’ of theprocess (which need not be connected to anything in a console orembedded version ofR, and is not inRGui on Windows). Seealsostdin() for the subtly different R-level concept ofstdin. Seenullfile() for a platform-independentway to get filename of the null device.

Forurl the description is a complete URL including scheme(such as ‘⁠http://⁠’, ‘⁠https://⁠’, ‘⁠ftp://⁠’ or‘⁠file://⁠’). Method"internal" is that available sinceconnections were introduced but now mainly defunct. Method"wininet" is only available on Windows (it uses the WinINetfunctions of that OS) and method"libcurl" (using the libraryof that name:https://curl.se/libcurl/) is nowadays required butwas optional on Windows beforeR 4.2.0. Method"default"currently uses method"internal" for ‘⁠file://⁠’ URLs and"libcurl" for all others. Which methods support which schemeshas varied byR version – currently"internal" supports only‘⁠file://⁠’;"wininet" supports ‘⁠file://⁠’,‘⁠http://⁠’ and ‘⁠https://⁠’. Proxies can be specified: seedownload.file.

Forgzfile the description is the path to a file compressed bygzip: it can also open for reading uncompressed files andthose compressed bybzip2,xz orlzma.

Forbzfile the description is the path to a file compressed bybzip2.

Forxzfile the description is the path to a file compressed byxz (https://en.wikipedia.org/wiki/Xz) or (for readingonly)lzma (https://en.wikipedia.org/wiki/LZMA).

unz reads (only) single files within zip files, in binary mode.The description is the full path to the zip file, with ‘.zip’extension if required.

Forpipe the description is the command line to be piped to orfrom. This is run in a shell, on Windows that specified by theCOMSPEC environment variable.

Forfifo the description is the path of the fifo. (Support forfifo connections is optional but they are available on mostUnix platforms and on Windows.)

The intention is thatfile andgzfile can be usedgenerally for text input (from files, ‘⁠http://⁠’ and‘⁠https://⁠’ URLs) and binary input respectively.

open,close andseek are generic functions: thefollowing applies to the methods relevant to connections.

open opens a connection. In general functions usingconnections will open them if they are not open, but then close themagain, so to leave a connection open callopen explicitly.

close closes and destroys a connection. This will happenautomatically in due course (with a warning) if there is no longer anR object referring to the connection.

flush flushes the output stream of a connection open forwrite/append (where implemented, currently for file and clipboardconnections,stdout andstderr).

If for afile or (on most platforms) afifo connectionthe description is"", the file/fifo is immediately opened (in"w+" mode unlessopen = "w+b" is specified) and unlinkedfrom the file system. This provides a temporary file/fifo to write toand then read from.

socketConnection(server=TRUE) creates a new temporary server socketlistening on the given port. As soon as a new socket connection isaccepted on that port, the server socket is automatically closed.serverSocket creates a listening server socket which can be usedfor accepting multiple socket connections bysocketAccept. To stoplistening for new connections, a server socket needs to be closedexplicitly byclose.

socketConnection andsocketAccept support setting ofsocket-specific options. Currently only"no-delay" isimplemented which enables theTCP_NODELAY socket option, causingthe socket to flush send buffers immediately (instead of waiting tocollect all output before sending). This option is useful forprotocols that need fast request/response turn-around times.

socketTimeout sets connection timeout of a socket connection. Anegativetimeout can be given to query the old value.

Value

file,pipe,fifo,url,gzfile,bzfile,xzfile,unz,socketConnection,socketAccept andserverSocketreturn a connection object which inherits from class"connection" and has a first more specific class.

open andflush returnNULL, invisibly.

close returns eitherNULL or an integer status,invisibly. The status is from when the connection was last closed andis available only for some types of connections (e.g., pipes, files andfifos): typically zero values indicate success. Negative values willresult in a warning; if writing, these may indicate write failures and shouldnot be ignored. Connections should be closed explicitly when finishedwith to avoid wasting resources and to reduce the risk that some buffereddata in output connections would be lost (seeon.exit() forhow to run code also in case of error).

isOpen returns a logical value, whether the connection iscurrently open.

isIncomplete returns a logical value, whether the last read attemptfrom a non-blocking connection provided no data (currently no data from asocket or an unterminated line inreadLines), or for anoutput text connection whether there is unflushed output. See examplebelow.

socketTimeout returns the old timeout value of a socket connection.

URLs

url andfile support URL schemes ‘⁠file://⁠’,‘⁠http://⁠’, ‘⁠https://⁠’ and ‘⁠ftp://⁠’.

method = "libcurl" allows more schemes: exactly which schemesis platform-dependent (seelibcurlVersion), but allplatforms will support ‘⁠https://⁠’ and most platforms will support‘⁠ftps://⁠’.

Support for the ‘⁠ftp://⁠’ scheme by the"internal" method wasdeprecated inR 4.1.1 and removed inR 4.2.0.

Most methods do not percent-encode special characters such as spacesin ‘⁠http://⁠’ URLs (seeURLencode), but it seems the"wininet" method does.

A note on ‘⁠file://⁠’ URLs (which are handled by the same internalcode irrespective of argumentmethod). The most general form(from RFC1738) is ‘⁠file://host/path/to/file⁠’, butR only acceptsthe form with an emptyhost field referring to the localmachine.

On a Unix-alike, this is then ‘⁠file:///path/to/file⁠’, where‘⁠path/to/file⁠’ is relative to ‘/’. So although the thirdslash is strictly part of the specification not part of the path, thiscan be regarded as a way to specify the file ‘/path/to/file’. Itis not possible to specify a relative path using a file URL.

In this form the path is relative to the root of the filesystem, not aWindows concept. The standard form on Windows is‘⁠file:///d:/R/repos⁠’: for compatibility with earlier versions ofR and Unix versions, any other form is parsed asR as ‘⁠file://⁠’pluspath_to_file. Also, backslashes are accepted within thepath even though RFC1738 does not allow them.

No attempt is made to decode a percent-encoded ‘⁠file:⁠’ URL: callURLdecode if necessary.

All the methods attempt to follow redirected HTTP andHTTPS URLs.

Server-side cached data is always accepted.

Functiondownload.file and several contributed packagesprovide more comprehensive facilities to download from URLs.

Modes

Possible values for the argumentopen are

"r" or"rt"

Open for reading in text mode.

"w" or"wt"

Open for writing in text mode.

"a" or"at"

Open for appending in text mode.

"rb"

Open for reading in binary mode.

"wb"

Open for writing in binary mode.

"ab"

Open for appending in binary mode.

"r+","r+b"

Open for reading and writing.

"w+","w+b"

Open for reading and writing,truncating file initially.

"a+","a+b"

Open for reading and appending.

Not all modes are applicable to all connections: for example URLs canonly be opened for reading. Only file and socket connections can beopened for both reading and writing. An unsupported mode is usuallysilently substituted.

If a file or fifo is created on a Unix-alike, its permissions will bethe maximal allowed by the current setting ofumask (seeSys.umask).

For many connections there is little or no difference between text andbinary modes. For file-like connections on Windows, translation ofline endings (betweenLF andCRLF) is done in text mode only (but textread operations on connections such asreadLines,scan andsource work for any form of lineending). VariousR operations are possible in only one of the modes:for examplepushBack is text-oriented and is onlyallowed on connections open for reading in text mode, and binaryoperations such asreadBin,load andsave can only be done on binary-mode connections.

The mode of a connection is determined when actually opened, which isdeferred ifopen = "" is given (the default for all but socketconnections). An explicit call toopen can specify the mode,but otherwise the mode will be"r". (gzfile,bzfile andxzfile connections are exceptions, as thecompressed file always has to be opened in binary mode and noconversion of line-endings is done even on Windows, so the defaultmode is interpreted as"rb".) Most operations that need writeaccess or text-only or binary-only mode will override the default modeof a non-yet-open connection.

Append modes need to be considered carefully for compressed-fileconnections. They donot produce a single compressed streamon the file, but rather append a new compressed stream to the file.Readers may or may not read beyond end of the first stream: currentlyR does so forgzfile,bzfile andxzfileconnections.

Compression

R supportsgzip,bzip2 andxzcompression (also read-only support for its precursor,lzmacompression).

For reading, the type of compression (if any) can be determined fromthe first few bytes of the file. Thus forfile(raw = FALSE)connections, ifopen is"","r" or"rt"the connection can read any of the compressed file types as well asuncompressed files. (Using"rb" will allow compressed files tobe read byte-by-byte.) Similarly,gzfile connections can readany of the forms of compression and uncompressed files in any readmode.

(The type of compression is determined when the connection is createdifopen is unspecified and a file of that name exists. If theintention is to open the connection to write a file with adifferent form of compression under that name, specifyopen = "w" when the connection is created orunlink the file before creating the connection.)

For write-mode connections,compress specifies how hard thecompressor works to minimize the file size, and higher values needmore CPU time and more working memory (up to ca 800Mb forxzfile(compress = 9)). Forxzfile negative values ofcompress correspond to adding thexz argument-e: this takes more time (double?) to compress but mayachieve (slightly) better compression. The default (6) hasgood compression and modest (100Mb memory) usage: but if you are usingxz compression you are probably looking for high compression.

Choosing the type of compression involves tradeoffs:gzip,bzip2 andxz are successively less widely supported,need more resources for both compression and decompression, andachieve more compression (although individual files may buck thegeneral trend). Typical experience is thatbzip2 compressionis 15% better on text files thangzip compression, andxz with maximal compression 30% better. The experience withRsave files is similar, but on some large ‘.rda’filesxz compression is much better than the other two. Withcurrent computers decompression times even withcompress = 9are typically modest and reading compressed files is usually fasterthan uncompressed ones because of the reduction in disc activity.

Encoding

The encoding of the input/output stream of a connection can bespecified by name in the same way as it would be given toiconv: see that help page for how to find out whatencoding names are recognized on your platform. Additionally,"" and"native.enc" both mean the ‘native’encoding, that is the internal encoding of the current locale andhence no translation is done.

When writing to a text connection, the connections code always assumes itsinput is in native encoding, so e.g.writeLines has toconvert text to native encoding. The native encoding is UTF-8 on mostsystems (since R 4.2 also on recent Windows) and can represent allcharacters.writeLines does not do the conversion whenuseBytes=TRUE (for expert use only, only useful on systems with nativeencoding other than UTF-8), but the connections code still behaves as ifthe text was in native encoding, so any attempt to convert encoding(encoding argument other than"" and"native.enc") inconnections will produce incorrect results.

When reading from a text connection, the connections code re-encodes theinput to native encoding (from the encoding given by theencodingargument). On systems where UTF-8 is not the native encoding, one canread text not representable in the native encoding usingreadLines andscan by providing them with anunopened connection that has been created with theencodingargument specifying the input encoding.readLines andscan would then instruct the connections code to convert thetext to UTF-8 (instead of native encoding) and they will return it marked(aka declared, seeEncoding)as"UTF-8". Finally and for expert use only, one may disablere-encoding of input by specifying"" or"native.enc" asencoding for the connection, but then mark the text as being"UTF-8" or"latin1" via theencoding argumentofreadLines andscan.

Re-encoding only works for connections in text mode: reading from aconnection with re-encoding specified in binary mode will read thestream of bytes, but mixing text and binary mode reads (e.g., mixingcalls toreadLines andreadChar) is likelyto lead to incorrect results.

The encodings"UCS-2LE" and"UTF-16LE" are treatedspecially, as they are appropriate values for Windows ‘Unicode’text files. If the first two bytes are the Byte Order Mark0xFEFF then these are removed as some implementations oficonv do not acceptBOMs. Note that whereas mostimplementations will handleBOMs using encoding"UCS-2" andchoose the appropriate byte order, some (including earlier versions ofglibc) will not. There is a subtle distinction between"UTF-16" and"UCS-2" (seehttps://en.wikipedia.org/wiki/UTF-16): the use of characters inthe ‘Supplementary Planes’ which need surrogate pairs is veryrare so"UCS-2LE" is an appropriate first choice (as it is morewidely implemented).

The encoding"UTF-8-BOM" is accepted for reading and willremove a Byte Order Mark if present (which it often is for files andwebpages generated by Microsoft applications). If aBOM is required(it is not recommended) when writing it should be written explicitly,e.g. bywriteChar("\ufeff", con, eos = NULL) orwriteBin(as.raw(c(0xef, 0xbb, 0xbf)), binary_con)

Encoding names"utf8","mac" and"macroman" arenot portable, and not supported on all currentR platforms."UTF-8" is portable and"macintosh" is the official (andmost widely supported) name for ‘Mac Roman’. (R maps"utf8" to"UTF-8" internally.)

Requesting a conversion that is not supported is an error, reportedwhen the connection is opened. Exactly what happens when therequested translation cannot be done for invalid input is in generalundocumented. On output the result is likely to be that up to theerror, with a warning. On input, it will most likely be all or someof the input up to the error.

It may be possible to deduce the current native encoding fromSys.getlocale("LC_CTYPE"), but not all OSes record it.

Blocking

Whether or not the connection blocks can be specified for file, url(default yes), fifo and socket connections (default not).

In blocking mode, functions using the connection do not return to theR evaluator until the read/write is complete. In non-blocking mode,operations return as soon as possible, so on input they will returnwith whatever input is available (possibly none) and for output theywill return whether or not the write succeeded.

The functionreadLines behaves differently in respect ofincomplete last lines in the two modes: see its help page.

Even when a connection is in blocking mode, attempts are made toensure that it does not block the event loop and hence the operationof GUI parts ofR. These do not always succeed, and the wholeRprocess will be blocked during aDNS lookup on Unix, for example.

Most blocking operations on HTTP/FTP URLs and on sockets are subject to thetimeout set byoptions("timeout"). Note that this is a timeoutfor no response, not for the whole operation. The timeout is set atthe time the connection is opened (more precisely, when the lastconnection of that type – ‘⁠http:⁠’, ‘⁠ftp:⁠’ or socket – wasopened).

Fifos

Fifos default to non-blocking. That follows S version 4 and isprobably most natural, but it does have some implications. Inparticular, opening a non-blocking fifo connection for writing (only)will fail unless some other process is reading on the fifo.

Opening a fifo for both reading and writing (in any mode: one can onlyappend to fifos) connects both sides of the fifo to theR process,and provides an similar facility tofile().

Clipboard

file can be used withdescription = "clipboard"

in mode"r" only. This reads the X11 primary selection (seehttps://specifications.freedesktop.org/clipboards-spec/clipboards-latest.txt),which can also be specified as"X11_primary" and the secondaryselection as"X11_secondary". On most systems the clipboardselection (that used by ‘Copy’ from an ‘Edit’ menu) canbe specified as"X11_clipboard".

When a clipboard is opened for reading, the contents are immediatelycopied to internal storage in the connection.

Unix users wishing towrite to one of the X11 selections may beable to do so viaxclip(https://github.com/astrand/xclip) orxsel(https://www.vergenet.net/~conrad/software/xsel/), for example bypipe("xclip -i", "w") for the primary selection.

macOS users can usepipe("pbpaste") andpipe("pbcopy", "w") to read from and write to that system'sclipboard.

File paths

In most cases these are translated to the native encoding.

The exceptions arefile andpipe on Windows, where adescription which is marked as being in UTF-8 is passed toWindows as a ‘wide’ character string. This allows files withnames not in the native encoding to be opened on file systems whichuse Unicode file names (such asNTFS but not FAT32).

⁠ftp://⁠’ URLs

Most modern browsers do not support such URLs, and ‘⁠https://⁠’ones are much preferred for use inR.

It is intended thatR will continue to allow such URLs for as long aslibcurl does, but as they become rarer this is increasinglyuntested. What ‘protocols’ the version oflibcurlbeing used supports can be seen by callinglibcurlVersion().

Number of connections

There is a limit on the number of connections which can be allocated(not necessarily open) at any one time. It is good practice to closeconnections when finished with, but if necessary garbage-collectionwill be invoked to close those connections without anyR objectreferring to them.

The default limit is 128 (including the three terminal connections,stdin,stdout andstderr). This can be increasedwhenR is started using the option--max-connections=N, wherethe maximum allowed value is 4096.

However, many types of connections use other resources which arethemselves limited. Notably on Unix, ‘file descriptors’ whichby default are per-process limited: this limits the number ofconnections using files, pipes and fifos. (The default limit is 256on macOS (and Solaris) but 1024 on Linux. The limit can be raised in theshell used to launchR, for example byulimit -n.) Filedescriptors are used for many other purposes including dynamicallyloadingDSO/DLLs (seedyn.load) which may use up to 60%of the limit.

Windows has a default limit of 512 open C file streams: these are usedby at leastfile,gzfile,bzfile,xzfile,pipe,url andunz connections applied to files(rather than URLs).

Packageparallel'smakeCluster uses socketconnections to communicate with the worker processes, one per worker.

Note

R's connections are modelled on those in S version 4 (see Chambers,1998). HoweverR goes well beyond the S model, for example in outputtext connections and URL, compressed and socket connections.The default open mode inR is"r" except for socket connections.This differs from S, where it is the equivalent of"r+",known as"*".

On (historic) platforms wherevsnprintf does not return the neededlength of output there is a 100,000 byte output limit on the length ofa line for text output onfifo,gzfile,bzfile andxzfile connections: longer lines will be truncated with awarning.

References

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language. Springer.

Ripley, B. D. (2001).“Connections.”R News,1(1), 16–7.https://www.r-project.org/doc/Rnews/Rnews_2001-1.pdf.

See Also

textConnection,seek,showConnections,pushBack.

Functions making direct use of connections are (text-mode)readLines,writeLines,cat,sink,scan,parse,read.dcf,dput,dump and(binary-mode)readBin,readChar,writeBin,writeChar,loadandsave.

capabilities to see iffifo connections aresupported by this build ofR.

gzcon to wrapgzip (de)compression around aconnection.

optionsHTTPUserAgent,internet.info andtimeout are used by some of the methods for URL connections.

memCompress for more ways to (de)compress and referenceson data compression.

extSoftVersion for the versions of thezlib (forgzfile),bzip2 andxz libraries in use.

To flush output to the Windows and macOS consoles, seeflush.console.

Examples

zzfil<- tempfile(fileext=".data")zz<- file(zzfil,"w")# open an output file connectioncat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")cat("One more line\n", file= zz)close(zz)readLines(zzfil)unlink(zzfil)zzfil<- tempfile(fileext=".gz")zz<- gzfile(zzfil,"w")# compressed filecat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)readLines(zz<- gzfile(zzfil))close(zz)unlink(zzfil)zz# an invalid connectionzzfil<- tempfile(fileext=".bz2")zz<- bzfile(zzfil,"w")# bzip2-ed filecat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)zz# print() method: invalid connectionprint(readLines(zz<- bzfile(zzfil)))close(zz)unlink(zzfil)## An example of a file open for reading and writingTpath<- tempfile("test")Tfile<- file(Tpath,"w+")c(isOpen(Tfile,"r"), isOpen(Tfile,"w"))# both TRUEcat("abc\ndef\n", file= Tfile)readLines(Tfile)seek(Tfile,0, rw="r")# reset to beginningreadLines(Tfile)cat("ghi\n", file= Tfile)readLines(Tfile)Tfile# -> print() :  "valid" connectionclose(Tfile)Tfile# -> print() :  "invalid" connectionunlink(Tpath)## We can do the same thing with an anonymous file.Tfile<- file()cat("abc\ndef\n", file= Tfile)readLines(Tfile)close(Tfile)## Not run: ## fifo example -- may hang even with OS support for fifosif(capabilities("fifo")){  zzfil<- tempfile(fileext="-fifo")  zz<- fifo(zzfil,"w+")  writeLines("abc", zz)  print(readLines(zz))  close(zz)  unlink(zzfil)}## End(Not run)## Unix examples of use of pipes# read listing of current directoryreadLines(pipe("ls -1"))# remove trailing commas.  Suppose## Not run: % cat data2_450,390,467,654,30,542,334,432,421,357,497,493,550,549,467,575,578,342,446,547,534,495,979,479## End(Not run)# Then read this byscan(pipe("sed -e s/,$// data2_"), sep=",")# convert decimal point to comma in output: see also write.table# both R strings and (probably) the shell need \ doubledzzfil<- tempfile("outfile")zz<- pipe(paste("sed s/\\\\./,/ >", zzfil),"w")cat(format(round(stats::rnorm(48),4)), fill=70, file= zz)close(zz)file.show(zzfil, delete.file=TRUE)## Not run:## example for a machine running a finger daemoncon<- socketConnection(port=79, blocking=TRUE)writeLines(paste0(system("whoami", intern=TRUE),"\r"), con)gsub(" *$","", readLines(con))close(con)## End(Not run)## Not run:## Two R processes communicating via non-blocking sockets# R process 1con1<- socketConnection(port=6011, server=TRUE)writeLines(LETTERS, con1)close(con1)# R process 2con2<- socketConnection(Sys.info()["nodename"], port=6011)# as non-blocking, may need to loop for inputreadLines(con2)while(isIncomplete(con2)){   Sys.sleep(1)   z<- readLines(con2)if(length(z)) print(z)}close(con2)## examples of use of encodings# write a file in UTF-8cat(x, file=(con<- file("foo","w", encoding="UTF-8"))); close(con)# read a 'Windows Unicode' fileA<- read.table(con<- file("students", encoding="UCS-2LE")); close(con)## End(Not run)

Built-in Constants

Description

Constants built intoR.

Usage

LETTERSlettersmonth.abbmonth.namepi

Details

R has a small number of built-in constants.

The following constants are available:

  • LETTERS: the 26 upper-case letters of the Romanalphabet;

  • letters: the 26 lower-case letters of the Romanalphabet;

  • month.abb: the three-letter abbreviations for theEnglish month names;

  • month.name: the English names for the months of theyear;

  • pi: the ratio of the circumference of a circle to itsdiameter.

These are implemented as variables in the base namespace takingappropriate values.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

data,DateTimeClasses.

Quotes for the parsing of character constants,NumericConstants for numeric constants.

Examples

## John Machin (ca 1706) computed pi to over 100 decimal places## using the Taylor series expansion of the second term ofpi-4*(4*atan(1/5)- atan(1/239))## months in Englishmonth.name## months in your current localeformat(ISOdate(2000,1:12,1),"%B")format(ISOdate(2000,1:12,1),"%b")

R Project Contributors

Description

TheR Who-is-who, describing who made significant contributions tothe development ofR.

Usage

contributors()

Control Flow

Description

These are the basic control-flow constructs of theR language. Theyfunction in much the same way as control statements in any Algol-likelanguage. They are allreserved words.

Usage

if(cond) exprif(cond) cons.exprelse  alt.exprfor(varin seq) exprwhile(cond) exprrepeat exprbreaknextx%||% y

Arguments

cond

A length-one logical vector that is notNA.Other types are coerced to logical if possible, ignoring any class.(Conditions of length greater than one are an error.)

var

A syntactical name for a variable.

seq

An expression evaluating to a vector (including a list andanexpression) or to apairlist orNULL. Afactor value will be coerced to a character vector. This can be along vector.

expr,cons.expr,alt.expr,x,y

Anexpression in a formal sense. This is either asimple expression or a so-calledcompound expression, usuallyof the form{ expr1 ; expr2 }.

Details

break breaks out of afor,while orrepeatloop; control is transferred to the first statement outside theinner-most loop.next halts the processing of the currentiteration and advances the looping index. Bothbreak andnext apply only to the innermost of nested loops.

Note that it is a common mistake to forget to put braces ({ .. })around your statements, e.g., afterif(..) orfor(....).In particular, you should not have a newline between} andelse to avoid a syntax error in entering aif ... elseconstruct at the keyboard or viasource.For that reason, one (somewhat extreme) attitude of defensive programmingis to always use braces, e.g., forif clauses.

Theseq in afor loop is evaluated at the start ofthe loop; changing it subsequently does not affect the loop. Ifseq has length zero the body of the loop is skipped. Otherwise thevariablevar is assigned in turn the value of each element ofseq. You can assign tovar within the body of the loop,but this will not affect the next iteration. When the loop terminates,var remains as a variable containing its latest value.

The null coalescing operator%||% is a simple 1-line function:x %||% y is an idiomatic way to call

    if (is.null(x)) y else x                             # or equivalently, of course,    if(!is.null(x)) x else y

Inspired by Ruby, it was first proposed by Hadley Wickham.

Value

if returns the value of the expression evaluated, orNULL invisibly if none was (which may happen if there is noelse).

for,while andrepeat returnNULL invisibly.for setsvar to the last used element ofseq,or toNULL if it was of length zero.

break andnext do not return a value as they transfercontrol within the loop.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

Syntax for the basicR syntax and operators,Paren for parentheses and braces.

ifelse,switch for other ways to control flow.

Examples

for(iin1:5) print(1:i)for(nin c(2,5,10,20,50)){   x<- stats::rnorm(n)   cat(n,": ", sum(x^2),"\n", sep="")}f<- factor(sample(letters[1:5],10, replace=TRUE))for(iin unique(f)) print(i)res<-{}res%||%"alternative result"x<- head(x)%||% stop("parsed, but *not* evaluated..")res<-if(sum(x)>7.5) mean(x)# may be NULLres%||%"sum(x) <= 7.5"

Copyrights of Files Used to Build R

Description

R is released under the ‘GNU Public License’: seelicense for details. The license describes your rightto useR. Copyright is concerned with ownership of intellectualrights, and some of the software used has conditions that thecopyright must be explicitly stated: see the ‘Details’ section. Weare grateful to these people and other contributors (seecontributors) for the ability to use their work.

Details

The file ‘R_HOME/COPYRIGHTS’ lists the copyrights in fulldetail.


Matrix Cross-Product

Description

Given matricesx andy as arguments, return a matrixcross-product. This is formally equivalent to (but faster than) the callt(x) %*% y (crossprod) orx %*% t(y) (tcrossprod).

These are generic functions sinceR 4.4.0: methods can be writtenindividually or via thematOps groupgeneric function; it dispatches to S3 and S4 methods.

Usage

crossprod(x, y=NULL,...)tcrossprod(x, y=NULL,...)

Arguments

x,y

numeric or complex matrices (or vectors):y = NULLis taken to be the same matrix asx. Vectors are promoted tosingle-column or single-row matrices, depending on the context.

...

potential further arguments for methods.

Value

A double or complex matrix, with appropriatedimnames takenfromx andy.

Note

Whenx ory are not matrices, they are treated as column orrow matrices, but theirnames are usuallynotpromoted todimnames. Hence, currently, the lastexample has empty dimnames.

In the same situation, these matrix products (also%*%)are more flexible in promotion of vectors to row or column matrices, suchthat more cases are allowed, sinceR 3.2.0.

The propagation ofNaN/Inf values, precision, and performance of matrixproducts can be controlled byoptions("matprod").

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

%*% and outer product%o%.

Examples

(z<- crossprod(1:4))# = sum(1 + 2^2 + 3^2 + 4^2)drop(z)# scalarx<-1:4; names(x)<- letters[1:4]; xtcrossprod(as.matrix(x))# isidentical(tcrossprod(as.matrix(x)),          crossprod(t(x)))tcrossprod(x)# no dimnamesm<- matrix(1:6,2,3); v<-1:3; v2<-2:1stopifnot(identical(tcrossprod(v, m), v%*% t(m)),          identical(tcrossprod(v, m), crossprod(v, t(m))),          identical(crossprod(m, v2), t(m)%*% v2))

Report Information on C Stack Size and Usage

Description

Report information on the C stack size and usage (if available).

Usage

Cstack_info()

Details

On most platforms, C stack information is recorded whenR isinitialized and used for stack-checking. If this information isunavailable, thesize will be returned asNA, andstack-checking is not performed.

The information on the stack base address is thought to be accurate onWindows, Linux (usingglibc), macOS and FreeBSD but a heuristicis used on other platforms. Because this might be slightlyinaccurate, the current usage could be estimated as negative. (Theheuristic is not used on embedded uses ofR on platforms where thestack base information is not thought to be accurate.)

The ‘evaluation depth’ is the number of nestedR expressionscurrently under evaluation: this has a limit controlled byoptions("expressions").

Value

An integer vector. This has named elements

size

The size of the stack (in bytes), orNA if unknown.

current

The estimated current usage (in bytes), possiblyNA.

direction

1 (stack grows down, the usual case) or-1 (stack grows up).

eval_depth

The current evaluation depth (including two callsfor the call toCstack_info).

Examples

Cstack_info()

Cumulative Sums, Products, and Extremes

Description

Returns a vector whose elements are the cumulative sums, products,minima or maxima of the elements of the argument.

Usage

cumsum(x)cumprod(x)cummax(x)cummin(x)

Arguments

x

a numeric or complex (notcummin orcummax)object, or an object that can be coerced to one of these.

Details

These are generic functions: methods can be defined for themindividually or via theMath group generic.

Value

A vector of the same length and type asx (after coercion),except thatcumprod returns a numeric vector for integer input(for consistency with*). Names are preserved.

AnNA value inx causes the corresponding and followingelements of the return value to beNA, as does integer overflowincumsum (with a warning).In the complex case withNAs, theseNA elements mayhave finite real or imaginary parts, notably forcumsum(),fulfilling the identityIm(cumsum(x))\equivcumsum(Im(x)).

S4 methods

cumsum andcumprod are S4 generic functions:methods can be defined for them individually or via theMath group generic.cummax andcummin are individually S4 generic functions.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (cumsum only.)

Examples

cumsum(1:10)cumprod(1:10)cummin(c(3:1,2:0,4:2))cummax(c(3:1,2:0,4:2))

Retrieve Headers from URLs

Description

Retrieve the headers for a URL for a supported protocol such as‘⁠http://⁠’, ‘⁠ftp://⁠’, ‘⁠https://⁠’ and ‘⁠ftps://⁠’.

Usage

curlGetHeaders(url, redirect=TRUE, verify=TRUE,               timeout=0L, TLS="")

Arguments

url

character string specifying the URL.

redirect

logical: should redirections be followed?

verify

logical: should certificates be verified as validand applying to that host?

timeout

integer: the maximum time in seconds the request isallowed to take. Non-positive and invalid values are ignored(including the default). (Added inR 4.1.0.)

TLS

character: the minimum version of theTLS protocol to be usedfor ‘⁠https://⁠’ URLs: the default ("") is no restrictionbeyond that of the underlyinglibcurl (usually 1.0). Othervalid values are"1.1","1.2" (both forlibcurl7.34.0 and later) and"1.3" (7.52.0 and later), if supportedby the underlying version oflibcurl and theSSL library it uses.

Details

This reports whatcurl -I -L orcurl -I wouldreport. For a ‘⁠ftp://⁠’ URL the ‘headers’ are a record ofthe conversation between client and server before data transfer.

Only 500 header lines will be reported: there is a limit of 20redirections so this should suffice (and even 20 would indicateproblems).

If argumenttimeout is not set to a positive integer this usesgetOption("timeout") which defaults to 60 seconds. Asthe request cannot be interrupted you may want to consider a shortervalue.

To see all the details of the interaction with the server(s) setoptions(internet.info = 1).

HTTP[S] servers are allowed to refuse requests to read the headers andsome do: this will result in astatus of405.

For possible issues with secure URLs (especially on Windows) seedownload.file.

There is a security risk in not verifying certificates, but as onlythe headers are captured it is slight. Usually looking at the URL ina browser will reveal what the problem is (and it may well bemachine-specific).

Value

A character vector with integer attribute"status" (thelast-received ‘status’ code). If redirection occurs this will includethe headers for all the URLs visited.

For the interpretation of ‘status’ codes seehttps://en.wikipedia.org/wiki/List_of_HTTP_status_codes andhttps://en.wikipedia.org/wiki/List_of_FTP_server_return_codes.A successful FTP connection will usually have status 250, 257 or 350.

See Also

capabilities("libcurl") to see if this is supported.libcurlVersion for the version oflibcurl in use.

optionsHTTPUserAgent andtimeout are used.

Examples

## needs Internet access, results varycurlGetHeaders("http://bugs.r-project.org")## this redirects to https://## 2023-04: replaces slow and unreliable https://httpbin.org/status/404curlGetHeaders("https://developer.R-project.org/inet-tests/not-found")## returns status

Convert Numeric to Factor

Description

cut divides the range ofx into intervalsand codes the values inx according to whichinterval they fall. The leftmost interval corresponds to level one,the next leftmost to level two and so on.

Usage

cut(x,...)## Default S3 method:cut(x, breaks, labels=NULL,    include.lowest=FALSE, right=TRUE, dig.lab=3,    ordered_result=FALSE,...)

Arguments

x

a numeric vector which is to be converted to a factor by cutting.

breaks

either a numeric vector of two or more unique cut points or asingle number (greater than or equal to 2) giving the number ofintervals into whichx is to be cut.

labels

labels for the levels of the resulting category. By default,labels are constructed using"(a,b]" interval notation. Iflabels = FALSE, simple integer codes are returned instead ofa factor.

include.lowest

logical, indicating if an ‘x[i]’ equal tothe lowest (or highest, forright = FALSE) ‘breaks’value should be included.

right

logical, indicating if the intervals should be closed onthe right (and open on the left) or vice versa.

dig.lab

integer which is used when labels are not given. Itdetermines the number of digits used in formatting the break numbers.

ordered_result

logical: should the result be an ordered factor?

...

further arguments passed to or from other methods.

Details

Whenbreaks is specified as a single number, the range of thedata is divided intobreaks pieces of equal length, and thenthe outer limits are moved away by 0.1% of the range to ensure thatthe extreme values both fall within the break intervals. (Ifxis a constant vector, equal-length intervals are created, one ofwhich includes the single value.)

If alabels parameter is specified, its values are used to namethe factor levels. If none is specified, the factor level labels areconstructed as"(b1, b2]","(b2, b3]" etc. forright = TRUE and as"[b1, b2)", ... ifright = FALSE.In this case,dig.lab indicates the minimum number of digitsshould be used in formatting the numbersb1,b2, ....A larger value (up to 12) will be used if needed to distinguishbetween any pair of endpoints: if this fails labels such as"Range3" will be used. Formatting is done byformatC.

The default method will sort a numeric vector ofbreaks, butother methods are not required to andlabels will correspond tothe intervals after sorting.

As fromR 3.2.0,getOption("OutDec") is consulted when labelsare constructed forlabels = NULL.

Value

Afactor is returned, unlesslabels = FALSE whichresults in an integer vector of level codes.

Values which fall outside the range ofbreaks are coded asNA, as areNaN andNA values.

Note

Instead oftable(cut(x, br)),hist(x, br, plot = FALSE) ismore efficient and less memory hungry. Instead ofcut(*, labels = FALSE),findInterval() is more efficient.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

split for splitting a variable according to a group factor;factor,tabulate,table,findInterval.

quantile for ways of choosing breaks of roughly equalcontent (rather than length).

.bincode for a bare-bones version.

Examples

Z<- stats::rnorm(10000)table(cut(Z, breaks=-6:6))sum(table(cut(Z, breaks=-6:6, labels=FALSE)))sum(graphics::hist(Z, breaks=-6:6, plot=FALSE)$counts)cut(rep(1,5),4)#-- dummytx0<- c(9,4,6,5,3,10,5,3,5)x<- rep(0:8, tx0)stopifnot(table(x)== tx0)table( cut(x, breaks=8))table( cut(x, breaks=3*(-2:5)))table( cut(x, breaks=3*(-2:5), right=FALSE))##--- some values OUTSIDE the breaks :table(cx<- cut(x, breaks=2*(0:4)))table(cxl<- cut(x, breaks=2*(0:4), right=FALSE))which(is.na(cx));  x[is.na(cx)]#-- the first 9  values  0which(is.na(cxl)); x[is.na(cxl)]#-- the last  5  values  8## Label construction:y<- stats::rnorm(100)table(cut(y, breaks= pi/3*(-3:3)))table(cut(y, breaks= pi/3*(-3:3), dig.lab=4))table(cut(y, breaks=1*(-3:3), dig.lab=4))# extra digits don't "harm" heretable(cut(y, breaks=1*(-3:3), right=FALSE))#- the same, since no exact INT!## sometimes the default dig.lab is not enough to be avoid confusion:aaa<- c(1,2,3,4,5,2,3,4,5,6,7)cut(aaa,3)cut(aaa,3, dig.lab=4, ordered_result=TRUE)## one way to extract the breakpointslabs<- levels(cut(aaa,3))cbind(lower= as.numeric( sub("\\((.+),.*","\\1", labs)),      upper= as.numeric( sub("[^,]*,([^]]*)\\]","\\1", labs)))

Convert a Date or Date-Time Object to a Factor

Description

Method forcut applied to date-time objects.

Usage

## S3 method for class 'POSIXt'cut(x, breaks, labels=NULL, start.on.monday=TRUE,    right=FALSE,...)## S3 method for class 'Date'cut(x, breaks, labels=NULL, start.on.monday=TRUE,    right=FALSE,...)

Arguments

x

an object inheriting from class"POSIXt" or"Date".

breaks

a vector of cut pointsor number giving the number ofintervals whichx is to be cut intoor aninterval specification, one of"sec","min","hour","day","DSTday","week","month","quarter" or"year", optionallypreceded by an integer and a space, or followed by"s".(For"Date" objects only interval specifications using"day","week","month","quarter" and"year" are allowed.)

labels

labels for the levels of the resulting category. By default,labels are constructed from the left-hand end of the intervals(which are included for the default value ofright). Iflabels = FALSE, simple integer codes are returned insteadof a factor.

start.on.monday

logical. Ifbreaks = "weeks", should theweek start on Mondays or Sundays?

right,...

arguments to be passed to or from other methods.

Details

Note that the default forright differs from thedefault method. Usinginclude.lowest = TRUE will include both ends of the range of dates.

Usingbreaks = "quarter" will create intervals of 3 calendarmonths, with the intervals beginning on January 1, April 1,July 1 or October 1 (based uponmin(x)) as appropriate.

A vector ofbreaks will be sorted before use:labels shouldcorrespond to the sorted vector.

Value

A factor is returned, unlesslabels = FALSE which returnsthe integer level codes.

Values which fall outside the range ofbreaks are coded asNA, as are andNA values.

See Also

seq.POSIXt,seq.Date,cut

Examples

## random dates in a 10-week periodcut(ISOdate(2001,1,1)+70*86400*stats::runif(100),"weeks")cut(as.Date("2001/1/1")+70*stats::runif(100),"weeks")# The standards all have midnight as the start of the day, but some# people incorrectly interpret it at the end of the previous day ...tm<- seq(as.POSIXct("2012-06-01 06:00"), by="6 hours", length.out=24)aggregate(1:24, list(day= cut(tm,"days")), mean)# and a version with midnight included in the previous day:aggregate(1:24, list(day= cut(tm,"days", right=TRUE)), mean)

Object Classes

Description

Determine the class of an arbitraryR object.

Usage

data.class(x)

Arguments

x

anR object.

Value

character string giving theclass ofx.

The class is the (first element) of theclassattribute if this is non-NULL, or inferred from the object'sdim attribute if this is non-NULL, ormode(x).

Simply speaking,data.class(x) returns what is typically usefulfor method dispatching. (Or, what the basic creator functions alreadyand maybe eventually all will attach as a class attribute.)

Note

For compatibility reasons, there is one exception to the rule above:Whenx isinteger, the result ofdata.class(x) is"numeric" even whenx is classed.

See Also

class

Examples

x<- LETTERSdata.class(factor(x))# has a class attributedata.class(matrix(x, ncol=13))# has a dim attributedata.class(list(x))# the same as mode(x)data.class(x)# the same as mode(x)stopifnot(data.class(1:2)=="numeric")# compatibility "rule"

Data Frames

Description

The functiondata.frame() creates data frames, tightly coupledcollections of variables which share many of the properties ofmatrices and of lists, used as the fundamental data structure by mostofR's modeling software.

Usage

data.frame(..., row.names=NULL, check.rows=FALSE,           check.names=TRUE, fix.empty.names=TRUE,           stringsAsFactors=FALSE)

Arguments

...

these arguments are of either the formvalue ortag = value. Component names are created based on the tag (ifpresent) or the deparsed argument itself.

row.names

NULL or a single integer or character stringspecifying a column to be used as row names, or a character orinteger vector giving the row names for the data frame.

check.rows

ifTRUE then the rows are checked forconsistency of length and names.

check.names

logical. IfTRUE then the names of thevariables in the data frame are checked to ensure that they aresyntactically valid variable names and are not duplicated.If necessary they are adjusted (bymake.names)so that they are.

fix.empty.names

logical indicating if arguments which are“unnamed” (in the sense of not being formally called assomeName = arg) get an automatically constructed name orrather name"". Needs to be set toFALSE even whencheck.names is false if"" names should be kept.

stringsAsFactors

logical: should character vectors be convertedto factors? The ‘factory-fresh’ default has beenTRUEpreviously but has been changed toFALSE forR 4.0.0.

Details

A data frame is a list of variables of the same number of rows withunique row names, given class"data.frame". If no variablesare included, the row names determine the number of rows.

The column names should be non-empty, and attempts to use empty nameswill have unsupported results. Duplicate column names are allowed,but you need to usecheck.names = FALSE fordata.frameto generate such a data frame. However, not all operations on dataframes will preserve duplicated column names: for example matrix-likesubsetting will force column names in the result to be unique.

data.frame converts each of its arguments to a data frame bycallingas.data.frame(optional = TRUE). As that is ageneric function, methods can be written to change the behaviour ofarguments according to their classes:R comes with many such methods.Character variables passed todata.frame are converted tofactor columns if not protected byI and argumentstringsAsFactors is true. If a list or dataframe or matrix is passed todata.frame it is as if eachcomponent or column had been passed as a separate argument (except formatrices protected byI).

Objects passed todata.frame should have the same number ofrows, but atomic vectors (seeis.vector), factors andcharacter vectors protected byI will be recycled awhole number of times if necessary (including as elements of listarguments).

If row names are not supplied in the call todata.frame, therow names are taken from the first component that has suitable names,for example a named vector or a matrix with rownames or a data frame.(If that component is subsequently recycled, the names are discardedwith a warning.) Ifrow.names was supplied asNULL or nosuitable component was found the row names are the integer sequencestarting at one (and such row names are considered to be‘automatic’, and not preserved byas.matrix).

If row names are supplied of length one and the data frame has asingle row, therow.names is taken to specify the row names andnot a column (by name or number).

Names are removed from vector inputs not protected byI.

Value

A data frame, a matrix-like structure whose columns may be ofdiffering types (numeric, logical, factor and character and so on).

How the names of the data frame are created is complex, and the restof this paragraph is only the basic story. If the arguments are allnamed and simple objects (not lists, matrices of data frames) then theargument names give the column names. For an unnamed simple argument,a deparsed version of the argument is used as the name (with anenclosingI(...) removed). For a named matrix/list/data frameargument with more than one named column, the names of the columns arethe name of the argument followed by a dot and the column name insidethe argument: if the argument is unnamed, the argument's column namesare used. For a named or unnamed matrix/list/data frame argument thatcontains a single column, the column name in the result is the columnname in the argument. Finally, the names are adjusted to be uniqueand syntactically valid unlesscheck.names = FALSE.

Note

In versions ofR prior to 2.4.0row.names had to becharacter: to ensure compatibility with such versions ofR, supplya character vector as therow.names argument.

References

Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

I,plot.data.frame,print.data.frame,row.names,names (for the column names),[.data.frame for subsetting methods andI(matrix(..)) examples;Math.data.frame etc, aboutGroup methods fordata.frames;read.table,make.names,list2DF for creating data frames from lists of variables.

Examples

L3<- LETTERS[1:3]char<- sample(L3,10, replace=TRUE)(d<- data.frame(x=1, y=1:10, char= char))## The "same" with automatic column names:data.frame(1,1:10, sample(L3,10, replace=TRUE))is.data.frame(d)## enable automatic conversion of character arguments to factor columns:(dd<- data.frame(d, fac= letters[1:10], stringsAsFactors=TRUE))rbind(class= sapply(dd, class), mode= sapply(dd, mode))stopifnot(1:10== row.names(d))# {coercion}(d0<- d[,FALSE])# data frame with 0 columns and 10 rows(d.0<- d[FALSE,])# <0 rows> data frame  (3 named cols)(d00<- d0[FALSE,])# data frame with 0 columns and 0 rows

Convert a Data Frame to a Numeric Matrix

Description

Return the matrix obtained by converting all the variables in a dataframe to numeric mode and then binding them together as the columns ofa matrix. Factors and ordered factors are replaced by their internalcodes.

Usage

data.matrix(frame, rownames.force=NA)

Arguments

frame

a data frame whose components are logical vectors,factors or numeric or character vectors.

rownames.force

logical indicating if the resulting matrixshould have character (rather thanNULL)rownames. The default,NA, usesNULLrownames if the data frame has ‘automatic’ row.names or for azero-row data frame.

Details

Logical and factor columns are converted to integers. Charactercolumns are first converted to factors and then to integers. Any othercolumn which is not numeric (according tois.numeric) isconverted byas.numeric or, for S4 objects,as(, "numeric"). If all columns are integer (afterconversion) the result is an integer matrix, otherwise a numeric(double) matrix.

Value

Ifframe inherits from class"data.frame", an integer ornumeric matrix of the same dimensions asframe, with dimnamestaken from therow.names (orNULL, depending onrownames.force) andnames.

Otherwise, the result ofas.matrix.

Note

The default behaviour for data frames differs fromR < 2.5.0 whichalways gave the result character rownames.

References

Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

as.matrix,data.frame,matrix.

Examples

DF<- data.frame(a=1:3, b= letters[10:12],                 c= seq(as.Date("2004-01-01"), by="week", length.out=3),                 stringsAsFactors=TRUE)data.matrix(DF[1:2])data.matrix(DF)

System Date and Time

Description

Returns a character string of the current system date and time.

Usage

date()

Value

The string has the form"Fri Aug 20 11:11:00 1999", i.e.,length 24, since it relies on POSIX'sctime ensuring the abovefixed format. Timezone and Daylight Saving Time are taken account of,butnot indicated in the result.

The day and month abbreviations are always in English, irrespectiveof locale.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

Sys.Date andSys.time;DateandDateTimeClasses for objects representing date and time.

Examples

(d<- date())nchar(d)==24## something similar in the current locale##   depending on ctime; e.g. %e could be %d:format(Sys.time(),"%a %b %e %H:%M:%S %Y")

Date Class

Description

Description of the class"Date" representing calendar dates.

Usage

## S3 method for class 'Date'summary(object, digits=12,...)## S3 method for class 'Date'print(x, max=NULL,...)

Arguments

object,x

aDate object to be summarized or printed.

digits

number of significant digits for the computations.

max

numeric orNULL, specifying the maximal number ofentries to be printed. By default, whenNULL,getOption("max.print") used.

...

further arguments to be passed from or to other methods.

Details

Dates are represented as the number of days since 1970-01-01, withnegative values for earlier dates. They are always printedfollowing the rules of the current Gregorian calendar, even thoughthat calendar was not in use long ago (it was adopted in 1752 inGreat Britain and its colonies). When printing there is assumed tobe a year zero.

It is intended that the date should be an integer value, but this isnot enforced in the internal representation. Fractional days will beignored when printing. It is possible to produce fractional days viathemean method or by adding or subtracting (seeOps.Date).

When a date is converted to a date-time (for example byas.POSIXct oras.POSIXlt its time is takenas midnight in UTC.

Printing dates involves conversion to class"POSIXlt"which treats dates of more than about 780 million years from presentasNA.

For the many methods seemethods(class = "Date"). Several aredocumented separately, see below.

See Also

Sys.Date for the current date.

weekdays for convenience extraction functions.

Methods with extra arguments and documentation:

Ops.Date

for operators on"Date" objects.

format.Date

for conversion to and from character strings.

axis.Date

andhist.Date for plotting.

seq.Date

,cut.Date, andround.Date for utility operations.

DateTimeClasses for date-time classes.

Examples

(today<- Sys.Date())format(today,"%d %b %Y")# with month as a word(tenweeks<- seq(today, length.out=10, by="1 week"))# next ten weeksweekdays(today)months(tenweeks)(Dls<- as.Date(.leap.seconds))## Show use of year zero:(z<- as.Date("01-01-01"))# how it is printed depends on the OSz-365# so year zero was a leap year.as.Date("00-02-29")# if you want a different format, consider something like (if supported)## Not run: format(z, "%04Y-%m-%d") # "0001-01-01"format(z,"%_4Y-%m-%d")# "   1-01-01"format(z,"%_Y-%m-%d")# "1-01-01"## End(Not run)##  length(<Date>) <- n   now worksls<- Dls; length(ls)<-12l2<- Dls; length(l2)<-5+ length(Dls)stopifnot(exprs={## length(.) <- * is compatible to subsetting/indexing:  identical(ls, Dls[seq_along(ls)])  identical(l2, Dls[seq_along(l2)])## has filled with NA's  is.na(l2[(length(Dls)+1):length(l2)])})

Date-Time Classes

Description

Description of the classes"POSIXlt" and"POSIXct"representing calendar dates and times.

Usage

## S3 method for class 'POSIXct'print(x, tz="", usetz=TRUE, max=NULL,...)## S3 method for class 'POSIXct'summary(object, digits=15,...)time+ zz+ timetime- ztime1 lop time2

Arguments

x,object

an object to be printed or summarized from one of thedate-time classes.

tz,usetz

for timezone formatting, passed toformat.POSIXct.

max

numeric orNULL, specifying the maximal number ofentries to be printed. By default, whenNULL,getOption("max.print") used.

digits

number of significant digits for the computations:should be high enough to represent the least important time unitexactly.

...

further arguments to be passed from or to other methods.

time

date-time objects.

time1,time2

date-time objects or character vectors. (Charactervectors are converted byas.POSIXct.)

z

a numeric vector (in seconds).

lop

one of==,!=,<,<=,>or>=.

Details

There are two basic classes of date/times. Class"POSIXct"represents the (signed) number of seconds since the beginning of 1970(in the UTC time zone) as a numeric vector. Class"POSIXlt" isinternally alist of vectors with components namedsec,min,hour for the time,mday,mon, andyear, for the date,wday,yday for the day of the week and day of the year,isdst, a Daylight Saving Time flag,and sometimes (bothoptional)zone, a string for the time zone, andgmtoff, offset in seconds from GMT,see the section ‘Details on POSIXlt’ below for more details.

The classes correspond to the POSIX/C99 constructs of ‘calendartime’ (thetime_t data type, “ct”), and ‘local time’(or broken-down time, the ‘⁠struct tm⁠’ data type, “lt”),from which they also inherit their names.

"POSIXct" is more convenient for including in data frames, and"POSIXlt" is closer to human-readable forms. A virtual class"POSIXt" exists from which both of the classes inherit: it isused to allow operations such as subtraction to mix the two classes.

Logical comparisons and some arithmetic operations are available forboth classes. One can add or subtract a number of seconds from adate-time object, but not add two date-time objects. Subtraction oftwo date-time objects is equivalent to usingdifftime.Be aware that"POSIXlt" objects will be interpreted as being inthe current time zone for these operations unless a time zone has beenspecified.

Both classes may have an attribute"tzone", specifying the timezone. Note however that their meaning differ, see the section‘Time Zones’ below for more details.

Unfortunately, the conversion is complicated by the operation of timezones and leap seconds (according to this version ofR's data,27 days have been 86401 seconds long sofar, the last being on (actually, immediately before)2017-01-01: the times of theextra seconds are in the object.leap.seconds). The details ofthis are entrusted to the OS services where possible. It seems thatsome rare systems used to use leap seconds, but all known currentplatforms ignore them (as required by POSIX). This is detected andcorrected for at build time, so"POSIXct" times used byR donot include leap seconds on any platform.

Usingc on"POSIXlt" objects converts them to thecurrent time zone, and on"POSIXct" objects drops"tzone"attributes if they are not all the same.

A few times have specific issues. First, the leap seconds are ignored,and real times such as"2005-12-31 23:59:60" are (probably)treated as the next second. However, they will never be generated byR, and are unlikely to arise as input. Second, on some OSes there isa problem in the POSIX/C99 standard with"1969-12-31 23:59:59 UTC",which is-1 in calendar time and that value is on those OSesalso used as an error code. Thusas.POSIXct("1969-12-31 23:59:59", format = "%Y-%m-%d %H:%M:%S", tz = "UTC") may giveNA, and henceas.POSIXct("1969-12-31 23:59:59", tz = "UTC") will give"1969-12-31 23:59:00". Other OSes(including the code used byR on Windows) report errors separatelyand so are able to handle that time as valid.

The print methods respectoptions("max.print").

Time zones

"POSIXlt" objects will often have an attribute"tzone",a character vector of length 3 giving the time zone name (from theTZenvironment variable or argumenttz of functions creating"POSIXlt" objects;"" marks the current time zone)and the names of the base time zoneand the alternate (daylight-saving) time zone. Sometimes this mayjust be of length one, giving thetime zone name.

"POSIXct" objects may also have an attribute"tzone", acharacter vector of length one. If set to a non-empty value, it willdetermine how the object is converted to class"POSIXlt" and inparticular how it is printed. This is usually desirable, but if youwant to specify an object in a particular time zone but to be printedin the current time zone you may want to remove the"tzone"attribute.

Details on POSIXlt

Class"POSIXlt" is internally a namedlist ofvectors representing date-times, with the following list components

sec

0–61: seconds, allowing for leap seconds.

min

0–59: minutes.

hour

0–23: hours.

mday

1–31: day of the month.

mon

0–11: months after the first of the year.

year

years since 1900.

wday

0–6 day of the week, starting on Sunday.

yday

0–365: day of the year (365 only in leap years).

isdst

Daylight Saving Time flag. Positive if inforce, zero if not, negative if unknown.

zone

(Optional.) The abbreviation for the time zone inforce at that time:"" if unknown (but"" might alsobe used for UTC).

gmtoff

(Optional.) The offset in seconds from GMT:positive values are East of the meridian. UsuallyNA ifunknown, but0 could mean unknown.

The components must be in this order: that was only minimally checkedprior toR 4.3.0. All objects created inR 4.3.0 have the optionalcomponents. From earlier versions ofR, he last two components willnot be present for times in UTC and are platform-dependent. Currentlygmtoff is set on almost all current platforms: those based onBSD orglibc (including Linux and macOS) and those using thetzcode implementation shipped withR (including Windows and bydefault macOS).

Note that the internal list structure is somewhat hidden, as manymethods (includinglength(x),print() andstr()) apply to the abstract date-time vector, as for"POSIXct". One can extract and replacesinglecomponents via[ indexing withtwo indices (see theexamples).

The components of"POSIXlt" areinteger vectors,exceptsec (double) andzone(character). However most users will coerce numericvalues for the first to real and the rest barzone to integer.

Componentswday andyday are for information, and are notused in the conversion to calendar time nor for printing,format(), or inas.character().

However, componentisdst is needed to distinguish times at theend of DST: typically 1am to 2am occurs twice, first in DST and thenin standard time. At all other timesisdst can be deduced fromthe first six values, but the behaviour if it is set incorrectly isplatform-dependent. For example Linux/glibc when checked fixed upincorrect values in time zones which support DST but gave an error onvalue1 in those without DST.

For “ragged” and out-of-range vs “balanced”"POSIXlt" objects, seebalancePOSIXlt().

Sub-second Accuracy

Classes"POSIXct" and"POSIXlt" are able to expressfractions of a second where the latter allows for higher accuracy.Consequently, conversion of fractions between the two formsmay not be exact, but will have better than microsecond accuracy.

Fractional seconds are printed only ifoptions("digits.secs") is set: seestrftime.

Valid ranges for times

The"POSIXlt" class can represent a very wide range of times (upto billions of years), but such times can only be interpreted withreference to a time zone.

The concept of time zones was first adopted in the nineteenth century,and the Gregorian calendar was introduced in 1582 but not universallyadopted until 1927. OS services almost invariably assume theGregorian calendar and may assume that the time zone that was firstenacted for the location was in force before that date. (The earliestlegislated time zone seems to have been London on 1847-12-01.) SomeOSes assume the previous use of ‘local time’ based on thelongitude of a location within the time zone.

Most operating systems representPOSIXct times as C typelong. This means that on 32-bit OSes this covers the period1902 to 2037. On all known 64-bit platforms and for the code we useon 32-bit Windows, the range of representable times is billions ofyears: however, not all can convert correctly times before 1902 orafter 2037. A few benighted OSes used a unsigned type and so cannotrepresent times before 1970.

Where possible the platform limits are detected, and outsidethe limits we use our own C code. This uses the offset fromGMT in use either for 1902 (when there was no DST) or that predictedfor one of 2030 to 2037 (chosen so that the likely DST transition daysare Sundays), and uses the alternate (daylight-saving) time zone onlyifisdst is positive or (if-1) if DST was predicted tobe in operation in the 2030s on that day.

Note that there are places (e.g., Rome) whose offset from UTC variedin the years prior to 1902, and these will be handled correctly onlywhere there is OS support.

There is no reason to assume that the DST rules will remain the samein the future: the US legislated in 2005 to change itsrules as from 2007, with a possible future reversion. So conversionsfor times more than a year or two ahead are speculative. Othercountries have changed their rules (and indeed, if DST is used at all)at a few days' notice. So representations and conversion of futuredates are tentative. This also applies to dates after the in-useversion of the time-zone database – not all platforms keep it up todate, which includes that shipped with older versions ofR where used(which it is by default on Windows and macOS).

Warnings

Some Unix-like systems (especially Linux ones) do not have environmentvariableTZ set, yet have internal code that expects it (as doesPOSIX). We have tried to work around this, but if you get unexpectedresults try settingTZ. SeeSys.timezone forvalid settings.

Great care is needed when comparing objects of class"POSIXlt".Not only are components and attributes optional; several componentsmay have values meaning ‘not yet determined’ and the same timerepresented in different time zones will look quite different.

Theorder of the list components of"POSIXlt" objectsmust not be changed, as several C-based conversion methods rely on theorder for efficiency.

References

Ripley, B. D. and Hornik, K. (2001).“Date-time classes.”R News,1(2), 8–11.https://www.r-project.org/doc/Rnews/Rnews_2001-2.pdf.

See Also

Dates for dates without times.

as.POSIXct andas.POSIXlt for conversionbetween the classes.

strptime for conversion to and from characterrepresentations.

Sys.time for clock time as a"POSIXct" object.

difftime for time intervals.

balancePOSIXlt() for balancing or filling “ragged”POSIXlt objects.

cut.POSIXt,seq.POSIXt,round.POSIXt andtrunc.POSIXt for methodsfor these classes.

weekdays for convenience extraction functions.

Examples

(z<- Sys.time())# the current date, as class "POSIXct"Sys.time()-3600# an hour agoas.POSIXlt(Sys.time(),"GMT")# the current time in GMTformat(.leap.seconds)# the leap seconds in your time zoneprint(.leap.seconds, tz="America/Los_Angeles")# and in Seattle's## look at *internal* representation of "POSIXlt" :leapS<- as.POSIXlt(.leap.seconds)names(unclass(leapS)); is.list(leapS)## str() on inner structure needs unclass(.):utils::str(unclass(leapS), vec.len=7)## show all (apart from "tzone" attr):data.frame(unclass(leapS))## Extracting *single* components of POSIXlt objects:leapS[1:5,"year"]leapS[17:22,"mon"]##  length(.) <- n   now works for "POSIXct" and "POSIXlt" :for(lpSin list(.leap.seconds, leapS)){    ls<- lpS; length(ls)<-12    l2<- lpS; length(l2)<-5+ length(lpS)    stopifnot(exprs={## length(.) <- * is compatible to subsetting/indexing:      identical(ls, lpS[seq_along(ls)])      identical(l2, lpS[seq_along(l2)])## has filled with NA's      is.na(l2[(length(lpS)+1):length(l2)])})}

Read and Write Data in DCF Format

Description

Reads or writes anR object from/to a file in Debian Control Fileformat.

Usage

read.dcf(file, fields=NULL, all=FALSE, keep.white=NULL)write.dcf(x, file="", append=FALSE, useBytes=FALSE,          indent=0.1* getOption("width"),          width=0.9* getOption("width"),          keep.white=NULL)

Arguments

file

either a character string naming a file or aconnection."" indicates output to the console. Forread.dcf thiscan name a compressed file (seegzfile).

fields

a character vector with the names of the fieldsto read from the DCF file. Default is to read all fields.

all

a logical indicating whether in case of multipleoccurrences of a field in a record, all these should be gathered.Ifall is false (default), only the last such occurrence isused.

keep.white

a character vector with the names of the fields forwhich whitespace should be kept as is, orNULL (default)indicating that there are no such fields. Coerced to character ifpossible. For fields where whitespace is not to be kept as is,read.dcf removes leading and trailing whitespace, andwrite.dcf folds usingstrwrap.

x

the object to be written, typically a data frame. If not, itis attempted to coercex to a data frame.

append

logical. IfTRUE, the output is appended to thefile. IfFALSE, any existing file of the name is destroyed.

useBytes

logical to be passed towriteLines(),see there: “for expert use”.

indent

a positive integer specifying the indentation forcontinuation lines in output entries.

width

a positive integer giving the target column for wrappinglines in the output.

Details

DCF is a simple format for storing databases in plain text files thatcan easily be directly read and written by humans. DCF is used invarious places to storeR system information, like descriptions andcontents of packages.

The DCF rules as implemented inR are:

  1. A database consists of one or more records, each with one ormore named fields. Not every record must contain each field.Fields may appear more than once in a record.

  2. Regular lines start with a non-whitespace character.

  3. Regular lines are of formtag:value, i.e., have a nametag and a value for the field, separated by: (only the first: counts). The value can be empty (i.e., whitespace only).

  4. Lines starting with whitespace are continuation lines (to thepreceding field) if at least one character in the line isnon-whitespace. Continuation lines where the only non-whitespacecharacter is a ‘⁠.⁠’ are taken as blank lines (allowing formulti-paragraph field values).

  5. Records are separated by one or more empty (i.e., whitespaceonly) lines.

  6. Individual lines may not be arbitrarily long; prior toR 3.0.2 thelength limit was approximately 8191 bytes per line.

Note thatread.dcf(all = FALSE) reads the file byte-by-byte.This allows a ‘DESCRIPTION’ file to be read and only its ASCIIfields used, or its ‘⁠Encoding⁠’ field used to re-encode theremaining fields.

write.dcf does not writeNA fields.

Value

The defaultread.dcf(all = FALSE) returns a character matrixwith one row per record and one column per field. Leading andtrailing whitespace of field values is ignored unless a field islisted inkeep.white. If a tag name is specified in the file,but the corresponding value is empty, then an empty string isreturned. If the tag name of a field is specified infieldsbut never used in a record, then the corresponding value isNA.If fields are repeated within a record, the last one encountered isreturned. Malformed lines lead to an error.

Forread.dcf(all = TRUE) a data frame is returned, again withone row per record and one column per field. The columns are lists ofcharacter vectors for fields with multiple occurrences, and charactervectors otherwise.

Note that an emptyfile is a valid DCF file, andread.dcf will return a zero-row matrix or data frame.

Forwrite.dcf, invisibleNULL.

Note

As fromR 3.4.0, ‘whitespace’ in all cases includes newlines.

References

https://www.debian.org/doc/debian-policy/ch-controlfields.html.

Note thatR does not require encoding in UTF-8, which is a recentDebian requirement. Nor does it use the Debian-specific sub-formatwhich allows comment lines starting with ‘⁠#⁠’.

See Also

write.table.

available.packages, which usesread.dcf to readthe indices of package repositories.

Examples

## Create a reduced version of the DESCRIPTION file in package 'splines'x<- read.dcf(file= system.file("DESCRIPTION", package="splines"),              fields= c("Package","Version","Title"))write.dcf(x)## An online DCF file with multiple recordscon<- url("https://cran.r-project.org/src/contrib/PACKAGES")y<- read.dcf(con, all=TRUE)close(con)utils::str(y)

Debug a Function

Description

Set, unset or query the debugging flag on a function.Thetext andcondition arguments are the same as thosethat can be supplied via a call tobrowser. They can be retrievedby the user once the browser has been entered, and provide a mechanism toallow users to identify which breakpoint has been activated.

Usage

debug(fun, text="", condition=NULL, signature=NULL)debugonce(fun, text="", condition=NULL, signature=NULL)undebug(fun, signature=NULL)isdebugged(fun, signature=NULL)debuggingState(on=NULL)

Arguments

fun

any interpretedR function.

text

a text string that can be retrieved when the browser is entered.

condition

a condition that can be retrieved when the browser isentered.

signature

an optional method signature. If specified, themethod is debugged, rather than its generic.

on

logical; a call to the support functiondebuggingState returnsTRUE if debugging is globallyturned on,FALSE otherwise. An argument of one or the otherof those values sets the state. If the debugging state isFALSE, none of the debugging actions will occur (but explicitbrowser calls in functions will continue to work).

Details

When a function flagged for debugging is entered, normal executionis suspended and the body of function is executed one statement at atime. A newbrowser context is initiated for each step(and the previous one destroyed).

At the debug prompt the user can enter commands orR expressions,followed by a newline. The commands are described in thebrowser help topic.

To debug a function which is defined inside another function,single-step through to the end of its definition, and then calldebug on its name.

If you want to debug a function not starting at the very beginning,usetrace(..., at = *) orsetBreakpoint.

Usingdebug is persistent, and unless debugging is turned offthe debugger will be entered on every invocation (note that if thefunction is removed and replaced the debug state is not preserved).Usedebugonce() to enter the debugger only the next time thefunction is invoked.

To debug an S4 method by explicit signature, usesignature. When specified, signature indicates the method offun to be debugged. Note that debugging is implemented slightlydifferently for this case, as it uses the trace machinery, rather thanthe debugging bit. As such,text andcondition cannot bespecified in combination with a non-nullsignature. For methodswhich implement the.local rematching mechanism, the.local closure itself is the one that will be ultimatelydebugged (seeisRematched).

isdebugged returnsTRUE if a)signature isNULLand the closurefun has been debugged, or b)signature is notNULL,fun is an S4 generic, and the method offunfor that signature has been debugged. In all other cases, it returnsFALSE.

The number of lines printed for the deparsed call when a function isentered for debugging can be limited by settingoptions(deparse.max.lines).

When debugging is enabled on a byte compiled function then theinterpreted version of the function will be used until debugging isdisabled.

Value

debug andundebug invisibly returnNULL.

isdebugged returnsTRUE if the function or method is

marked for debugging, andFALSE otherwise.

See Also

debugcall for conveniently debugging methods,browser notably for its ‘commands’,trace;traceback to see the stack after anError: ...message;recover for another debugging approach.

Examples

## Not run:debug(library)library(methods)## End(Not run)## Not run:debugonce(sample)## only the first call will be debuggedsampe(10,1)sample(10,1)## End(Not run)

Declarations

Description

A framework for specifying information about R code for use by theinterpreter, compiler, and code analysis tools.

Usage

declare(...)

Arguments

...

declaration expressions.

Details

A syntax for declaration expressions is still being developed.

Value

Evaluating adeclare() call ignores the arguments and returnsNULL invisibly.


Marking Objects as Defunct

Description

When a function is removed fromR it should be replaced by a functionwhich calls.Defunct.

Usage

.Defunct(new, package=NULL, msg)

Arguments

new

character string: A suggestion for a replacement function.

package

character string: The package to be used when suggesting where thedefunct function might be listed.

msg

character string: A message to be printed, if missing a defaultmessage is used.

Details

.Defunct is called from defunct functions. Functions should belisted inhelp("pkg-defunct") for an appropriatepkg,includingbase (with the alias added to the respective Rdfile).

.Defunct signals an error of classdefunctErrorwith fieldsold,new, andpackage.

See Also

Deprecated.

base-defunct and so on which list the defunct functionsin the packages.


Delay Evaluation and Promises

Description

delayedAssign creates apromise to evaluate the givenexpression if its value is requested. This provides direct accessto thelazy evaluation mechanism used byR for the evaluationof (interpreted) functions.

Usage

delayedAssign(x, value, eval.env= parent.frame(1),              assign.env= parent.frame(1))

Arguments

x

a variable name (given as a quoted string in the function call)

value

an expression to be assigned tox

eval.env

an environment in which to evaluatevalue

assign.env

an environment in which to assignx

Details

Botheval.env andassign.env default to the currently activeenvironment.

The expression assigned to a promise bydelayedAssign willnot be evaluated until it is eventually ‘forced’. This happens whenthe variable is first accessed.

When the promise is eventually forced, it is evaluated within theenvironment specified byeval.env (whose contents may have changed inthe meantime). After that, the value is fixed and the expression willnot be evaluated again, where the promise still keeps its expression.

Value

This function is invoked for its side effect, which is assigninga promise to evaluatevalue to the variablex.

See Also

substitute, to see the expression associated with apromise, ifassign.env is not the.GlobalEnv.

Examples

msg<-"old"delayedAssign("x", msg)substitute(x)# shows only 'x', as it is in the global env.msg<-"new!"x# new!delayedAssign("x",{for(iin1:3)        cat("yippee!\n")10})x^2#- yippeex^2#- simple numberne<- new.env()delayedAssign("x", pi+2, assign.env= ne)## See the promise {without "forcing" (i.e. evaluating) it}:substitute(x, ne)#  'pi + 2'### Promises in an environment [for advanced users]:  ---------------------e<-(function(x, y=1, z) environment())(cos,"y",{cat(" HO!\n"); pi+2})## How can we look at all promises in an env (w/o forcing them)?gete<-function(e_){   ne<- names(e_)   names(ne)<- ne   lapply(lapply(ne, as.name),function(n) eval(substitute(substitute(X, e_), list(X=n))))}(exps<- gete(e))sapply(exps, typeof)(le<- as.list(e))# evaluates ("force"s) the promisesstopifnot(identical(le, lapply(exps, eval)))# and another "Ho!"

Expression Deparsing

Description

Turn unevaluated expressions into character strings.

Usage

deparse(expr, width.cutoff=60L,        backtick= mode(expr)%in% c("call","expression","(","function"),        control= c("keepNA","keepInteger","niceNames","showAttributes"),        nlines=-1L)deparse1(expr, collapse=" ", width.cutoff=500L,...)

Arguments

expr

anyR expression.

width.cutoff

integer in[20,500][20, 500] determining the cutoff(in bytes) at which line-breaking is tried.

backtick

logical indicating whether symbolic names should beenclosed in backticks if they do not follow the standard syntax.

control

character vector (orNULL) of deparsing options.control = "all" is thorough, see.deparseOpts.

nlines

integer: the maximum number of lines to produce. Negativevalues indicate no limit.

collapse

a string, passed topaste().

...

further arguments passed todeparse().

Details

These functions turn unevaluated expressions (where ‘expression’is taken in a wider sense than the strict concept of a vector ofmode and type (typeof)"expression" used inexpression) into characterstrings (a kind of inverse toparse).

A typical use of this is to create informative labels for data setsand plots. The example shows a simple use of this facility. It usesthe functionsdeparse andsubstitute to create labelsfor a plot which are character string versions of the actual argumentsto the functionmyplot.

The default for thebacktick option is not to quote singlesymbols but only composite expressions. This is a compromise toavoid breaking existing code.

width.cutoff is a lower bound for the line lengths: deparsing aline proceeds until at leastwidth.cutoffbytes havebeen output and e.g.arg = value expressions will not be splitacross lines.

deparse1() is a simple utility added inR 4.0.0 to ensure astring result (character vector of length one),typically used in name construction, asdeparse1(substitute(.)).

Note

To avoid the risk of a source attribute out of sync with the actualfunction definition, the source attribute of a function will neverbe deparsed as an attribute.

Deparsing internal structures may not be accurate: for example thegraphics display list recorded byrecordPlot is notintended to be deparsed and.Internal calls will be shown asprimitive calls.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

.deparseOpts for availablecontrol settings;dput() anddump() for related functions usingidentical internal deparsing functionality.

substitute,parse,expression.

Quotes for quoting conventions, including backticks.

Examples

require(stats); require(graphics)deparse(args(lm))deparse(args(lm), width.cutoff=500)myplot<-function(x, y){    plot(x, y, xlab= deparse1(substitute(x)),               ylab= deparse1(substitute(y)))}e<- quote(`foo bar`)deparse(e)deparse(e, backtick=TRUE)e<- quote(`foo bar`+1)deparse(e)deparse(e, control="all")# wraps it w/ quote( . )

Options for Expression Deparsing

Description

Process the deparsing options fordeparse,dput anddump.

Usage

.deparseOpts(control)..deparseOpts

Arguments

control

character vector of deparsing options.

Details

..deparseOpts is thecharacter vector of possibledeparsing options used by.deparseOpts().

.deparseOpts() is called bydeparse,dput anddump to process theircontrol argument.

Thecontrol argument is a vector containing zero or more of thefollowing strings (exactly those in..deparseOpts). Partialstring matching is used.

"keepInteger":

Either surround integer vectors byas.integer() or usesuffixL, so they are not converted to type double whenparsed. This includes making sure that integerNAs arepreserved (viaNA_integer_ if there are no non-NAvalues in the vector, unless"S_compatible" is set).

"quoteExpressions":

Surround unevaluated expressions, but notformulas,withquote(), so they are not evaluated when re-parsed.

"showAttributes":

If the object hasattributes (other than asourceattribute, seesrcref), usestructure()to display them as well as the object value unless the only suchattribute isnames and the"niceNames" option is set.This ("showAttributes") is the default fordeparse anddput.

"useSource":

If the object has asource attribute (srcref),display that instead of deparsing the object. Currently onlyapplies to function definitions.

"warnIncomplete":

Some exotic objects such asenvironments, externalpointers, etc. can not be deparsed properly. This option causes awarning to be issued if the deparser recognizes one of thesesituations.

Also, the parser inR < 2.7.0 would only accept strings of up to8192 bytes, and this option gives a warning for longer strings.

"keepNA":

Integer, real and characterNAs are surrounded by coercionfunctions where necessary to ensure that they are parsed to thesame type. Since e.g.NA_real_ can be output inR, this ismainly used in connection withS_compatible.

"niceNames":

If true,lists and atomic vectors with non-NAnames (seenames) are deparsed as e.g.,c(A = 1)instead ofstructure(1, names = "A"), independently of the"showAttributes" setting.

"all":

An abbreviated way to specify all of the optionslisted aboveplus"digits17".This is the default fordump, and, without"digits17", the optionsused byedit (which are fixed).

"delayPromises":

Deparse promises in the form <promise: expression> rather thanevaluating them. The value and the environment of the promisewill not be shown and the deparsed code cannot be sourced.

"S_compatible":

Make deparsing as far as possible compatible with S andR < 2.5.0.For compatibility with S, integer values of double vectors aredeparsed with a trailing decimal point. Backticks are not used.

"hexNumeric":

Real and finite complex numbers are output in ‘⁠"%a"⁠’ format asbinary fractions (coded as hexadecimal: seesprintf)with maximal opportunity to be recorded exactly to full precision.Complex numbers with one or both non-finite components areoutput as if this option were not set.

(This relies on that format being correctly supported: knownproblems on Windows are worked around as fromR 3.1.2.)

"digits17":

Real and finite complex numbers are output using format‘⁠"%.17g"⁠’ which may give more precision than the default(but the output will depend on the platform and there may be lossof precision when read back). Complex numbers with one or bothnon-finite components are output as if this option were not set.

"exact":

An abbreviated way to specifycontrol = c("all", "hexNumeric")which is guaranteed to be exact for numbers, see also below.

For the most readable (but perhaps incomplete) display, usecontrol = NULL. This displays the object's value, but not itsattributes. The default indeparse is to display theattributes as well, but not to use any of the other options to makethe result parseable. (dump uses more default options viacontrol = "all", and printing of functions without sourcesusesc("keepInteger", "keepNA") to which one may add"warnIncomplete".)

Usingcontrol = "exact" (short forcontrol = c("all", "hexNumeric"))comes closest to makingdeparse() an inverse ofparse()(but we have not yet seen an example where"all", now including"digits17", would not have been as good). However, not allobjects are deparse-able even with these options, and a warning will beissued if the function recognizes that it is being asked to do theimpossible.

Only one of"hexNumeric" and"digits17" can be specified.

Value

An integer value corresponding to thecontrol optionsselected.

Examples

stopifnot(.deparseOpts("exact")== .deparseOpts(c("all","hexNumeric")))(iOpt.all<- .deparseOpts("all"))# a four digit integer## one integer --> vector binary bitsint2bits<-function(x, base=2L,                     ndigits=1+ floor(1e-9+ log(max(x,1), base))){    r<- numeric(ndigits)for(iin ndigits:1){        r[i]<- x%%baseif(i>1L)            x<- x%/%base}    rev(r)# smallest bit at left}int2bits(iOpt.all)## What options does  "all" contain ? =========(depO.indiv<- setdiff(..deparseOpts, c("all","exact")))(oa<- depO.indiv[int2bits(iOpt.all)==1])# 8 stringsstopifnot(identical(iOpt.all, .deparseOpts(oa)))## ditto for "exact" instead of "all":(iOpt.X<- .deparseOpts("exact"))data.frame(opts= depO.indiv,           all= int2bits(iOpt.all),           exact= int2bits(iOpt.X))(oX<- depO.indiv[int2bits(iOpt.X)==1])# 8 strings, toodiffXall<- oa!= oXstopifnot(identical(iOpt.X, .deparseOpts(oX)),          identical(oX[diffXall],"hexNumeric"),          identical(oa[diffXall],"digits17"))

Marking Objects as Deprecated

Description

When an object is about to be removed fromR it is first deprecated andshould include a call to.Deprecated.

Usage

.Deprecated(new, package=NULL, msg,            old= as.character(sys.call(sys.parent()))[1L])

Arguments

new

character string: A suggestion for a replacement function.

package

character string: The package to be used when suggesting where thedeprecated function might be listed.

msg

character string: A message to be printed, if missing a defaultmessage is used.

old

character string specifying the function (default) or usagewhich is being deprecated.

Details

.Deprecated("new name") is called from deprecatedfunctions. The original help page for these functions is oftenavailable athelp("old-deprecated") (note the quotes).Deprecated functions should be listed inhelp("pkg-deprecated")for an appropriatepkg, includingbase.

.Deprecated signals a warning of class"deprecatedWarning"with fieldsold,new, andpackage.

See Also

Defunct

help("base-deprecated") and so on which list thedeprecated functions in the packages.


Calculate the Determinant of a Matrix

Description

det calculates the determinant of a matrix.determinantis a generic function that returns separately the modulus of the determinant,optionally on the logarithm scale, and the sign of the determinant.

Usage

det(x,...)determinant(x, logarithm=TRUE,...)

Arguments

x

numeric matrix: logical matrices are coerced to numeric.

logarithm

logical; ifTRUE (default) return thelogarithm of the modulus of the determinant.

...

optional arguments, currently unused.

Details

Thedeterminant function uses an LU decomposition and thedet function is simply a wrapper around a call todeterminant.

Often, computing the determinant isnot what you should be doingto solve a given problem.

Value

Fordet, the determinant ofx. Fordeterminant, alist with components

modulus

a numeric value. The modulus (absolute value) of thedeterminant iflogarithm isFALSE; otherwise thelogarithm of the modulus.

sign

integer; either+1+1 or1-1 according to whetherthe determinant is positive or negative.

Examples

(x<- matrix(1:4, ncol=2))unlist(determinant(x))det(x)det(print(cbind(1,1:3, c(2,0,1))))

Detach Objects from the Search Path

Description

Detach a database, i.e., remove it from thesearch()path of availableR objects. Usually this is either adata.frame which has beenattached or apackage which was attached bylibrary.

Usage

detach(name, pos=2L, unload=FALSE, character.only=FALSE,       force=FALSE)

Arguments

name

the object to detach. Defaults tosearch()[pos].This can be an unquoted name or a character string butnot acharacter vector. If a number is supplied this is taken aspos.

pos

index position insearch() of the database todetach. Whenname is a number,pos = nameis used.

unload

a logical value indicating whether or not to attempt tounload the namespace when a package is being detached. If thepackage has a namespace andunload isTRUE, thendetach will attempt to unload the namespaceviaunloadNamespace: if the namespace is imported byanother namespace orunload isFALSE, no unloadingwill occur.

character.only

a logical indicating whethernamecan be assumed to be a character string.

force

logical: should a package be detached even though otherattached packages depend on it?

Details

This is most commonly used with a single number argument referring to aposition on the search list, and can also be used with a unquoted orquoted name of an item on the search list such aspackage:tools.

If a package has a namespace, detaching it does not by default unloadthe namespace (and may not even withunload = TRUE), anddetaching will not in general unload any dynamically loaded compiledcode (DLLs); seegetLoadedDLLs andlibrary.dynam.unload. Further, registered S3 methodsfrom the namespace will not be removed, and because S3 methods arenot tagged to their source on registration, it is in general notpossible to safely un-register the methods associated with a givenpackage. If you uselibrary on a package whosenamespace is loaded, it attaches the exports of the already loadednamespace. So detaching and re-attaching a package may not refreshsome or all components of the package, and is inadvisable. The mostreliable way to completely detach a package is to restartR.

Value

The return value isinvisible. It isNULL when apackage is detached, otherwise the environment which was returned byattach when the object was attached (incorporating anychanges since it was attached).

Good practice

detach() without an argument removes the first item on thesearch path after the workspace. It is all too easy to call it toomany or too few times, or to not notice that the search path haschanged since anattach call.

Use ofattach/detach is best avoided in functions (seethe help forattach) and in interactive use and scriptsit is prudent to detach by name.

Note

You cannot detach either the workspace (position 1) nor thebasepackage (the last item in the search list), and attempting to do sowill throw an error.

Unloading some namespaces has undesirable side effects:e.g. unloadinggrid closes all graphics devices, and on somesystemstcltk cannot be reloaded once it has been unloaded andmay crashR if this is attempted.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

attach,library,search,objects,unloadNamespace,library.dynam.unload .

Examples

require(splines)# packagedetach(package:splines)## or alsolibrary(splines)pkg<-"package:splines"detach(pkg, character.only=TRUE)## careful: do not do this unless 'splines' is not already attached.library(splines)detach(2)# 'pos' used for 'name'## an example of the name argument to attach## and of detaching a database named by a character vectorattach_and_detach<-function(db, pos=2){   name<- deparse1(substitute(db))   attach(db, pos= pos, name= name)   print(search()[pos])   detach(name, character.only=TRUE)}attach_and_detach(women, pos=3)

Matrix Diagonals

Description

Extract or replace the diagonal of a matrix,or construct a diagonal matrix.

Usage

diag(x=1, nrow, ncol, names=TRUE)diag(x)<- value

Arguments

x

a matrix, vector or 1Darray, or missing.

nrow,ncol

optional dimensions for the result whenx isnot a matrix.

names

(whenx is a matrix) logical indicating if theresulting vector, the diagonal ofx, should inheritnames fromdimnames(x) if available.

value

either a single value or a vector of length equal to thatof the current diagonal. Should be of a mode which can be coercedto that ofx.

Details

diag has four distinct usages:

  1. x is a matrix, when it extracts the diagonal.

  2. x is missing andnrow is specified, it returnsan identity matrix.

  3. x is a scalar (length-one vector) and the onlyargument, it returns a square identity matrix of size given by the scalar.

  4. x is a ‘numeric’ (complex,numeric,integer,logical, orraw) vector, either of length at least 2 or therewere further arguments. This returns a matrix with the givendiagonal and zero off-diagonal entries.

It is an error to specifynrow orncol in the first case.

Value

Ifx is a matrix thendiag(x) returns the diagonal ofx. The resulting vector will havenames ifnames is true and if thematrixx has matching column and rownames.

The replacement form sets the diagonal of the matrixx to thegiven value(s).

In all other cases the value is a diagonal matrix withnrowrows andncol columns (ifncol is not given the matrixis square). Herenrow is taken from the argument if specified,otherwise inferred fromx: if that is a vector (or 1D array) oflength two or more, then its length is the number of rows, but if itis of length one and neithernrow norncol is specified,nrow = as.integer(x).

When a diagonal matrix is returned, the diagonal elements are oneexcept in the fourth case, whenx gives the diagonal elements:it will be recycled or truncated as needed, but fractional recyclingand truncation will give a warning.

Note

Usingdiag(x) can have unexpected effects ifx is avector that could be of length one. Usediag(x, nrow = length(x)) for consistent behaviour.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

upper.tri,lower.tri,matrix.

Examples

dim(diag(3))diag(10,3,4)# guess what?all(diag(1:3)=={m<- matrix(0,3,3); diag(m)<-1:3; m})## other "numeric"-like diagonal matrices :diag(c(1i,2i))# complexdiag(TRUE,3)# logicaldiag(as.raw(1:3))# raw(D2<- diag(2:1,4)); typeof(D2)# "integer"require(stats)## diag(<var-cov-matrix>) = variancesdiag(var(M<- cbind(X=1:5, Y= rnorm(5))))#-> vector with names "X" and "Y"rownames(M)<- c(colnames(M), rep("",3))M; diag(M)#  named as welldiag(M, names=FALSE)# w/o names

Lagged Differences

Description

Returns suitably lagged and iterated differences.

Usage

diff(x,...)## Default S3 method:diff(x, lag=1, differences=1,...)## S3 method for class 'POSIXt'diff(x, lag=1, differences=1,...)## S3 method for class 'Date'diff(x, lag=1, differences=1,...)

Arguments

x

a numeric vector or matrix containing the values to bedifferenced.

lag

an integer indicating which lag to use.

differences

an integer indicating the order of the difference.

...

further arguments to be passed to or from methods.

Details

diff is a generic function with a default method and ones forclasses"ts","POSIXt" and"Date".

NA's propagate.

Value

Ifx is a vector of lengthn anddifferences = 1,then the computed result is equal to the successive differencesx[(1+lag):n] - x[1:(n-lag)].

Ifdifference is larger than one this algorithm is appliedrecursively tox.Note that the returned value is a vector which is shorter thanx.

Ifx is a matrix then the difference operations are carried outon each column separately.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

diff.ts,diffinv.

Examples

diff(1:10,2)diff(1:10,2,2)x<- cumsum(cumsum(1:10))diff(x, lag=2)diff(x, differences=2)diff(.leap.seconds)## allows to pass units via ... to difftime()diff(.leap.seconds, units="weeks") diff(as.Date(.leap.seconds), units="weeks")

Time Intervals / Differences

Description

Time intervals creation, printing, and some arithmetic. Theprint() method calls these “time differences”.

Usage

time1- time2difftime(time1, time2, tz,         units= c("auto","secs","mins","hours","days","weeks"))as.difftime(tim, format="%X", units="auto", tz="UTC")## S3 method for class 'difftime'format(x,...)## S3 method for class 'difftime'units(x)## S3 replacement method for class 'difftime'units(x)<- value## S3 method for class 'difftime'as.double(x, units="auto",...)## Group methods, notably for round(), signif(), floor(),## ceiling(), trunc(), abs(); called directly, *not* as Math():## S3 method for class 'difftime'Math(x,...)

Arguments

time1,time2

date-time ordate objects.

tz

an optionaltime zone specification to be used for theconversion, mainly for"POSIXlt" objects.

units

character string. Units in which the results aredesired. Can be abbreviated.

value

character string. Likeunits, except thatabbreviations are not allowed.

tim

character string or numeric value specifying a time interval.

format

character specifying the format oftim: seestrptime. The default is a locale-specific time format.

x

an object inheriting from class"difftime".

...

arguments to be passed to or from other methods.

Details

Functiondifftime calculates a difference of two date/timeobjects and returns an object of class"difftime" with anattribute indicating the units. TheMath group method providesround,signif,floor,ceiling,trunc,abs, andsign methods for objects of this class, and there aremethods for the group-generic (seeOps) logical and arithmeticoperations.

Ifunits = "auto", a suitable set of units is chosen, the largestpossible (excluding"weeks") in which all the absolutedifferences are greater than one.

Subtraction of date-time objects gives an object of this class,by callingdifftime withunits = "auto". Alternatively,as.difftime() works on character-coded or numeric timeintervals; in the latter case, units must be specified, andformat has no effect.

Limited arithmetic is available on"difftime" objects: they canbe added or subtracted, and multiplied or divided by a numeric vector.In addition, adding or subtracting a numeric vector by a"difftime" object implicitly converts the numeric vector to a"difftime" object with the same units as the"difftime"object. There are methods formean andsum (via theSummarygroup generic), anddiff viadiff.defaultbuilding on the"difftime" method for arithmetic, notably-.

The units of a"difftime" object can be extracted by theunits function, which also has a replacement form. If theunits are changed, the numerical value is scaled accordingly. Thereplacement version keeps attributes such as names and dimensions.

Note thatunits = "days" means a period of 24 hours, hencetakes no account of Daylight Savings Time. Differences in objectsof class"Date" are computed as if in the UTC time zone.

Theas.double method returns the numeric value expressed inthe specified units. Usingunits = "auto" means the units of theobject.

Theformat method simply formats the numeric value and appendsthe units as a text string.

Warning

BecauseR follows POSIX (and almost all computer clocks) in ignoringleap seconds, so do time differences. So in a UTC time zone

    z <- as.POSIXct(c("2016-12-31 23:59:59", "2017-01-01 00:00:01"))    z[2] - z[1]

reports ‘⁠Time difference of 2 secs⁠’ but 3 seconds elapsed whilethe computer clock advanced by 2 seconds.

If you want the elapsed time interval, you need to add in anyleap seconds for yourself.

Note

Units such as"months" are not possible as they are not ofconstant length. To create intervals of months, quarters or yearsuseseq.Date orseq.POSIXt.

See Also

DateTimeClasses.

Examples

(z<- Sys.time()-3600)Sys.time()- z# just over 3600 seconds.## time interval between release days of R 1.2.2 and 1.2.3.ISOdate(2001,4,26)- ISOdate(2001,2,26)as.difftime(c("0:3:20","11:23:15"))as.difftime(c("3:20","23:15","2:"), format="%H:%M")# 3rd gives NA(z<- as.difftime(c(0,30,60), units="mins"))as.numeric(z, units="secs")as.numeric(z, units="hours")format(z)

Dimensions of an Object

Description

Retrieve or set the dimension of an object.

Usage

dim(x)dim(x)<- value

Arguments

x

anR object, for example a matrix, array or data frame.

value

for the default method, eitherNULL ora numeric vector, which is coerced to integer (by truncation).

Details

The functionsdim anddim<- areinternal genericprimitive functions.

dim has a method fordata.frames, which returnsthe lengths of therow.names attribute ofx andofx (as the numbers of rows and columns respectively).

Value

For an array (and hence in particular, for a matrix)dim retrievesthedim attribute of the object. It isNULL or a vectorof modeinteger.

The replacement method changes the"dim" attribute (provided thenew value is compatible) and removes any"dimnames"and"names" attributes.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

ncol,nrow anddimnames.

Examples

x<-1:12; dim(x)<- c(3,4)x# simple versions of nrow and ncol could be defined as followsnrow0<-function(x) dim(x)[1]ncol0<-function(x) dim(x)[2]

Dimnames of an Object

Description

Retrieve or set the dimnames of an object.

Usage

dimnames(x)dimnames(x)<- valueprovideDimnames(x, sep="", base= list(LETTERS), unique=TRUE)

Arguments

x

anR object, for example a matrix, array or data frame.

value

a possible value fordimnames(x): see the‘Value’ section.

sep

a character string, used to separatebasesymbols and digits in the constructed dimnames.

base

a non-emptylist of character vectors. Thelist components are used in turn (and recycled when needed) toconstruct replacements for empty dimnames components. See also theexamples.

unique

logical indicating that the dimnames constructed areunique within each dimension in the sense ofmake.unique.

Details

The functionsdimnames anddimnames<- are generic.

For anarray (and hence in particular, for amatrix), they retrieve or set thedimnamesattribute (seeattributes) of the object. A listvalue can have names, and these will be used to label thedimensions of the array where appropriate.

The replacement method for arrays/matrices coerces vector and factorelements ofvalue to character, but does not dispatch methodsforas.character. It coerces zero-length elements toNULL, and a zero-length list toNULL. Ifvalueis a list shorter than the number of dimensions, it is extended withNULLs to the needed length.

Both have methods for data frames. The dimnames of a data frame areitsrow.names and itsnames. For thereplacement method each component ofvalue will be coerced byas.character.

For a 1D matrix thenames are the same thing as the(only) component of thedimnames.

Both areprimitive functions.

provideDimnames(x) providesdimnames where“missing”, such that its result hascharacterdimnames for each component. Ifunique is true as by default,they are unique within each component viamake.unique(*, sep=sep).

Value

The dimnames of a matrix or array can beNULL (which is notstored) or a list of the same length asdim(x). If a list, itscomponents are eitherNULL or a character vector with positivelength of the appropriate dimension ofx. The list can havenames. It is possible that all components areNULL: suchdimnames may get converted toNULL.

For the"data.frame" method both dimnames are charactervectors, and the rownames must contain no duplicates nor missingvalues.

provideDimnames(x) returnsx, with “NULL -free”dimnames, i.e. each component a character vector ofcorrect length.

Note

Setting components of the dimnames, e.g.,dimnames(A)[[1]] <- value is a common paradigm, but note thatit will not work if the value assigned isNULL. Userownames instead, or (as it does) manipulate the wholedimnames list.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

rownames,colnames;array,matrix,data.frame.

Examples

## simple versions of rownames and colnames## could be defined as followsrownames0<-function(x) dimnames(x)[[1]]colnames0<-function(x) dimnames(x)[[2]](dn<- dimnames(A<- provideDimnames(N<- array(1:24, dim=2:4))))A0<- A; dimnames(A)[2:3]<- list(NULL)stopifnot(identical(A0, provideDimnames(A)))strd<-function(x) utils::str(dimnames(x))strd(provideDimnames(A, base= list(letters[-(1:9)], tail(LETTERS))))strd(provideDimnames(N, base= list(letters[-(1:9)], tail(LETTERS))))# recyclingstrd(provideDimnames(A, base= list(c("AA","BB"))))# recycling on both levels## set "empty dimnames":provideDimnames(rbind(1,2:3), base= list(""), unique=FALSE)

Execute a Function Call

Description

do.call constructs and executes a function call from a name ora function and a list of arguments to be passed to it.

Usage

do.call(what, args, quote=FALSE, envir= parent.frame())

Arguments

what

either a function or a non-empty character string naming thefunction to be called.

args

alist of arguments to the function call. Thenames attribute ofargs gives the argument names.

quote

a logical value indicating whether to quote thearguments.

envir

an environment within which to evaluate the call. Thiswill be most useful ifwhat is a character string andthe arguments are symbols or quoted expressions.

Details

Ifquote isFALSE, the default, then the arguments areevaluated (in the calling environment, not inenvir). Ifquote isTRUE then each argument is quoted (seequote) so that the effect of argument evaluation is toremove the quotes – leaving the original arguments unevaluated when thecall is constructed.

The behavior of some functions, such assubstitute,will not be the same for functions evaluated usingdo.call asif they were evaluated from the interpreter. The precise semanticsare currently undefined and subject to change.

Value

The result of the (evaluated) function call.

Warning

This should not be used to attempt to evade restrictions on the use of.Internal and other non-API calls.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

call which creates an unevaluated call.

Examples

do.call("complex", list(imaginary=1:3))## if we already have a list (e.g., a data frame)## we need c() to add further argumentstmp<- expand.grid(letters[1:2],1:3, c("+","-"))do.call("paste", c(tmp, sep=""))do.call(paste, list(as.name("A"), as.name("B")), quote=TRUE)## examples of where objects will be found.A<-2f<-function(x) print(x^2)env<- new.env()assign("A",10, envir= env)assign("f", f, envir= env)f<-function(x) print(x)f(A)# 2do.call("f", list(A))# 2do.call("f", list(A), envir= env)# 4do.call( f,  list(A), envir= env)# 2do.call("f", list(quote(A)), envir= env)# 100do.call( f,  list(quote(A)), envir= env)# 10do.call("f", list(as.name("A")), envir= env)# 100eval(call("f", A))# 2eval(call("f", quote(A)))# 2eval(call("f", A), envir= env)# 4eval(call("f", quote(A)), envir= env)# 100

Identity Function to Suppress Checking

Description

ThedontCheck function is the same asidentity, but is interpreted byR CMD check code analysis as a directiveto suppress checking ofx. Currently this is only used bycheckFF(registration = TRUE)when checking the.NAME argument of foreign function calls.

Usage

dontCheck(x)

Arguments

x

anR object.

See Also

suppressForeignCheck which explains why that anddontCheck are undesirable and should be avoided if at allpossible.


...,..1, etc used in Functions

Description

... and..1,..2 etc are used to refer toarguments passed down from a calling function. These (and thefollowing) can only be usedinside a function which has... among its formal arguments.

...elt(n) is a functional way to get..n andbasically the same aseval(paste0("..", n)), just more elegantand efficient.Note thatswitch(n, ...) is very close, differing by returningNULL invisibly instead of an error whenn is zero ortoo large.

...length() returns the number of expressions in..., and...names() thenames.These are the same aslength(list(...)) ornames(list(...))but without evaluating the expressions in... (which happens withlist(...)).

Evaluating elements of... with..1,..2,...elt(n), etc. propagatesvisibility. Thisis consistent with the evaluation of named arguments which alsopropagates visibility.

Usage

...length()...names()...elt(n)

Arguments

n

a positive integer, not larger than the number of expressionsin ..., which is the same as...length() which is the sameaslength(list(...)), but the latter evaluates allexpressions in....

See Also

... and..1,..2 arereserved words inR, seeReserved.

For more, see theIntroduction to Rmanual for usage of these syntactic elements,anddotsMethods for their use in formal (S4) methods.

Examples

tst<-function(n,...)...elt(n)tst(1, pi=pi*0:1,2:4)## [1] 0.000000 3.141593tst(2, pi=pi*0:1,2:4)## [1] 2 3 4try(tst(1))# -> Error about '...' not containing an element.tst.dl<-function(x,...)...length()tst.dns<-function(x,...)...names()tst.dl(1:10)# 0  (because the first argument is 'x')tst.dl(4,5)# 1tst.dl(4,5,6)# 2  namely '5, 6'tst.dl(4,5,6,7, sin(1:10),"foo"/"bar")# 5.    Note: no evaluation!tst.dns(4, foo=5,6, bar=7, sini= sin(1:10),"foo"/"bar")##        "foo"  "" "bar"  "sini"               ""## From R 4.1.0 to 4.1.2, ...names() sometimes did not match names(list(...));## check and show (these examples all would've failed):chk.n2<-function(...) stopifnot(identical(print(...names()), names(list(...))))chk.n2(4, foo=5,6, bar=7, sini= sin(1:10),"bar")chk.n2()chk.n2(1,2)

Double-Precision Vectors

Description

Create, coerce to or test for a double-precision vector.

Usage

double(length=0)as.double(x,...)is.double(x)single(length=0)as.single(x,...)

Arguments

length

a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error.

x

object to be coerced or tested.

...

further arguments passed to or from other methods.

Details

double creates a double-precision vector of the specifiedlength. The elements of the vector are all equal to0.It is identical tonumeric.

as.double is a generic function. It is identical toas.numeric. Methods should return an object of base type"double".

is.double is a test of doubletype.

R has no single precision data type. All real numbers arestored in double precision format. The functionsas.singleandsingle are identical toas.double anddoubleexcept they set the attributeCsingle that is used in the.C and.Fortran interface, and they areintended only to be used in that context.

Value

double creates a double-precision vector of the specifiedlength. The elements of the vector are all equal to0.

as.double attempts to coerce its argument to be of double type:likeas.vector it strips attributes including names.(To ensure that an object is of double type without strippingattributes, usestorage.mode.) Character stringscontaining optional whitespace followed by either a decimalrepresentation or a hexadecimal representation (starting with0x or0X) can be converted, as can special values suchas"NA","NaN","Inf" and"infinity",irrespective of case.

as.double for factors yields the codes underlying the factorlevels, not the numeric representation of the labels, see alsofactor.

is.double returnsTRUE orFALSE depending onwhether its argument is of doubletype or not.

Double-precision values

AllR platforms are required to work with values conforming to theIEC 60559 (also known as IEEE 754) standard. This basically workswith a precision of 53 bits, and represents to that precision a rangeof absolute values from about2×103082 \times 10^{-308} to2×103082 \times 10^{308}. It also has special valuesNaN (many of them), plus and minus infinity and plus andminus zero (althoughR acts as if these are the same). There arealsodenormal(ized) (orsubnormal) numbers with valuesbelow the range given above but represented to less precision.

See.Machine for precise information on these limits.Note that ultimately how double precision numbers are handled is downto the CPU/FPU and compiler.

In IEEE 754-2008/IEC60559:2011 this is called ‘binary64’ format.

Note on names

It is a historical anomaly thatR has two names for itsfloating-point vectors,double andnumeric(and formerly hadreal).

double is the name of thetype.numeric is the name of themode and also of the implicitclass. As an S4 formal class, use"numeric".

The potential confusion is thatR has usedmode"numeric" to mean ‘double or integer’, which conflictswith the S4 usage. Thusis.numeric tests the mode, not theclass, butas.numeric (which is identical toas.double)coerces to the class.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

https://en.wikipedia.org/wiki/IEEE_754-1985,https://en.wikipedia.org/wiki/IEEE_754-2008,https://en.wikipedia.org/wiki/IEEE_754-2019,https://en.wikipedia.org/wiki/Double_precision,https://en.wikipedia.org/wiki/Denormal_number.

See Also

integer,numeric,storage.mode.

Examples

is.double(1)all(double(3)==0)

Write an Object to a File or Recreate it

Description

Writes an ASCII text representation of anR object to a file, theRconsole, or a connection, or uses one to recreate the object.

Usage

dput(x, file="",     control= c("keepNA","keepInteger","niceNames","showAttributes"))dget(file, keep.source=FALSE)

Arguments

x

an object.

file

either a character string naming a file or aconnection."" indicates output to the console.

control

character vector (orNULL) of deparsing options.control = "all" is thorough, see.deparseOpts.

keep.source

logical: should the source formatting be retained whenparsing functions, if possible?

Details

dput opensfile and deparses the objectx intothat file. The object name is not written (unlikedump).Ifx is a function the associated environment is stripped.Hence scoping information can be lost.

Deparsing an object is difficult, and not always possible. With thedefaultcontrol,dput() attempts to deparse in a waythat is readable, but for more complex or unusual objects (seedump), not likelyto be parsed as identical to the original. Usecontrol = "all"for the most complete deparsing; usecontrol = NULL for thesimplest deparsing, not even including attributes.

dput will warn if fewer characters were written to a file thanexpected, which may indicate a full or corrupt file system.

To display saved source rather than deparsing the internalrepresentation include"useSource" incontrol.Rcurrently saves source only for function definitions. If you do notcare about source representation (e.g., for a data object), for speedsetoptions(keep.source = FALSE) when callingsource.

Value

Fordput, the first argument invisibly.

Fordget, the object created.

Note

This isnot a good way to transfer objects betweenR sessions.dump is better, but the functionssave andsaveRDS are designed to be used for transportingR data,and will work withR objects thatdput does not handle correctlyas well as being much faster.

To avoid the risk of a source attribute out of sync with the actualfunction definition, the source attribute of a function will neverbe written as an attribute.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

deparse,.deparseOpts,dump,write.

Examples

fil<- tempfile()## Write an ASCII version of the 'base' function mean() to our temp file, ..dput(base::mean, fil)## ... read it back into 'bar' and confirm it is the samebar<- dget(fil)stopifnot(all.equal(bar, base::mean, check.environment=FALSE))## Create a function with commentsbaz<-function(x){# Subtract from one1-x}## and display itdput(baz)## and now display the saved sourcedput(baz, control="useSource")## Numeric values:xx<- pi^(1:3)dput(xx)dput(xx, control="digits17")dput(xx, control="hexNumeric")dput(xx, fil); dget(fil)- xx# slight rounding on all platformsdput(xx, fil, control="digits17")dget(fil)- xx# slight rounding on some platformsdput(xx, fil, control="hexNumeric"); dget(fil)- xxunlink(fil)xn<- setNames(xx, paste0("pi^",1:3))dput(xn)# nicer, now "niceNames" being part of default 'control'dput(xn, control="S_compat")# no names## explicitly asking for output as in R < 3.5.0:dput(xn, control= c("keepNA","keepInteger","showAttributes"))

Drop Redundant Extent Information

Description

Delete the dimensions of an array which have only one level.

Usage

drop(x)

Arguments

x

an array (including a matrix).

Value

Ifx is an object with adim attribute (e.g., a matrixorarray), thendrop returns an object likex, but with any extents of length one removed. Anyaccompanyingdimnames attribute is adjusted and returned withx: if the result is a vector thenames are taken fromthedimnames (if any). If the result is a length-one vector,the names are taken from the first dimension with a dimname.

Array subsetting ([) performs this reduction unless usedwithdrop = FALSE, but sometimes it is useful to invokedrop directly.

See Also

drop1 which is used for dropping terms in models, anddroplevels used for dropping unused levels from afactor.

Examples

dim(drop(array(1:12, dim= c(1,3,1,1,2,1,2))))# = 3 2 2drop(1:3%*%2:4)# scalar product

Drop Unused Levels from Factors

Description

The functiondroplevels is used to drop unused levels from afactor or, more commonly, from factors in a data frame.

Usage

droplevels(x,...)## S3 method for class 'factor'droplevels(x, exclude=if(anyNA(levels(x)))NULLelseNA,...)## S3 method for class 'data.frame'droplevels(x, except, exclude,...)

Arguments

x

an object from which to drop unused factor levels.

exclude

passed tofactor(); factor levels whichshould be excluded from the result even if present. Note that thiswasimplicitlyNA inR <= 3.3.1 which did dropNA levels even when present inx, contrary to thedocumentation. The current default is compatible withx[ , drop=TRUE].

...

further arguments passed to methods.

except

indices of columns from whichnot to drop levels.

Details

The method for class"factor" is currently equivalent tofactor(x, exclude=exclude). For the data frame method, youshould rarely specifyexclude “globally” for all factorcolumns; rather the default uses the same factor-specificexclude as the factor method itself.

Theexcept argument follows the usual indexing rules.

Value

droplevels returns an object of the same class asx

Note

This function was introduced in R 2.12.0. It is primarilyintended for cases where one or more factors in a data framecontains only elements from a reduced level set aftersubsetting. (Notice that subsetting doesnot in general dropunused levels). By default, levels are dropped from all factors in adata frame, but theexcept argument allows you to specifycolumns for which this is not wanted.

See Also

subset for subsetting data frames.factor for definition of factors.drop for dropping array dimensions.drop1 for dropping terms from a model.[.factor for subsetting of factors.

Examples

aq<- transform(airquality, Month= factor(Month, labels= month.abb[5:9]))aq<- subset(aq, Month!="Jul")table(           aq$Month)table(droplevels(aq)$Month)

Text Representations of R Objects

Description

This function takes a vector of names ofR objects and producestext representations of the objects on a file or connection.Adump file can usually besourced into anotherR session.

Usage

dump(list, file="dumpdata.R", append=FALSE,     control="all", envir= parent.frame(), evaluate=TRUE)

Arguments

list

character vector (orNULL). The names ofR objects to be dumped.

file

either a character string naming a file or aconnection."" indicates output to the console.

append

ifTRUE andfile is a character string,output will be appended tofile; otherwise, it will overwritethe contents offile.

control

character vector (orNULL) indicating deparsingoptions. See.deparseOpts for their description.

envir

the environment to search for objects.

evaluate

logical. Should promises be evaluated?

Details

If some of the objects named do not exist (in scope), they areomitted, with a warning. Iffile is a file and no objectsexist then no file is created.

sourceing may not produce an identical copy ofdumped objects. A warning is issued if it is likely thatproblems will arise, for example when dumping exotic or complexobjects (see the Note).

dump will also warn if fewer characters were written to a filethan expected, which may indicate a full or corrupt file system.

Adump file can besourced into anotherR (orperhaps S) session, but the functionssave andsaveRDS are designed tobe used for transportingR data, and will work withR objects thatdump does not handle. For maximal reproducibility usecontrol = "exact".

To produce a more readable representation of an object, usecontrol = NULL. This will skip attributes, and will make othersimplifications that makesource less likely to produce anidentical copy. See.deparseOpts for details.

To deparse the internal representation of a function rather thandisplaying the saved source, usecontrol = c("keepInteger", "warnIncomplete", "keepNA"). This will lose all formatting andcomments, but may be useful in those cases where the saved source isno longer correct.

Promises will normally only be encountered by users as a result oflazy-loading (when the defaultevaluate = TRUE is essential)and after the use ofdelayedAssign,whenevaluate = FALSE might be intended.

Value

An invisible character vector containing the names of the objectswhich were dumped.

Note

Asdump is defined in the base namespace, thebasepackage will be searchedbefore the global environment unlessdump is called from the top level prompt or theenvirargument is given explicitly.

To avoid the risk of a source attribute becoming out of sync with theactual function definition, the source attribute of a function willnever be dumped as an attribute.

Currently environments, external pointers, weak references and objectsof typeS4 are not deparsed in a way that can besourced. In addition,language objects are deparsed in asimple way whatever the value ofcontrol, and this includes notdumping their attributes (which will result in a warning).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

.deparseOpts for availablecontrol settings;dput(),dget() anddeparse()for related functions using identical internal deparsing functionality.

write,write.table, etc for “dumping”data to (text) files.

save andsaveRDS for a more reliable way tosaveR objects.

Examples

x<-1; y<-1:10fil<- tempfile(fileext=".Rdmped")dump(ls(pattern='^[xyz]'), fil)print(.Last.value)unlink(fil)

Determine Duplicate Elements

Description

duplicated() determines which elements of a vector or dataframe are duplicatesof elements with smaller subscripts, and returns a logical vectorindicating which elements (rows) are duplicates.

anyDuplicated(.) is a “generalized” more efficientversionany(duplicated(.)), returning positive integer indicesinstead of justTRUE.

Usage

duplicated(x, incomparables=FALSE,...)## Default S3 method:duplicated(x, incomparables=FALSE,           fromLast=FALSE, nmax=NA,...)## S3 method for class 'array'duplicated(x, incomparables=FALSE, MARGIN=1,           fromLast=FALSE,...)anyDuplicated(x, incomparables=FALSE,...)## Default S3 method:anyDuplicated(x, incomparables=FALSE,           fromLast=FALSE,...)## S3 method for class 'array'anyDuplicated(x, incomparables=FALSE,           MARGIN=1, fromLast=FALSE,...)

Arguments

x

a vector or a data frame or an array orNULL.

incomparables

a vector of values that cannot be compared.FALSE is a special value, meaning that all values can becompared, and may be the only value accepted for methods other thanthe default. It will be coerced internally to the same type asx.

fromLast

logical indicating if duplication should be consideredfrom the reverse side, i.e., the last (or rightmost) of identicalelements would correspond toduplicated = FALSE.

nmax

the maximum number of unique items expected (greater than one).

...

arguments for particular methods.

MARGIN

the array margin to be held fixed: seeapply, and note thatMARGIN = 0 may be useful.

Details

These are generic functions with methods for vectors (includinglists), data frames and arrays (including matrices).

For the default methods, and whenever there are equivalent methoddefinitions forduplicated andanyDuplicated,anyDuplicated(x, ...) is a “generalized” shortcut forany(duplicated(x, ...)), in the sense that it returns theindexi of the first duplicated entryx[i] ifthere is one, and0 otherwise. Their behaviours may bedifferent when at least one ofduplicated andanyDuplicated has a relevant method.

duplicated(x, fromLast = TRUE) is equivalent to but faster thanrev(duplicated(rev(x))).

The array method calculates for each element of the sub-arrayspecified byMARGIN if the remaining dimensions are identicalto those for an earlier (or later, whenfromLast = TRUE) element(in row-major order). This would most commonly be used to findduplicated rows (the default) or columns (withMARGIN = 2).Note thatMARGIN = 0 returns an array of the samedimensionality attributes asx.

Missing values ("NA") are regarded as equal, numeric andcomplex ones differing fromNaN; character strings will be compared in a“common encoding”; for details, seematch (andunique) which use the same concept.

Values inincomparables will never be marked as duplicated.This is intended to be used for a fairly small set of values and willnot be efficient for a very large set.

Except for factors, logical and raw vectors the defaultnmax = NA isequivalent tonmax = length(x). Since a hash table of size8*nmax bytes is allocated, settingnmax suitably cansave large amounts of memory. For factors it is automatically set tothe smaller oflength(x) and the number of levels plus one (forNA). Ifnmax is set too small there is liable to be anerror:nmax = 1 is silently ignored.

Long vectors are supported for the default method ofduplicated, but may only be usable ifnmax is supplied.

Value

duplicated():For a vector input, a logical vector of the same length asx. For a data frame, a logical vector with one element foreach row. For a matrix or array, and whenMARGIN = 0, alogical array with the same dimensions and dimnames.

anyDuplicated(): an integer or real vector of length one withvalue the 1-based index of the first duplicate if any, otherwise0.

Warning

Using this for lists is potentially slow, especially if the elementsare not atomic vectors (seevector) or differ onlyin their attributes. In the worst case it isO(n2)O(n^2).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

unique.

Examples

x<- c(9:20,1:5,3:7,0:8)## extract unique elements(xu<- x[!duplicated(x)])## similar, same elements but different order:(xu2<- x[!duplicated(x, fromLast=TRUE)])## xu == unique(x) but unique(x) is more efficientstopifnot(identical(xu,  unique(x)),          identical(xu2, unique(x, fromLast=TRUE)))duplicated(iris)[140:143]duplicated(iris3, MARGIN= c(1,3))anyDuplicated(iris)## 143anyDuplicated(x)anyDuplicated(x, fromLast=TRUE)

Foreign Function Interface

Description

Load or unload DLLs (also known as shared objects), and test whether aC function or Fortran subroutine is available.

Usage

dyn.load(x, local=TRUE, now=TRUE,...)dyn.unload(x)is.loaded(symbol, PACKAGE="", type="")

Arguments

x

a character string giving the pathname to a DLL, also knownas a dynamic shared object. (See ‘Details’ for what theseterms mean.)

local

a logical value controlling whether the symbols in theDLL are stored in their own local table and not sharedacross DLLs, or added to the global symbol table. Whether this hasany effect is system-dependent.

now

a logical controlling whether all symbols are resolved (andrelocated) immediately when the library is loaded or deferred until theyare used. This control is useful for developers testing whether alibrary is complete and has all the necessary symbols, and for usersto ignore missing symbols. Whether this has any effect is system-dependent.

...

other arguments for future expansion.

symbol

a character string giving a symbol name.

PACKAGE

if supplied, confine the search for thename tothe DLL given by this argument (plus the conventional extension,‘.so’, ‘.sl’, ‘.dll’, ...). This is intended toadd safety for packages, which can ensure by using this argumentthat no other package can override their external symbols. This isused in the same way as in the.C,.Call,.Fortran and.External functions.

type

the type of symbol to look for: can be any ("", thedefault),"Fortran","Call" or"External".

Details

The objectsdyn.load loads are called ‘dynamicallyloadable libraries’ (abbreviated to ‘DLL’) on all platformsexcept macOS, which uses the term for a different sortof object. On Unix-alikes they are also called ‘dynamicshared objects’ (‘DSO’), or ‘shared objects’ forshort. (The POSIX standards use ‘executable object file’,but no one else does.)

See ‘See Also’ and the ‘Writing R Extensions’ and‘R Installation and Administration’ manuals for how to createand install a suitable DLL.

Unfortunately some rare platforms (e.g., Compaq Tru64) do not handlethePACKAGE argument correctly, and may incorrectly findsymbols linked intoR.

The additional arguments todyn.load mirror the differentaspects of the mode argument to thedlopen() routine on POSIXsystems. They are available so that users can exercise greater controlover the loading process for an individual library. In general, thedefault values are appropriate and you should override them only ifthere is good reason and you understand the implications.

Thelocal argument allows one to control whether the symbols inthe DLL being attached are visible to other DLLs. While maintainingthe symbols in their own namespace is good practice, the ability toshare symbols across related ‘chapters’ is useful in manycases. Additionally, on certain platforms and versions of anoperating system, certain libraries must have their symbols loadedglobally to successfully resolve all symbols.

One should be careful of one potential side-effect of using lazyloading vianow = FALSE: if a routine iscalled that has a missing symbol, the process will terminateimmediately. The intended use is for library developers to call this withvalueTRUE to check that all symbols are actually resolved andfor regular users to call it withFALSE so that missing symbolscan be ignored and the available ones can be called.

The initial motivation for adding these was to avoid such terminationin the_init() routines of the Java virtual machine library.However, symbols loaded locally may not be (read: probably) availableto other DLLs. Those added to the global table are available to allother elements of the application and so can be shared across twodifferent DLLs.

Some (very old) systems do not provide (explicit) support forlocal/global and lazy/eager symbol resolution. This can be the sourceof subtle bugs. One can arrange to have warning messages emitted whenunsupported options are used. This is done by setting either of theoptionsverbose orwarn to be non-zero via theoptions function.

There is a short discussion of these additional arguments with someexample code available athttps://www.stat.ucdavis.edu/~duncan/R/dynload/.

Value

The functiondyn.load is used for its side effect which linksthe specified DLL to the executingR image. Calls to.C,.Call,.Fortran and.External can then be used toexecute compiled C functions or Fortran subroutines contained in thelibrary. The return value ofdyn.load is an object of classDLLInfo. SeegetLoadedDLLs for information aboutthis class.

The functiondyn.unload unlinks the DLL. Note that unloading aDLL and then re-loading a DLL of the same name may or may not work: onSolaris it used the first version loaded. Note also that some DLLs cannotbe safely unloaded at all: unloading a DLL which implements C finalizersbut does not unregister them on unload causes R to crash.

is.loaded checks if the symbol name is loadedandsearchable and hence available for use as a character string valuefor argument.NAME in.C,.Fortran,.Call, or.External. It will succeed if any one of thefour calling functions would succeed in using the entry point unlesstype is specified. (See.Fortran for how Fortransymbols are mapped.) Note that symbols in base packages are notsearchable, and other packages can be so marked.

Warning

Do not usedyn.unload on a DLL loaded bylibrary.dynam: uselibrary.dynam.unload.This is needed for system housekeeping.

Note

is.loaded requires the name you would give to.C etc.It must be a character string and so cannot be anR object as usedfor registered native symbols (see “Writing R Extensions”section 5.4.). Some registered symbols are available by name but most arenot, including those in the examples below.

By default, the maximum number of DLLs that can be loaded is now 614when the OS limit on the number of open files allows or can beincreased, but less otherwise (but it will be at least 100). Aspecific maximum can be requestedvia the environment variableR_MAX_NUM_DLLS, which has to be set (to a value between 100 and1000 inclusive) before starting anR session. If the OS limit onthe number of open files does not allow using this maximum and cannotbe increased,R will fail to start with an error. The maximum is notallowed to be greater than 60% of the OS limit on the number of openfiles (essentially unlimited on Windows, on Unix typically 1024, but256 on macOS). The limit can sometimes (including on macOS) bemodified using commandulimit -n (sh,bash) orlimit descriptors (csh) in theshell used to launchR. IncreasingR_MAX_NUM_DLLS comes withsome memory overhead, and be aware that many types ofconnections also use file descriptors.

If the OS limit on the number of open files cannot be determined, theDLL limit is 100 and cannot be changedviaR_MAX_NUM_DLLS.

The creation of DLLs and the runtime linking of them into executingprograms is very platform dependent. In recent years there has beensome simplification in the process because the C subroutine calldlopen has become the POSIX standard for doing this. UnderUnix-alikesdyn.load uses thedlopen mechanism andshould work on all platforms which support it. On Windows it uses thestandard mechanism (LoadLibrary) for loading DLLs.

The original code for loading DLLs in Unix-alikes was provided byHeiner Schwarte.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

library.dynam to be used inside a package's.onLoad initialization.

SHLIB for how to create suitable DLLs.

.C,.Fortran,.External,.Call.

Examples

## expect all of these to be false in R >= 3.0.0 as these can only be## used via registered symbols.is.loaded("supsmu")# Fortran entry point in statsis.loaded("supsmu","stats","Fortran")is.loaded("PDF", type="External")# pdf() device in grDevices

Apply a Function Over Values in an Environment

Description

eapply appliesFUN to the named values from anenvironment and returns the results as a list. The usercan request that all named objects are used (normally names that beginwith a dot are not). The output is not sorted and no enclosingenvironments are searched.

Usage

eapply(env, FUN,..., all.names=FALSE, USE.NAMES=TRUE)

Arguments

env

environment to be used.

FUN

the function to be applied, foundviamatch.fun.In the case of functions like+,%*%, etc., thefunction name must be backquoted or quoted.

...

optional arguments toFUN.

all.names

a logical indicating whether to apply the function toall values.

USE.NAMES

logical indicating whether the resulting list shouldhavenames.

Value

A named (unlessUSE.NAMES = FALSE) list. Note that the order ofthe components is arbitrary for hashed environments.

See Also

environment,lapply.

Examples

require(stats)env<- new.env(hash=FALSE)# so the order is fixedenv$a<-1:10env$beta<- exp(-3:3)env$logic<- c(TRUE,FALSE,FALSE,TRUE)# what have we there?utils::ls.str(env)# compute the mean for each list element       eapply(env, mean)unlist(eapply(env, mean, USE.NAMES=FALSE))# median and quartiles for each element (making use of "..." passing):eapply(env, quantile, probs=1:3/4)eapply(env, quantile)

Spectral Decomposition of a Matrix

Description

Computes eigenvalues and eigenvectors of numeric (double, integer,logical) or complex matrices.

Usage

eigen(x, symmetric, only.values=FALSE, EISPACK=FALSE)

Arguments

x

a numeric or complex matrix whose spectral decomposition is tobe computed. Logical matrices are coerced to numeric.

symmetric

ifTRUE, the matrix is assumed to be symmetric(or Hermitian if complex) and only its lower triangle (diagonalincluded) is used. Ifsymmetric is not specified,isSymmetric(x) is used.

only.values

ifTRUE, only the eigenvalues are computedand returned, otherwise both eigenvalues and eigenvectors arereturned.

EISPACK

logical. Defunct and ignored.

Details

Ifsymmetric is unspecified,isSymmetric(x)determines if the matrix is symmetric up to plausible numericalinaccuracies. It is surer and typically much faster to set the valueyourself.

Computing the eigenvectors is the slow part for large matrices.

Computing the eigendecomposition of a matrix is subject to errors on areal-world computer: the definitive analysis is Wilkinson (1965). Allyou can hope for is a solution to a problem suitably close tox. So even though a real asymmetricx may have analgebraic solution with repeated real eigenvalues, the computedsolution may be of a similar matrix with complex conjugate pairs ofeigenvalues.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code (most often1): these canonly be interpreted by detailed study of the FORTRAN code.

Missing,NaN or infinite values inx will givenan error.

Value

The spectral decomposition ofx is returned as a list with components

values

a vector containing thepp eigenvalues ofx,sorted indecreasing order, according toMod(values)in the asymmetric case when they might be complex (even for realmatrices). For real asymmetric matrices the vector will becomplex only if complex conjugate pairs of eigenvalues are detected.

vectors

either ap×pp\times p matrix whose columnscontain the eigenvectors ofx, orNULL ifonly.values isTRUE. The vectors are normalized tounit length.

Recall that the eigenvectors are only defined up to a constant: evenwhen the length is specified they are still only defined up to ascalar of modulus one (the sign for real matrices).

Whenonly.values is not true, as by default, the result is ofS3 class"eigen".

Ifr <- eigen(A), andV <- r$vectors; lam <- r$values,then

A=VΛV1A = V \Lambda V^{-1}

(up to numericalfuzz), whereΛ=\Lambda =diag(lam).

Source

eigen uses the LAPACK routinesDSYEVR,DGEEV,ZHEEV andZGEEV.

LAPACK is fromhttps://netlib.org/lapack/ and its guide is listedin the references.

References

Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Wilkinson, J. H. (1965)The Algebraic Eigenvalue Problem.Clarendon Press, Oxford.

See Also

svd, a generalization ofeigen;qr, andchol for related decompositions.

To compute the determinant of a matrix, theqrdecomposition is much more efficient:det.

Examples

eigen(cbind(c(1,-1), c(-1,1)))eigen(cbind(c(1,-1), c(-1,1)), symmetric=FALSE)# same (different algorithm).eigen(cbind(1, c(1,-1)), only.values=TRUE)eigen(cbind(-1,2:1))# complex valueseigen(print(cbind(c(0,1i), c(-1i,0))))# Hermite ==> real Eigenvalues## 3 x 3:eigen(cbind(1,3:1,1:3))eigen(cbind(-1, c(1:2,0),0:2))# complex values

Encode Character Vector as for Printing

Description

encodeString escapes the strings in a character vector in thesame wayprint.default does, and optionally fits the encodedstrings within a field width.

Usage

encodeString(x, width=0, quote="", na.encode=TRUE,             justify= c("left","right","centre","none"))

Arguments

x

a character vector, or an object that can be coerced to onebyas.character.

width

integer: the minimum field width. IfNULL orNA, this is taken to be the largest field width needed forany element ofx.

quote

character: quoting character, if any.

na.encode

logical: shouldNA strings be encoded?

justify

character: partial matches are allowed. If padding tothe minimum field width is needed, how should spaces be inserted?justify == "none" is equivalent towidth = 0, forconsistency withformat.default.

Details

This escapes backslash and the control characters ‘⁠\a⁠’ (bell),‘⁠\b⁠’ (backspace), ‘⁠\f⁠’ (form feed),‘⁠\n⁠’ (line feed, aka “newline”),‘⁠\r⁠’ (carriage return), ‘⁠\t⁠’ (tab) and ‘⁠\v⁠’(vertical tab) as well as any non-printable characters in asingle-byte locale, which are printed in octal notation (‘⁠\xyz⁠’with leading zeroes).

Which characters are non-printable depends on the current locale.Windows' reporting of printable characters is unreliable, so there allother control characters are regarded as non-printable, and allcharacters with codes 32–255 as printable in a single-byte locale.Seeprint.default for how non-printable characters arehandled in multi-byte locales.

Ifquote is a single or double quote any embedded quote of thesame type is escaped. Note that justification is of the quotedstring, hence spaces are added outside the quotes.

Value

A character vector of the same length asx, with the sameattributes (including names and dimensions) but with no class set.

Marked UTF-8 encodings are preserved.

Note

The default forwidth is different fromformat.default,which does similar things for character vectors but without encodingusing escapes.

See Also

print.default

Examples

x<-"ab\bc\ndef"print(x)cat(x)# interprets escapescat(encodeString(x),"\n", sep="")# similar to print()factor(x)# makes use of this to print the levelsx<- c("a","ab","abcde")encodeString(x)# width = 0: use as little as possibleencodeString(x,2)# use two or more (left justified)encodeString(x, width=NA)# left justificationencodeString(x, width=NA, justify="c")encodeString(x, width=NA, justify="r")encodeString(x, width=NA, quote="'", justify="r")

Read or Set the Declared Encodings for a Character Vector

Description

Read or set the declared encodings for a character vector.

Usage

Encoding(x)Encoding(x)<- valueenc2native(x)enc2utf8(x)

Arguments

x

A character vector.

value

A character vector of positive length.

Details

Character strings inR can be declared to be encoded in"latin1" or"UTF-8" or as"bytes". Thesedeclarations can be read byEncoding, which will return acharacter vector of values"latin1","UTF-8""bytes" or"unknown", or set, whenvalue isrecycled as needed and other values are silently treated as"unknown". ASCII strings will never be marked with a declaredencoding, since their representation is the same in all supportedencodings. Strings marked as"bytes" are intended to benon-ASCII strings which should be manipulated as bytes, and neverconverted to a character encoding (so writing them to a text file issupported only bywriteLines(useBytes = TRUE)).

enc2native andenc2utf8 convert elements of charactervectors to the native encoding or UTF-8 respectively, taking anymarked encoding into account. They areprimitive functions,designed to do minimal copying.

There are other ways for character strings to acquire a declaredencoding apart from explicitly setting it (and these have changed asR has evolved). The parser marks strings containing ‘⁠\u⁠’ or‘⁠\U⁠’ escapes. Functionsscan,read.table,readLines, andparse have anencoding argument that is used todeclare encodings,iconv declares encodings from itsto argument, and console input in suitable locales is alsodeclared.intToUtf8 declares its output as"UTF-8", and output text connections (seetextConnection) are marked if running in asuitable locale. Under some circumstances (see its help page)source(encoding=) will mark encodings of characterstrings it outputs.

Most character manipulation functions will set the encoding on outputstrings if it was declared on the corresponding input. These includechartr,strsplit(useBytes = FALSE),tolower andtoupper as well assub(useBytes = FALSE) andgsub(useBytes = FALSE). Note that such functions do notpreserve theencoding, but if they know the input encoding and that the string hasbeen successfully re-encoded (to the current encoding or UTF-8), theymark the output.

substr does preserve the encoding, andchartr,tolower andtoupperpreserve UTF-8 encoding on systems with Unicode wide characters. Withtheirfixed andperl options,strsplit,sub andgsub will give a marked UTF-8 result ifany of the inputs are UTF-8.

paste andsprintf return elements markedas bytes if any of the corresponding inputs is marked as bytes, andotherwise marked as UTF-8 if any of the inputs is marked as UTF-8.

match,pmatch,charmatch,duplicated andunique all match in UTF-8if any of the elements are marked as UTF-8.

Changing the current encoding from a running R session may lead toconfusion (seeSys.setlocale).

There is some ambiguity as to what is meant by a ‘Latin-1’locale, since some OSes (notably Windows) make use of characterpositions undefined (or used for control characters) in the ISO 8859-1character set. How such characters are interpreted issystem-dependent but as fromR 3.5.0 they are if possible interpretedas per Windows codepage 1252 (which Microsoft calls ‘WindowsLatin 1 (ANSI)’) when converting to e.g. UTF-8.

Value

A character vector.

Forenc2utf8 encodings are always marked: they are forenc2native in UTF-8 and Latin-1 locales.

Examples

## x is intended to be in latin1x.<- x<-"fran\xE7ais"Encoding(x.)# "unknown" (UTF-8 loc.) | "latin1" (8859-1/CP-1252 loc.) | ....Encoding(x)<-"latin1"xxx<- iconv(x,"latin1","UTF-8")Encoding(c(x., x, xx))c(x, xx)xb<- xx; Encoding(xb)<-"bytes"xb# will be encoded in hexcat("x = ", x,", xx = ", xx,", xb = ", xb,"\n", sep="")(Ex<- Encoding(c(x.,x,xx,xb)))stopifnot(identical(Ex, c(Encoding(x.), Encoding(x),                          Encoding(xx), Encoding(xb))))

Environment Access

Description

Get, set, test for and create environments.

Usage

environment(fun=NULL)environment(fun)<- valueis.environment(x).GlobalEnvglobalenv().BaseNamespaceEnvemptyenv()baseenv()new.env(hash=TRUE, parent= parent.frame(), size=29L)parent.env(env)parent.env(env)<- valueenvironmentName(env)env.profile(env)

Arguments

fun

afunction, aformula, orNULL, which is the default.

value

an environment to associate with the function.

x

an arbitraryR object.

hash

a logical, ifTRUE the environment will use a hash table.

parent

an environment to be used as the enclosure of theenvironment created.

env

an environment.

size

an integer specifying the initial size for a hashedenvironment. An internal default value will be used ifsize isNA or zero. This argument is ignored ifhash isFALSE.

Details

Environments consist of aframe, or collection of namedobjects, and a pointer to anenclosing environment. The mostcommon example is the frame of variables local to a function call; itsenclosure is the environment where the function was defined(unless changed subsequently). The enclosing environment isdistinguished from theparent frame: the latter (returned byparent.frame) refers to the environment of the caller ofa function. Since confusion is so easy, it is best never to use‘parent’ in connection with an environment (despite thepresence of the functionparent.env).

Whenget orexists search an environmentwith the defaultinherits = TRUE, they look for the variablein the frame, then in the enclosing frame, and so on.

The global environment.GlobalEnv, more often known as theuser's workspace, is the first item on the search path. It can alsobe accessed byglobalenv(). On the search path, each item'senclosure is the next item.

The object.BaseNamespaceEnv is the namespace environment forthe base package. The environment of the base package itself isavailable asbaseenv().

If one follows the chain of enclosures found by repeatedly callingparent.env from any environment, eventually one reaches theempty environmentemptyenv(), into which nothing maybe assigned.

The replacement functionparent.env<- is extremely dangerous asit can be used to destructively change environments in ways thatviolate assumptions made by the internal C code. It may be removedin the near future.

The replacement form ofenvironment,is.environment,baseenv,emptyenv andglobalenv areprimitive functions.

System environments, such as the base, global and empty environments,have names as do the package and namespace environments and thosegenerated byattach(). Other environments can be named bygiving a"name" attribute, but this needs to be done with careas environments have unusual copying semantics.

Value

Iffun is a function or a formula thenenvironment(fun)returns the environment associated with that function or formula.Iffun isNULL then the current evaluation environment isreturned.

The replacement form sets the environment of the function or formulafun to thevalue given.

is.environment(obj) returnsTRUE if and only ifobj is anenvironment.

new.env returns a new (empty) environment with (by default)enclosure the parent frame.

parent.env returns the enclosing environment of its argument.

parent.env<- sets the enclosing environment of its firstargument.

environmentName returns a character string, that given whenthe environment is printed or"" if it is not a named environment.

env.profile returns a list with the following components:size the number of chains that can be stored in the hash table,nchains the number of non-empty chains in the table (asreported byHASHPRI), andcounts an integer vectorgiving the length of each chain (zero for empty chains). Thisfunction is intended to assess the performance of hashed environments.Whenenv is a non-hashed environment,NULL is returned.

See Also

For the performance implications of hashing or not, seehttps://en.wikipedia.org/wiki/Hash_table.

Theenvir argument ofeval,get,andexists.

ls may be used to view the objects in an environment,and hencels.str may be useful for an overview.

sys.source can be used to populate an environment.

Examples

f<-function()"top level function"##-- all three give the same:environment()environment(f).GlobalEnvls(envir= environment(stats::approxfun(1:2,1:2, method="const")))is.environment(.GlobalEnv)# TRUEe1<- new.env(parent= baseenv())# this one has enclosure package:base.e2<- new.env(parent= e1)assign("a",3, envir= e1)ls(e1)ls(e2)exists("a", envir= e2)# this succeeds by inheritanceexists("a", envir= e2, inherits=FALSE)exists("+", envir= e2)# this succeeds by inheritanceeh<- new.env(hash=TRUE, size=NA)with(env.profile(eh), stopifnot(size== length(counts)))

Environment Variables

Description

Details of some of the environment variables which affect anR session.

Details

It is impossible to list all the environment variables which canaffect anR session: some affect the OS system functions whichRuses, and others will affect add-on packages. But here are notes onsome of the more important ones. Those that set the defaults foroptions are consulted only at startup (as are some of the others).

HOME:

The user's ‘home’ directory.

LANGUAGE:

Optional. The language(s) to be used formessage translations. This is consulted when needed.

LC_ALL:

(etc) Optional. Use to set various aspects ofthe locale – seeSys.getlocale. Consulted at startup.

MAKEINDEX:

The path tomakeindex.If unset to a value determined whenR was built.Used by the emulation mode oftexi2dvi andtexi2pdf.

R_BATCH:

Optional – set in a batch session, that isone started byR CMDBATCH. Most often set to"", so test by something like!is.na(Sys.getenv("R_BATCH", NA)).

R_BROWSER:

The path to the default browser. Used toset the default value ofoptions("browser").

R_COMPLETION:

Optional. If set toFALSE,command-line completion is not used. (Not used by the macOS GUI.)

R_DEFAULT_PACKAGES:

A comma-separated list of packageswhich are to be attached in every session. Seeoptions.

R_DOC_DIR:

The location of theRdoc’directory. Set byR.

R_ENVIRON:

Optional. The path to the site environmentfile: seeStartup. Consulted at startup.

R_GSCMD:

Optional. The path to Ghostscript, used bydev2bitmap,bitmap andembedFonts. Consulted when those functions areinvoked. Since it will be treated as if passed tosystem, spaces and shell metacharacters should be escaped.

R_HISTFILE:

Optional. The path of the history file:seeStartup. Consulted at startup and when the history issaved.

R_HISTSIZE:

Optional. The maximum size of the historyfile, in lines. Exactly how this is used depends on theinterface.

On Unix-alikes,

for thereadline command-line interface it takes effectwhen the history is saved (bysavehistory or at theend of a session).

On Windows,

forRgui it controls the number of lines saved to thehistory file: the size of the history used in the session iscontrolled by the console customization: seeRconsole.

R_HOME:

The top-level directory of theRinstallation: seeR.home. Set byR.

R_INCLUDE_DIR:

The location of theRinclude’directory. Set byR.

R_LIBS:

Optional. Used for initial setting of.libPaths.

R_LIBS_SITE:

Optional. Used for initial setting of.libPaths.

R_LIBS_USER:

Optional. Used for initial setting of.libPaths.

R_PAPERSIZE:

Optional. Used to set the default foroptions("papersize"), e.g. used bypdf andpostscript.

R_PCRE_JIT_STACK_MAXSIZE:

Optional. Consulted whenPCRE'sJIT pattern compiler is first used. Seegrep.

R_PDFVIEWER:

The path to the default PDF viewer. UsedbyR CMD Rd2pdf.

R_PLATFORM:

The platform – a string of the form"cpu-vendor-os", seeR.Version.

R_PROFILE:

Optional. The path to the site profilefile: seeStartup. Consulted at startup.

R_RD4PDF:

Options forpdflatex processing ofRd files. Used byR CMD Rd2pdf.

R_SHARE_DIR:

The location of theRshare’directory. Set byR.

R_TEXI2DVICMD:

The path totexi2dvi.Defaults to the value ofTEXI2DVI, and if that is unset to avalue determined whenR was built.

Only on Unix-alikes:
Consulted at startup to set the default foroptions("texi2dvi"), used bytexi2dvi andtexi2pdf in packagetools.

R_TIDYCMD:

The path to HTMLtidy. Used byR CMD check if_R_CHECK_RD_VALIDATE_RD2HTML_ isset to a true value (as it is by--as-cran.

R_UNZIPCMD:

The path tounzip. Sets theinitial value foroptions("unzip") on a Unix-alikewhen namespaceutils is loaded.

R_ZIPCMD:

The path tozip. Used byzip and byR CMD INSTALL --build on Windows.

TMPDIR,TMP,TEMP:

Consulted (in thatorder) when setting the temporary directory for the session: seetempdir.TMPDIR is also used by some of theutilities: see the help forbuild.

TZ:

Optional. The current time zone. SeeSys.timezone for the system-specificformats. Consulted as needed.

TZDIR:

Optional. The top-level directory of thetime-zone database. SeeSys.timezone.

no_proxy,http_proxy,ftp_proxy:

(and more). Optional. Settings fordownload.file:see its help for further details.

Unix-specific

Some variables set on Unix-alikes, and not (in general) on Windows.

DISPLAY:

Optional: used byX11, Tk (inpackagetcltk), the data editor and various packages.

EDITOR:

The path to the default editor: sets thedefault foroptions("editor") when namespaceutils is loaded.

PAGER:

The path to the pager with the default setting ofoptions("pager"). The default value is chosen atconfiguration, usually as the path toless.

R_PRINTCMD:

Sets the default foroptions("printcmd"), which sets the default printcommand to be used bypostscript.

R_SUPPORT_OLD_TARS

logical. Sets the default for thesupport_old_tars argument ofuntar. Shouldbe set toTRUE if an old systemtar command isused which does not support eitherxz compression orautomagically detecting compression type.

Windows-specific

Some Windows-specific variables are

GSC:

Optional: the path to Ghostscript, used ifR_GSCMD is not set.

R_USER:

The user's ‘home’ directory. Set byR. (HOME will be set to the same value if not already set.)

See Also

Sys.getenv andSys.setenv to read and setenvironmental variables in anR session.

gctorture for environment variables controlling garbagecollection.


Evaluate an (Unevaluated) Expression

Description

Evaluate anR expression in a specified environment.

Usage

eval(expr, envir= parent.frame(),           enclos=if(is.list(envir)|| is.pairlist(envir))                       parent.frame()else baseenv())evalq(expr, envir, enclos)eval.parent(expr, n=1)local(expr, envir= new.env())

Arguments

expr

an object to be evaluated. See ‘Details’.

envir

theenvironment in whichexpr is tobe evaluated. May also beNULL, a list, a data frame,a pairlist or an integer as specified tosys.call.

enclos

relevant whenenvir is a (pair)list or a data frame.Specifies the enclosure, i.e., whereR looks for objects not foundinenvir. This can beNULL (interpreted as the basepackage environment,baseenv()) or an environment.

n

number of parent generations to go back.

Details

eval evaluates theexpr argument in theenvironment specified byenvir and returns the computed value.Ifenvir is not specified, then the default isparent.frame() (the environment where the call toeval was made).

Objects to be evaluated can be of typescall orexpression orname (when the name is lookedup in the current scope and its binding is evaluated), apromiseor any of the basic types such as vectors, functions and environments(which are returned unchanged).

Theevalq form is equivalent toeval(quote(expr), ...).eval evaluates its first argument in the current scopebefore passing it to the evaluator:evalq avoids this.

eval.parent(expr, n) is a shorthand foreval(expr, parent.frame(n)).

Ifenvir is a list (such as a data frame) or pairlist, it iscopied into a temporary environment (with enclosureenclos),and the temporary environment is used for evaluation. So ifexpr changes any of the components named in the (pair)list, thechanges are lost.

Ifenvir isNULL it is interpreted as an empty list sono values could be found inenvir and look-up goes directly toenclos.

local evaluates an expression in a local environment. It isequivalent toevalq except that its default argument creates anew, empty environment. This is useful to create anonymous recursivefunctions and as a kind of limited namespace feature since variablesdefined in the environment are not visible from the outside.

Value

The result of evaluating the object: for an expression vector this isthe result of evaluating the last element.

Note

Due to the difference in scoping rules, there are some differencesbetweenR and S in this area. In particular, the default enclosurein S is the global environment.

When evaluating expressions in a data frame that has been passed as anargument to a function, the relevant enclosure is often the caller'senvironment, i.e., one needseval(x, data, parent.frame()).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (eval only.)

See Also

expression,quote,sys.frame,parent.frame,environment.

Further,force toforce evaluation, typically offunction arguments.

Examples

eval(2^2^3)mEx<- expression(2^2^3); mEx;1+ eval(mEx)eval({ xx<- pi; xx^2}); xxa<-3; aa<-4; evalq(evalq(a+b+aa, list(a=1)), list(b=5))# == 10a<-3; aa<-4; evalq(evalq(a+b+aa,-1), list(b=5))# == 12ev<-function(){   e1<- parent.frame()## Evaluate a in e1   aa<- eval(expression(a), e1)## evaluate the expression bound to a in e1   a<- expression(x+y)   list(aa= aa, eval= eval(a, e1))}tst.ev<-function(a=7){ x<- pi; y<-1; ev()}tst.ev()#-> aa : 7,  eval : 4.14a<- list(a=3, b=4)with(a, a<-5)# alters the copy of a from the list, discarded.#### Example of evalq()##N<-3env<- new.env()assign("N",27, envir= env)## this version changes the visible copy of N only, since the argument## passed to eval is '4'.eval(N<-4, env)Nget("N", envir= env)## this version does the assignment in env, and changes N only there.evalq(N<-5, env)Nget("N", envir= env)#### Uses of local()### Mutually recursive.# gg gets value of last assignment, an anonymous version of f.gg<- local({    k<-function(y)f(y)    f<-function(x)if(x) x*k(x-1)else1})gg(10)sapply(1:5, gg)# Nesting locals: a is private storage accessible to kgg<- local({    k<- local({        a<-1function(y){print(a<<- a+1);f(y)}})    f<-function(x)if(x) x*k(x-1)else1})sapply(1:5, gg)ls(envir= environment(gg))ls(envir= environment(get("k", envir= environment(gg))))

Is an Object Defined?

Description

Look for anR object of the given name and possibly return it

Usage

exists(x, where=-1, envir=, frame, mode="any",       inherits=TRUE)get0(x, envir= pos.to.env(-1L), mode="any", inherits=TRUE,     ifnotfound=NULL)

Arguments

x

a variable name (given as a character string or a symbol).

where

where to look for the object (see the details section); ifomitted, the function will search as if the name of the objectappeared unquoted in an expression.

envir

an alternative way to specify an environment to look in,but it is usually simpler to just use thewhere argument.

frame

a frame in the calling list. Equivalent to givingwhere assys.frame(frame).

mode

the mode or type of object sought: see the‘Details’ section.

inherits

should the enclosing frames of the environment besearched?

ifnotfound

the return value ofget0(x, *) whenx does not exist.

Details

Thewhere argument can specify the environment in which to lookfor the object in any of several ways: as an integer (the position inthesearch list); as the character string name of anelement in the search list; or as anenvironment(including usingsys.frame to access the currently activefunction calls). Theenvir argument is an alternative way tospecify an environment, but is primarily there for back compatibility.

This function looks to see if the namex has a value bound toit in the specified environment. Ifinherits isTRUE anda value is not found forx in the specified environment, theenclosing frames of the environment are searched until the namexis encountered. Seeenvironment and the ‘RLanguage Definition’ manual for details about the structure ofenvironments and their enclosures.

Warning:inherits = TRUE is the default behaviour forR but not for S.

Ifmode is specified then only objects of that type are sought.Themode may specify one of the collections"numeric" and"function" (seemode): any member of thecollection will suffice. (This is true even if a member of acollection is specified, so for examplemode = "special" willseek any type of function.)

Value

exists(): Logical, true if and only if an object of the correctname and mode is found.

get0(): The object—as fromget(x, *)—ifexists(x, *) is true, otherwiseifnotfound.

Note

Withget0(), instead of the easy to read but somewhatinefficient

    if (exists(myVarName, envir = myEnvir)) {      r <- get(myVarName, envir = myEnvir)      ## ... deal with r ...    }

you now can use the more efficient (and slightly harder to read)

    if (!is.null(r <- get0(myVarName, envir = myEnvir))) {      ## ... deal with r ...    }

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

get andhasName. For quite a different kind of “existence”checking, namely if function arguments were specified,missing;and for yet a different kind, namely if a file exists,file.exists.

Examples

##  Define a substitute function if necessary:if(!exists("some.fun", mode="function"))  some.fun<-function(x){ cat("some.fun(x)\n"); x}search()exists("ls",2)# true even though ls is in pos = 3exists("ls",2, inherits=FALSE)# false## These are true (in most circumstances):identical(ls,   get0("ls"))identical(NULL, get0(".foo.bar."))# default ifnotfound = NULL (!)

Create a Data Frame from All Combinations of Factor Variables

Description

Create a data frame from all combinations of the supplied vectors orfactors. See the description of the return value for precise details ofthe way this is done.

Usage

expand.grid(..., KEEP.OUT.ATTRS=TRUE, stringsAsFactors=TRUE)

Arguments

...

vectors, factors or a list containing these.

KEEP.OUT.ATTRS

a logical indicating the"out.attrs"attribute (see below) should be computed and returned.

stringsAsFactors

logical specifying if character vectors areconverted to factors.

Value

A data frame containing one row for each combination of the suppliedfactors. The first factors vary fastest. The columns are labelled bythe factors if these are supplied as named arguments or namedcomponents of a list. The row names are ‘automatic’.

Attribute"out.attrs" is a list which gives the dimension anddimnames for use bypredict methods.

Note

Conversion to a factor is done with levels in the orderthey occur in the character vectors (and not alphabetically, as ismost common when converting to factors).

References

Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.

See Also

combn (packageutils) for the generationof all combinations of n elements, taken m at a time.

Examples

require(utils)expand.grid(height= seq(60,80,5), weight= seq(100,300,50),            sex= c("Male","Female"))x<- seq(0,10, length.out=100)y<- seq(-1,1, length.out=20)d1<- expand.grid(x= x, y= y)d2<- expand.grid(x= x, y= y, KEEP.OUT.ATTRS=FALSE)object.size(d1)- object.size(d2)##-> 5992 or 8832 (on 32- / 64-bit platform)

Unevaluated Expressions

Description

Creates or tests for objects of mode and class"expression".

Usage

expression(...)is.expression(x)as.expression(x,...)

Arguments

...

expression:R objects, typically calls, symbolsor constants.
as.expression: arguments to be passed to methods.

x

an arbitraryR object.

Details

‘Expression’ here is not being used in its colloquial sense,that of mathematical expressions. Those are calls (seecall) inR, and anR expression vector is a list ofcalls, symbols etc, for example as returned byparse.

As an object of mode"expression" is a list, it can besubsetted by[,[[ or$, the latter two extractingindividual calls etc. The replacement forms of these operators can beused to replace or delete elements.

expression andis.expression areprimitive functions.expression is ‘special’: it does not evaluate its arguments.

Value

expression returns a vector of type"expression"containing its arguments (unevaluated).

is.expression returnsTRUE ifexpr is anexpression object andFALSE otherwise.

as.expression attempts to coerce its argument into anexpression object. It is generic, and only the default method isdescribed here. (The default method callsas.vector(type = "expression") and so may dispatch methods foras.vector.)NULL, calls, symbols (seeas.symbol) and pairlists are returned as the element ofa length-one expression vector. Atomic vectors are placedelement-by-element into an expression vector (without using anynames):lists have their type (typeof)changed to an expression vector(keeping all attributes).Other types are not currently supported.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

call,eval,function.Further,text,legend, andplotmathfor plotting mathematical expressions.

Examples

length(ex1<- expression(1+0:9))# 1ex1eval(ex1)# 1:10length(ex3<- expression(u,2, u+0:9))# 3mode(ex3[3])# expressionmode(ex3[[3]])# call## but not all components are 'call's :sapply(ex3, mode)#  name  numeric  callsapply(ex3, typeof)# symbol  double  languagerm(ex3)

Extract or Replace Parts of an Object

Description

Operators acting on vectors, matrices, arrays and lists to extract orreplace parts.

Usage

x[i]x[i, j,..., drop=TRUE]x[[i, exact=TRUE]]x[[i, j,..., exact=TRUE]]x$namegetElement(object, name)x[i]<- valuex[i, j,...]<- valuex[[i]]<- valuex$name<- value

Arguments

x,object

object from which to extract element(s) or in which to replace element(s).

i,j,...

indices specifying elements to extract or replace. Indices arenumeric orcharacter vectors or empty (missing) orNULL. Numeric values are coerced to integer or whole numbers asbyas.integer or for large values bytrunc(and hence truncated towards zero).Character vectors will be matched to thenames of theobject (or for matrices/arrays, thedimnames):see ‘Character indices’ below for further details.

For[-indexing only:i,j,... can belogical vectors, indicating elements/slices to select. Such vectorsare recycled if necessary to match the corresponding extent.i,j,... can also be negative integers,indicating elements/slices to leave out of the selection.

When indexing arrays by[ a single argumenti can be amatrix with as many columns as there are dimensions ofx; theresult is then a vector with elements corresponding to the sets ofindices in each row ofi.

An index value ofNULL is treated as if it wereinteger(0).

name

a literal character string or aname (possiblybacktickquoted). For extraction, this is normally (see under‘Environments’) partially matched to thenamesof the object.

drop

relevant for matrices and arrays. IfTRUE the result iscoerced to the lowest possible dimension (see the examples). Thisonly works for extracting elements, not for the replacement. Seedrop for further details.

exact

controls possible partial matching of[[ whenextracting by a character vector (for most objects, but see under‘Environments’). The default is no partial matching. ValueNA allows partial matching but issues a warning when itoccurs. ValueFALSE allows partial matching without anywarning.

value

typically an array-likeR object of a similar class asx.

Details

These operators are generic. You can write methods to handle indexingof specific classes of objects, seeInternalMethods as well as[.data.frame and[.factor. Thedescriptions here apply only to the default methods. Note thatseparate methods are required for the replacement functions[<-,[[<- and$<- for use when indexing occurs onthe assignment side of an expression.

The most important distinction between[,[[ and$ is that the[ can select more than one element whereasthe other two select a single element.

Note thatx[[]] is always erroneous.

The default methods work somewhat differently for atomic vectors,matrices/arrays and for recursive (list-like, seeis.recursive) objects.$ is only valid forrecursive objects (andNULL), and is only discussed in the section below onrecursive objects.

Subsetting (except by an empty index) will drop all attributes exceptnames,dim anddimnames.

Indexing can occur on the right-hand-side of an expression forextraction, or on the left-hand-side for replacement. When an indexexpression appears on the left side of an assignment (known assubassignment) then that part ofx is set to the valueof the right hand side of the assignment. In this case no partialmatching of character indices is done, and the left-hand-side iscoerced as needed to accept the values. For vectors, the answer willbe of the higher of the types ofx andvalue in thehierarchy raw < logical < integer < double < complex < character <list < expression. Attributes are preserved (althoughnames,dim anddimnames will be adjusted suitably).Subassignment is done sequentially, so if an index is specified morethan once the latest assigned value for an index will result.

It is an error to apply any of these operators to an object which isnot subsettable (e.g., a function).

Atomic vectors

The usual form of indexing is[.[[ can be used toselect a single elementdroppingnames, whereas[ keeps them, e.g., inc(abc = 123)[1].

The index objecti can be numeric, logical, character or empty.Indexing by factors is allowed and is equivalent to indexing by thenumeric codes (seefactor) and not by the charactervalues which are printed (for which use[as.character(i)]).

An empty index selects all values: this is most often used to replaceall the entries but keep theattributes.

Matrices and arrays

Matrices and arrays are vectors with a dimension attribute and so allthe vector forms of indexing can be used with a single index. Theresult will be an unnamed vector unlessx is one-dimensionalwhen it will be a one-dimensional array.

The most common form of indexing akk-dimensional array is tospecifykk indices to[. As for vector indexing, theindices can be numeric, logical, character, empty or even factor.And again, indexing by factors is equivalent to indexing by thenumeric codes, see ‘Atomic vectors’ above.

An empty index (a comma separated blank) indicates that all entries inthat dimension are selected.The argumentdrop applies to this form of indexing.

A third form of indexing is via a numeric matrix with the one columnfor each dimension: each row of the index matrix then selects a singleelement of the array, and the result is a vector. Negative indices arenot allowed in the index matrix.NA and zero values are allowed:rows of an index matrix containing a zero are ignored, whereas rowscontaining anNA produce anNA in the result.

Indexing via a character matrix with one column per dimensions is alsosupported if the array has dimension names. As with numeric matrixindexing, each row of the index matrix selects a single element of thearray. Indices are matched against the appropriate dimension names.NA is allowed and will produce anNA in the result.Unmatched indices as well as the empty string ("") are notallowed and will result in an error.

A vector obtained by matrix indexing will be unnamed unlessxis one-dimensional when the row names (if any) will be indexed toprovide names for the result.

Recursive (list-like) objects

Indexing by[ is similar to atomic vectors and selects a listof the specified element(s).

Both[[ and$ select a single element of the list. Themain difference is that$ does not allow computed indices,whereas[[ does.x$name is equivalent tox[["name", exact = FALSE]]. Also, the partial matchingbehavior of[[ can be controlled using theexact argument.

getElement(x, name) is a version ofx[[name, exact = TRUE]]which for formally classed (S4) objects returnsslot(x, name),hence providing access to even more general list-like objects.

[ and[[ are sometimes applied to other recursiveobjects such ascalls andexpressions. Pairlists (suchas calls) are coerced to lists for extraction by[, but allthree operators can be used for replacement.

[[ can be applied recursively to lists, so that if the singleindexi is a vector of lengthp,alist[[i]] isequivalent toalist[[i1]]...[[ip]] providing all but thefinal indexing results in a list.

Note that in all three kinds of replacement, a value ofNULLdeletes the corresponding item of the list. To set entries toNULL, you needx[i] <- list(NULL).

When$<- is applied to aNULLx, it first coercesx tolist(). This is what also happens with[[<-where inR versions less than 4.y.z, a length one value resulted in alength one (atomic)vector.

Environments

Both$ and[[ can be applied to environments. Onlycharacter indices are allowed and no partial matching is done. Thesemantics of these operations are those ofget(i, env = x, inherits = FALSE). If no match is found thenNULL isreturned. The replacement versions,$<- and[[<-, canalso be used. Again, only character arguments are allowed. Thesemantics in this case are those ofassign(i, value, env = x, inherits = FALSE). Such an assignment will either create a newbinding or change the existing binding inx.

NAs in indexing

When extracting, a numerical, logical or characterNA index picksan unknown element and so returnsNA in the correspondingelement of a logical, integer, numeric, complex or character result,andNULL for a list. (It returns00 for a raw result.)

When replacing (that is using indexing on the lhs of anassignment)NA does not select any element to be replaced. Asthere is ambiguity as to whether an element of the rhs shouldbe used or not, this is only allowed if the rhs value is of length one(so the two interpretations would have the same outcome).(The documented behaviour of S was that anNA replacement index‘goes nowhere’ but uses up an element ofvalue:Beckeret al. p. 359. However, that has not been true ofother implementations.)

Argument matching

Note that these operations do not match their index arguments in thestandard way: argument names are ignored and positional matching only isused. Som[j = 2, i = 1] is equivalent tom[2, 1] andnot tom[1, 2].

This may not be true for methods defined for them; for example it isnot true for thedata.frame methods described in[.data.frame which warn ifi orjis named and have undocumented behaviour in that case.

To avoid confusion, do not name index arguments (butdrop andexact must be named).

S4 methods

These operators are also implicit S4 generics, but as primitives, S4methods will be dispatched only on S4 objectsx.

The implicit generics for the$ and$<- operators do nothavename in their signature because the grammar only allowssymbols or string constants for thename argument.

Character indices

Character indices can in some circumstances be partially matched (seepmatch) to the names or dimnames of the object beingsubsetted (but never for subassignment).Unlike S (Beckeret al. p. 358),R never uses partialmatching when extracting by[, and partial matching is not by default used by[[(see argumentexact).

Thus the default behaviour is to use partial matching only whenextracting from recursive objects (except environments) by$.Even in that case, warnings can be switched on byoptions(warnPartialMatchDollar = TRUE).

Neither empty ("") norNA indices match any names, noteven empty nor missing names. If any object has no names orappropriate dimnames, they are taken as all"" and so matchnothing.

Error conditions

Attempting to apply a subsetting operation to objects for which this isnot possible signals an error of classnotSubsettableError. Theobject component of the errorcondition contains the non-subsettable object.

Subscript out of bounds errors are signaled as errors of classsubscriptOutOfBoundsError. Theobject component of theerror condition contains the object being subsetted. The integersubscript component is zero for vector subscripting, and formultiple subscripts indicates which subscript was out of bounds. Theindex component contains the erroneous index.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

names for details of matching to names, andpmatch for partial matching.

list,array,matrix.

[.data.frame and[.factor for thebehaviour when applied to data.frame and factors.

Syntax for operator precedence, and the‘R Language Definition’ manual about indexing details.

NULL for details of indexing null objects.

Examples

x<-1:12m<- matrix(1:6, nrow=2, dimnames= list(c("a","b"), LETTERS[1:3]))li<- list(pi= pi, e= exp(1))x[10]# the tenth element of xx<- x[-1]# delete the 1st element of xm[1,]# the first row of matrix mm[1,, drop=FALSE]# is a 1-row matrixm[,c(TRUE,FALSE,TRUE)]# logical indexingm[cbind(c(1,2,1),3:1)]# matrix numeric indexci<- cbind(c("a","b","a"), c("A","C","B"))m[ci]# matrix character indexm<- m[,-1]# delete the first column of mli[[1]]# the first element of list liy<- list(1,2, a=4,5)y[c(3,4)]# a list containing elements 3 and 4 of yy$a# the element of y named a## non-integer indices are truncated:(i<-3.999999999)# "4" is printed(1:5)[i]# 3## named atomic vectors, compare "[" and "[[" :nx<- c(Abc=123, pi= pi)nx[1]; nx["pi"]# keeps names, whereas "[[" does not:nx[[1]]; nx[["pi"]]## recursive indexing into listsz<- list(a= list(b=9, c="hello"), d=1:5)unlist(z)z[[c(1,2)]]z[[c(1,2,1)]]# both "hello"z[[c("a","b")]]<-"new"unlist(z)## check $ and [[ for environmentse1<- new.env()e1$a<-10e1[["a"]]e1[["b"]]<-20e1$bls(e1)## partial matching - possibly with warning :stopifnot(identical(li$p, pi))op<- options(warnPartialMatchDollar=TRUE)stopifnot( identical(li$p, pi),#-- a warning  inherits(tryCatch(li$p, warning= identity),"warning"))## revert the warning option:options(op)

Extract or Replace Parts of a Data Frame

Description

Extract or replace subsets of data frames.

Usage

## S3 method for class 'data.frame'x[i, j, drop=]## S3 replacement method for class 'data.frame'x[i, j]<- value## S3 method for class 'data.frame'x[[..., exact=TRUE]]## S3 replacement method for class 'data.frame'x[[i, j]]<- value## S3 replacement method for class 'data.frame'x$name<- value

Arguments

x

data frame.

i,j,...

elements to extract or replace. For[ and[[, these arenumeric orcharacter or, for[ only, empty orlogical. Numeric values are coerced to integer as ifbyas.integer. For replacement by[, a logicalmatrix is allowed.

name

a literal character string or aname (possiblybacktickquoted).

drop

logical. IfTRUE the result is coerced to thelowest possible dimension. The default is to drop if only onecolumn is left, butnot to drop if only one row is left.

value

a suitable replacement value: it will be repeated a wholenumber of times if necessary and it may be coerced: see theCoercion section. IfNULL, deletes the column if a singlecolumn is selected.

exact

logical: see[, and applies to column names.

Details

Data frames can be indexed in several modes. When[ and[[ are used with a single vector index (x[i] orx[[i]]), they index the data frame as if it were a list. Inthis usage adrop argument is ignored, with a warning.

There is nodata.frame method for$, sox$nameuses the default method which treatsx as a list (with partialmatching of column names if the match is unique, seeExtract). The replacement method (for$) checksvalue for the correct number of rows, and replicates it if necessary.

When[ and[[ are used with two indices (x[i, j]andx[[i, j]]) they act like indexing a matrix:[[ canonly be used to select one element. Note that for each selectedcolumn,xj say, typically (if it is not matrix-like), theresulting column will bexj[i], and hence rely on thecorresponding[ method, see the examples section.

If[ returns a data frame it will have unique (and non-missing)row names, if necessary transforming the row names usingmake.unique. Similarly, if columns are selected columnnames will be transformed to be unique if necessary (e.g., if columnsare selected more than once, or if more than one column of a givenname is selected if the data frame has duplicate column names).

Whendrop = TRUE, this is applied to the subsetting of anymatrices contained in the data frame as well as to the data frame itself.

The replacement methods can be used to add whole column(s) by specifyingnon-existent column(s), in which case the column(s) are added at theright-hand edge of the data frame and numerical indices must becontiguous to existing indices. On the other hand, rows can be addedat any row after the current last row, and the columns will bein-filled with missing values. Missing values in the indices are notallowed for replacement.

For[ the replacement value can be a list: each element of thelist is used to replace (part of) one column, recycling the list asnecessary. If columns specified by number are created, the names(if any) of the corresponding list elements are used to name thecolumns. If the replacement is not selecting rows, list values cancontainNULL elements which will cause the correspondingcolumns to be deleted. (See the Examples.)

Matrix indexing (x[i] with a logical or a 2-column integermatrixi) using[ is not recommended. For extraction,x is first coerced to a matrix. For replacement, logicalmatrix indices must be of the same dimension asx.Replacements are done one column at a time, with multiple typecoercions possibly taking place.

Both[ and[[ extraction methods partially match rownames. By default neither partially match column names, but[[will ifexact = FALSE (and with a warning ifexact = NA). If you want to exact matching on row names usematch, as in the examples.

Value

For[ a data frame, list or a single column (the latter twoonly when dimensions have been dropped). If matrix indexing is used forextraction a vector results. If the result would be a data frame anerror results if undefined columns are selected (as there is no generalconcept of a 'missing' column in a data frame). Otherwise if a singlecolumn is selected and this is undefined the result isNULL.

For[[ a column of the data frame orNULL(extraction with one index)or a length-one vector (extraction with two indices).

For$, a column of the data frame (orNULL).

For[<-,[[<- and$<-, a data frame.

Coercion

The story over when replacement values are coerced is a complicatedone, and one that has changed duringR's development. This sectionis a guide only.

When[ and[[ are used to add or replace a whole column,no coercion takes place butvalue will bereplicated (by calling the generic functionrep) to theright length if an exact number of repeats can be used.

When[ is used with a logical matrix, each value is coerced tothe type of the column into which it is to be placed.

When[ and[[ are used with two indices, thecolumn will be coerced as necessary to accommodate the value.

Note that when the replacement value is an array (including a matrix)it isnot treated as a series of columns (asdata.frame andas.data.frame do) butinserted as a single column.

Warning

The default behaviour when only onerow is left is equivalent tospecifyingdrop = FALSE. To drop from a data frame to a list,drop = TRUE has to be specified explicitly.

Arguments other thandrop andexact should not be named:there is a warning if they are and the behaviour differs from thedescription here.

See Also

subset which is often easier for extraction,data.frame,Extract.

Examples

sw<- swiss[1:5,1:4]# select a manageable subsetsw[1:3]# select columnssw[,1:3]# samesw[4:5,1:3]# select rows and columnssw[1]# a one-column data framesw[,1, drop=FALSE]# the samesw[,1]# a (unnamed) vectorsw[[1]]# the samesw$Fert# the same (possibly w/ warning, see ?Extract)sw[1,]# a one-row data framesw[1,, drop=TRUE]# a listsw["C",]# partially matchessw[match("C", row.names(sw)),]# no exact matchtry(sw[,"Ferti"])# column names must match exactlysw[sw$Fertility>90,]# logical indexing, see also ?subsetsw[c(1,1:2),]# duplicate row, unique row names are createdsw[sw<=6]<-6# logical matrix indexingsw## adding a columnsw["new1"]<- LETTERS[1:5]# adds a character columnsw[["new2"]]<- letters[1:5]# dittosw[,"new3"]<- LETTERS[1:5]# dittosw$new4<-1:5sapply(sw, class)sw$new# -> NULL: no unique partial matchsw$new4<-NULL# delete the columnswsw[6:8]<- list(letters[10:14],NULL, aa=1:5)# update col. 6, delete 7, appendsw## matrices in a data frameA<- data.frame(x=1:3, y= I(matrix(4:9,3,2)),                         z= I(matrix(letters[1:9],3,3)))A[1:3,"y"]# a matrixA[1:3,"z"]# a matrixA[,"y"]# a matrixstopifnot(identical(colnames(A), c("x","y","z")), ncol(A)==3L,          identical(A[,"y"], A[1:3,"y"]),          inherits(A[,"y"],"AsIs"))## keeping special attributes: use a class with a## "as.data.frame" and "[" method;## "avector" := vector that keeps attributes.   Could provide a constructor##  avector <- function(x) { class(x) <- c("avector", class(x)); x }as.data.frame.avector<- as.data.frame.vector`[.avector`<-function(x,i,...){  r<- NextMethod("[")  mostattributes(r)<- attributes(x)  r}d<- data.frame(i=0:7, f= gl(2,4),                u= structure(11:18, unit="kg", class="avector"))str(d[2:4,-1])# 'u' keeps its "unit"

Extract or Replace Parts of a Factor

Description

Extract or replace subsets of factors.

Usage

## S3 method for class 'factor'x[..., drop=FALSE]## S3 method for class 'factor'x[[...]]## S3 replacement method for class 'factor'x[...]<- value## S3 replacement method for class 'factor'x[[...]]<- value

Arguments

x

a factor.

...

a specification of indices – seeExtract.

drop

logical. If true, unused levels are dropped.

value

character: a set of levels. Factor values are coerced tocharacter.

Details

When unused levels are dropped the ordering of the remaining levels ispreserved.

Ifvalue is not inlevels(x), a missing value isassigned with a warning.

Anycontrasts assigned to the factor are preservedunlessdrop = TRUE.

The[[ method supports argumentexact.

Value

A factor with the same set of levels asx unlessdrop = TRUE.

See Also

factor,Extract.

Examples

## following example(factor)(ff<- factor(substring("statistics",1:10,1:10), levels= letters))ff[, drop=TRUE]factor(letters[7:10])[2:3, drop=TRUE]

Maxima and Minima

Description

Returns the (regular orparallel) maxima and minima of theinput values.

pmax*() andpmin*() take one or more vectors asarguments, recycle them to common length and return a single vectorgiving the‘parallel’ maxima (or minima) of the argumentvectors.

Usage

max(..., na.rm=FALSE)min(..., na.rm=FALSE)pmax(..., na.rm=FALSE)pmin(..., na.rm=FALSE)pmax.int(..., na.rm=FALSE)pmin.int(..., na.rm=FALSE)

Arguments

...

numeric or character arguments (see Note).

na.rm

a logical indicating whether missing values should beremoved.

Details

max andmin return the maximum or minimum ofallthe values present in their arguments, asinteger ifall arelogical orinteger, asdouble ifall are numeric, and character otherwise.

Ifna.rm isFALSE anNA value in any of thearguments will cause a value ofNA to be returned, otherwiseNA values are ignored.

The minimum and maximum of a numeric empty set are+Inf and-Inf (in this order!) which ensurestransitivity, e.g.,min(x1, min(x2)) == min(x1, x2). For numericxmax(x) == -Inf andmin(x) == +Infwheneverlength(x) == 0 (after removing missing values ifrequested). However,pmax andpmin returnNA if all the parallel elements areNA even forna.rm = TRUE.

pmax andpmin take one or more vectors (or matrices) asarguments and return a single vector giving the ‘parallel’maxima (or minima) of the vectors. The first element of the result isthe maximum (minimum) of the first elements of all the arguments, thesecond element of the result is the maximum (minimum) of the secondelements of all the arguments and so on. Shorter inputs (of non-zerolength) are recycled if necessary. Attributes (seeattributes: such asnames ordim) are copied from the first argument (if applicable,e.g.,not for anS4 object).

pmax.int andpmin.int are faster internal versions onlyused when all arguments are atomic vectors and there are no classes:they drop all attributes. (Note that all versions fail for raw andcomplex vectors since these have no ordering.)

max andmin are generic functions: methods can bedefined for them individually or via theSummary group generic. For this towork properly, the arguments... should be unnamed, anddispatch is on the first argument.

By definition the min/max of a numeric vector containing anNaNisNaN, except that the min/max of any vector containing anNA isNA even if it also contains anNaN.Note thatmax(NA, Inf) == NA even though the maximum would beInf whatever the missing value actually is.

Character versions are sorted lexicographically, and this depends onthe collating sequence of the locale in use: the help for‘Comparison’ gives details. The max/min of an emptycharacter vector is defined to be characterNA. (One couldargue that as"" is the smallest character element, the maximumshould be"", but there is no obvious candidate for theminimum.)

Value

Formin ormax, a length-one vector. Forpmin orpmax, a vector of length the longest of the input vectors, orlength zero if one of the inputs had zero length.

The type of the result will be that of the highest of the inputs inthe hierarchy integer < double < character.

Formin andmax if there are only numeric inputs and allare empty (after possible removal ofNAs), the result is double(Inf or-Inf).

S4 methods

max andmin are part of the S4Summary group generic. Methodsfor them must use the signaturex, ..., na.rm.

Note

‘Numeric’ arguments are vectors of type integer and numeric,and logical (coerced to integer). For historical reasons,NULLis accepted as equivalent tointeger(0).

pmax andpmin will also work on classed S3 or S4 objectswith appropriate methods for comparison,is.na andrep(if recycling of arguments is needed).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

range (both min and max) andwhich.min (which.max) for thearg min,i.e., the location where an extreme value occurs.

plotmath’ for the use ofmin in plot annotation.

Examples

require(stats); require(graphics) min(5:1, pi)#-> one numberpmin(5:1, pi)#->  5  numbersx<- sort(rnorm(100));  cH<-1.35pmin(cH, quantile(x))# no namespmin(quantile(x), cH)# has namesplot(x, pmin(cH, pmax(-cH, x)), type="b", main="Huber's function")cut01<-function(x) pmax(pmin(x,1),0)curve(      x^2-1/4,-1.4,1.5, col=2)curve(cut01(x^2-1/4), col="blue", add=TRUE, n=500)## pmax(), pmin() preserve attributes of *first* argumentD<- diag(x=(3:1)/4); n0<- numeric()stopifnot(identical(D,  cut01(D)),          identical(n0, cut01(n0)),          identical(n0, cut01(NULL)),          identical(n0, pmax(3:1, n0,2)),          identical(n0, pmax(n0,4)))

Report Versions of Third-Party Software

Description

Report versions of (external) third-party software used.

Usage

extSoftVersion()

Details

The reports the versions of third-party software libraries in use.These are often external but might have been compiled intoR when itwas installed.

With dynamic linking, these are the versions of the libraries linkedto in this session: with static linking, of those compiled in.

Value

A named character vector, currently with components

zlib

The version ofzlib in use.

bzlib

The version ofbzlib (frombzip2) in use.

xz

The version ofliblzma (fromxz) in use.

libdeflate

The version oflibdeflate (if any otherwise"") used whenR was built.

PCRE

The version ofPCRE in use. PCRE1 has versions < 10.00,PCRE2 has versions >= 10.00.

ICU

The version ofICU in use (if any, otherwise"").

TRE

The version oflibtre in use.

iconv

The implementation and version of theiconvlibrary in use (if known).

readline

The version ofreadline in use (if any,otherwise""). If using the emulation bylibedit akaeditline this will be"EditLine wrapper" preceded bythereadline version it emulates: that is most likely to beseen on macOS.

BLAS

Name of the binary/executable file with the implementation ofBLAS in use (if known, otherwise"").

Note that the values forbzlib andpcre normally containa date as well as the version number, and that fortre includesseveral items separated by spaces, the version number being thesecond.

Foriconv this will give the implementation as well as theversion, for example"GNU libiconv 1.14","glibc 2.18" or"win_iconv" (which has no version number).

The name of the binary/executable file forBLAS can be used as anindication of which implementation is in use. Typically, the R version ofBLAS will appear aslibR.so (libR.dylib),R orlibRblas.so (libRblas.dylib), depending on how R was built.Note thatlibRblas.so (libRblas.dylib) may also be shown foran external BLAS implementation that had been copied, hard-linked orrenamed by the system administrator. For an external BLAS, a sharedobject file will be given and its path/name may indicate thevendor/version. The detection does not work on Windows nor for someuses of the Accelerate framework on macOS.

See Also

libcurlVersion for the version oflibCurl.

La_version for the version of LAPACK in use.

La_library for binary/executable file with LAPACK in use.

grSoftVersion for third-party graphics software.

tclVersion in packagetcltk for the version of Tcl/Tk.

pcre_config for PCRE configuration options.

Examples

extSoftVersion()## the PCRE versionsub(" .*","", extSoftVersion()["PCRE"])

Factors

Description

The functionfactor is used to encode a vector as a factor (theterms ‘category’ and ‘enumerated type’ are also used forfactors). If argumentordered isTRUE, the factorlevels are assumed to be ordered. For compatibility with S there isalso a functionordered.

is.factor,is.ordered,as.factor andas.orderedare the membership and coercion functions for these classes.

Usage

factor(x= character(), levels, labels= levels,       exclude=NA, ordered= is.ordered(x), nmax=NA)ordered(x= character(),...)is.factor(x)is.ordered(x)as.factor(x)as.ordered(x)addNA(x, ifany=FALSE).valid.factor(object)

Arguments

x

a vector of data, usually taking a small number of distinctvalues.

levels

an optional vector of the unique values (as character strings)thatx might have taken. The default is the unique set ofvalues taken byas.character(x), sorted intoincreasing orderofx. Note that this set can bespecified as smaller thansort(unique(x)).

labels

either an optional character vector oflabels for the levels (in the same order aslevels afterremoving those inexclude),or a character string oflength 1. Duplicated values inlabels can be used to mapdifferent values ofx to the same factor level.

exclude

a vector of values to be excluded when forming theset of levels. This may be factor with the same level set asxor should be acharacter.

ordered

logical flag to determine if the levels should be regardedas ordered (in the order given).

nmax

an upper bound on the number of levels; see ‘Details’.

...

(inordered(.)): any of the above, apart fromordered itself.

ifany

only add anNA level if it is used, i.e.ifany(is.na(x)).

object

anR object.

Details

The type of the vectorx is not restricted; it only must haveanas.character method and be sortable (byorder).

Ordered factors differ from factors only in their class, but methodsand model-fitting functions may treat the two classes quite differently,seeoptions("contrasts").

The encoding of the vector happens as follows. First all the valuesinexclude are removed fromlevels. Ifx[i]equalslevels[j], then thei-th element of the result isj. If no match is found forx[i] inlevels(which will happen for excluded values) then thei-th elementof the result is set toNA.

Normally the ‘levels’ used as an attribute of the result arethe reduced set of levels after removing those inexclude, butthis can be altered by supplyinglabels. This should eitherbe a set of new labels for the levels, or a character string, inwhich case the levels are that character string with a sequencenumber appended.

factor(x, exclude = NULL) applied to a factor withoutNAs is a no-operation unless there are unused levels: inthat case, a factor with the reduced level set is returned. Ifexclude is used, sinceR version 3.4.0, excluding non-existingcharacter levels is equivalent to excluding nothing, and whenexclude is acharacter vector, thatisapplied to the levels ofx.Alternatively,exclude can be factor with the same level set asx and will exclude the levels present inexclude.

The codes of a factor may containNA. For a numericx, setexclude = NULL to makeNA an extralevel (prints as ‘⁠<NA>⁠’); by default, this is the last level.

IfNA is a level, the way to set a code to be missing (asopposed to the code of the missing level) is touseis.na on the left-hand-side of an assignment (as inis.na(f)[i] <- TRUE; indexing insideis.na does not work).Under those circumstances missing values are currently printed as‘⁠<NA>⁠’, i.e., identical to entries of levelNA.

is.factor is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.

Wherelevels is not supplied,unique is called.Since factors typically have quite a small number of levels, for largevectorsx it is helpful to supplynmax as an upper boundon the number of unique values.

When usingc to combine a (possiblyordered) factor with other objects, if all objects are (possiblyordered) factors, the result will be a factor with levels the union ofthe level sets of the elements, in the order the levels occur in thelevel sets of the elements (which means that if all the elements havethe same level set, that is the level set of the result), equivalentto howunlist operates on a list of factor objects.

Value

factor returns an object of class"factor" which has aset of integer codes the length ofx with a"levels"attribute of modecharacter and unique(!anyDuplicated(.)) entries. If argumentorderedis true (orordered() is used) the result has classc("ordered", "factor").Undocumentedly for a long time,factor(x) loses allattributes(x) but"names", and resets"levels" and"class".

Applyingfactor to an ordered or unordered factor returns afactor (of the same type) with just the levels which occur: see also[.factor for a more transparent way to achieve this.

is.factor returnsTRUE orFALSE depending onwhether its argument is of type factor or not. Correspondingly,is.ordered returnsTRUE when its argument is an orderedfactor andFALSE otherwise.

as.factor coerces its argument to a factor.It is an abbreviated (sometimes faster) form offactor.

as.ordered(x) returnsx if this is ordered, andordered(x) otherwise.

addNA modifies a factor by turningNA into an extralevel (so thatNA values are counted in tables, for instance).

.valid.factor(object) checks the validity of a factor,currently onlylevels(object), and returnsTRUE if it isvalid, otherwise a string describing the validity problem. Thisfunction is used forvalidObject(<factor>).

Warning

The interpretation of a factor depends on both the codes and the"levels" attribute. Be careful only to compare factors withthe same set of levels (in the same order). In particular,as.numeric applied to a factor is meaningless, and mayhappen by implicit coercion. To transform a factorf toapproximately its original numeric values,as.numeric(levels(f))[f] is recommended and slightly moreefficient thanas.numeric(as.character(f)).

The levels of a factor are by default sorted, but the sort ordermay well depend on the locale at the time of creation, and shouldnot be assumed to be ASCII.

There are some anomalies associated with factors that haveNA as a level. It is suggested to use them sparingly, e.g.,only for tabulation purposes.

Comparison operators and group generic methods

There are"factor" and"ordered" methods for thegroup genericOps whichprovide methods for theComparison operators,and for themin,max, andrange generics inSummaryof"ordered". (The rest of the groups and theMath group generate an error as theyare not meaningful for factors.)

Only== and!= can be used for factors: a factor canonly be compared to another factor with an identical set of levels(not necessarily in the same ordering) or to a character vector.Ordered factors are compared in the same way, but the general dispatchmechanism precludes comparing ordered and unordered factors.

All the comparison operators are available for ordered factors.Collation is done by the levels of the operands: if both operands areordered factors they must have the same level set.

Note

In earlier versions ofR, storing character data as a factor was morespace efficient if there is even a small proportion ofrepeats. However, identical character strings now share storage, sothe difference is small in most cases. (Integer values are storedin 4 bytes whereas each reference to a character string needs apointer of 4 or 8 bytes.)

References

Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.

See Also

[.factor for subsetting of factors.

gl for construction of balanced factors andC for factors with specified contrasts.levels andnlevels for accessing thelevels, andunclass to get integer codes.

Examples

(ff<- factor(substring("statistics",1:10,1:10), levels= letters))as.integer(ff)# the internal codes(f.<- factor(ff))# drops the levels that do not occurff[, drop=TRUE]# the same, more transparentlyfactor(letters[1:20], labels="letter")class(ordered(4:1))# "ordered", inheriting from "factor"z<- factor(LETTERS[3:1], ordered=TRUE)## and "relational" methods work:stopifnot(sort(z)[c(1,3)]== range(z), min(z)< max(z))## suppose you want "NA" as a level, and to allow missing values.(x<- factor(c(1,2,NA), exclude=NULL))is.na(x)[2]<-TRUEx# [1] 1    <NA> <NA>is.na(x)# [1] FALSE  TRUE FALSE## More rational, since R 3.4.0 :factor(c(1:2,NA), exclude="")# keeps <NA> , asfactor(c(1:2,NA), exclude=NULL)# always did## exclude = <character>z# ordered levels 'A < B < C'factor(z, exclude="C")# does excludefactor(z, exclude="B")# ditto## Now, labels maybe duplicated:## factor() with duplicated labels allowing to "merge levels"x<- c("Man","Male","Man","Lady","Female")## Map from 4 different values to only two levels:(xf<- factor(x, levels= c("Male","Man","Lady","Female"),                 labels= c("Male","Male","Female","Female")))#> [1] Male   Male   Male   Female Female#> Levels: Male Female## Using addNA()Month<- airquality$Monthtable(addNA(Month))table(addNA(Month, ifany=TRUE))

Ascertain File Accessibility

Description

Utility function to access information about files on the user'sfile systems.

Usage

file.access(names, mode=0)

Arguments

names

character vector containing file names.Tilde-expansion will be done: seepath.expand.

mode

integer specifying access mode required: see ‘Details’.

Details

Themode value can be the exclusive or (xor), i.e., apartial sum of the following values, and hence must be in0:7,

0

test for existence.

1

test for execute permission.

2

test for write permission.

4

test for read permission.

Permission will be computed for real user ID and real group ID (ratherthan the effective IDs).

Please note that it is not a good idea to use this function to testbefore trying to open a file. On a multi-tasking system, it ispossible that the accessibility of a file will change between the timeyou callfile.access() and the time you try to open the file.It is better to wrap file open attempts intry.

Value

An integer vector with values0 for success and-1 for failure.

Note

This was written as a replacement for the S-PLUS functionaccess, a wrapper for the C function of the same name, whichexplains the return value encoding. Note that the return value isfalse forsuccess.

See Also

file.info for more details on permissions,Sys.chmod to change permissions, andtry for a ‘test it and see’ approach.

file_test for shell-style file tests.

Examples

fa<- file.access(dir("."))table(fa)# count successes & failures

Choose a File Interactively

Description

Choose a file interactively.

Usage

file.choose(new=FALSE)

Arguments

new

Logical: choose the style of dialog boxpresented to the user: at present only new = FALSE is used.

Value

A character vector of length one giving the file path.

See Also

list.files for non-interactive selection.


Extract File Information

Description

Utility function to extract information about files on the user'sfile systems.

Usage

file.info(..., extra_cols=TRUE)file.mode(...)file.mtime(...)file.size(...)

Arguments

...

character vectors containing file paths. Tilde-expansionis done: seepath.expand.

extra_cols

logical: return all cols rather than just thefirst six.

Details

What constitutes a ‘file’ is OS-dependent but includesdirectories. (However, directory names must not include a trailingbackslash or slash on Windows.) See also the section in the help forfile.exists on case-insensitive file systems.

The file ‘mode’ follows POSIX conventions, giving three octaldigits summarizing the permissions for the file owner, the owner'sgroup and for anyone respectively. Each digit is the logicalor of read (4), write (2) and execute/search (1) permissions.

Seefiles for how file paths with marked encodings are interpreted.

On unix alikes:

On most systems symbolic links are followed, so information is givenabout the file to which the link points rather than about the link.

On Windows:

File modes are probably only useful onNTFS file systems, and it seemsall three digits refer to the file's owner.The execute/search bits are set for directories, and for files basedon their extensions (e.g., ‘.exe’, ‘.com’, ‘.cmd’and ‘.bat’ files).file.access will give a morereliable view of read/write access availability to theR process.

UTF-8-encoded file names not valid in the current locale can be used.

Junction points and symbolic links are followed, so information isgiven about the file/directory to which the link points rather thanabout the link.

Value

Forfile.info(), data frame with row names the file names and columns

size

double: File size in bytes.

isdir

logical: Is the file a directory?

mode

integer of class"octmode". The file permissions,printed in octal, for example644.

mtime,ctime,atime

object of class"POSIXct":file modification, ‘last status change’ and last access times.

On unix alikes:
uid:

integer, the user ID of the file's owner.

gid:

integer, the group ID of the file's group.

uname:

character,uid interpreted as a user name.

grname:

character,gid interpreted as a group name.

Unknown user and group names will beNA.

On Windows only:
exe:

character indicating the sort of executable. Possiblevalues are"no","msdos","win16","win32","win64" and"unknown". Note that afile (e.g., a script file) can be executable according to the modebits but not executable in this sense.

Ifextra_cols is false, only the first six columns arereturned: as these can all be found from a single C system call thiscan be faster. (However, properly configured systems will use a‘name service cache daemon’ to speed up the name lookups.)

Entries for non-existent or non-readable files will beNA.

Theuid,gid,uname andgrname columnsmay not be supplied on a non-POSIX Unix-alike system, and will not beon Windows.

What is meant by the three file times depends on the OS and filesystem. On Windows native file systemsctime is the filecreation time (something which is not recorded on most Unix-alike filesystems). What is meant by ‘file access’ and hence the‘last access time’ is system-dependent.

The resolution of the file times depends on both the OS and the typeof the file system. Modern file systems typically record times to anaccuracy of a microsecond or better: notable exceptions areHFS+ onmacOS (recorded in seconds) and modification time on older FAT systems(recorded in increments of 2 seconds). Note that"POSIXct"times are by default printed in whole seconds: to change that seestrftime.

file.mode(),file.mtime() andfile.size() are fastconvenience wrappers returning just one of the columns.

Note

Some (now old) unix alike systems allow files of more than 2Gb to be created butnot accessed by thestat system call. Such files may show upas non-readable (and very likely not be readable by any ofR's inputfunctions).

See Also

Sys.readlink to find out about symbolic links,files,file.access,list.files,andDateTimeClasses for the date formats.

Sys.chmod to change permissions.

Examples

ncol(finf<- file.info(dir()))# at least sixfinf# the whole list## Those that are more than 100 days old :finf<- file.info(dir(), extra_cols=FALSE)finf[difftime(Sys.time(), finf[,"mtime"], units="days")>100,1:4]file.info("no-such-file-exists")## E.g., for R-core, in a R-devel version:if(Sys.info()[["sysname"]]=="Linux")     sort(file.mtime(file.path(R.home("bin"),                             c("",                               file.path(c("","exec"),"R")))))

Construct Path to File

Description

Construct the path to a file from components in a platform-independentway.

Usage

file.path(..., fsep= .Platform$file.sep)

Arguments

...

character vectors.Long vectors are not supported.

fsep

the path separator to use (assumed to be ASCII).

Details

The implementation is designed to be fast (faster thanpaste) as this function is used extensively inR itself.

It can also be used for environment paths such asPATH andR_LIBS withfsep = .Platform$path.sep.

Trailing path separators are invalid for Windows file paths apart from‘/’ and ‘d:/’ (although some functions/utilities do acceptthem), so a trailing/ or\ is removed there.

Value

A character vector of the arguments concatenated term-by-term andseparated byfsep if all arguments have positive length;otherwise, an empty character vector (unlikepaste).

An element of the result will be marked (seeEncoding) asUTF-8 if run in a UTF-8 locale (when marked inputs are converted toUTF-8) or if a component of the result is marked as UTF-8, or asLatin-1 in a non-Latin-1 locale.

Note

The components are by default separated by/(not\) on Windows.

See Also

basename,normalizePath,path.expand.


Display One or More Text Files

Description

Display one or more (plain) text files, in a platformspecific way, typically via a ‘pager’.

Usage

file.show(..., header= rep("", nfiles),          title="R Information",          delete.file=FALSE, pager= getOption("pager"),          encoding="")

Arguments

...

one or more character vectors containing the names of thefiles to be displayed. Paths with havetilde expansion.

header

character vector (of the same length as the number of filesspecified in...) giving a header for each file beingdisplayed. Defaults to empty strings.

title

an overall title for the display. If a single separatewindow is used for the display,title will be used as the windowtitle. If multiple windows are used, their titles should combine the titleand the file-specific header.

delete.file

should the files be deleted after display? Usedfor temporary files.

pager

the pager to be used, see ‘Details’.

encoding

character string giving the encoding to be assumed forthe file(s).

Details

This function provides the core of the R help system, but it can beused for other purposes as well, such aspage.

How the pager is implemented is highly system-dependent.

The basic Unix version concatenates the files (using the headers) to atemporary file, and displays it in the pager selected by thepager argument, which is a character vector specifying a systemcommand (a full path or a command found on thePATH) to run onthe set of files. The ‘factory-fresh’ default is to use‘R_HOME/bin/pager’, which is a shell script running the command-linespecified by the environment variablePAGER whose default is setat configuration, usually toless. On a Unix-alikemore is used ifpager is empty.

Most GUI systems will use a separate pager window for each file, andlet the user leave it up whileR continues running. The selection ofsuch pagers could either be done using special pager names beingintercepted by lower-level code (such as"internal" and"console" on Windows), or by lettingpager be anRfunction which will be called with arguments(files, header, title, delete.file) corresponding to the first four arguments offile.show and take care of interfacing to the GUI.

TheR.app GUI on macOS uses its internal pager irrespectiveof the setting ofpager.

Not all implementations will honourdelete.file. Inparticular, using an external pager on Windows does not, as there isno way to know when the external application has finished with thefile.

Author(s)

Ross Ihaka, Brian Ripley.

See Also

file.exists,list.files.

Text-typehelp andRShowDoc callfile.show.

ConsidergetOption("pdfviewer") and,e.g.,system for displaying pdf files.

file.edit.

Examples

file.show(file.path(R.home("doc"),"COPYRIGHTS"))

File Manipulation

Description

These functions provide a low-level interface to the computer'sfile system.

Usage

file.create(..., showWarnings=TRUE)file.exists(...)file.remove(...)file.rename(from, to)file.append(file1, file2)file.copy(from, to, overwrite= recursive, recursive=FALSE,          copy.mode=TRUE, copy.date=FALSE)file.symlink(from, to)file.link(from, to)

Arguments

...,file1,file2

character vectors, containing file names or paths.

from,to

character vectors, containing file names or paths.Forfile.copy andfile.symlink

to can alternatively be the path to a single existing directory.

overwrite

logical; should existing destination files be overwritten?

showWarnings

logical; should the warnings on failure be shown?

recursive

logical. Ifto is a directory, shoulddirectories infrom be copied (and their contents)? (Likecp -R on POSIX OSes.)

copy.mode

logical: should file permission bits be copied wherepossible?

copy.date

logical: should file dates be preserved wherepossible? SeeSys.setFileTime.

Details

The... arguments are concatenated to form one characterstring: you can specify the files separately or as one vector. All ofthese functions expand path names: seepath.expand. (file.exists silently reports falsefor paths that would be too long after expansion: the rest will give awarning.)

file.create creates files with the given names if they do notalready exist and truncates them if they do. They are created withthe maximal read/write permissions allowed by the‘umask’ setting (where relevant). By default a warningis given (with the reason) if the operation fails.

file.exists returns a logical vector indicating whether thefiles named by its argument exist. (Here ‘exists’ is in thesense of the system'sstat call: a file will be reported asexisting only if you have the permissions needed bystat.Existence can also be checked byfile.access, whichmight use different permissions and so obtain a different result.Note that the existence of a file does not imply that it is readable:for that usefile.access.) What constitutes a‘file’ is system-dependent, but should include directories.(However, directory names must not include a trailing backslash orslash on Windows.) Note that if the file is a symbolic link on aUnix-alike, the result indicates if the link points to an actual file,not just if the link exists. On Windows, the result is unreliable for abroken symbolic link (junction).Lastly, note thedifferent functionexists whichchecks for existence ofR objects.

file.remove attempts to remove the files named in its argument.On most Unix platforms ‘file’ includesemptydirectories, symbolic links, fifos and sockets. On Windows,‘file’ means a regular file and not, say, an empty directory.

file.rename attempts to rename files (andfrom andto must be of the same length). Where file permissions allowthis will overwrite an existing element ofto. This is subjectto the limitations of the OS's corresponding system call (seesomething likeman 2 rename on a Unix-alike): in particularin the interpretation of ‘file’: most platforms will not renamefiles from one file system to another.NB: This means thatrenaming a file from a temporary directory to the user's filespace orduring package installation will often fail. (On Windows,file.rename can rename files but not directories acrossvolumes.) On platforms which allow directories to be renamed,typically neither or both offrom andto must adirectory, and ifto exists it must be an empty directory.

file.append attempts to append the files named by itssecond argument to those named by its first. TheR subscriptrecycling rule is used to align names given in vectorsof different lengths.

file.copy works in a similar way tofile.append but withthe arguments in the natural order for copying. Copying to existingdestination files is skipped unlessoverwrite = TRUE. Theto argument can specify a single existing directory. Ifcopy.mode = TRUE file read/write/execute permissions are copiedwhere possible, restricted by ‘umask’. (On Windows thisapplies only to files.) Other security attributes such asACLs are notcopied. On a POSIX filesystem the targets of symbolic links will becopied rather than the links themselves, and hard links are copiedseparately. Usingcopy.date = TRUE may or may not copy thetimestamp exactly (for example, fractional seconds may be omitted),but is more likely to do so as fromR 3.4.0.

file.symlink andfile.link make symbolic and hard linkson those file systems which support them. Forfile.symlink theto argument can specify a single existing directory. (Unix andmacOS native filesystems support both. Windows has hard links tofiles onNTFS file systems and concepts related to symbolic links onrecent versions: see the section below on the Windows version of thishelp page. What happens on a FAT orSMB-mounted file system is OS-specific.)

File arguments with a marked encoding (seeEncoding areif possible translated to the native encoding, except on Windows whereUnicode file operations are used (so marking as UTF-8 can be used toaccess file paths not in the native encoding on suitable filesystems).

Value

These functions return a logical vector indicating whichoperation succeeded for each of the files attempted. Using a missingvalue for a file or path name will always be regarded as a failure.

IfshowWarnings = TRUE,file.create will give a warningfor an unexpected failure.

Case-insensitive file systems

Case-insensitive file systems are the norm on Windows and macOS,but can be found on all OSes (for example a FAT-formatted USB drive isprobably case-insensitive).

These functions will most likely match existing files regardless of caseon such file systems: however this is an OS function and it ispossible that file names might be mapped to upper or lower case.

Warning

Always check the return value of these functions when used in packagecode. This is especially important forfile.rename, which hasOS-specific restrictions (and note that the session temporarydirectory is commonly on a different file system from the workingdirectory): it is only portable to usefile.rename to changefile name(s) within a single directory.

Author(s)

Ross Ihaka, Brian Ripley

See Also

file.info,file.access,file.path,file.show,list.files,unlink,basename,path.expand.

dir.create.

Sys.glob to expand wildcards in file specifications.

file_test,Sys.readlink (for ‘symlink’s).

https://en.wikipedia.org/wiki/Hard_link andhttps://en.wikipedia.org/wiki/Symbolic_link for the concepts oflinks and their limitations.

Examples

cat("file A\n", file="A")cat("file B\n", file="B")file.append("A","B")file.create("A")# (trashing previous)file.append("A", rep("B",10))if(interactive()) file.show("A")# -> the 10 lines from 'B'file.copy("A","C")dir.create("tmp")file.copy(c("A","B"),"tmp")list.files("tmp")# -> "A" and "B"setwd("tmp")file.remove("A")# the tmp/A filefile.symlink(file.path("..", c("A","B")),".")# |--> (TRUE,FALSE) : ok for A but not B as it exists alreadysetwd("..")unlink("tmp", recursive=TRUE)file.remove("A","B","C")

Manipulation of Directories and File Permissions

Description

These functions provide a low-level interface to the computer'sfile system.

Usage

dir.exists(paths)dir.create(path, showWarnings=TRUE, recursive=FALSE, mode="0777")Sys.chmod(paths, mode="0777", use_umask=TRUE)Sys.umask(mode=NA)

Arguments

path

a character vector containing a single path name. Tildeexpansion (seepath.expand) is done.

paths

character vectors containing file or directory paths. Tildeexpansion (seepath.expand) is done.

showWarnings

logical; should the warnings on failure be shown?

recursive

logical. Should elements of the path other than thelast be created? If true, like the Unix commandmkdir -p.

mode

the mode to be used on Unix-alikes: it will becoerced byas.octmode. ForSys.chmod it isrecycled alongpaths.

use_umask

logical: should the mode be restricted by theumask setting?

Details

dir.exists checks that the paths exist (in the same sense asfile.exists) and are directories.

dir.create creates the last element of the path, unlessrecursive = TRUE. Trailing path separators are discarded.

The mode will be modified by theumask setting in the same wayas for the system functionmkdir. What modes can be set isOS-dependent, and it is unsafe to assume that more than three octaldigits will be used. For more details see your OS's documentation on thesystem callmkdir, e.g.man 2 mkdir (and not that onthe command-line utility of that name).

One of the idiosyncrasies of Windows is that directory creation mayreport success but create a directory with a different name, forexampledir.create("G.S.") creates ‘"G.S"’. This isundocumented, and what are the precise circumstances is unknown (andmight depend on the version of Windows). Also avoid directory nameswith a trailing space.

Sys.chmod sets the file permissions of one or more files.It may not be supported on a system (when a warning is issued).See the comments fordir.create for how modes are interpreted.Changing mode on a symbolic link is unlikely to work (nor benecessary). For more details see your OS's documentation on thesystem callchmod, e.g.man 2 chmod (and not that onthe command-line utility of that name). Whether this changes thepermission of a symbolic link or its target is OS-dependent (althoughto change the target is more common, and POSIX does not support modesfor symbolic links: BSD-based Unixes do, though).

Sys.umask sets theumask and returns the previous value:as a special casemode = NA just returns the current value.It may not be supported (when a warning is issued and"0"is returned). For more details see your OS's documentation on thesystem callumask, e.g.man 2 umask.

How modes are handled depends on the file system, even on Unix-alikes(although their documentation is often written assuming a POSIX filesystem). So treat documentation cautiously if you are using, say, aFAT/FAT32 or network-mounted file system.

Seefiles for how file paths with marked encodings are interpreted.

Value

dir.exists returns a logical vector ofTRUE orFALSE values (without names).

dir.create andSys.chmod return invisibly a logical vectorindicating if the operation succeeded for each of the files attempted.Using a missing value for a path name will always be regarded as afailure.dir.create indicates failure if the directory alreadyexists. IfshowWarnings = TRUE,dir.create will give awarning for an unexpected failure (e.g., not for a missing value norfor an already existing component forrecursive = TRUE).

Sys.umask returns the previous value of theumask,as a length-one object of class"octmode": thevisibility flag is off unlessmode isNA.

See also the section in the help forfile.exists oncase-insensitive file systems for the interpretation ofpathandpaths.

Author(s)

Ross Ihaka, Brian Ripley

See Also

file.info,file.exists,file.path,list.files,unlink,basename,path.expand.

Examples

## Not run:## Fix up maximal allowed permissions in a file treeSys.chmod(list.dirs("."),"777")f<- list.files(".", all.files=TRUE, full.names=TRUE, recursive=TRUE)Sys.chmod(f,(file.mode(f)|"664"))## End(Not run)

Find Packages

Description

Find the paths to one or more packages.

Usage

find.package(package, lib.loc=NULL, quiet=FALSE,             verbose= getOption("verbose"))path.package(package, quiet=FALSE)packageNotFoundError(package, lib.loc, call=NULL)

Arguments

package

character vector: the names of packages.

lib.loc

a character vector describing the location ofRlibrary trees to search through, orNULL. The default valueofNULL corresponds to checking the loaded namespace, thenall libraries currently known in.libPaths().

quiet

logical. Should this not give warnings or an errorif the package is not found?

verbose

a logical. IfTRUE, additional diagnostics areprinted, notably when a package is found more than once.

call

call expression.

Details

find.package returns path to the locations where thegiven packages are found. Iflib.loc isNULL, thenloaded namespaces are searched before the libraries. If a package isfound more than once, the first match is used. Unlessquiet = TRUE a warning will be given about the named packages which are notfound, and an error if none are. Ifverbose is true, warningsabout packages found more than once are given. For a package to bereturned it must contain a either a ‘Meta’ subdirectory or a‘DESCRIPTION’ file containing a validversion field, butit need not be installed (it could be a source package iflib.loc was set suitably).

find.package is not usually the right tool to find out if apackage is available for use: the only way to do that is to userequire to try to load it. It need not be installed forthe correct platform, it might have a version requirement not met bythe running version ofR, there might be dependencies which are notavailable, ....

path.package returns the paths from which the named packageswere loaded, or if none were named, for all currently attached packages.Unlessquiet = TRUE it will warn if some of the packages namedare not attached, and given an error if none are.

packageNotFoundError creates an error condition object of classpackageNotFoundError for signaling errors. The condition objectcontains the fieldspackage andlib.loc.

Value

A character vector of paths of package directories.

See Also

path.expand andnormalizePath for pathstandardization.

Examples

try(find.package("knitr"))## will not give an error, maybe a warning about *all* locations it is found:find.package("kitty", quiet=TRUE, verbose=TRUE)## Find all .libPaths() entries a package is found:findPkgAll<-function(pkg)  unlist(lapply(.libPaths(),function(lib)           find.package(pkg, lib, quiet=TRUE, verbose=FALSE)))findPkgAll("MASS")findPkgAll("knitr")

Find Interval Numbers or Indices

Description

Given a vector of non-decreasing breakpoints invec, find theinterval containing each element ofx; i.e., ifi <- findInterval(x,v), for each indexj inxvijxj<vij+1v_{i_j} \le x_j < v_{i_j + 1}wherev0:=v_0 := -\infty,vN+1:=+v_{N+1} := +\infty, andN <- length(v).At the two boundaries, the returned index may differ by 1, dependingon the optional argumentsrightmost.closed andall.inside.

Usage

findInterval(x, vec, rightmost.closed=FALSE, all.inside=FALSE,             left.open=FALSE)

Arguments

x

numeric.

vec

numeric, sorted (weakly) increasingly, of lengthN,say.

rightmost.closed

logical; if true, the rightmost interval,vec[N-1] .. vec[N] is treated asclosed, see below.

all.inside

logical; if true, the returned indices are coercedinto1,...,N-1, i.e.,0 is mapped to1andN toN-1.

left.open

logical; if true all the intervals are open at leftand closed at right; in the formulas below,\le should beswapped with<< (and>> with\ge), andrightmost.closed means ‘leftmost is closed’. This maybe useful, e.g., in survival analysis computations.

Details

The functionfindInterval finds the index of one vectorx inanother,vec, where the latter must be non-decreasing. Wherethis is trivial, equivalent toapply( outer(x, vec, `>=`), 1, sum),as a matter of fact, the internal algorithm uses interval searchensuringO(nlogN)O(n \log N) complexity wheren <- length(x) (andN <- length(vec)). For (almost)sortedx, it will be even faster, basicallyO(n)O(n).

This is the same computation as for the empirical distributionfunction, and indeed,findInterval(t, sort(X)) isidentical tonFn(t;X1,,Xn)n F_n(t; X_1,\dots,X_n) whereFnF_n is the empirical distributionfunction ofX1,,XnX_1,\dots,X_n.

Whenrightmost.closed = TRUE, the result forx[j] = vec[N](=maxvec= \max vec), isN - 1 as for all othervalues in the last interval.

left.open = TRUE is occasionally useful, e.g., for survival data.For (anti-)symmetry reasons, it is equivalent to using“mirrored” data, i.e., the following is always true:

    identical(          findInterval( x,  v,      left.open= TRUE, ...) ,      N - findInterval(-x, -v[N:1], left.open=FALSE, ...) )

whereN <- length(vec) as above.

Value

vector of lengthlength(x) with values in0:N (andNA) whereN <- length(vec), or values coerced to1:(N-1) if and only ifall.inside = TRUE (equivalently coercing allx valuesinside the intervals). Note thatNAs arepropagated fromx, andInf values are allowed inbothx andvec.

Author(s)

Martin Maechler

See Also

approx(*, method = "constant") which is ageneralization offindInterval(),ecdf forcomputing the empirical distribution function which is (up to a factorofnn) also basically the same asfindInterval(.).

Examples

x<-2:18v<- c(5,10,15)# create two bins [5,10) and [10,15)cbind(x, findInterval(x, v))N<-100X<- sort(round(stats::rt(N, df=2),2))tt<- c(-100, seq(-2,2, length.out=201),+100)it<- findInterval(tt, X)tt[it<1| it>= N]# only first and last are outside range(X)##  'left.open = TRUE' means  "mirroring" :N<- length(v)stopifnot(identical(                  findInterval( x,  v,  left.open=TRUE),              N- findInterval(-x,-v[N:1])))

Force Evaluation of an Argument

Description

Forces the evaluation of a function argument.

Usage

force(x)

Arguments

x

a formal argument of the enclosing function.

Details

force forces the evaluation of a formal argument. This canbe useful if the argument will be captured in a closure by the lexicalscoping rules and will later be altered by an explicit assignment oran implicit assignment in a loop or an apply function.

Note

This is semantic sugar: just evaluating the symbol will do thesame thing (see the examples).

force does not force the evaluation of otherpromises. (It works by forcing the promise thatis created when the actual arguments of a call are matched to theformal arguments of a closure, the mechanism which implementslazy evaluation.)

Examples

f<-function(y)function() ylf<- vector("list",5)for(iin seq_along(lf)) lf[[i]]<- f(i)lf[[1]]()# returns 5g<-function(y){ force(y);function() y}lg<- vector("list",5)for(iin seq_along(lg)) lg[[i]]<- g(i)lg[[1]]()# returns 1## This is identical tog<-function(y){ y;function() y}

Call a function with Some Arguments Forced

Description

Call a function with a specified number of leading arguments forcedbefore the call if the function is a closure.

Usage

forceAndCall(n, FUN,...)

Arguments

n

number of leading arguments to force.

FUN

function to call.

...

arguments toFUN.

Details

forceAndCall calls the functionFUN with argumentsspecified in.... If the value ofFUN is a closurethen the firstn arguments to the function are evaluated(i.e. their delayed evaluation promises are forced) before executingthe function body. If the value ofFUN is a primitive thenthe callFUN(...) is evaluated in the usual way.

forceAndCall is intended to help defining higher orderfunctions likeapply to behave more reasonably when theresult returned by the function applied is a closure that captured itsarguments.

See Also

force,promise,closure.


Foreign Function Interface

Description

Functions to make calls to compiled code that has been loaded intoR.

Usage

.C(.NAME,..., NAOK=FALSE, DUP=TRUE, PACKAGE, ENCODING) .Fortran(.NAME,..., NAOK=FALSE, DUP=TRUE, PACKAGE, ENCODING)

Arguments

.NAME

a character string giving the name of a C function orFortran subroutine, or an object of class"NativeSymbolInfo","RegisteredNativeSymbol"or"NativeSymbol" referring to such a name.

...

arguments to be passed to the foreign function. Up to 65.

NAOK

ifTRUE then anyNA orNaN orInf values in the arguments arepassed on to the foreign function. IfFALSE, the presence ofNA orNaN orInf values is regarded as an error.

PACKAGE

if supplied, confine the search for a character string.NAME to the DLL given by this argument (plus theconventional extension, ‘.so’, ‘.dll’, ...).

This is intended to add safety for packages, which can ensure byusing this argument that no other package can override their externalsymbols, and also speeds up the search (see ‘Note’).

DUP,ENCODING

For back-compatibility, accepted but ignored.

Details

These functions can be used to make calls to compiled C and Fortrancode. Later interfaces are.Call and.External which are more flexible and have betterperformance.

These functions areprimitive, and.NAME is alwaysmatched to the first argument supplied (which should not be named).The other named arguments follow... and so cannot beabbreviated. For clarity, should avoid using names in the argumentspassed to... that match or partially match.NAME.

Value

A list similar to the... list of arguments passed in(including any names given to the arguments), but reflecting anychanges made by the C or Fortran code.

Argument types

The mapping of the types ofR arguments to C or Fortran arguments is

R C Fortran
integerint *integer
numericdouble *double precision
-- or --float *real
complexRcomplex *double complex
logicalint *integer
characterchar ** [see below]
rawunsigned char * not allowed
listSEXP * not allowed
otherSEXP not allowed

Note: The C types corresponding tointeger andlogical areint, notlong as in S. Thisdifference matters on most 64-bit platforms, whereint is32-bit andlong is 64-bit (but not on 64-bit Windows).

Note: The Fortran type corresponding tological isinteger, notlogical: the difference matters on someFortran compilers.

Numeric vectors inR will be passed as typedouble * to C(and asdouble precision to Fortran) unless the argument hasattributeCsingle set toTRUE (useas.single orsingle). This mechanism isonly intended to be used to facilitate the interfacing of existing Cand Fortran code.

The C typeRcomplex is defined in ‘Complex.h’ as atypedef struct {double r; double i;}. It may or may not beequivalent to the C99double complex type, depending on thecompiler used.

Logical values are sent as0 (FALSE),1(TRUE) orINT_MIN = -2147483648 (NA, but only ifNAOK = TRUE), and the compiled code should return one of thesethree values: however non-zero values other thanINT_MIN aremapped toTRUE.

Missing (NA) string values are passed to.C as the string"NA". As the Cchar type can represent all possible bit patternsthere appears to be no way to distinguish missing strings from thestring"NA". If this distinction is important use.Call.

Using a character string with.Fortran is deprecated and willgive a warning. It passes the first (only) character string of acharacter vector as a C character array to Fortran: that may be usableascharacter*255 if its true length is passed separately. Onlyup to 255 characters of the string are passed back. (How well thisworks, and even if it works at all, depends on the C and Fortrancompilers and the platform.)

Lists, functions or otherR objects can (for historical reasons) bepassed to.C, but the.Call interface is muchpreferred. All inputs apart from atomic vectors should be regarded asread-only, and all apart from vectors (including lists), functions andenvironments are now deprecated.

Fortran symbol names

All Fortran compilers known to be usable to compileR map symbol namesto lower case, and so does.Fortran.

Symbol names containing underscores are not valid Fortran 77 (althoughthey are valid in Fortran 9x). Many Fortran 77 compilers will allowthem but may translate them in a different way to names not containingunderscores. Such names will often work with.Fortran (sincehow they are translated is detected whenR is built and theinformation used by.Fortran), but portable code should not useFortran names containing underscores.

Use.Fortran with care for compiled Fortran 9x code: it may notwork if the Fortran 9x compiler used differs from the Fortran compilerused when configuringR, especially if the subroutine name is notlower-case or includes an underscore. The most portable way to callFortran 9x code fromR is to use.C and the Fortran 2003moduleiso_c_binding to provide a C interface to the Fortrancode.

Copying of arguments

Character vectors are copied before calling the compiled code and tocollect the results. For other atomic vectors the argument is copiedbefore calling the compiled code if it is otherwise used in thecalling code.

Non-atomic-vector objects are read-only to the C code and are nevercopied.

This behaviour can be changed by settingoptions(CBoundsCheck = TRUE). In that case raw,logical, integer, double and complex vector arguments are copied bothbefore and after calling the compiled code. The first copy made isextended at each end by guard bytes, and on return it is checked thatthese are unaltered. For.C, each element of a charactervector uses guard bytes.

Note

If one of these functions is to be used frequently, do specifyPACKAGE (to confine the search to a single DLL) or pass.NAME as one of the native symbol objects. Searching forsymbols can take a long time, especially when many namespaces are loaded.

You may seePACKAGE = "base" for symbols linked intoR. Donot use this in your own code: such symbols are not part of the APIand may be changed without warning.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

dyn.load,.Call.

The ‘Writing R Extensions’ manual.


Access to and Manipulation of the Formal Arguments

Description

Get or set the formal arguments of afunction.

Usage

formals(fun= sys.function(sys.parent()), envir= parent.frame())formals(fun, envir= environment(fun))<- value

Arguments

fun

afunction, or see ‘Details’.

envir

environment in which the function should bedefined (or found viaget() in the first case and whenfun a character string).

value

alist (orpairlist, hence possiblyNULL) ofR expressions.

Details

For the first form,fun can also be a character string namingthe function to be manipulated, which is searched for inenvir,by default from the parentframe. If it is not specified, the function callingformals isused.

Onlyclosures, i.e., non-primitive functions, have formals, notprimitive functions.
Note thatformals(args(f)) gives a formal argument list forall functionsf, primitive or not.

Value

formals returns the formal argument list of the functionspecified, as apairlist, orNULL for anon-function or primitive.

The replacement form sets the formals of a function to thelist/pairlist on the right hand side, and (potentially) resets theenvironment of the function, droppingattributes.

See Also

formalArgs (frommethods), a shortcut fornames(formals(.)).args for a human-readable version,and asintermediary to get formals of a primitive function.
alist toconstruct a typical formalsvalue,see the examples.

The three parts of a (non-primitive)function are itsformals,body, andenvironment.

Examples

require(stats)formals(lm)## If you just want the names of the arguments, use formalArgs instead.names(formals(lm))methods:: formalArgs(lm)# same## formals returns a pairlist. Arguments with no default have type symbol (aka name).str(formals(lm))## formals returns NULL for primitive functions.  Use it in combination with## args for this case.is.primitive(`+`)formals(`+`)formals(args(`+`))## You can overwrite the formal arguments of a function (though this is## advanced, dangerous coding).f<-function(x) a+ bformals(f)<- alist(a=, b=3)f# function(a, b = 3) a + bf(2)# result = 5

Encode in a Common Format

Description

Format anR object for pretty printing.

Usage

format(x,...)## Default S3 method:format(x, trim=FALSE, digits=NULL, nsmall=0L,       justify= c("left","right","centre","none"),       width=NULL, na.encode=TRUE, scientific=NA,       big.mark="",   big.interval=3L,       small.mark="", small.interval=5L,       decimal.mark= getOption("OutDec"),       zero.print=NULL, drop0trailing=FALSE,...)## S3 method for class 'data.frame'format(x,..., justify="none")## S3 method for class 'factor'format(x,...)## S3 method for class 'AsIs'format(x, width=12,...)

Arguments

x

anyR object (conceptually); typically numeric.

trim

logical; ifFALSE, logical, numeric and complexvalues are right-justified to a common width: ifTRUE theleading blanks for justification are suppressed.

digits

a positive integer indicating how many significant digitsare to be used fornumeric and complexx. The default,NULL, usesgetOption("digits"). This is a suggestion: enough decimalplaces will be used so that the smallest (in magnitude) number hasthis many significant digits, and also to satisfynsmall.(For more, notably the interpretation for complex numbers seesignif.)

nsmall

the minimum number of digits to the right of the decimalpoint in formatting real/complex numbers in non-scientific formats.Allowed values are0 <= nsmall <= 20.

justify

should acharacter vector be left-justified (thedefault), right-justified, centred or left alone. Can be abbreviated.

width

default method: theminimum field width orNULL or0 for no restriction.

AsIs method: themaximum field width for non-characterobjects.NULL corresponds to the default12.

na.encode

logical: shouldNA strings be encoded? Notethis only applies to elements of character vectors, not to numerical,complex nor logicalNAs, which are always encoded as"NA".

scientific

either a logical specifying whetherelements of a real or complex vector should be encoded in scientificformat, or an integer penalty (seeoptions("scipen")).Missing values correspond to the current default penalty.

...

further arguments passed to or from other methods.

big.mark,big.interval,small.mark,small.interval,decimal.mark,zero.print,drop0trailing

used for prettying (longish) numerical and complex sequences.Passed toprettyNum: that help page explains the details.

Details

format is a generic function. Apart from the methods describedhere there are methods for dates (seeformat.Date),date-times (seeformat.POSIXct) and for other classes suchasformat.octmode andformat.dist.

format.data.frame formats the data frame column by column,applying the appropriate method offormat for each column.Methods for columns are often similar toas.character but offermore control. Matrix and data-frame columns will be converted toseparate columns in the result, and character columns (normally all)will be given class"AsIs".

format.factor converts the factor to a character vector andthen calls the default method (and sojustify applies).

format.AsIs deals with columns of complicated objects thathave been extracted from a data frame. Character objects and (atomic)matrices are passed to the default method (and sowidth doesnot apply).Otherwise it callstoString to convert the objectto character (if a vector or list, element by element) and thenright-justifies the result.

Justification for character vectors (and objects converted tocharacter vectors by their methods) is done on display width (seenchar), taking double-width characters and the renderingof special characters (as escape sequences, including escapingbackslash but not double quote: seeprint.default) intoaccount. Thus the width is as displayed byprint(quote = FALSE) and not as displayed bycat. Character stringsare padded with blanks to the display width of the widest. (Ifna.encode = FALSE missing character strings are not included inthe width computations and are not encoded.)

Numeric vectors are encoded with the minimum number of decimal placesneeded to display all the elements to at least thedigitssignificant digits. However, if all the elements then have trailingzeroes, the number of decimal places is reduced until at least oneelement has a non-zero final digit; see also the argumentdocumentation forbig.*,small.* etc, above. See thenote inprint.default aboutdigits >= 16.

Raw vectors are converted to their 2-digit hexadecimal representationbyas.character.

format.default(x) now provides a “minimal” string whenisS4(x) is true.

While the internal code respects the optiongetOption("OutDec") for the ‘decimal mark’ in general,decimal.mark takes precedence over that option. Similarly,scientific takes precedence overgetOption("scipen").

Value

An object of similar structure tox containing characterrepresentations of the elements of the first argumentxin a common format, and in the current locale's encoding.

For character, numeric, complex or factorx, dims and dimnamesare preserved on matrices/arrays and names on vectors: no otherattributes are copied.

Ifx is a list, the result is a character vector obtained byapplyingformat.default(x, ...) to each element of the list(afterunlisting elements which are themselves lists),and then collapsing the result for each element withpaste(collapse = ", "). The defaults in this case aretrim = TRUE, justify = "none" since one does not usually wantalignment in the collapsed strings.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

format.info indicates how an atomic vector would beformatted.

formatC,paste,as.character,sprintf,print,prettyNum,toString,encodeString.

Examples

format(1:10)format(1:10, trim=TRUE)zz<- data.frame("(row names)"= c("aaaaa","b"), check.names=FALSE)format(zz)format(zz, justify="left")## use of nsmallformat(13.7)format(13.7, nsmall=3)format(c(6.0,13.1), digits=2)format(c(6.0,13.1), digits=2, nsmall=1)## use of scientificformat(2^31-1)format(2^31-1, scientific=TRUE)## scientific = numeric scipen (= {sci}entific notation {pen}alty) :x<- c(1e5,1000,10,0.1,.001,.123)t(sapply(setNames(,-4:1),         \(sci) sapply(x, format, scientific=sci)))## a listz<- list(a= letters[1:3], b=(-pi+0i)^((-2:2)/2), c= c(1,10,100,1000),          d= c("a","longer","character","string"),          q= quote( a+ b), e= expression(1+x))## can you find the "2" small differences?(f1<- format(z, digits=2))(f2<- format(z, digits=2, justify="left", trim=FALSE))f1== f2## 2 FALSE, 4 TRUE## A "minimal" format() for S4 objects without their own format() method:cc<- methods::getClassDef("standardGeneric")format(cc)## "<S4 class ......>"

format(.) Information

Description

Information is returned on howformat(x, digits, nsmall)would be formatted.

Usage

format.info(x, digits=NULL, nsmall=0)

Arguments

x

an atomic vector; a potential argument offormat(x, ...).

digits

how many significant digits are to be used fornumeric and complexx. The default,NULL, usesgetOption("digits").

nsmall

(seeformat(..., nsmall)).

Value

Anintegervector of length 1, 3 or 6, sayr.

For logical, integer and character vectors a single element,the width which would be used byformat ifwidth = NULL.

For numeric vectors:

r[1]

width (in characters) used byformat(x)

r[2]

number of digits after decimal point.

r[3]

in0:2; if\ge1,exponentialrepresentation would be used, with exponent length ofr[3]+1.

For a complex vector the first three elements refer to the real parts,and there are three further elements corresponding to the imaginaryparts.

See Also

format (notably aboutdigits >= 16),formatC.

Examples

dd<- options("digits"); options(digits=7)#-- for the followingformat.info(123)# 3 0 0format.info(pi)# 8 6 0format.info(1e8)# 5 0 1 - exponential "1e+08"format.info(1e222)# 6 0 2 - exponential "1e+222"x<- pi*10^c(-10,-2,0:2,8,20)names(x)<- formatC(x, width=1, digits=3, format="g")cbind(sapply(x, format))t(sapply(x, format.info))## using at least 8 digits right of "."t(sapply(x, format.info, nsmall=8))# Reset old options:options(dd)

Format P Values

Description

format.pval is intended for formatting p-values.

Usage

format.pval(pv, digits= max(1, getOption("digits")-2),            eps= .Machine$double.eps, na.form="NA",...)

Arguments

pv

a numeric vector.

digits

how many significant digits are to be used.

eps

a numerical tolerance: see ‘Details’.

na.form

character representation ofNAs.

...

further arguments to be passed toformatsuch asnsmall.

Details

format.pval is mainly an auxiliary function forprint.summary.lm etc., and does separate formatting forfixed, floating point and very small values; those less thaneps are formatted as"< [eps]" (where ‘[eps]’stands forformat(eps, digits)).

Value

A character vector.

Examples

format.pval(c(stats::runif(5), pi^-100,NA))format.pval(c(0.1,0.0001,1e-27))

Formatting Using C-style Formats

Description

formatC() formats numbers individually and flexibly usingC style format specifications.

prettyNum() is used for “prettifying” (possiblyformatted) numbers, also informat.default.

.format.zeros(x), an auxiliary function ofprettyNum(),re-formats the zeros in a vectorx of formatted numbers.

Usage

formatC(x, digits=NULL, width=NULL,        format=NULL, flag="", mode=NULL,        big.mark="", big.interval=3L,        small.mark="", small.interval=5L,        decimal.mark= getOption("OutDec"),        preserve.width="individual",        zero.print=NULL, replace.zero=TRUE,        drop0trailing=FALSE)prettyNum(x, big.mark="",   big.interval=3L,          small.mark="", small.interval=5L,          decimal.mark= getOption("OutDec"), input.d.mark= decimal.mark,          preserve.width= c("common","individual","none"),          zero.print=NULL, replace.zero=FALSE,          drop0trailing=FALSE, is.cmplx=NA,...).format.zeros(x, zero.print, nx= suppressWarnings(as.numeric(x)),              replace=FALSE, warn.non.fitting=TRUE)

Arguments

x

an atomic numerical or character object, possiblycomplex only forprettyNum(), typically avector of real numbers. Any class is discarded, with a warning.

digits

the desired number of digits after the decimalpoint (format = "f") orsignificant digits(format = "g",= "e" or= "fg").

Default: 2 for integer, 4 for real numbers. If less than 0,the C default of 6 digits is used. If specified as more than 50, 50will be used with a warning unlessformat = "f" where it islimited to typically 324. (Not more than 15–21 digits need beaccurate, depending on the OS and compiler used. This limit isjust a precaution against segfaults in the underlying C runtime.)

width

the total field width; if bothdigits andwidth are unspecified,width defaults to 1,otherwise todigits + 1.width = 0 will usewidth = digits,width < 0 means leftjustify the number in this field (equivalent toflag = "-").If necessary, the result will have more characters thanwidth. For character data this is interpreted in characters(not bytes nor display width).

format

equal to"d" (for integers),"f","e","E","g","G","fg" (forreals), or"s" (for strings). Default is"d" forintegers,"g" for reals.

"f" gives numbers in the usualxxx.xxx format;"e" and"E" given.ddde+nn orn.dddE+nn (scientific format);"g" and"G" putx[i] into scientific format only if it saves space to do soand drop trailing zeros and decimal point - unlessflagcontains"#" which keeps trailing zeros for the"g", "G"formats.

"fg" (our own hybrid format) uses fixed format as"f",butdigits as the minimum number ofsignificant digits.This can lead to quite long result strings, see examples below. Notethat unlikesignif this prints large numbers withmore significant digits thandigits. Trailing zeros aredropped in this format, unlessflag contains"#".

flag

forformatC, a character string giving aformat modifier as inKernighan and Ritchie (1988, page 243) or the C+99 standard.

"0"

pads leading zeros;

"-"

does left adjustment,

"+"

ensures a sign in all cases, i.e.,"+" forpositive numbers ,

" "

if the first character is not a sign, the spacecharacter" " will be used instead.

"#"

specifies “an alternative output form”,specifically depending onformat.

"'"

on some platform–locale combination, activates“thousands' grouping” for decimal conversion,

"I"

in some versions of ‘glibc’ allow for integerconversion to use the locale's alternative output digits, if any.

There can be more than one of these flags, in any order. Other charactersused to have no effect forcharacter formatting, but signalan error sinceR 3.4.0.

mode

"double" (or"real"),"integer" or"character".Default: Determined from the storage mode ofx.

big.mark

character; if not empty used as mark between everybig.interval decimalsbefore (hencebig) thedecimal point.

big.interval

seebig.mark above; defaults to 3.

small.mark

character; if not empty used as mark between everysmall.interval decimalsafter (hencesmall) thedecimal point.

small.interval

seesmall.mark above; defaults to 5.

decimal.mark

the character to be used to indicate the numericdecimal point.

input.d.mark

ifx ischaracter, thecharacter known to have been used as the numeric decimal point inx.

preserve.width

string specifying if the string widths shouldbe preserved where possible in those cases where marks(big.mark orsmall.mark) are added."common",the default, corresponds toformat-like behaviorwhereas"individual" is the default informatC(). Value can be abbreviated.

zero.print

logical, character string orNULL specifyingif and howzeros should be formatted specially. Useful forpretty printing ‘sparse’ objects.

replace.zero,replace

logical; ifzero.print is acharacter string, indicates if the exact zero entries inxshould be simply replaced byzero.print. Otherwise,depending on the widths of the respective strings, the (formatted)zeroes arepartly replaced byzero.print and thenpadded with" " to the right were applicable. In that case(falsereplace[.zero]), if thezero.print string doesnot fit, a warning is produced (ifwarn.non.fitting is true).

This works viaprettyNum(), which calls.format.zeros(*, replace=replace.zero) three times in this case, see the ‘Details’.

warn.non.fitting

logical; if it is true,replace[.zero] isfalse and thezero.print string does not fit, awarning is signalled.

drop0trailing

logical, indicating if trailing zeros,i.e.,"0"after the decimal mark, should be removed;also drops"e+00" in exponential formats. This is simply passedtoprettyNum(), see the ‘Details’.

is.cmplx

optional logical, to be used whenx is"character" to indicate if it stems fromcomplex vector or not. By default (NA),x is checked to ‘look like’ complex.

...

arguments passed toformat.

nx

numeric vector of the same length asx, typically thenumbers of which the character vectorx is the pre-format.

Details

For numbers,formatC() callsprettyNum() when neededwhich itself calls.format.zeros(*, replace=replace.zero).(“when needed”: whenzero.print is notNULL,drop0trailing is true, or one ofbig.mark,small.mark, ordecimal.mark is not at default.)

If you setformat it overrides the setting ofmode, soformatC(123.45, mode = "double", format = "d") gives123.

The rendering of scientific format is platform-dependent: some systemsusen.ddde+nnn orn.dddenn rather thann.ddde+nn.

formatC does not necessarily align the numbers on the decimalpoint, soformatC(c(6.11, 13.1), digits = 2, format = "fg") givesc("6.1", " 13"). If you want common formatting for severalnumbers, useformat.

prettyNum is the utility function for prettifyingx.x can be complex (orformat(<complex>)), here. Ifx is not a character,format(x[i], ...) is applied toeach element, and then it is left unchanged if all the other argumentsare at their defaults. Use theinput.d.mark argument forprettyNum(x) whenx is acharacter vector notresulting from something likeformat(<number>) with a period asdecimal mark.

Becausegsub is used to insert thebig.markandsmall.mark, special characters need escaping. In particular,to insert a single backslash, use"\\\\".

The C doubles used forR numerical vectors have signed zeros, whichformatC may output as-0,-0.000 ....

There is a warning ifbig.mark anddecimal.mark are thesame: that would be confusing to those reading the output.

Value

A character object of same size and attributes asx (afterdiscarding any class), in the current locale's encoding.

Unlikeformat, each number is formatted individually.Looping over each element ofx, the C functionsprintf(...) is called for numeric inputs (inside the Cfunctionstr_signif).

formatC: for characterx, do simple (left or right)padding with white space.

Note

The default fordecimal.mark informatC() was changed inR 3.2.0: for use withinprint methods in packages which mightbe used with earlier versions: usedecimal.mark = getOption("OutDec")explicitly.

Author(s)

formatC was originally written by Bill Dunlap for S-PLUS, latermuch improved by Martin Maechler.

It was first adapted forR by Friedrich Leisch and since muchimproved by the R Core team.

References

Kernighan, B. W. and Ritchie, D. M. (1988)The C Programming Language. Second edition. Prentice Hall.

See Also

format.

sprintf for more general C-like formatting.

Examples

xx<- pi*10^(-5:4)cbind(format(xx, digits=4), formatC(xx))cbind(formatC(xx, width=9, flag="-"))cbind(formatC(xx, digits=5, width=8, format="f", flag="0"))cbind(format(xx, digits=4), formatC(xx, digits=4, format="fg"))f<-(-2:4); f<- f*16^f# Default ("g") format:formatC(pi*f)# Fixed ("f") format, more than one flag ('width' partly "enlarged"):cbind(formatC(pi*f, digits=3, width=9, format="f", flag="0+"))formatC(    c("a","Abc","no way"), width=-7)# <=> flag = "-"formatC(c((-1:1)/0,c(1,100)*pi), width=8, digits=1)## note that some of the results here depend on the implementation## of long-double arithmetic, which is platform-specific.xx<- c(1e-12,-3.98765e-10,1.45645e-69,1e-70,pi*1e37,3.44e4)##       1        2             3        4      5       6formatC(xx)formatC(xx, format="fg")# special "fixed" format.formatC(xx[1:4], format="f", digits=75)#>> even longer stringsformatC(c(3.24,2.3e-6), format="f", digits=11)formatC(c(3.24,2.3e-6), format="f", digits=11, drop0trailing=TRUE)r<- c("76491283764.97430","29.12345678901","-7.1234","-100.1","1123")## American:prettyNum(r, big.mark=",")## Some Europeans:prettyNum(r, big.mark="'", decimal.mark=",")(dd<- sapply(1:10,function(i) paste((9:0)[1:i], collapse="")))prettyNum(dd, big.mark="'")## examples of 'small.mark'pN<- stats::pnorm(1:7, lower.tail=FALSE)cbind(format(pN, small.mark=" ", digits=15))cbind(formatC(pN, small.mark=" ", digits=17, format="f"))cbind(ff<- format(1.2345+10^(0:5), width=11, big.mark="'"))## all with same width (one more than the specified minimum)## individual formatting to common width:fc<- formatC(1.234+10^(0:8), format="fg", width=11, big.mark="'")cbind(fc)## Powers of two, stored exactly, formatted individually:pow.2<- formatC(2^-(1:32), digits=24, width=1, format="fg")## nicely printed (the last line showing 5^32 exactly):noquote(cbind(pow.2))## complex numbers:r<-10.0000001; rv<-(r/10)^(1:10)(zv<-(rv+1i*rv))op<- options(digits=7)## (system default)(pnv<- prettyNum(zv))stopifnot(pnv=="1+1i", pnv== format(zv),          pnv== prettyNum(zv, drop0trailing=TRUE))## more digits change the picture:options(digits=8)head(fv<- format(zv),3)prettyNum(fv)prettyNum(fv, drop0trailing=TRUE)# a bit niceroptions(op)## The  '  flag :doLC<-FALSE# <= R warns, so change to TRUE manually if you want see the effectif(doLC){  oldLC<- Sys.getlocale("LC_NUMERIC")           Sys.setlocale("LC_NUMERIC","de_CH.UTF-8")}formatC(1.234+10^(0:4), format="fg", width=11, flag="'")## -->  .....  "      1'001" "     10'001"   on supported platformsif(doLC)## revert, typically to  "C"  :  Sys.setlocale("LC_NUMERIC", oldLC)

Format Description Lists

Description

Format vectors of items and their descriptions as 2-columntables or LaTeX-style description lists.

Usage

formatDL(x, y, style= c("table","list"),         width=0.9* getOption("width"), indent=NULL)

Arguments

x

a vector giving the items to be described, or a list oflength 2 or a matrix with 2 columns giving both items anddescriptions.

y

a vector of the same length asx with thecorresponding descriptions. Only used ifx does not alreadygive the descriptions.

style

a character string specifying the rendering style of thedescription information. Can be abbreviated.If"table", a two-column table withitems and descriptions as columns is produced (similar to Texinfo's⁠@table⁠ environment). If"list", a LaTeX-style taggeddescription list is obtained.

width

a positive integer giving the target column for wrappinglines in the output.

indent

a positive integer specifying the indentation of thesecond column in table style, and the indentation of continuationlines in list style. Must not be greater thanwidth/2, anddefaults towidth/3 for table style andwidth/9 forlist style.

Details

After extracting the vectors of items and corresponding descriptionsfrom the arguments, both are coerced to character vectors.

In table style, items with more thanindent - 3 characters aredisplayed on a line of their own.

Value

a character vector with the formatted entries.

Examples

## Provide a nice summary of the numerical characteristics of the## machine R is running on:writeLines(formatDL(unlist(.Machine)))## Inspect Sys.getenv() results in "list" style (by default, these are## printed in "table" style):writeLines(formatDL(Sys.getenv(), style="list"))

Function Definition

Description

These functions provide the base mechanisms for definingnew functions in theR language.

Usage

function( arglist) expr\( arglist) exprreturn(value)

Arguments

arglist

empty or one or more (comma-separated) ‘⁠name⁠’ or‘⁠name = expression⁠’ termsand/or the special token....

expr

an expression.

value

an expression.

Details

The names in an argument list can be back-quoted non-standard names(see ‘backquote’).

Ifvalue is missing,NULL is returned. If it is asingle expression, the value of the evaluated expression is returned.(The expression is evaluated as soon asreturn is called, inthe evaluation frame of the function and before anyon.exit expression is evaluated.)

If the end of a function is reached without callingreturn, thevalue of the last evaluated expression is returned.

The shorthand form\(x) x + 1 is parsed asfunction(x) x + 1. It may be helpful in making code containing simple functionexpressions more readable.

Technical details

This type of function is not the only type inR: they are calledclosures (a name with origins in LISP) to distinguish them fromprimitive functions.

A closure has three components, itsformals (its argumentlist), itsbody (expr in the ‘Usage’section) and itsenvironment which provides theenclosure of the evaluation frame when the closure is used.

There is an optional further component if the closure has beenbyte-compiled. This is not normally user-visible, but is indicatedwhen functions are printed.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

args.

formals,body andenvironment for accessing the component parts of afunction.

debug for debugging; usinginvisible insidereturn(.) for returninginvisibly.

Examples

norm<-function(x) sqrt(x%*%x)norm(1:4)## An anonymous function:(function(x, y){ z<- x^2+ y^2; x+y+z})(0:7,1)

Common Higher-Order Functions in Functional Programming Languages

Description

Reduce

uses a binary function to successively combine theelements of a given vector and a possibly given initial value.

Filter

extracts the elements of a vector for which a predicate(logical) function gives true.

Find andPosition

give the first or last suchelement and its position in the vector, respectively.

Map

applies a function to the corresponding elements of given vectors.

Negate

creates the negation of a given function.

Usage

Reduce(f, x, init, right=FALSE, accumulate=FALSE, simplify=TRUE)Filter(f, x)Find(f, x, right=FALSE, nomatch=NULL)Map(f,...)Negate(f)Position(f, x, right=FALSE, nomatch=NA_integer_)

Arguments

f

a function of the appropriate arity (binary forReduce, unary forFilter,Find andPosition,kk-ary forMap if this is called withkk arguments). An arbitrary predicate function forNegate.

x

a vector.

init

anR object of the same kind as the elements ofx.

right

a logical indicating whether to proceed from left toright (default) or from right to left.

accumulate

a logical indicating whether the successive reducecombinations should be accumulated. By default, only the finalcombination is used.

simplify

a logical indicating whether accumulated resultsshould be simplified (by unlisting) in case they all are lengthone.

nomatch

the value to be returned in the case when“no match” (no element satisfying the predicate) is found.

...

vectors to which the function isMap()ped, and otherarguments ofmapply passed to it, e.g.,MoreArgs.

Details

Ifinit is given,Reduce logically adds it to the start(when proceeding left to right) or the end ofx, respectively.If this possibly augmented vectorvv hasn>1n > 1 elements,Reduce successively appliesff to the elements ofvvfrom left to right or right to left, respectively. I.e., a leftreduce computesl1=f(v1,v2)l_1 = f(v_1, v_2),l2=f(l1,v3)l_2 = f(l_1, v_3), etc.,and returnsln1=f(ln2,vn)l_{n-1} = f(l_{n-2}, v_n), and a right reduce doesrn1=f(vn1,vn)r_{n-1} = f(v_{n-1}, v_n),rn2=f(vn2,rn1)r_{n-2} = f(v_{n-2}, r_{n-1})and returnsr1=f(v1,r2)r_1 = f(v_1, r_2). (E.g., ifvv is thesequence (2, 3, 4) andff is division, left and right reduce give(2/3)/4=1/6(2 / 3) / 4 = 1/6 and2/(3/4)=8/32 / (3 / 4) = 8/3, respectively.)Ifvv has only a single element, this is returned; if there areno elements,NULL is returned. Thus, it is ensured thatf is always called with 2 arguments.

The current implementation is non-recursive to ensure stability andscalability.

Reduce is patterned after Common Lisp'sreduce. Areduce is also known as a fold (e.g., in Haskell) or an accumulate(e.g., in the C++ Standard Template Library). The accumulativeversion corresponds to Haskell's scan functions.

Filter applies the unary predicate functionf to eachelement ofx, coercing to logical if necessary, and returns thesubset ofx for which this gives true. Note that possibleNA values are currently always taken as false; control overNA handling may be added in the future.Filtercorresponds tofilter in Haskell or ‘⁠remove-if-not⁠’ inCommon Lisp.

Find andPosition are patterned after Common Lisp's‘⁠find-if⁠’ and ‘⁠position-if⁠’, respectively. If there is anelement for which the predicate function gives true, then the first orlast such element or its position is returned depending on whetherright is false (default) or true, respectively. If there is nosuch element, the value specified bynomatch is returned. Thecurrent implementation is not optimized for performance.

Map is a simple wrapper tomapply which does notattempt to simplify the result, similar to Common Lisp'smapcar(with arguments being recycled, however). Future versions may allowsome control of the result type.

Negate corresponds to Common Lisp'scomplement. Given a(predicate) functionf, it creates a function which returns thelogical negation of whatf returns.

See Also

FunctionclusterMap andmcmapply (notWindows) in packageparallel provide parallel versions ofMap.

Examples

## A general-purpose adder:add<-function(x) Reduce(`+`, x)add(list(1,2,3))## Like sum(), but can also used for adding matrices etc., as it will## use the appropriate '+' method in each reduction step.## More generally, many generics meant to work on arbitrarily many## arguments can be defined via reduction:FOO<-function(...) Reduce(FOO2, list(...))FOO2<-function(x, y) UseMethod("FOO2")## FOO() methods can then be provided via FOO2() methods.## A general-purpose cumulative adder:cadd<-function(x) Reduce(`+`, x, accumulate=TRUE)cadd(seq_len(7))## A simple function to compute continued fractions:cfrac<-function(x) Reduce(function(u, v) u+1/ v, x, right=TRUE)## Continued fraction approximation for pi:cfrac(c(3,7,15,1,292))## Continued fraction approximation for Euler's number (e):cfrac(c(2,1,2,1,1,4,1,1,6,1,1,8))## Map() now recycles similar to basic Ops:Map(`+`,1,1:3);1+1:3Map(`+`, numeric(),1:3); numeric()+1:3## Iterative function application:Funcall<-function(f,...) f(...)## Compute log(exp(acos(cos(0))))Reduce(Funcall, list(log, exp, acos, cos),0, right=TRUE)## n-fold iterate of a function, functional style:Iterate<-function(f, n=1)function(x) Reduce(Funcall, rep.int(list(f), n), x, right=TRUE)## Continued fraction approximation to the golden ratio:Iterate(function(x)1+1/ x,30)(1)## which is the same ascfrac(rep.int(1,31))## Computing square root approximations for x as fixed points of the## function t |-> (t + x / t) / 2, as a function of the initial value:asqrt<-function(x, n) Iterate(function(t)(t+ x/ t)/2, n)asqrt(2,30)(10)# Starting from a positive value => +sqrt(2)asqrt(2,30)(-1)# Starting from a negative value => -sqrt(2)## A list of all functions in the base environment:funs<- Filter(is.function, sapply(ls(baseenv()), get, baseenv()))## Functions in base with more than 10 arguments:names(Filter(function(f) length(formals(f))>10, funs))## Number of functions in base with a '...' argument:length(Filter(function(f)              any(names(formals(f))%in%"..."),              funs))## Find all objects in the base environment which are *not* functions:Filter(Negate(is.function),  sapply(ls(baseenv()), get, baseenv()))

Garbage Collection

Description

A call ofgc causes a garbage collection to take place.gcinfo sets a flag so thatautomatic collection is either silent (verbose = FALSE) orprints memory usage statistics (verbose = TRUE).

Usage

gc(verbose= getOption("verbose"), reset=FALSE, full=TRUE)gcinfo(verbose)

Arguments

verbose

logical; ifTRUE, the garbage collection printsstatistics about cons cells and the space allocated for vectors.

reset

logical; ifTRUE the values for maximum space usedare reset to the current values.

full

logical; ifTRUE a full collection is performed;otherwise only more recently allocated objects may be collected.

Details

A call ofgc causes a garbage collection to take place.This will also take place automatically without user intervention, and theprimary purpose of callinggc is for the report on memoryusage. For an accurate reportfull = TRUE should be used.

It can be useful to callgc after a large objecthas been removed, as this may promptR to return memory to theoperating system.

R allocates space for vectors in multiples of 8 bytes: hence thereport of"Vcells", a relic of an earlier allocator (that useda vector heap).

Whengcinfo(TRUE) is in force, messages are sent to the messageconnection at each garbage collection of the form

    Garbage collection 12 = 10+0+2 (level 0) ...    6.4 Mbytes of cons cells used (58%)    2.0 Mbytes of vectors used (32%)

Here the last two lines give the current memory usage rounded up tothe next 0.1Mb and as a percentage of the current trigger value.The first line gives a breakdown of the number of garbage collectionsat various levels (for an explanation see the ‘R Internals’ manual).

Value

gc returns a matrix with rows"Ncells" (conscells), usually 28 bytes each on 32-bit systems and 56 bytes on64-bit systems, and"Vcells" (vector cells, 8 byteseach), and columns"used" and"gc trigger",each also interpreted in megabytes (rounded up to the next 0.1Mb).

If maxima have been set for either"Ncells" or"Vcells",a fifth column is printed giving the current limits in Mb (withNA denoting no limit).

The final two columns show the maximum space used since the last calltogc(reset = TRUE) (or sinceR started).

gcinfo returns the previous value of the flag.

See Also

The ‘R Internals’ manual.

Memory onR's memory management,andgctorture if you are anR developer.

gc.time() reportstime used for garbage collection.

reg.finalizer for actions to happen at garbagecollection.

Examples

gc()#- do it nowgcinfo(TRUE)#-- in the future, show when R does it##            vvvvv use larger to *show* somethingx<- integer(100000);for(iin1:18) x<- c(x, i)gcinfo(verbose=FALSE)#-- don't show it anymoregc(TRUE)gc(reset=TRUE)

Report Time Spent in Garbage Collection

Description

This function reports the time spent in garbage collection so far intheR session whileGC timing was enabled.

Usage

gc.time(on=TRUE)

Arguments

on

logical; ifTRUE,GC timing is enabled.

Details

Due to timer resolution this may be under-estimate.

This is aprimitive.

Value

A numerical vector of length 5 giving the user CPU time, the systemCPU time, the elapsed time and children's user and system CPU times(normally both zero), of time spent doing garbage collection whilstGC timing was enabled.

Times of child processes are not available on Windows and will alwaysbe given asNA.

See Also

gc,proc.time for the timings for the session.

Examples

gc.time()

Torture Garbage Collector

Description

Provokes garbage collection on (nearly) every memory allocation.Intended to ferret out memory protection bugs. Also makesR runvery slowly, unfortunately.

Usage

gctorture(on=TRUE)gctorture2(step, wait= step, inhibit_release=FALSE)

Arguments

on

logical; turning it on/off.

step

integer; runGC everystep allocations;step = 0 turns theGC torture off.

wait

integer; number of allocations to wait before startingGC torture.

inhibit_release

logical; do not release free objects forre-use: use with caution.

Details

Callinggctorture(TRUE) instructs the memory manager to force afullGC on every allocation.gctorture2 provides a more refinedinterface that allows the start of theGC torture to be deferred andalso gives the option of running aGC only everystepallocations.

The third argument togctorture2 is only used if R has beenconfigured with a strict write barrier enabled. When this is the caseall garbage collections are full collections, and the memory managermarks free nodes and enables checks in many situations that signal anerror when a free node is used. This can help greatly in isolatingunprotected values in C code. It does not detect the case where anode becomes free and is reallocated. Theinhibit_releaseargument can be used to prevent such reallocation. This will causememory to grow and should be used with caution and in conjunction withoperating system facilities to monitor and limit process memory use.

gctorture2 can also be invoked via environment variables at thestart of theR session.R_GCTORTURE corresponds to thestep argument,R_GCTORTURE_WAIT towait, andR_GCTORTURE_INHIBIT_RELEASE toinhibit_release.

Value

Previous value of first argument.

Author(s)

Peter Dalgaard and Luke Tierney


Return the Value of a Named Object

Description

Search by name for an object (get) or zero or more objects(mget).

Usage

get(x, pos=-1, envir= as.environment(pos), mode="any",    inherits=TRUE)mget(x, envir= as.environment(-1), mode="any", ifnotfound,     inherits=FALSE)dynGet(x, ifnotfound=, minframe=1L, inherits=FALSE)

Arguments

x

Forget, an object name (given as a characterstring or a symbol).
Formget, a character vector of object names.

pos,envir

where to look for the object (see ‘Details’); ifomitted search as if the name of the object appeared unquoted in anexpression.

mode

the mode or type of object sought: see the‘Details’ section.

inherits

should the enclosing frames of the environment besearched?

ifnotfound

Formget, alist of values tobe used if the item is not found: it will be coerced to a list ifnecessary.
FordynGet anyR object, e.g., a call tostop().

minframe

integer specifying the minimal frame number to lookinto.

Details

Thepos argument can specify the environment in which to lookfor the object in any of several ways: as a positive integer (theposition in thesearch list); as the character stringname of an element in the search list; or as anenvironment (including usingsys.frameto access the currently active function calls). The default of-1 indicates the current environment of the call toget. Theenvir argument is an alternative way tospecify an environment.

These functions look to see if each of the name(s)x have avalue bound to it in the specified environment. Ifinherits isTRUE and a value is not found forx in the specifiedenvironment, the enclosing frames of the environment are searcheduntil the namex is encountered. Seeenvironmentand the ‘R Language Definition’ manual for details about thestructure of environments and their enclosures.

Ifmode is specified then only objects of that type are sought.mode here is a mixture of the meanings oftypeofandmode:"function" covers primitive functionsand operators,"numeric","integer" and"double"all refer to any numeric type,"symbol" and"name" areequivalentbut"language" must be used (and not"call" or"(").Currently,mode = "S4" andmode = "object" are equivalent.

Formget, the values ofmode andifnotfound canbe either the same length asx or of length 1. The argumentifnotfound must be a list containing either the value to use ifthe requested item is not found or a function of one argument whichwill be called if the item is not found, with argument the name of theitem being requested.

dynGet() is somewhat experimental and to be usedinsideanother function. It looks for an object in the callers, i.e.,thesys.frame()s of the function. Use with caution.

Value

Forget, the object found. If no object is found an error results.

Formget, a named list of objects (found or specifiedviaifnotfound).

Note

The reverse (or “inverse”) ofa <- get(nam) isassign(nam, a), assigninga to namenam.

inherits = TRUE is the default forget inRbut not for S where it had a different meaning.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

exists for checking whether an object exists;get0 for an efficient way of both checking existence andgetting an object.

assign, the inverse ofget(), see above.

UsegetAnywhere for searching for an objectanywhere, including in other namespaces, andgetFromNamespace to find an object in a specificnamespace.

Examples

get("%o%")## test mgete1<- new.env()mget(letters, e1, ifnotfound= as.list(LETTERS))

Reflectance Information for C/Fortran routines in a DLL

Description

This function allows us to query the set of routinesin a DLL that are registered with R to enhancedynamic lookup, error handling when calling native routines,and potentially security in the future.This function provides a description of each of theregistered routines in the DLL for the different interfaces,i.e..C,.Call,.Fortranand.External.

Usage

getDLLRegisteredRoutines(dll, addNames=TRUE)

Arguments

dll

a character string orDLLInfo object.The character string specifies the file name of the DLLof interest, and is given without the file name extension (e.g., the‘.dll’ or ‘.so’) and with no directory/path information.So a file ‘MyPackage/libs/MyPackage.so’ would be specified as‘⁠MyPackage⁠’.

TheDLLInfo objects can be obtained directlyin calls todyn.load andlibrary.dynam,or can be found after the DLL has been loaded usinggetLoadedDLLs, which returns a list ofDLLInfo objects (index-able by DLL file name).

TheDLLInfo approach avoids any ambiguities related to twoDLLs having the same name but corresponding to files in differentdirectories.

addNames

a logical value. If this isTRUE, the elementsof the returned lists are named using the names of the routines (asseen by R via registration or raw name). IfFALSE,these names are not computed and assigned to the lists. As aresult, the call should be quicker. The name information is alsoavailable in theNativeSymbolInfo objects in the lists.

Details

This takes the registration information after it has been registeredand processed by the R internals. In other words, it uses the extendedinformation.

There is aprint method for the class, which prints only thetypes which have registered routines.

Value

A list of class"DLLRegisteredRoutines" with four elementscorresponding to the routines registered for the.C,.Call,.Fortran and.External interfaces. Each isa list (of class"NativeRoutineList") with as many elements asthere were routines registered for that interface.

Each element identifies a routine and is an objectof class"NativeSymbolInfo".An object of this class has the following fields:

name

the registered name of the routine (not necessarily thename in the C code).

address

the memory address of the routine as resolved in theloaded DLL. This may beNULL if the symbol has not yet beenresolved.

dll

an object of classDLLInfo describing the DLL.This is same for all elements returned.

numParameters

the number of arguments the native routine is tobe called with.

Author(s)

Duncan Temple Lang[email protected]

References

‘Writing R Extensions’ manual for symbol registration.

Duncan Temple Lang (2001).“In Search of C/C++ & FORTRAN Routines”.R News,1(3), 20–23.https://www.r-project.org/doc/Rnews/Rnews_2001-3.pdf.

See Also

getLoadedDLLs,getNativeSymbolInfo for information on the entry points listed.

Examples

dlls<- getLoadedDLLs()getDLLRegisteredRoutines(dlls[["base"]])getDLLRegisteredRoutines("stats")

Get DLLs Loaded in Current Session

Description

This function provides a way to get a list of all the DLLs (seedyn.load) that are currently loaded in theR session.

Usage

getLoadedDLLs()

Details

This queries the internal table that manages the DLLs.

Value

An object of class"DLLInfoList" which is alistwith an element corresponding to each DLL that is currently loaded in thesession. Each element is an object of class"DLLInfo" whichhas the following entries.

name

the abbreviated name.

path

the fully qualified name of the loaded DLL.

dynamicLookup

a logical value indicating whether R uses onlythe registration information to resolve symbols or whether itsearches the entire symbol table of the DLL.

handle

a reference to the C-level data structure thatprovides access to the contents of the DLL.This is an object of class"DLLHandle".

Note that the classDLLInfo has a method for$ which can be used to resolve native symbols within thatDLL. Therefore, one must access the R-level elements describedabove using[[, e.g.x[["name"]] orx[["handle"]].

Note

We are starting to use thehandle elements in the DLL object toresolve symbols more directly inR.

Author(s)

Duncan Temple Lang[email protected].

See Also

getDLLRegisteredRoutines,getNativeSymbolInfo

Examples

getLoadedDLLs()utils::tail(getLoadedDLLs(),2)# the last 2 loaded ones, still a DLLInfoList

Obtain a Description of one or more Native (C/Fortran) Symbols

Description

This finds and returns a description of one or more dynamically loadedor ‘exported’ built-in native symbols. For each name, itreturns information about the name of the symbol, the library in whichit is located and, if available, the number of arguments it expectsand by which interface it should be called (i.e.Call,.C,.Fortran, or.External). Additionally, it returns the address of thesymbol and this can be passed to other C routines. Specifically, thisprovides a way to explicitly share symbols between differentdynamically loaded package libraries. Also, it provides a way toquery where symbols were resolved, and aids diagnosing strangebehavior associated with dynamic resolution.

Usage

getNativeSymbolInfo(name, PACKAGE, unlist=TRUE,                    withRegistrationInfo=FALSE)

Arguments

name

the name(s) of the native symbol(s).

PACKAGE

an optional argument that specifies to whichDLL to restrict the search for this symbol. If this is"base", we search in theR executable itself.

unlist

a logical value which controls how the result isreturned if the function is called with the name of a single symbol.Ifunlist isTRUE and the number of symbol names inname is one, then theNativeSymbolInfo objectis returned. If it isFALSE, then a listofNativeSymbolInfo objects is returned.This is ignored if the number of symbols passed inname ismore than one.To be compatible with earlier versions of this function, thisdefaults toTRUE.

withRegistrationInfo

a logical value indicating whether, ifTRUE, to return information that was registered withR aboutthe symbol and its parameter types if such information is available,or ifFALSE to return just the address of the symbol.

Details

This uses the same mechanism for resolving symbols as is usedin all the native interfaces (.Call, etc.).If the symbol has been explicitly registered by the DLLin which it is contained, information about the number of argumentsand the interface by which it should be called will be returned.Otherwise, a generic native symbol object is returned.

Value

Generally, a list ofNativeSymbolInfo elements whose elementscan be indexed by the elements ofname in the call. EachNativeSymbolInfo object is a list containing the followingelements:

name

the name of the symbol, as given by thename argument.

address

ifwithRegistrationInfo isFALSE,this is the native memory address of the symbol which canbe used to invoke the routine, and also tocompare with other symbol addresses. This is an external pointerobject and of classNativeSymbol.IfwithRegistrationInfo isTRUE and registrationinformation is available for the symbol, then this isan object of classRegisteredNativeSymbol and is a referenceto an internal data type that has access to the routine pointer andregistration information. This too can be used in calls to.Call,.C,.Fortran and.External.

dll

a list containing 3 elements:

name

the short form of the library name which can be usedas the value of thePACKAGE argument inthe different native interface functions.

path

the fully qualified name of the DLL.

dynamicLookup

a logical value indicating whether dynamicresolution is used when looking for symbols in this library,or only registered routines can be located.

If the routine was explicitly registered by the dynamically loadedlibrary, the list contains a fourth field

numParameters

the number of arguments that should be passed ina call to this routine.

Additionally, the list will have an additional class,beingCRoutine,CallRoutine,FortranRoutine orExternalRoutine corresponding to the R interface by which itshould be invoked.

If any of the symbols is not found, an error is raised.

Ifname contains only one symbol name andunlist isTRUE, then the singleNativeSymbolInfo is returnedrather than the list containing that one element.

Note

The third element of theNativeSymbolInfo objects was renamedfrompackage todll inR version 3.6.0, for consistencywith the names of theNativeSymbolInfo objects returned bygetDLLRegisteredRoutines().

Note

One motivation for accessing this reflectance information is to beable to pass native routines to C routines as function pointers in C.This allows us to treat native routines andR functions in a similarmanner, such as when passing anR function to C code that makescallbacks to that function at different points in its computation(e.g.,nls). Additionally, we can resolve the symboljust once and avoid resolving it repeatedly or using the internalcache.

Author(s)

Duncan Temple Lang

References

For information about registering native routines,see “In Search of C/C++ & FORTRAN Routines”,R-News, volume 1, number 3, 2001, p20–23(https://www.r-project.org/doc/Rnews/Rnews_2001-3.pdf).

See Also

getDLLRegisteredRoutines,is.loaded,.C,.Fortran,.External,.Call,dyn.load.


Translate Text Messages

Description

Translation of text messages typically from calls tostop(),warning(), ormessage()happens when Native Language Support (NLS) was enabled in this build ofR as it is almost always, see also thebindtextdomain() example.

The functions documented here are the low level building blocks usedexplicitly or implicitly in almost all such message producing calls andthey attempt totranslate character vectors or set where the translations are to be found.

Usage

gettext(..., domain=NULL, trim=TRUE)ngettext(n, msg1, msg2, domain=NULL)bindtextdomain(domain, dirname=NULL)Sys.setLanguage(lang, unset="en")

Arguments

...

one or more character vectors.

trim

logical indicating if the white space trimming ingettext() should happen.trim = FALSE may be needed forcompiled code (C / C++) messages which often end with\n.

domain

the ‘domain’ for the translation, acharacterstring, orNULL; see ‘Details’.

n

a non-negative integer.

msg1

the message to be used in English forn = 1.

msg2

the message to be used in English forn = 0, 2, 3, ....

dirname

the directory in which to find translated messagecatalogs for the domain.

lang

acharacter string specifying a language forwhich translations should be sought.

unset

a string, specifying the default language assumed to becurrent in the caseSys.getenv("LANGUAGE") is unset orempty.

Details

Ifdomain isNULL (the default) ingettextorngettext, the domain is inferred. Ifgettextorngettext is called from a function in the namespace ofpackagepkg including called viastop(),warning(), ormessage() from the function,or, say, evaluated as if called from that namespace, see theevalq() example, the domain is set to"R-pkg". Otherwise there is no defaultdomain and messages are not translated.

Settingdomain = NA ingettext orngettextsuppresses any translation.

"" does not match any domain. Ingettext orngettext,domain = "" is effectively the same asdomain = NA.

If the domain is found, each character string is offered fortranslation, and replaced by its translation into the current languageif one is found.

Thelanguage to be used for message translation is determined byyour OS default and/or the locale setting atR's startup, seeSys.getlocale(), and notably theLANGUAGE environment variable, and alsoSys.setLanguage() here.

Conventionally the domain forR warning/error messages in packagepkg is"R-pkg", and that for C-level messages is"pkg".

Forgettext, whentrim is true as by default,leading and trailing whitespace is ignored (“trimmed”) whenlooking for the translation.

ngettext is used where the message needs to vary by a singleinteger. Translating such messages is subject to very specific rulesfor different languages: see the GNU Gettext Manual. The stringwill often contain a single instance of%d to be used insprintf. If English is used,msg1 is returned ifn == 1 andmsg2 in all other cases.

bindtextdomain is typically wrapper for the C function of the samename: your system may have aman page for it. With anon-NULLdirname it specifies where to look for messagecatalogues: withdirname = NULL it returns the current location.IfNLS is not enabled,bindtextdomain(*,*) returnsNULL.The special casebindtextdomain(NULL) calls C leveltextdomain(textdomain(NULL)) for the purpose of flushing (i.e.,emptying) the cache of already translated strings; it returnsTRUEwhenNLS is enabled.

The utilitySys.setLanguage(lang) combines setting theLANGUAGE environment variable with flushing the translation cachebybindtextdomain(NULL).

Value

Forgettext, a character vector, one element per string in.... If translation is not enabled or no domain is found orno translation is found in that domain, the original strings arereturned.

Forngettext, a character string.

Forbindtextdomain, a character string giving the current basedirectory, orNULL if setting it failed.

ForSys.setLanguage(), the previousLANGUAGE setting withattributeattr(*, "ok"), alogicalindicating success.Note that currently, using a non-existing languagelang is stillset and no translation will happen, without anymessage.

See Also

stop andwarning make use ofgettext totranslate messages.

xgettext (packagetools) for extracting translatablestrings fromR source files.

Examples

bindtextdomain("R")# non-null if and only if NLS is enabledfor(nin0:3)    print(sprintf(ngettext(n,"%d variable has missing values","%d variables have missing values"),                  n))## Not run: ## for translation, those strings should appear in R-pkg.pot asmsgid"%d variable has missing values"msgid_plural"%d variables have missing values"msgstr[0]""msgstr[1]""## End(Not run)miss<-"One only"# this line, or the next for the ngettext() belowmiss<- c("one","or","another")cat(ngettext(length(miss),"variable","variables"),    paste(sQuote(miss), collapse=", "),    ngettext(length(miss),"contains","contain"),"missing values\n")## better for translators would be to usecat(sprintf(ngettext(length(miss),"variable %s contains missing values\n","variables %s contain missing values\n"),            paste(sQuote(miss), collapse=", ")))thisLang<- Sys.getenv("LANGUAGE", unset=NA)# so we can reset itif(is.na(thisLang)||!nzchar(thisLang)) thisLang<-"en"# "factory" defaultenT<-"empty model supplied"Sys.setenv(LANGUAGE="de")# may not always 'work'gettext(enT, domain="R-stats")# "leeres Modell angegeben" (if translation works)tget<-function() gettext(enT)tget()# not translated as fn tget() is not from "stats" pkg/namespaceevalq(function() gettext(enT), asNamespace("stats"))()# *is* translated## Sys.setLanguage()  -- typical usage --Sys.setLanguage("en")-> oldSet# does set LANGUAGE env.varerrMsg<-function(expr) tryCatch(expr, error=conditionMessage)(errMsg(1+"2")-> err)Sys.setLanguage("fr")errMsg(1+"2")Sys.setLanguage("de")errMsg(1+"2")## Usually, you would reset the language to "previous" viaSys.setLanguage(oldSet)## A show off of translations -- platform (font etc) dependent:## The translation languages available for "base" R in this version of R:if(capabilities("NLS")) withAutoprint({  langs<- list.files(bindtextdomain("R"),      pattern="^[a-z]{2}(_[A-Z]{2}|@quot)?$")  langs  txts<- sapply(setNames(,langs),function(lang){ Sys.setLanguage(lang) gettext("incompatible dimensions", domain="R-stats")})  cbind(txts)(nTrans<- length(unique(txts)))(not_translated<- names(txts[txts== txts[["en"]]]))})## Here, we reset to the *original* setting before the full example started:if(nzchar(thisLang)){## reset to previous and check  Sys.setLanguage(thisLang)  stopifnot(identical(errMsg(1+"2"), err))}# else staying at 'de' ..

Get or Set Working Directory

Description

getwd returns an absolute filepath representing the currentworking directory of theR process;setwd(dir) is used to setthe working directory todir.

Usage

getwd()setwd(dir)

Arguments

dir

A character string:tilde expansion will be done.

Details

Seefiles for how file paths with marked encodings are interpreted.

Value

getwd returns a character string orNULL if the workingdirectory is not available.On Windows the path returned will use/ as the path separatorand be encoded in UTF-8. The path will not have a trailing/unless it is the root directory (of a drive or share on Windows).

setwd returns the current directory before the change,invisibly and with the same conventions asgetwd. It will givean error if it does not succeed (including if it is not implemented).

Note

Note that the return value is said to bean absolutefilepath: there can be more than one representation of the path to adirectory and on some OSes the value returned can differ afterchanging directories and changing back to the same directory (forexample if symbolic links have been traversed).

See Also

list.files for thecontents of a directory.

normalizePath for a ‘canonical’ path name.

Examples

(WD<- getwd())if(!is.null(WD)) setwd(WD)

Generate Factor Levels

Description

Generate factors by specifying the pattern of their levels.

Usage

gl(n, k, length= n*k, labels= seq_len(n), ordered=FALSE)

Arguments

n

an integer giving the number of levels.

k

an integer giving the number of replications.

length

an integer giving the length of the result.

labels

an optional vector of labels for the resulting factorlevels.

ordered

a logical indicating whether the result should beordered or not.

Value

The result has levels from1 ton with each valuereplicated in groups of lengthk out to a total length oflength.

gl is modelled on theGLIM function of the same name.

See Also

The underlyingfactor().

Examples

## First control, then treatment:gl(2,8, labels= c("Control","Treat"))## 20 alternating 1s and 2sgl(2,1,20)## alternating pairs of 1s and 2sgl(2,2,20)

Pattern Matching and Replacement

Description

grep,grepl,regexpr,gregexpr,regexec andgregexec search for matches to argumentpattern within each element of a character vector: they differ inthe format of and amount of detail in the results.

sub andgsub perform replacement of the first and allmatches respectively.

Usage

grep(pattern, x, ignore.case=FALSE, perl=FALSE, value=FALSE,     fixed=FALSE, useBytes=FALSE, invert=FALSE)grepl(pattern, x, ignore.case=FALSE, perl=FALSE,      fixed=FALSE, useBytes=FALSE)sub(pattern, replacement, x, ignore.case=FALSE, perl=FALSE,    fixed=FALSE, useBytes=FALSE)gsub(pattern, replacement, x, ignore.case=FALSE, perl=FALSE,     fixed=FALSE, useBytes=FALSE)regexpr(pattern, text, ignore.case=FALSE, perl=FALSE,        fixed=FALSE, useBytes=FALSE)gregexpr(pattern, text, ignore.case=FALSE, perl=FALSE,         fixed=FALSE, useBytes=FALSE)regexec(pattern, text, ignore.case=FALSE, perl=FALSE,        fixed=FALSE, useBytes=FALSE)gregexec(pattern, text, ignore.case=FALSE, perl=FALSE,        fixed=FALSE, useBytes=FALSE)

Arguments

pattern

character string containing aregular expression(or character string forfixed = TRUE) to be matchedin the given character vector. Coerced byas.character to a character string if possible. If acharacter vector of length 2 or more is supplied, the first elementis used with a warning. Missing values are allowed except forregexpr,gregexpr andregexec.

x,text

a character vector where matches are sought, or anobject which can be coerced byas.character to a charactervector.Long vectors are supported.

ignore.case

ifFALSE, the pattern matching iscasesensitive and ifTRUE, case is ignored during matching.

perl

logical. Should Perl-compatible regexps be used?

value

ifFALSE, a vector containing the (integer)indices of the matches determined bygrep is returned, and ifTRUE, a vector containing the matching elements themselves isreturned.

fixed

logical. IfTRUE,pattern is a string to bematched as is. Overrides all conflicting arguments.

useBytes

logical. IfTRUE the matching is donebyte-by-byte rather than character-by-character. See‘Details’.

invert

logical. IfTRUE return indices or values forelements that donot match.

replacement

a replacement for matched pattern insub andgsub. Coerced to character if possible. Forfixed = FALSE this can include backreferences"\1" to"\9" to parenthesized subexpressions ofpattern. Forperl = TRUE only, it can also contain"\U" or"\L" to convert the rest of the replacement to upper orlower case and"\E" to end case conversion. If acharacter vector of length 2 or more is supplied, the first elementis used with a warning. IfNA, all elements in the resultcorresponding to matches will be set toNA.

Details

Arguments which should be character strings or character vectors arecoerced to character if possible.

Each of these functions operates in one of three modes:

  1. fixed = TRUE: use exact matching.

  2. perl = TRUE: use Perl-style regular expressions.

  3. fixed = FALSE, perl = FALSE: use POSIX 1003.2extended regular expressions (the default).

See the help pages onregular expression for details of thedifferent types of regular expressions.

The two*sub functions differ only in thatsub replacesonly the first occurrence of apattern whereasgsubreplaces all occurrences. Ifreplacement containsbackreferences which are not defined inpattern the result isundefined (but most often the backreference is taken to be"").

Forregexpr,gregexpr,regexec andgregexecit is an error forpattern to beNA, otherwiseNAis permitted and gives anNA match.

Bothgrep andgrepl take missing values inx asnot matching a non-missingpattern.

The main effect ofuseBytes = TRUE is to avoid errors/warningsabout invalid inputs and spurious matches in multibyte locales, butforregexpr it changes the interpretation of the output. Itinhibits the conversion of inputs with marked encodings, and is forcedif any input is found which is marked as"bytes" (seeEncoding).

Caseless matching does not make much sense for bytes in a multibytelocale, and you should expect it only to work for ASCII characters ifuseBytes = TRUE.

regexpr andgregexpr withperl = TRUE allowPython-style named captures, but not forlong vector inputs.

Invalid inputs in the current locale are warned about up to 5 times.

Caseless matching withperl = TRUE for non-ASCII charactersdepends on the PCRE library being compiled with ‘Unicodeproperty support’, which PCRE2 is by default.

Value

grep(value = FALSE) returns a vector of the indicesof the elements ofx that yielded a match (or not, forinvert = TRUE). This will be an integer vector unless the inputis along vector, when it will be a double vector.

grep(value = TRUE) returns a character vector containing theselected elements ofx (after coercion, preserving names but noother attributes).

grepl returns a logical vector (match or not for each element ofx).

sub andgsub return a character vector of the same lengthand with the same attributes asx (after possible coercion tocharacter). Elements of character vectorsx which are notsubstituted will be returned unchanged (including any declared encoding ifuseBytes = FALSE). IfuseBytes = FALSE a non-ASCIIsubstituted result will often be in UTF-8 with a marked encoding (e.g., ifthere is a UTF-8 input, and in a multibyte locale unlessfixed = TRUE). Such strings can be re-encoded byenc2native. Ifany of the inputs is marked as"bytes", elements of charactervectorsx which are substituted will be returned marked as"bytes", but the encoding flag on elements not substituted isunspecified (it may be the original or "bytes"). If none of the inputs ismarked as"bytes", butuseBytes = TRUE is given explicitly,the encoding flag is unspecified even on the substituted elements (it maybe"bytes" or"unknown", possibly invalid in the currentencoding). Mixed use of"bytes" and other marked encodings isdiscouraged, but if still desired one may useiconv tore-encode the result e.g. to UTF-8 with suitably substituted invalidbytes.

regexpr returns an integer vector of the same length astext giving the starting position of the first match or1-1 if there is none, with attribute"match.length", aninteger vector giving the length of the matched text (or1-1 forno match). The match positions and lengths are in characters unlessuseBytes = TRUE is used, when they are in bytes (as they arefor ASCII-only matching: in either case an attributeuseBytes with valueTRUE is set on the result). Ifnamed capture is used there are further attributes"capture.start","capture.length" and"capture.names".

gregexpr returns a list of the same length astext eachelement of which is of the same form as the return value forregexpr, except that the starting positions of every (disjoint)match are given.

regexec returns a list of the same length astext eachelement of which is either1-1 if there is no match, or asequence of integers with the starting positions of the match and allsubstrings corresponding to parenthesized subexpressions ofpattern, with attribute"match.length" a vectorgiving the lengths of the matches (or1-1 for no match). Theinterpretation of positions and length and the attributes followsregexpr.

gregexec returns the same asregexec, except that toaccommodate multiple matches per element oftext, the integersequences for each match are made into columns of a matrix, with onematrix per element oftext with matches.

Where matching failed because of resource limits (especially forperl = TRUE) this is regarded as a non-match, usually with awarning.

Warning

The POSIX 1003.2 mode ofgsub andgregexpr does notwork correctly with repeated word-boundaries (e.g.,pattern = "\b").Useperl = TRUE for such matches (but that may notwork as expected with non-ASCII inputs, as the meaning of‘word’ is system-dependent).

Performance considerations

If you are doing a lot of regular expression matching, including onvery long strings, you will want to consider the options used.Generallyperl = TRUE will be faster than the default regularexpression engine, andfixed = TRUE faster still (especiallywhen each pattern is matched only a few times).

If you are working with texts with non-ASCII characters, which can beeasily turned into ASCII (e.g. by substituting fancy quotes), doing so islikely to improve performance.

If you are working in a single-byte locale (though not common sinceR 4.2)and have marked UTF-8 strings that are representable in that locale,convert them first as just one UTF-8 string will force all the matching tobe done in Unicode, which attracts a penalty of around3×3\times{}for the default POSIX 1003.2 mode.

WhileuseBytes = TRUE will improve performance further, because thestrings will not be checked before matching and the actual matching willbe faster, it can produce unexpected results so is best avoided. Withfixed = TRUE anduseBytes = FALSE, optimizations are inplace that take advantage of byte-based matching working for such patternsin UTF-8. WithuseBytes = TRUE, character ranges, wildcards,and other regular expression patterns may produce unexpected results.

PCRE-based matching by default used to put additional effort into‘studying’ the compiled pattern whenx/text haslength 10 or more. That study may use the PCREJIT compiler onplatforms where it is available (seepcre_config). Asfrom PCRE2 (PCRE version >= 10.00 as reported byextSoftVersion), there is no study phase, but thepatterns are optimized automatically when possible, and PCREJIT isused when enabled. The details are controlled byoptionsPCRE_study andPCRE_use_JIT.(Some timing comparisons can be seen by running file‘tests/PCRE.R’ in theR sources (and perhaps installed).)People working with PCRE and very long strings can adjust the maximumsize of theJIT stack by setting environment variableR_PCRE_JIT_STACK_MAXSIZE beforeJIT is used to a value between1 and1000 in MB: the default is64. WhenJIT isnot used with PCRE version < 10.30 (that is with PCRE1 and oldversions of PCRE2), it might also be wise to set the optionPCRE_limit_recursion.

Note

Aspects will be platform-dependent as well as locale-dependent: forexample the implementation of character classes (except[:digit:] and[:xdigit:]). One can expect results to beconsistent for ASCII inputs and when working in UTF-8 mode (when mostplatforms will use Unicode character tables, although those areupdated frequently and subject to some degree of interpretation – isa circled capital letter alphabetic or a symbol?). However, resultsin 8-bit encodings can differ considerably between platforms, modesand from the UTF-8 versions.

Source

The C code for POSIX-style regular expression matching has changedover the years. As fromR 2.10.0 (Oct 2009) the TRE library of VilleLaurikari (https://github.com/laurikari/tre) is used. The POSIXstandard does give some room for interpretation, especially in thehandling of invalid regular expressions and the collation of characterranges, so the results will have changed slightly over the years.

For Perl-style matching PCRE2 or PCRE (https://www.pcre.org) isused: again the results may depend (slightly) on the version of PCREin use.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole (grep)

See Also

regular expression (akaregexp) for the detailsof the pattern specification.

regmatches for extracting matched substrings based onthe results ofregexpr,gregexpr andregexec.

glob2rx to turn wildcard matches into regular expressions.

agrep for approximate matching.

charmatch,pmatch for partial matching,match for matching to whole strings,startsWith for matching of initial parts of strings.

tolower,toupper andchartrfor character translations.

apropos uses regexps and has more examples.

grepRaw for matching raw vectors.

OptionsPCRE_limit_recursion,PCRE_study andPCRE_use_JIT.

extSoftVersion for the versions of regex and PCRElibraries in use,pcre_config for more details forPCRE.

Examples

grep("[a-z]", letters)txt<- c("arm","foot","lefroo","bafoobar")if(length(i<- grep("foo", txt)))   cat("'foo' appears at least once in\n\t", txt,"\n")i# 2 and 4txt[i]## Double all 'a' or 'b's;  "\" must be escaped, i.e., 'doubled'gsub("([ab])","\\1_\\1_","abc and ABC")txt<- c("The","licenses","for","most","software","are","designed","to","take","away","your","freedom","to","share","and","change","it.","","By","contrast,","the","GNU","General","Public","License","is","intended","to","guarantee","your","freedom","to","share","and","change","free","software","--","to","make","sure","the","software","is","free","for","all","its","users")( i<- grep("[gu]", txt))# indicesstopifnot( txt[i]== grep("[gu]", txt, value=TRUE))## Note that for some implementations character ranges are## locale-dependent (but not currently).  Then [b-e] in locales such as## en_US may include B as the collation order is aAbBcCdDe ...(ot<- sub("[b-e]",".", txt))txt[ot!= gsub("[b-e]",".", txt)]#- gsub does "global" substitution## In caseless matching, ranges include both cases:a<- grep("[b-e]", txt, value=TRUE)b<- grep("[b-e]", txt, ignore.case=TRUE, value=TRUE)setdiff(b, a)txt[gsub("g","#", txt)!=    gsub("g","#", txt, ignore.case=TRUE)]# the "G" wordsregexpr("en", txt)gregexpr("e", txt)## Using grepl() for filtering## Find functions with argument names matching "warn":findArgs<-function(env, pattern){  nms<- ls(envir= as.environment(env))  nms<- nms[is.na(match(nms, c("F","T")))]# <-- work around "checking hack"  aa<- sapply(nms,function(.){ o<- get(.)if(is.function(o)) names(formals(o))})  iw<- sapply(aa,function(a) any(grepl(pattern, a, ignore.case=TRUE)))  aa[iw]}findArgs("package:base","warn")## trim trailing white spacestr<-"Now is the time      "sub(" +$","", str)## spaces only## what is considered 'white space' depends on the locale.sub("[[:space:]]+$","", str)## white space, POSIX-style## what PCRE considered white space changed in version 8.34: see ?regexsub("\\s+$","", str, perl=TRUE)## PCRE-style white space## capitalizingtxt<-"a test of capitalizing"gsub("(\\w)(\\w*)","\\U\\1\\L\\2", txt, perl=TRUE)gsub("\\b(\\w)","\\U\\1",       txt, perl=TRUE)txt2<-"useRs may fly into JFK or laGuardia"gsub("(\\w)(\\w*)(\\w)","\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE) sub("(\\w)(\\w*)(\\w)","\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE)## named capturenotables<- c("  Ben Franklin and Jefferson Davis","\tMillard Fillmore")# name groups 'first' and 'last'name.rex<-"(?<first>[[:upper:]][[:lower:]]+) (?<last>[[:upper:]][[:lower:]]+)"(parsed<- regexpr(name.rex, notables, perl=TRUE))gregexpr(name.rex, notables, perl=TRUE)[[2]]parse.one<-function(res, result){  m<- do.call(rbind, lapply(seq_along(res),function(i){if(result[i]==-1) return("")    st<- attr(result,"capture.start")[i,]    substring(res[i], st, st+ attr(result,"capture.length")[i,]-1)}))  colnames(m)<- attr(result,"capture.names")  m}parse.one(notables, parsed)## Decompose a URL into its components.## Example by LT (http://www.cs.uiowa.edu/~luke/R/regexp.html).x<-"http://stat.umn.edu:80/xyz"m<- regexec("^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)", x)mregmatches(x, m)## Element 3 is the protocol, 4 is the host, 6 is the port, and 7## is the path.  We can use this to make a function for extracting the## parts of a URL:URL_parts<-function(x){    m<- regexec("^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)", x)    parts<- do.call(rbind,                     lapply(regmatches(x, m), `[`, c(3L,4L,6L,7L)))    colnames(parts)<- c("protocol","host","port","path")    parts}URL_parts(x)## gregexec() may match multiple times within a single string.pattern<-"([[:alpha:]]+)([[:digit:]]+)"s<-"Test: A1 BC23 DEF456"m<- gregexec(pattern, s)mregmatches(s, m)## Before gregexec() was implemented, one could emulate it by running## regexec() on the regmatches obtained via gregexpr().  E.g.:lapply(regmatches(s, gregexpr(pattern, s)),function(e) regmatches(e, regexec(pattern, e)))

Pattern Matching for Raw Vectors

Description

grepRaw searches for substringpattern matches within araw vectorx.

Usage

grepRaw(pattern, x, offset=1L, ignore.case=FALSE,        value=FALSE, fixed=FALSE, all=FALSE, invert=FALSE)

Arguments

pattern

raw vector containing aregular expression(or fixed pattern forfixed = TRUE) to be matched in thegiven raw vector. Coerced bycharToRaw to a characterstring if possible.

x

a raw vector where matches are sought, or an object which canbe coerced bycharToRaw to a raw vector.Long vectorsare not supported.

ignore.case

ifFALSE, the pattern matching iscasesensitive and ifTRUE, case is ignored during matching.

offset

an integer specifying the offset fromwhich the search should start. Must be positive. The beginning ofline is defined to be at that offset so"^" will match there.

value

logical. Determines the return value: see ‘Value’.

fixed

logical. IfTRUE,pattern is a pattern to bematched as is.

all

logical. IfTRUE all matches are returned,otherwise just the first one.

invert

logical. IfTRUE return indices or values forelements that donot match. Ignored (with a warning) unlessvalue = TRUE.

Details

Unlikegrep, seeks matching patterns within the rawvectorx . This has implications especially in theall = TRUE case, e.g., patterns matching empty strings are inherentlyinfinite and thus may lead to unexpected results.

The argumentinvert is interpreted as asking to return thecomplement of the match, which is only meaningful forvalue = TRUE. Argumentoffset determines the start of the search, notof the complement. Note thatinvert = TRUE withall = TRUE will splitx into pieces delimited by the patternincluding leading and trailing empty strings (consequently the use ofregular expressions with"^" or"$" in that case maylead to less intuitive results).

Some combinations of arguments such asfixed = TRUE withvalue = TRUE are supported but are less meaningful.

Value

grepRaw(value = FALSE) returns an integer vector of the offsetsat which matches have occurred. Ifall = FALSE then it will beeither of length zero (no match) or length one (first matchingposition).

grepRaw(value = TRUE, all = FALSE) returns a raw vector whichis either empty (no match) or the matched part ofx.

grepRaw(value = TRUE, all = TRUE) returns a (potentiallyempty) list of raw vectors corresponding to the matched parts.

Source

The TRE library of Ville Laurikari (https://github.com/laurikari/tre/)is used except forfixed = TRUE.

See Also

regular expression (akaregexp) for the detailsof the pattern specification.

grep for matching character vectors.

Examples

grepRaw("no match","textText")# integer(0): no matchgrepRaw("adf","adadfadfdfadadf")# 3 - the first matchgrepRaw("adf","adadfadfdfadadf", all=TRUE, fixed=TRUE)## [1]  3  6 13 -- three matches

S3 Group Generic Functions

Description

Group generic methods can be defined for the following pre-specified groups offunctions,Math,Ops,matrixOps,Summary andComplex.(There are no objects of these names in baseR, but there are in themethods package, not yet formatrixOps.)

A method defined for an individual member of the group takesprecedence over a method defined for the group as a whole.

Usage

## S3 methods for group generics have prototypes:Math(x,...)Ops(e1, e2)Complex(z)Summary(..., na.rm=FALSE)matrixOps(x, y)

Arguments

x,y,z,e1,e2

objects.

...

further arguments passed to methods.

na.rm

logical: should missing values be removed?

Details

There are fivegroups for which S3 methods can be written,namely the"Math","Ops","Summary","matrixOps", and"Complex" groups. These are notR objects in baseR, butmethods can be supplied for them and baseR containsfactor,data.frame anddifftime methods for the first three groups. (There isalso aordered method forOps,POSIXt andDate methods forMathandOps,package_version methods forOpsandSummary, as well as ats method forOps in packagestats.)

  1. Group"Math":

    • abs,sign,sqrt,
      floor,ceiling,trunc,
      round,signif

    • exp,log,expm1,log1p,
      cos,sin,tan,
      cospi,sinpi,tanpi,
      acos,asin,atan

      cosh,sinh,tanh,
      acosh,asinh,atanh

    • lgamma,gamma,digamma,trigamma

    • cumsum,cumprod,cummax,cummin

    Members of this group dispatch onx. Most members acceptonly one argument, but memberslog,round andsignif accept one or two arguments, andtrunc acceptsone or more.

  2. Group"Ops":

    • "+","-","*","/","^","%%","%/%"

    • "&","|","!"

    • "==","!=","<","<=",">=",">"

    This group contains both binary and unary operators (+,- and!): when a unary operator is encountered theOps method is called with one argument ande2 ismissing.

    The classes of both arguments are considered in dispatching anymember of this group. For each argument its vector of classes isexamined to see if there is a matching specific (preferred) orOps method. If a method is found for just one argument orthe same method is found for both, it is used.If different methods are found, then the genericchooseOpsMethod() is called topick the appropriate method. (See?chooseOpsMethod fordetails). IfchooseOpsMethod() does not resolve the method,then there is a warning about‘incompatible methods’: in that case or if no method is foundfor either argument the internal method is used.

    Note that thedata.frame methods for the comparison("Compare":==,<, ...) and logic("Logic":&| and!) operators return alogicalmatrix instead of a data frame, forconvenience and back compatibility.

    If the members of this group are called as functions, any argumentnames are removed to ensure that positional matching is always used.

  3. Group"matrixOps":

    • "%*%"

    This group currently contains the matrix multiply%*% binary operatoronly, where at leastcrossprod() andtcrossprod()are meant to follow.Members of the group have the same dispatch semantics (usingboth arguments)as theOps group.

  4. Group"Summary":

    • all,any

    • sum,prod

    • min,max

    • range

    Members of this group dispatch on the first argument supplied.

    Note that thedata.frame methods for the"Summary" and"Math" groups require “numeric-alike”columnsx, i.e., fulfilling

          is.numeric(x) || is.logical(x) || is.complex(x)
  5. Group"Complex":

    • Arg,Conj,Im,Mod,Re

    Members of this group dispatch onz.

Note that a method will be used for one of these groups or one of itsmembersonly if it corresponds to a"class" attribute,as the internal code dispatches onoldClass and not onclass. This is for efficiency: having to dispatch on,say,Ops.integer would be too slow.

The number of arguments supplied for primitive members of the"Math" group generic methods is not checked prior to dispatch.

There is no lazy evaluation of arguments for group-generic functions.

Technical Details

These functions are all primitive andinternal generic.

The details of method dispatch and variables such as.Genericare discussed in the help forUseMethod. There are afew small differences:

  • For the operators of groupOps, the object.Method is a length-two character vector with elements themethods selected for the left and right arguments respectively. (Ifno method was selected, the corresponding element is"".)

  • Object.Group records the group used for dispatch (ifa specific method is used this is"").

Note

Packagemethods does contain objects with these names, which ithas re-used in confusing similar (but different) ways. See the helpfor that package.

References

Appendix A,Classes and Methods of
Chambers, J. M. and Hastie, T. J. eds (1992)Statistical Models in S.Wadsworth & Brooks/Cole.

See Also

methods for methods of non-internal generic functions.

S4groupGeneric for group generics for S4 methods.

Examples

require(utils)d.fr<- data.frame(x=1:9, y= stats::rnorm(9))class(1+ d.fr)=="data.frame"##-- add to d.f. ...methods("Math")methods("Ops")methods("Summary")methods("Complex")# none in base R

Grouping Permutation

Description

grouping returns a permutation which rearranges its firstargument such that identical values are adjacent to each other. Alsoreturned as attributes are the group-wise partitioning and the maximumgroup size.

Usage

grouping(...)

Arguments

...

a sequence of numeric, character or logicalvectors, all of the same length, or a classedR object.

Details

The function partially sorts the elements so that identical values areadjacent.NA values come last. This is guaranteed to bestable, so ties are preserved, and if the data are alreadygrouped/sorted, the grouping is unchanged. This is useful foraggregation and is particularly fast for character vectors.

Under the covers, the"radix" method oforder isused, and the same caveats apply, including restrictions on characterencodings and lack of support forlong vectors (those with2312^{31} or more elements). Real-valued numbers are slightlyrounded to account for numerical imprecision.

Likeorder, for a classedR object the grouping is based onthe result ofxtfrm.

Value

An object of class"grouping", the representation of whichshould be considered experimental and subject to change. It is aninteger vector with two attributes:

ends

subscripts in the result corresponding to the lastmember of each group

maxgrpn

the maximum group size

See Also

order,xtfrm.

Examples

(ii<- grouping(x<- c(1,1,3:1,1:4,3), y<- c(9,9:1), z<- c(2,1:9)))## 6  5  2  1  7  4 10  8  3  9rbind(x, y, z)[, ii]

(De)compress I/O Through Connections

Description

gzcon provides a modified connection that wraps an existingconnection, and decompresses reads or compresses writes through thatconnection. Standardgzip headers are assumed.

Usage

gzcon(con, level=6, allowNonCompressed=TRUE, text=FALSE)

Arguments

con

a connection.

level

integer between 0 and 9, the compression level when writing.

allowNonCompressed

logical. When reading, shouldnon-compressed input be allowed?

text

logical. Should the connection be text-oriented? This isdistinct from the mode of the connection (must always be binary).IfTRUE,pushBack works on the connection,otherwisereadBin and friends apply.

Details

Ifcon is open then the modified connection is opened. Closingthe wrapper connection will also close the underlying connection.

Reading from a connection which does not supply agzip magicheader is equivalent to reading from the original connection ifallowNonCompressed is true, otherwise an error.

Compressed output will contain embeddedNUL bytes, and soconis not permitted to be atextConnection opened withopen = "w". Use a writablerawConnection tocompress data into a variable.

The original connection becomes unusable: any object pointing to it willnow refer to the modified connection. For this reason, the newconnection needs to be closed explicitly.

Value

An object inheriting from class"connection". This is the sameconnectionnumber as supplied, but with a modified internalstructure. It has binary mode.

See Also

gzfile

Examples

## Uncompress a data file from a URLz<- gzcon(url("https://www.stats.ox.ac.uk/pub/datasets/csb/ch12.dat.gz"))# read.table can only read from a text-mode connection.raw<- textConnection(readLines(z))close(z)dat<- read.table(raw)close(raw)dat[1:4,]## gzfile and gzcon can inter-work.## Of course here one would use gzfile, but file() can be replaced by## any other connection generator.zzfil<- tempfile(fileext=".gz")zz<- gzfile(zzfil,"w")cat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)readLines(zz<- gzcon(file(zzfil,"rb")))close(zz)unlink(zzfil)zzfil2<- tempfile(fileext=".gz")zz<- gzcon(file(zzfil2,"wb"))cat("TITLE extra line","2 3 5 7","","11 13 17", file= zz, sep="\n")close(zz)readLines(zz<- gzfile(zzfil2))close(zz)unlink(zzfil2)

Integer Numbers Displayed in Hexadecimal

Description

Integers which are displayed in hexadecimal (short ‘hex’) format,with as many digits as are needed to display the largest, using leadingzeroes as necessary.

Arithmetic works as for integers, and non-integer valued mathematicalfunctions typically work by truncating the result to integer.

Usage

as.hexmode(x)## S3 method for class 'hexmode'as.character(x, keepStr=FALSE,...)## S3 method for class 'hexmode'format(x, width=NULL, upper.case=FALSE,...)## S3 method for class 'hexmode'print(x,...)

Arguments

x

an object, for the methods inheriting from class"hexmode".

keepStr

alogical indicating that names anddimensions should be kept; setTRUE for back compatibility, if needed.

width

NULL or a positive integer specifying the minimumfield width to be used, with padding by leading zeroes.

upper.case

a logical indicating whether to use upper-caseletters or lower-case letters (default).

...

further arguments passed to or from other methods.

Details

Class"hexmode" consists of integer vectors with that classattribute, used primarily to ensure that they are printed in hex.Subsetting ([) works too, as do arithmetic orother mathematical operations, albeit truncated to integer.

as.character(x) drops allattributes (unless whenkeepStr=TRUE where it keeps,dim,dimnames andnames for back compatibility) and converts each entry individually, hence with noleading zeroes, whereas informat(), whenwidth = NULL (thedefault), the output is padded with leading zeroes to the smallest widthneeded for all the non-missing elements.

as.hexmode can convert integers (oftype"integer" or"double") and character vectors whose elements contain only0-9,a-f,A-F (or areNA) to class"hexmode".

There is a! method and methods for| and&:these recycle their arguments to the length of the longer and thenapply the operators bitwise to each element.

See Also

octmode,sprintf for other options inconverting integers to hex,strtoi to convert hexstrings to integers.

Examples

i<- as.hexmode("7fffffff")i; class(i)identical(as.integer(i), .Machine$integer.max)hm<- as.hexmode(c(NA,1)); hmas.integer(hm)Xm<- as.hexmode(1:16)Xm# print()s via format()stopifnot(nchar(format(Xm))==2)Xm[-16]# *no* leading zeroes!stopifnot(format(Xm[-16])== c(1:9, letters[1:6]))## Integer arithmetic (remaining "hexmode"):16*XmXm^2-Xm(fac<- factorial(Xm[1:12]))# !1, !2, !3, !4 .. in hexadecimalsas.integer(fac)# indeed the same as  factorial(1:12)

Hyperbolic Functions

Description

These functions give the obvious hyperbolic functions. Theyrespectively compute the hyperbolic cosine, sine, tangent, and theirinverses, arc-cosine, arc-sine, arc-tangent (or ‘area cosine’,etc).

Usage

cosh(x)sinh(x)tanh(x)acosh(x)asinh(x)atanh(x)

Arguments

x

a numeric or complex vector

Details

These areinternal genericprimitive functions: methodscan be defined for them individually or via theMath group generic.

Branch cuts are consistent with the inverse trigonometric functionsasinet seq, and agree with those defined inAbramowitz & Stegun, figure 4.7, page 86.The behaviour actually on the cutsfollows the C99 standard which requires continuity coming round theendpoint in a counter-clockwise direction.

S4 methods

All are S4 generic functions: methods can be definedfor them individually or via theMath group generic.

References

Abramowitz, M. and Stegun, I. A. (1972)Handbook of Mathematical Functions. New York: Dover.
Chapter 4. Elementary Transcendental Functions: Logarithmic,Exponential, Circular and Hyperbolic Functions

See Also

The trigonometric functions,cos,sin,tan, and their inversesacos,asin,atan.

The logistic distribution functionplogis is a shiftedversion oftanh() for numericx.


Convert Character Vector between Encodings

Description

This uses system facilities to convert a character vector betweenencodings: the ‘i’ stands for ‘internationalization’.

Usage

iconv(x, from="", to="", sub=NA, mark=TRUE, toRaw=FALSE)iconvlist()

Arguments

x

a character vector, or an object to be converted to a charactervector byas.character, or a list withNULL andraw elements as returned byiconv(toRaw = TRUE).

from

a character string describing the current encoding.

to

a character string describing the target encoding.

sub

character string. If notNA it is used to replaceany non-convertible bytes in the input. (This would normally be asingle character, but can be more.) If"byte", the indication is"<xx>" with the hex code of the byte.If"Unicode" and converting from UTF-8, the Unicode point inthe form"<U+xxxx>", or ifc99, a C99-style escape"\uxxxx". (For points in a ‘supplementary plane’,"\Uxxxxxxxx" is used, with zero-padding)

mark

logical, for expert use. Should encodings be marked?

toRaw

logical. Should a list of raw vectors be returned ratherthan a character vector?

Details

The names of encodings and which ones are available areplatform-dependent. AllR platforms support"" (for theencoding of the current locale),"latin1" and"UTF-8".Generally case is ignored when specifying an encoding.

On most platformsiconvlist provides an alphabetical list ofthe supported encodings. On others, the information is on the manpage foriconv(5) or elsewhere in the man pages (but bewarethat the system commandiconv may not support the same set ofencodings as the C functionsR calls). Unfortunately, the names arerarely supported across all platforms.

Elements ofx which cannot be converted (perhaps because theyare invalid or because they cannot be represented in the targetencoding) will be returned asNA (orNULL fortoRaw = TRUE) unlesssub is specified.

Most versions oficonv will allow transliteration by appending‘⁠//TRANSLIT⁠’ to theto encoding: see the examples.

Encoding"ASCII" is accepted, and on most systems"C"and"POSIX" are synonyms for ASCII. Where"ASCII/TRANSLIT" is unsupported by the OS,"ASCII" isused withsub = "c99" if from UTF-8, elsesub = "?". (However, musl's version of"ASCII" substitutes*.)

Elements ofx with a declared encoding (UTF-8 or latin1, seeEncoding) are converted from that encoding iffrom = "", otherwise they are taken as being in the encoding specified byfrom.

Note that implementations oficonv typically do not do muchvalidity checking and will often mis-convert inputs which are invalidin encodingfrom.

Ifsub = "Unicode" orsub = "c99" is used for anon-UTF-8 input it is the same assub = "byte".

Value

IftoRaw = FALSE (the default), the value is a character vectorof the same length and the same attributes asx (afterconversion to a character vector). If conversion fails for an elementthat element of the result is set toNA_character_. (NB:whether conversion fails is implementation-specific.)NA_character_ inputs giveNA_character_ outputs.

Ifmark = TRUE (the default) the elements of the result have adeclared encoding ifto is"latin1" or"UTF-8",or ifto = "" and the current locale's encoding is detected asLatin-1 (or its superset CP1252 on Windows) or UTF-8.

IftoRaw = TRUE, the value is a list of the same length andthe same attributes asx whose elements are eitherNULL(if conversion fails or the input wasNA_character_) or a rawvector.

Foriconvlist(), a character vector (typically of a few hundredelements) of known encoding names.

Implementation Details

There are three main implementations oficonv in use. Linux'smost common C runtime, ‘⁠glibc⁠’, contains one. Several platformssupply versions or emulations of GNU ‘⁠libiconv⁠’, includingprevious versions of macOS and FreeBSD, in some cases with additionalencodings. On Windows we use a version of Yukihiro Nakadaira's‘⁠win_iconv⁠’, which is based on Windows' codepages. (We haveadded many encoding names for compatibility with other systems.) Allthree haveiconvlist, ignore case in encoding names and support‘⁠//TRANSLIT⁠’ (but with different results, and for‘⁠win_iconv⁠’ currently a ‘best fit’ strategy is used exceptforto = "ASCII").

The macOS 14 implementation is attributed to the ‘CitrusProject’: the Apple headers declare it as ‘compatible’ with GNU‘⁠libiconv⁠’ 1.11 from 2006. However, it differs in significantways including using transliteration for conversions which cannot berepresented exactly in the target encoding. (It seems thisimplementation is also used in recent versions of FreeBSD. Earlierversions of macOS used GNU ‘⁠libiconv⁠’ 1.11 and someCRAN builds still do.) For a failingconversion macOS 14 generally translated character(s) to? but14.1 gives an error (so anNA result inR).

Most commercial Unixes contain an implementation oficonv butnone we have encountered have supported the encoding names we need:the ‘R Installation and Administration’ manual recommendedinstalling GNU ‘⁠libiconv⁠’ on Solaris and AIX.

Some Linux distributions use ‘⁠musl⁠’ as their C runtime. This isless comprehensive than ‘⁠glibc⁠’: it does not support‘⁠//TRANSLIT⁠’ but does inexact conversions (currently using‘⁠*⁠’).

There are other implementations, e.g. NetBSD has used one from theCitrus project (which does not support ‘⁠//TRANSLIT⁠’) and there isan older FreeBSD port.

Note that you cannot rely on invalid inputs being detected, especiallyforto = "ASCII" where some implementations allow 8-bitcharacters and pass them through unchanged or with transliteration orsubstitution.

Some of the implementations have interesting extra encodings: forexample GNU ‘⁠libiconv⁠’ and macOS 14 allowto = "C99" to use‘⁠\uxxxx⁠’ escapes (or if needed ‘⁠\Uuxxxxxxxx⁠’) fornon-ASCII characters.

Byte Order Marks

most commonly known as ‘BOMs’.

Encodings using character units which are more than one byte in sizecan be written on a file in either big-endian or little-endian order:this applies most commonly to UCS-2, UTF-16 and UTF-32/UCS-4encodings. Some systems will write the Unicode characterU+FEFF at the beginning of a file in these encodings andperhaps also in UTF-8. In that usage the character is known as aBOM,and should be handled during input (see the ‘Encodings’ sectionunderconnection: re-encoded connections have somespecial handling ofBOMs). The rest of this section applies when thishas not been done sox starts with aBOM.

Implementations will generally interpret aBOM forfrom givenas one of"UCS-2","UTF-16" and"UTF-32". Implementations differ in how they treatBOMs inx in otherfrom encodings: they may be discarded,returned as characterU+FEFF or regarded as invalid.

Note

The most portable name for the ISO 8859-15 encoding, commonly known as‘Latin 9’, is"iso885915": most platforms support both"latin-9" and"latin9" but GNU ‘⁠libiconv⁠’ does notsupport the latter. ‘⁠musl⁠’ (as used by Alpine Linux and otherlightweight Linux distributions) supports neither, butR remaps thereto"iso885915".

Encoding names"utf8","mac" and"macroman" arenot portable."utf8" is converted to"UTF-8" forfrom andto byiconv, but notfor e.g.fileEncoding arguments."macintosh" isthe official (and most widely supported) name for ‘Mac Roman’(https://en.wikipedia.org/wiki/Mac_OS_Roman).

Usingsub substitutes each non-convertiblebyte in theinput, so when converting from UTF-8 a non-convertible character maybe replaced by two or more bytes. Usingsub = "c99" orsub = "Unicode" will be clearer.

See Also

localeToCharset,file.

Examples

## In principle, as not all systems have iconvlisttry(utils::head(iconvlist(), n=50))## Not run:## convert from Latin-2 to UTF-8: two of the glibc iconv variants.iconv(x,"ISO_8859-2","UTF-8")iconv(x,"LATIN2","UTF-8")## End(Not run)## Both x below are in latin1 and will only display correctly in a## locale that can represent and display latin1.x<-"fran\xE7ais"Encoding(x)<-"latin1"xcharToRaw(xx<- iconv(x,"latin1","UTF-8"))xx## The results in the comments are those from glibc and GNU libiconviconv(x,"latin1","ASCII")#   NAiconv(x,"latin1","ASCII","?")# "fran?ais"iconv(x,"latin1","ASCII","")# "franais"iconv(x,"latin1","ASCII","byte")# "fran<e7>ais"iconv(xx,"UTF-8","ASCII","Unicode")# "fran<U+00E7>ais"iconv(xx,"UTF-8","ASCII","c99")# "fran\\u00e7ais"## Extracts from old R help files (they are nowadays in UTF-8)x<- c("Ekstr\xf8m","J\xf6reskog","bi\xdfchen Z\xfcrcher")Encoding(x)<-"latin1"xtry(iconv(x,"latin1","ASCII//TRANSLIT"))# platform-dependent## glibc gives "Ekstroem" "Joreskog" "bisschen Zurcher"## macOS 14 gives "Ekstrom" "J\"oreskog" "bisschen Z\"urcher"## musl gives "Ekstr*m" "J*reskog" "bi*chen Z*rcher"iconv(x,"latin1","ASCII", sub="byte")## and for Windows' 'Unicode'str(xx<- iconv(x,"latin1","UTF-16LE", toRaw=TRUE))iconv(xx,"UTF-16LE","UTF-8")emoji<-"\U0001f604"iconv(emoji,,"latin1", sub="Unicode")# "<U+1F604>"iconv(emoji,,"latin1", sub="c99")

Setup Collation by ICU

Description

Controls the way collation is done by ICU (an optional part of theRbuild).

Usage

icuSetCollate(...)icuGetCollate(type= c("actual","valid"))

Arguments

...

named arguments, see ‘Details’.

type

a character string: either the"actual" localein use for collation, or the most specific locale which would be"valid". Can be abbreviated.

Details

Optionally,R can be built to collate character strings by ICU(https://icu.unicode.org/). For such systems,icuSetCollate can be used to tune the way collation is done.On other builds calling this function does nothing, with a warning.

Possible arguments are

locale:

A character string such as"da_DK"giving the language and country whose collation rules are to beused. If present, this should be the first argument.

case_first:

"upper","lower" or"default", asking for upper- or lower-case characters to besorted first. The default is usually lower-case first, but not inall languages (not under the default settings for Danish, for example).

alternate_handling:

Controls the handling of‘variable’ characters (mainly punctuation and symbols).Possible values are"non_ignorable" (primary strength) and"shifted" (quaternary strength).

strength:

Which components should be used? Possiblevalues"primary","secondary","tertiary"(default),"quaternary" and"identical".

french_collation:

In a French locale the way accentsaffect collation is from right to left, whereas in most other localesit is from left to right. Possible values"on","off"and"default".

normalization:

Should strings be normalized? Possible valuesare"on" and"off" (default). This affects thecollation of composite characters.

case_level:

An additional level between secondary andtertiary, used to distinguish large and small Japanese Kanacharacters. Possible values"on" and"off" (default).

hiragana_quaternary:

Possible values"on" (sortHiragana first at quaternary level) and"off".

Only the first three are likely to be of interest except to those with adetailed understanding of collation and specialized requirements.

Some special values are accepted forlocale:

"none":

ICU is not used for collation: the OS'scollation services are used instead.

"ASCII":

ICU is not used for collation: the C functionstrcmp is used instead, which should sort byte-by-byte in(unsigned) numerical order.

"default":

obtains the locale from the OS as is done at the start of thesession (except on Windows). If environment variableR_ICU_LOCALE is set to a non-empty value, its value is usedrather than consulting the OS, unless environment variableLC_ALL is set to 'C' (or unset butLC_COLLATE is set to'C').

"","root":

the ‘root’ collation: seehttps://www.unicode.org/reports/tr35/tr35-collation.html#Root_Collation.

For the specifications of ‘real’ ICU locales, seehttps://unicode-org.github.io/icu/userguide/locale/. Note that ICU does notreport that a locale is not supported, but falls back to its idea of‘best fit’ (which could be rather different and is reported byicuGetCollate("actual"), often"root"). Most Englishlocales fall back to"root" as although e.g."en_GB" isa valid locale (at least on some platforms), it contains no specialrules for collation. Note that"C" is not a supported ICU localeand henceR_ICU_LOCALE should never be set to"C".

Some examples arecase_level = "on", strength = "primary" to ignoreaccent differences andalternate_handling = "shifted" to ignorespace and punctuation characters.

Initially ICU will not be used for collation if the OS is set to use theC locale for collation andR_ICU_LOCALE is not set. Oncethis function is called with a value forlocale, ICU will be useduntil it is called again withlocale = "none". ICU will not beused onceSys.setlocale is called with a"C" value forLC_ALL orLC_COLLATE, even ifR_ICU_LOCALE is set. ICU will be used again honoringR_ICU_LOCALE onceSys.setlocale is called to set a different collation order. Environment variablesLC_ALL (orLC_COLLATE) take precedenceoverR_ICU_LOCALE if and only if they are set to 'C'. Due to theinteraction with other ways of setting the collation order,R_ICU_LOCALE should be used with care and only when needed.

All customizations are reset to the default for the locale iflocale is specified: the collation engine is reset if theOS collation locate category is changed bySys.setlocale.

Value

ForicuGetCollate, a character string describing the ICU localein use (which may be reported as"ICU not in use"). The‘actual’ locale may be simpler than the requested locale: forexample"da" rather than"da_DK": English locales arelikely to report"root".

Note

Except on Windows, ICU is used by default wherever it is available.As it works internally in UTF-8, it will be most efficient in UTF-8locales.

On Windows,R is normally built including ICU, but it will only beused if environment variableR_ICU_LOCALE had been set whenRis started or aftericuSetCollate is called to select thelocale (as ICU and Windows differ in their idea of locale names).Note thaticuSetCollate(locale = "default") should workreasonably well, but finds the system default ignoring environmentvariables such asLC_COLLATE.

See Also

Comparison,sort.

capabilities for whether ICU is available;extSoftVersion for its version.

The ICU user guide chapter on collation(https://unicode-org.github.io/icu/userguide/collation/).

Examples

## These examples depend on having ICU available, and on the locale.## As we don't know the current settings, we can only reset to the default.if(capabilities("ICU")) withAutoprint({    icuGetCollate()    icuGetCollate("valid")    x<- c("Aarhus","aarhus","safe","test","Zoo")    sort(x)    icuSetCollate(case_first="upper"); sort(x)    icuSetCollate(case_first="lower"); sort(x)## Danish collates upper-case-first and with 'aa' as a single letter    icuSetCollate(locale="da_DK", case_first="default"); sort(x)## Estonian collates Z between S and T    icuSetCollate(locale="et_EE"); sort(x)    icuSetCollate(locale="default"); icuGetCollate("valid")})

Test Objects for Exact Equality

Description

The safe and reliable way to test two objects for beingexactly equal. It returnsTRUE in this case,FALSE in every other case.

Usage

identical(x, y, num.eq=TRUE, single.NA=TRUE, attrib.as.set=TRUE,          ignore.bytecode=TRUE, ignore.environment=FALSE,          ignore.srcref=TRUE, extptr.as.ref=FALSE)

Arguments

x,y

anyR objects.

num.eq

logical indicating if (double andcomplex non-NA) numbers should becompared using== (‘equal’), or by bitwisecomparison. The latter (non-default) differentiates between-0 and+0.

single.NA

logical indicating if there is conceptually just one numericNA and oneNaN;single.NA = FALSEdifferentiates bit patterns.

attrib.as.set

logical indicating ifattributes ofx andy should be treated asunordered taggedpairlists (“sets”); this currently also applies toslots of S4 objects. It may well be too strict to setattrib.as.set = FALSE.

ignore.bytecode

logical indicating if byte code should beignored when comparingclosures.

ignore.environment

logical indicating if their environmentsshould be ignored when comparingclosures.

ignore.srcref

logical indicating if their"srcref"attributes should be ignored when comparingclosures.

extptr.as.ref

logical indicating whether external pointerobjects should be compared as reference objects and consideredidentical only if they are the same object in memory. By default,external pointers are considered identical if the addresses theycontain are identical.

Details

A call toidentical is the way to test exact equality inif andwhile statements, as well as in logicalexpressions that use&& or||. In all theseapplications you need to be assured of getting a single logicalvalue.

Users often use the comparison operators, such as== or!=, in these situations. It looks natural, but it is not whatthese operators are designed to do inR. They return an object likethe arguments. If you expectedx andy to be of length1, but it happened that one of them was not, you willnot get asingleFALSE. Similarly, if one of the arguments isNA,the result is alsoNA. In either case, the expressionif(x == y).... won't work as expected.

The functionall.equal is also sometimes used to test equalitythis way, but was intended for something different: it allows forsmall differences in numeric results.

The computations inidentical are also reliable and usuallyfast. There should never be an error. The only known way to killidentical is by having an invalid pointer at the C level,generating a memory fault. It will usually find inequality quickly.Checking equality for two large, complicated objects can take longerif the objects are identical or nearly so, but represent completelyindependent copies. For most applications, however, the computational costshould be negligible.

Ifsingle.NA is true, as by default,identical seesNaN as different fromNA_real_, but allNaNs are equal (and allNA of the same type are equal).

Character strings (except those in marked encoding"bytes") areregarded as identical even if they are in different marked encodings butwould agree when translated to UTF-8. A character string in marked encoding"bytes" is only regarded as identical to a character string in thesame encoding and with the same content.

Ifattrib.as.set is true, as by default, comparison ofattributes view them as a set (and not a vector, so order is nottested).

Ifignore.bytecode is true (the default), the compiledbytecode of a function (seecmpfun) will be ignored inthe comparison. If it is false, functions will compare equal only ifthey are copies of the same compiled object (or both areuncompiled). To check whether two different compiles are equal, youshould compare the results ofdisassemble().

You almost never want to useidentical on datetimes of class"POSIXlt": not only can different times in the differenttime zones represent the same time and time zones have multiple names,but several of the components are optional.

Note that the strictest test for equality is

    identical(x, y,              num.eq = FALSE, single.NA = FALSE, attrib.as.set = FALSE,              ignore.bytecode = FALSE, ignore.environment = FALSE,              ignore.srcref = FALSE, extptr.as.ref = TRUE)

Value

A single logical value,TRUE orFALSE, neverNAand never anything other than a single value.

Author(s)

John Chambers and R Core

References

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

See Also

all.equal for descriptions of how two objects differ;Comparison andLogic for elementwise comparisons.

Examples

identical(1,NULL)## FALSE -- don't try this with ==identical(1,1.)## TRUE in R (both are stored as doubles)identical(1, as.integer(1))## FALSE, stored as different typesx<-1.0; y<-0.99999999999## how to test for object equality allowing for numeric fuzz :(E<- all.equal(x, y))identical(TRUE, E)isTRUE(E)# alternative test## If all.equal thinks the objects are different, it returns a## character string, and the above expression evaluates to FALSE## even for unusual R objects :identical(.GlobalEnv, environment())### ------- Pickyness Flags : -----------------------------## the infamous example:identical(0.,-0.)# TRUE, i.e. not differentiatedidentical(0.,-0., num.eq=FALSE)## similar:identical(NaN,-NaN)# TRUEidentical(NaN,-NaN, single.NA=FALSE)# differ on bit-level### For functions ("closure"s): ----------------------------------------------###     ~~~~~~~~~f<-function(x) xfg<- compiler::cmpfun(f)gidentical(f, g)# TRUE, as bytecode is ignored by defaultidentical(f, g, ignore.bytecode=FALSE)# FALSE: bytecode differs## GLM families contain several functions, some of which share an environment:p1<- poisson(); p2<- poisson()identical(p1, p2)# FALSEidentical(p1, p2, ignore.environment=TRUE)# TRUE## in interactive use, the 'keep.source' option is typically true:op<- options(keep.source=TRUE)# and so, these have differing "srcref" :f1<-function(){}f2<-function(){}identical(f1,f2)# ignore.srcref= TRUE : TRUEidentical(f1,f2,  ignore.srcref=FALSE)# FALSEoptions(op)# revert to previous state

Identity Function

Description

A trivial identity function returning its argument.

Usage

identity(x)

Arguments

x

anR object.

See Also

diag creates diagonal matrices, including identity ones.


Conditional Element Selection

Description

ifelse returns a value with the same shape astest which is filled with elements selectedfrom eitheryes ornodepending on whether the element oftestisTRUE orFALSE.

Usage

ifelse(test, yes, no)

Arguments

test

an object which can be coerced to logical mode.

yes

return values for true elements oftest.

no

return values for false elements oftest.

Details

Ifyes orno are too short, their elements are recycled.yes will be evaluated if and only if any element oftestis true, and analogously forno.

Missing values intest give missing values in the result.

Value

A vector of the same length and attributes (including dimensions and"class") astest and data values from the values ofyes orno. The mode of the answer will be coerced fromlogical to accommodate first any values taken fromyes and thenany values taken fromno.

Warning

The mode of the result may depend on the value oftest (see theexamples), and the class attribute (seeoldClass) of theresult is taken fromtest and may be inappropriate for thevalues selected fromyes andno.

Sometimes it is better to use a construction such as

  (tmp <- yes; tmp[!test] <- no[!test]; tmp)

, possibly extended to handle missing values intest.

Further note thatif(test) yes else no is much more efficientand often much preferable toifelse(test, yes, no) whenevertest is a simple true/false result, i.e., whenlength(test) == 1.

Thesrcref attribute of functions is handled specially: iftest is a simple true result andyes evaluates to a functionwithsrcref attribute,ifelse returnsyes includingits attribute (the same applies to a falsetest andnoargument). This functionality is only for backwards compatibility, theformif(test) yes else no should be used wheneveryes andno are functions.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

if.

Examples

x<- c(6:-4)sqrt(x)#- gives warningsqrt(ifelse(x>=0, x,NA))# no warning## Note: the following also gives the warning !ifelse(x>=0, sqrt(x),NA)## ifelse() strips attributes## This is important when working with Dates and factorsx<- seq(as.Date("2000-02-29"), as.Date("2004-10-04"), by="1 month")## has many "yyyy-mm-29", but a few "yyyy-03-01" in the non-leap yearsy<- ifelse(as.POSIXlt(x)$mday==29, x,NA)head(y)# not what you expected ... ==> need restore the class attribute:class(y)<- class(x)y## This is a (not atypical) case where it is better *not* to use ifelse(),## but rather the more efficient and still clear:y2<- xy2[as.POSIXlt(x)$mday!=29]<-NA## which gives the same as ifelse()+class() hack:stopifnot(identical(y2, y))## example of different return modes (and 'test' alone determining length):yes<-1:3no<- pi^(1:4)utils::str( ifelse(NA,    yes, no))# logical, length 1utils::str( ifelse(TRUE,  yes, no))# integer, length 1utils::str( ifelse(FALSE, yes, no))# double,  length 1

Integer Vectors

Description

Creates or tests for objects of type"integer".

Usage

integer(length=0)as.integer(x,...)is.integer(x)

Arguments

length

a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error.

x

object to be coerced or tested.

...

further arguments passed to or from other methods.

Details

Integer vectors exist so that data can be passed to C or Fortran codewhich expects them, and so that (small) integer data can be representedexactly and compactly.

Note that current implementations ofR use 32-bit integers forinteger vectors, so the range of representable integers is restrictedto about±2×109\pm 2 \times 10^9:doubles canhold much larger integers exactly.

Value

integer creates a integer vector of the specified length.Each element of the vector is equal to0.

as.integer attempts to coerce its argument to be of integertype. The answer will beNA unless the coercion succeeds. Realvalues larger in modulus than the largest integer are coerced toNA (unlike S which gives the most extreme integer of the samesign). Non-integral numeric values are truncated towards zero (i.e.,as.integer(x) equalstrunc(x) there), andimaginary parts of complex numbers are discarded (with a warning).Character strings containing optional whitespace followed by either adecimal representation or a hexadecimal representation (starting with0x or0X) can be converted, as well as any allowed bythe platform for real numbers. Likeas.vector it stripsattributes including names. (To ensure that an objectx is ofinteger type without stripping attributes, usestorage.mode(x) <- "integer".)

is.integer returnsTRUE orFALSE depending onwhether its argument is of integertype or not, unless it is afactor when it returnsFALSE.

Note

is.integer(x) doesnot test ifx contains integernumbers! For that, useround, as in the functionis.wholenumber(x) in the examples.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

numeric,storage.mode.

round (andceiling andfloor on that helppage) to convert to integral values.

Examples

## as.integer() truncates:x<- pi* c(-1:1,10)as.integer(x)is.integer(1)# is FALSE !is.wholenumber<-function(x, tol= .Machine$double.eps^0.5)  abs(x- round(x))< tolis.wholenumber(1)# is TRUE(x<- seq(1,5, by=0.5))is.wholenumber( x)#-->  TRUE FALSE TRUE ...

Compute Factor Interactions

Description

interaction computes a factor which represents the interactionof the given factors. The result ofinteraction is always unordered.

Usage

interaction(..., drop=FALSE, sep=".", lex.order=FALSE)

Arguments

...

the factors for which interaction is to be computed, or asingle list giving those factors.

drop

ifdrop isTRUE, unused factor levelsare dropped from the result. The default is to retain allfactor levels.

sep

string to construct the new level labels by joining theconstituent ones.

lex.order

logical indicating if the order of factor concatenationshould be lexically ordered.

Value

A factor which represents the interaction of the given factors.The levels are labelled as the levels of the individual factors joinedbysep which is. by default.

By default, whenlex.order = FALSE, the levels are ordered sothe level of the first factor varies fastest, then the second and soon. This is the reverse of lexicographic ordering (which you can getbylex.order = TRUE), and differs from:. (It is done this way for compatibility with S.)

References

Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.

See Also

factor;: wheref:g is similar tointeraction(f, g, sep = ":") whenf andg are factors.

Examples

a<- gl(2,4,8)b<- gl(2,2,8, labels= c("ctrl","treat"))s<- gl(2,1,8, labels= c("M","F"))interaction(a, b)interaction(a, b, s, sep=":")stopifnot(identical(a:s,                    interaction(a, s, sep=":", lex.order=TRUE)),          identical(a:s:b,                    interaction(a, s, b, sep=":", lex.order=TRUE)))

Is R Running Interactively?

Description

ReturnTRUE whenR is being used interactively andFALSE otherwise.

Usage

interactive()

Details

An interactiveR session is one in which it is assumed that there isa human operator to interact with, so for exampleR can prompt forcorrections to incorrect input or ask what to do next or if it is OKto move to the next plot.

GUI consoles will arrange to startR in an interactive session. WhenR is run in a terminal (viaRterm.exe on Windows), itassumes that it is interactive if ‘stdin’ is connected to a(pseudo-)terminal and not if ‘stdin’ is redirected to a file orpipe. Command-line options--interactive (Unix) and--ess (Windows,Rterm.exe) override the defaultassumption.(On a Unix-alike, whether thereadline command-line editor isused isnot overridden by--interactive.)

Embedded uses ofR can set a session to be interactive or not.

Internally, whether a session is interactive determines

  • how some errors are handled and reported, e.g. seestop andoptions("showWarnCalls").

  • whether one of--save,--no-save or--vanilla is required, and ifR ever asks whether to save theworkspace.

  • the choice of default graphics device launched when needed andbydev.new: seeoptions("device")

  • whether graphics devices ever ask for confirmation of a newpage.

In addition,R's ownR code makes use ofinteractive(): forexamplehelp,debugger andinstall.packages do.

Note

This is aprimitive function.

See Also

source,.First

Examples

.First<-function()if(interactive()) x11()

Call an Internal Function

Description

.Internal performs a call to an internal codewhich is built in to theR interpreter.

Only trueR wizards should even consider using this function, and onlyR developers can add to the list of internal functions.

Usage

.Internal(call)

Arguments

call

a call expression

See Also

.Primitive,.External (the nearestequivalent available to users).


Internal Generic Functions

Description

ManyR-internal functions aregeneric and allowmethods to be written for.

Details

The following primitive and internal functions aregeneric,i.e., you can writemethods for them:

[,[[,$,[<-,[[<-,$<-,

length,length<-,lengths,dimnames,dimnames<-,dim,dim<-,names,names<-,levels<-,@,@<-,

c,unlist,cbind,rbind,

as.character,as.complex,as.double,as.integer,as.logical,as.raw,as.vector,as.call,as.environmentis.array,is.matrix,is.na,anyNA,is.nan,is.finiteis.infiniteis.numeric,ncharrep,rep.intrep_lenseq.int(which dispatches methods for"seq"),is.unsortedandxtfrm

In addition,is.name is a synonym foris.symbol anddispatches methods for the latter. Similarly,as.numericis a synonym foras.double and dispatches methods for thelatter, i.e., S3 methods are foras.double, whereas S4 methodsare to be written foras.numeric.

Note that all of thegroup generic functions are alsointernal/primitive and allow methods to be written for them.

.S3PrimitiveGenerics is a character vector listing theprimitives which are internal generic and notgroup generic,(not only for S3 but also S4).Similarly, the.internalGenerics character vector contains the namesof the internal (via.Internal(..)) non-primitive functionswhich are internally generic.

Note

For efficiency, internal dispatch only occurs onobjects, thatis those for whichis.object returns true.

See Also

methods for the methods which are available.


Change the Print Mode to Invisible

Description

Return a (temporarily) invisible copy of an object.

Usage

invisible(x=NULL)

Arguments

x

an arbitraryR object, by defaultNULL.

Details

This function can be useful when it is desired to have functionsreturn values which can be assigned, but which do not print when theyare not assigned.

This is aprimitive function.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

withVisible,return,function.

Examples

# These functions both return their argumentf1<-function(x) xf2<-function(x) invisible(x)f1(1)# printsf2(1)# does not

Finite, Infinite and NaN Numbers

Description

is.finite andis.infinite return a vector of the samelength asx, indicating which elements are finite (not infiniteand not missing) or infinite.

Inf and-Inf are positive and negative infinitywhereasNaN means ‘Not a Number’. (These apply to numericvalues and real and imaginary parts of complex values but not tovalues of integer vectors.)Inf andNaN (as well asNA) arereserved words in theR language.

Usage

is.finite(x)is.infinite(x)is.nan(x)InfNaN

Arguments

x

R object to be tested: the default methods handle atomicvectors.

Details

is.finite returns a vector of the same length asx thej-th element of which isTRUE ifx[j] is finite (i.e., itis not one of the valuesNA,NaN,Inf or-Inf) andFALSE otherwise. Complexnumbers are finite if both the real and imaginary parts are.

is.infinite returns a vector of the same length asx thej-th element of which isTRUE ifx[j] is infinite (i.e.,equal to one ofInf or-Inf) andFALSEotherwise. This will be false unlessx is numeric or complex.Complex numbers are infinite if either the real or the imaginary part is.

is.nan tests if a numeric value isNaN. Do not testequality toNaN, or even useidentical, sincesystems typically have many different NaN values. One of these isused for the numeric missing valueNA, andis.nan isfalse for that value. A complex number is regarded asNaN ifeither the real or imaginary part isNaN but notNA.All elements of logical, integer and raw vectors are considered not tobe NaN.

All three functions acceptNULL as input and return a lengthzero result. The default methods accept character and raw vectors, andreturnFALSE for all entries. Prior toR version 2.14.0 theyaccepted all input, returningFALSE for most non-numericvalues; cases which are not atomic vectors are now signalled aserrors.

All three functions are generic: you can write methods to handlespecific classes of objects, seeInternalMethods.

Value

A logical vector of the same length asx:dim,dimnames andnames attributes are preserved.

Note

InR, basically all mathematical functions (including basicArithmetic), are supposed to work properly with+/- Inf andNaN as input or output.

The basic rule should be that calls and relations withInfsreally are statements with a proper mathematicallimit.

Computations involvingNaN will returnNaN or perhapsNA: which of those two is not guaranteed and may dependon theR platform (since compilers may re-order computations).

References

The IEC 60559 standard, also known as theANSI/IEEE 754 Floating-Point Standard.

https://en.wikipedia.org/wiki/NaN.

D. Goldberg (1991).What Every Computer Scientist Should Know about Floating-PointArithmetic.ACM Computing Surveys,23(1), 5–48.doi:10.1145/103162.103163.
Also available athttps://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html.

The C99 functionisfinite is used foris.finite.

See Also

NA, ‘Not Available’ which is not a numberas well, however usually used for missing values and applies to manymodes, not just numeric and complex.

Arithmetic,double.

Examples

pi/0## = Inf a non-zero number divided by zero creates infinity0/0## =  NaN1/0+1/0# Inf1/0-1/0# NaNstopifnot(1/0==Inf,1/Inf==0)sin(Inf)cos(Inf)tan(Inf)

Is an Object of Type (Primitive) Function?

Description

Checks whether its argument is a (primitive) function.

Usage

is.function(x)is.primitive(x)

Arguments

x

anR object.

Details

is.primitive(x) tests ifx is aprimitive function,i.e, iftypeof(x) is either"builtin" or"special".

Value

TRUE ifx is a (primitive) function, andFALSEotherwise.

Examples

is.function(1)# FALSEis.function(is.primitive)# TRUE: it is a function, but ..is.primitive(is.primitive)# FALSE: it's not a primitive one, whereasis.primitive(is.function)# TRUE: that one *is*

Is an Object a Language Object?

Description

is.language returnsTRUE ifx is avariablename, acall, or anexpression.

Usage

is.language(x)

Arguments

x

object to be tested.

Note

Aname is also known as ‘symbol’, from its type(typeof), seeis.symbol.

Iftypeof(x) == "language", thenis.language(x)is always true, but the reverse does not hold as expressions ornamesy also fulfillis.language(y), see the examples.

This is aprimitive function.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Examples

ll<- list(a= expression(x^2-2*x+1), b= as.name("Jim"),           c= as.expression(exp(1)), d= call("sin", pi))sapply(ll, typeof)sapply(ll, mode)stopifnot(sapply(ll, is.language))

Is an Object ‘internally classed’?

Description

A function mostly for internal use. It returnsTRUE if theobjectx has theR internalOBJECT bit set, andFALSE otherwise. TheOBJECT bit is set when a"class" attribute is added and removed when that attribute isremoved, so this is a very efficient way to check if an object has aclass attribute. (S4 objects always should.)

Note that typical basic (‘atomic’, seeis.atomic)R vectors and arraysx arenot objects in the abovesense asattributes(x) doesnot contain"class".

Usage

is.object(x)

Arguments

x

object to be tested.

Note

This is aprimitive function.

See Also

class, andmethods.

isS4.

Examples

is.object(1)# FALSEis.object(as.factor(1:3))# TRUE

Is an Object Atomic or Recursive?

Description

is.atomic returnsTRUE ifx is of an atomic typeandFALSE otherwise.

is.recursive returnsTRUE ifx has a recursive(list-like) structure andFALSE otherwise.

Usage

is.atomic(x)is.recursive(x)

Arguments

x

object to be tested.

Details

is.atomic is true for theatomic types("logical","integer","numeric","complex","character" and"raw").

Most types of objects are regarded as recursive. Exceptions are the atomictypes,NULL, symbols (as given byas.name),S4 objects with slots, external pointers, and—rarely visiblefromR—weak references and byte code, seetypeof.

It is common to call the atomic types ‘atomic vectors’, butnote thatis.vector imposes further restrictions: anobject can be atomic but not a vector (in that sense).

These areprimitive functions.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

is.list,is.language, etc,and thedemo("is.things").

Examples

require(stats)is.a.r<-function(x) c(is.atomic(x), is.recursive(x))is.a.r(c(a=1, b=3))# TRUE FALSEis.a.r(list())# FALSE TRUE - a list is a listis.a.r(list(2))# FALSE TRUEis.a.r(lm)# FALSE TRUEis.a.r(y~ x)# FALSE TRUEis.a.r(expression(x+1))# FALSE TRUEis.a.r(quote(exp))# FALSE FALSEis.a.r(NULL)# FALSE FALSE

Is an Object of Single Precision Type?

Description

is.single reports an error. There are no single precisionvalues in R.

Usage

is.single(x)

Arguments

x

object to be tested.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.


Test if an Object is Not Sorted

Description

Test if an object is not sorted (in increasing order), without thecost of sorting it.

Usage

is.unsorted(x, na.rm=FALSE, strictly=FALSE)

Arguments

x

anR object with a class or a numeric, complex, character,logical or raw vector.

na.rm

logical. Should missing values be removed before checking?

strictly

logical indicating if the check should be forstrictly increasing values.

Details

is.unsorted is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.

Value

A length-one logical value. All objects of length 0 or 1 are sorted.Otherwise, the result will beNA except for atomic vectors andobjects with an S3 class (where the>= or> method isused to comparex[i] withx[i-1] fori in2:length(x)) or with an S4 class where you have to provide amethod foris.unsorted().

Note

This function is designed for objects with one-dimensional indices, asdescribed above. Data frames, matrices and other arrays may givesurprising results.

See Also

sort,order.


Date-time Conversion Functions from Numeric Representations

Description

Convenience wrappers to create date-times from numeric representations.

Usage

ISOdatetime(year, month, day, hour, min, sec, tz="")ISOdate(year, month, day, hour=12, min=0, sec=0, tz="GMT")

Arguments

year,month,day

numerical values to specify a day.

hour,min,sec

numerical values for a time within a day.Fractional seconds are allowed.

tz

atime zone specification to be used for the conversion."" is the current time zone and"GMT" is UTC. Invalidvalues are most commonly treated as UTC, on some platforms with a warning.

Details

ISOdatetime andISOdate are convenience wrappers forstrptime that differ only in their defaults and thatISOdate sets UTC as the time zone. For dates without times itwould normally be better to use the"Date" class.

The main arguments will be recycled using the usual recycling rules.

Because these make use ofstrptime, only years in therange0:9999 are accepted.

Value

An object of class"POSIXct".

See Also

DateTimeClasses for details of the date-time classes;strptime for conversions from character strings.


Test for an S4 object

Description

Tests whether the object is an instance of an S4 class.

Usage

isS4(object)asS4(object, flag=TRUE, complete=TRUE)asS3(object, flag=TRUE, complete=TRUE)

Arguments

object

Any R object.

flag

Optional, logical: indicate direction of conversion.

complete

Optional, logical: whether conversion to S3 iscompleted. Not usually needed, but see the details section.

Details

Note thatisS4 does not rely on themethodspackage, so in particular it can be used to detect the need torequire that package.

asS3 uses the value ofcomplete to control whether an attempt is made to transformobject into a valid object of the implied S3 class. Ifcomplete isTRUE,then an object from an S4 class extending an S3 class will betransformed into an S3 object with the corresponding S3 class (seeS3Part). This includes classes extending thepseudo-classesarray andmatrix: such objects will havetheir class attribute set toNULL.

isS4 isprimitive.

Value

isS4 always returnsTRUE orFALSE according towhether the internal flag marking an S4 object has been turned on forthis object.

asS4 andasS3 will turn this flag on or off,andasS3 will set the class from the objects.S3Classslot if one exists. Note thatasS3 willnot turnthe object into an S3 objectunless there is a valid conversion; that is, an object of type otherthan"S4" for which the S4 object is an extension, unlessargumentcomplete isFALSE.

See Also

is.object for a more general test;Introductionfor general information on S4;Classes_Details for more on S4class definitions.

Examples

isS4(pi)# FALSEisS4(getClass("MethodDefinition"))# TRUE

Test if a Matrix or other Object is Symmetric (Hermitian)

Description

Generic function to test ifobject is symmetric or not.Currently only a matrix method is implemented, where acomplex matrixZ must be “Hermitian” forisSymmetric(Z) to be true.

Usage

isSymmetric(object,...)## S3 method for class 'matrix'isSymmetric(object, tol=100* .Machine$double.eps,            tol1=8* tol,...)

Arguments

object

anyR object; amatrix for the matrix method.

tol

numeric scalar >= 0. Smaller differences are notconsidered, seeall.equal.numeric.

tol1

numeric scalar >= 0.isSymmetric.matrix()‘pre-tests’ the first and last few rows for fast detection of‘obviously’ asymmetric cases with this tolerance. Setting itto length zero will skip the pre-tests.

...

further arguments passed to methods; the matrix methodpasses these toall.equal. If the row and columnnames ofobject are allowed to differ for the symmetry checkdo usecheck.attributes = FALSE!

Details

Thematrix method is used insideeigen bydefault to test symmetry of matricesup to rounding error, usingall.equal. It might not be appropriate in allsituations.

Note that a matrixm is only symmetric if itsrownames andcolnames are identical. Consider usingunname(m).

Value

logical indicating ifobject is symmetric or not.

See Also

eigen which callsisSymmetric when itssymmetric argument is missing.

Examples

isSymmetric(D3<- diag(3))# -> TRUED3[2,1]<-1e-100D3isSymmetric(D3)# TRUEisSymmetric(D3, tol=0)# FALSE for zero-tolerance## Complex Matrices - Hermitian or notZ<- sqrt(matrix(-1:2+0i,2)); Z<- t(Conj(Z))%*% ZZisSymmetric(Z)# TRUEisSymmetric(Z+1)# TRUEisSymmetric(Z+1i)# FALSE -- a Hermitian matrix has a *real* diagonalcolnames(D3)<- c("X","Y","Z")isSymmetric(D3)# FALSE (as row and column names differ)isSymmetric(D3, check.attributes=FALSE)# TRUE  (as names are not checked)

‘Jitter’ (Add Noise) to Numbers

Description

Add a small amount of noise to a numeric vector.

Usage

jitter(x, factor=1, amount=NULL)

Arguments

x

numeric vector to whichjitter should be added.

factor

numeric.

amount

numeric; if positive, used asamount (see below),otherwise, if= 0 the default isfactor * z/50.

Default (NULL):factor * d/5 whered is aboutthe smallest difference betweenx values.

Details

The result, sayr, isr <- x + runif(n, -a, a)wheren <- length(x) anda is theamountargument (if specified).

Letz <- max(x) - min(x) (assuming the usual case).The amounta to be added is either provided aspositiveargumentamount or otherwise computed fromz, asfollows:

Ifamount == 0, we seta <- factor * z/50 (same as S).

Ifamount isNULL (default), we seta <- factor * d/5 whered is the smallestdifference between adjacent unique (apart from fuzz)x values.

Value

jitter(x, ...) returns a numeric of the same length asx, but with anamount of noise added in order to breakties.

Author(s)

Werner Stahel and Martin Maechler, ETH Zurich

References

Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P.A. (1983)Graphical Methods for Data Analysis. Wadsworth; figures 2.8,4.22, 5.4.

Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.

See Also

rug which you may want to combine withjitter.

Examples

round(jitter(c(rep(1,3), rep(1.2,4), rep(3,3))),3)## These two 'fail' with S-plus 3.x:jitter(rep(0,7))jitter(rep(10000,5))

Compute or Estimate the Condition Number of a Matrix

Description

The condition number of a regular (square) matrix is the product ofthenorm of the matrix and the norm of its inverse (orpseudo-inverse), and hence depends on the kind of matrix-norm.

kappa() computes by default (an estimate of) the 2-normcondition number of a matrix or of theRR matrix of aQRQRdecomposition, perhaps of a linear fit. The 2-norm condition numbercan be shown to be the ratio of the largest to the smallestnon-zero singular value of the matrix.

rcond() computes an approximation of thereciprocalcondition number, see the details.

Usage

kappa(z,...)## Default S3 method:kappa(z, exact=FALSE,      norm=NULL, method= c("qr","direct"),      inv_z= solve(z),      triangular=FALSE, uplo="U",...)## S3 method for class 'lm'kappa(z,...)## S3 method for class 'qr'kappa(z,...).kappa_tri(z, exact=FALSE, LINPACK=TRUE, norm=NULL, uplo="U",...)rcond(x, norm= c("O","I","1"), triangular=FALSE, uplo="U",...)

Arguments

z,x

a numeric or complex matrix or a result ofqr or a fit from a class inheriting from"lm".

exact

logical. Should the result be exact (up to small roundingerror) as opposed to fast (but quite inaccurate)?

norm

character string, specifying the matrix norm with respectto which the condition number is to be computed, see the functionnorm(). Forkappa(), the default is"2",forrcond() it is"O", and for.kappa_tri()), thedefault depends onexact: if that is true, the default is"2", otherwise"O",meaning theOne- or 1-norm. Forexact=FALSE, thecurrently only other possible value is"I" for the infinitynorm. Forexact=TRUE, norm may be"2", or any of thepossibletype values innorm(., type = *).

method

a partially matched character string specifying the method to be used;"qr" is the default for back-compatibility, mainly.

inv_z

forexact=TRUE, norm != "2", (an approximation of)solve(z); could be the pseudo inverse or a fastapproximate inverse of the matrixz. By default,solve(z) is the most expensive part of the condition computationwhenexact is true.

triangular

logical. If true, the matrix used is just the upper orlower triangular part ofz (orx), depending on

uplo

character string, either"U" or"L". Used onlywhentriangular = TRUE, indicates if the upper or lowertriangular part of the matrix is to be used.

LINPACK

logical. If true andz is not complex, theLINPACK routinedtrco() is called; otherwise the relevantLAPACK routine is.

...

further arguments passed to or from other methods;forkappa.*(), notablyLINPACK whennorm is not"2".

Details

Forkappa(), ifexact = FALSE (the default) thecondition number is estimated by a cheap approximation to the 1-norm ofthe triangular matrixRR of theqr(x) decompositionz=QRz = QR. However, the exact 2-norm calculation (viasvd) is also likely to be quick enough.

Note that the approximate 1- and Inf-norm condition numbers viamethod = "direct" are much faster tocalculate, andrcond() computes thesereciprocalcondition numbers, also for complex matrices, using standard LAPACKroutines.Currently, also thekappa*() functions compute theseapproximations wheneverexact is false, i.e., by default.

kappa andrcond are different interfaces topartly identical functionality.

.kappa_tri is an internal function called bykappa.qr andkappa.default;tri is fortriangular and its methodsonly consider the upper or lower triangular part of the matrix, dependingonuplo = "U" or"L", where"U" was internally hardwired beforeR 4.4.0.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.

Value

The condition number,kappakappa, or an approximation ifexact = FALSE.

Author(s)

The design was inspired by (but differs considerably from)the S function of the same name described in Chambers (1992).

Source

The LAPACK routinesDTRCON andZTRCON and the LINPACKroutineDTRCO.

LAPACK and LINPACK are fromhttps://netlib.org/lapack/ andhttps://netlib.org/linpack/ and their guides are listedin the references.

References

Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.

Chambers, J. M. (1992)Linear models.Chapter 4 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide. Philadelphia: SIAM Publications.

See Also

norm;svd for the singular value decomposition andqr for theQRQR one.

Examples

kappa(x1<- cbind(1,1:10))# 15.71kappa(x1, exact=TRUE)# 13.68kappa(x2<- cbind(x1,2:11))# high! [x2 is singular!]hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}sv9<- svd(h9<- hilbert(9))$ dkappa(h9)# pretty high; by default {exact=FALSE, method="qr"} :kappa(h9)== kappa(qr.R(qr(h9)), norm="1")all.equal(kappa(h9, exact=TRUE),# its definition:          max(sv9)/ min(sv9),          tolerance=1e-12)## the same (typically down to 2.22e-16)kappa(h9, exact=TRUE)/ kappa(h9)# 0.677 (i.e., rel.error = 32%)## Exact kappa for rectangular matrix## panmagic.6npm1(7) :pm7<- rbind(c(1,13,18,23,35,40,45),             c(37,49,5,10,15,27,32),             c(24,29,41,46,2,14,19),             c(11,16,28,33,38,43,6),             c(47,3,8,20,25,30,42),             c(34,39,44,7,12,17,22),             c(21,26,31,36,48,4,9))kappa(pm7, exact=TRUE, norm="1")# no problem for square matrixm76<- pm7[,1:6](m79<- cbind(pm7,50:56,63:57))## Moore-Penrose inverse { ~= MASS::ginv(); differing tol (value & meaning)}:## pinv := p(seudo) inv(erse)pinv<-function(X, s= svd(X), tol=64*.Machine$double.eps){if(is.complex(X))        s$u<- Conj(s$u)    dx<- dim(X)## X = U D V' ==> Result =  V {1/D} U'    pI<-function(u,d,v) tcrossprod(v, u/ rep(d, each= dx[1L]))    pos<-(d<- s$d)> max(tol* max(dx)* d[1L],0)if(all(pos))        pI(s$u, d, s$v)elseif(!any(pos))        array(0, dX[2L:1L])else{# some pos, some not:        i<- which(pos)        pI(s$u[, i, drop=FALSE], d[i],           s$v[, i, drop=FALSE])}}## rectangularkappa(m76, norm="1")try( kappa(m76, exact=TRUE, norm="1"))# error in  solve().. must be square## ==> use pseudo-inverse instead of solve() for rectangular {and norm != "2"}:iZ<- pinv(m76)kappa(m76, exact=TRUE, norm="1", inv_z= iZ)kappa(m76, exact=TRUE, norm="M", inv_z= iZ)kappa(m76, exact=TRUE, norm="I", inv_z= iZ)iX<- pinv(m79)kappa(m79, exact=TRUE, norm="1", inv_z= iX)kappa(m79, exact=TRUE, norm="M", inv_z= iX)kappa(m79, exact=TRUE, norm="I", inv_z= iX)

Kronecker Products on Arrays

Description

Computes the generalised Kronecker product of two arrays,X andY.

Usage

kronecker(X, Y, FUN="*", make.dimnames=FALSE,...)X%x% Y

Arguments

X

a vector or array.

Y

a vector or array.

FUN

a function; it may be a quoted string.

make.dimnames

logical: provide dimnames that are the product of thedimnames ofX andY.

...

optional arguments to be passed toFUN.

Details

IfX andY do not have the same number ofdimensions, the smaller array is padded with dimensions of sizeone. The returned array comprises submatrices constructed bytakingX one term at a time and expanding that term asFUN(x, Y, ...).

%x% is an alias forkronecker (whereFUN is hardwired to"*").

Value

An arrayA with dimensionsdim(X) * dim(Y).

Author(s)

Jonathan Rougier

References

Shayle R. Searle (1982)Matrix Algebra Useful for Statistics. John Wiley and Sons.

See Also

outer, on whichkronecker is builtand%*% for usual matrix multiplication.

Examples

# simple scalar multiplication( M<- matrix(1:6, ncol=2))kronecker(4, M)# Block diagonal matrix:kronecker(diag(1,3), M)# ask for dimnamesfred<- matrix(1:12,3,4, dimnames= list(LETTERS[1:3], LETTERS[4:7]))bill<- c("happy"=100,"sad"=1000)kronecker(fred, bill, make.dimnames=TRUE)bill<- outer(bill, c("cat"=3,"dog"=4))kronecker(fred, bill, make.dimnames=TRUE)

Localization Information

Description

Report on localization information.

Usage

l10n_info()

Details

‘A Latin-1 locale’ includes supersets (for printablecharacters) such as Windows codepage 1252 but not Latin-9 (ISO 8859-15).

OnWindows (where the resulting list containscodepageandsystem.codepage components additionally), commoncodepages are 1252 (Western European), 1250 (Central European),1251 (Cyrillic), 1253 (Greek), 1254 (Turkish), 1255 (Hebrew), 1256(Arabic), 1257 (Baltic), 1258 (Vietnamese), 874 (Thai), 932(Japanese), 936 (Simplified Chinese), 949 (Korean) and 950(Traditional Chinese). Codepage 28605 is Latin-9 and 65001 is UTF-8(where supported).R does not allow the C locale, and uses 1252 asthe default codepage.

Value

A list with three logical elements and further OS-specific elements:

MBCS

If a multi-byte character set in use?

UTF-8

Is this known to be a UTF-8 locale?

Latin-1

Is this known to be a Latin-1 locale?

Not on Windows:

codeset

character. The encoding name as reported by the OS,possibly"". (Added inR 4.1.0. Encoding names areOS-specific.)

Only on Windows:

codepage

integer: the Windows codepage corresponding to thelocaleR is using (and not necessarily that Windows is using).

system.codepage

integer: the Windows system/ANSI codepage(the codepage Windows is using). Added inR 4.1.0.

See Also

Sys.getlocale,localeconv

Examples

l10n_info()

LAPACK Library

Description

Report the name of the shared object file withLAPACK implementationin use.

Usage

La_library()

Value

A character vector of length one ("" when the name is not known).The value can be used as an indication of whichLAPACKimplementation is in use. Typically, theR version ofLAPACK willappear aslibRlapack.so (libRlapack.dylib), depending on howR was built. Note thatlibRlapack.so (libRlapack.dylib) mayalso be shown for an externalLAPACK implementation that had beencopied, hard-linked or renamed by the system administrator. Otherwise,the shared object file will be given and its path/name may indicatethe vendor/version.

The detection does not work on Windows, nor for the Accelerateframework on macOS, nor in the rare (and unsupported) case of a staticexternal library.

It is possible to buildR against an enhanced BLAS which containssome but not all LAPACK routines, in which case this function reportsthe library containing routineILAVER.

See Also

extSoftVersion for versions of other third-party softwareincludingBLAS.

La_version for the version of LAPACK in use.

Examples

La_library()

LAPACK Version

Description

Report the version of LAPACK in use.

Usage

La_version()

Value

A character vector of length one.

Note that this is the version as reported by the library at runtime.It may differ from the reference (‘netlib’) implementation, forexample by having some optimized or patched routines. For the versionincluded withR, the older (not Fortran 90) versions of

    DLARTG DLASSQ ZLARTG ZLASSQ

are used.

See Also

extSoftVersion for versions of other third-party software.

La_library for binary/executable file with LAPACK in use.

Examples

La_version()

Find Labels from Object

Description

Find a suitable set of labels from an object for use in printing orplotting, for example. A generic function.

Usage

labels(object,...)

Arguments

object

anyR object: the function is generic.

...

further arguments passed to or from other methods.

Value

A character vector or list of such vectors. For a vector the resultsis the names orseq_along(x) and for a data frame or array itis the dimnames (withNULL expanded toseq_len(d[i])).

References

Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.


Apply a Function over a List or Vector

Description

lapply returns a list of the same length asX, eachelement of which is the result of applyingFUN to thecorresponding element ofX.

sapply is a user-friendly version and wrapper oflapplyby default returning a vector, matrix or, ifsimplify = "array", anarray if appropriate, by applyingsimplify2array().sapply(x, f, simplify = FALSE, USE.NAMES = FALSE) is the same aslapply(x, f).

vapply is similar tosapply, but has a pre-specifiedtype of return value, so it can be safer (and sometimes faster) touse.

replicate is a wrapper for the common use ofsapply forrepeated evaluation of an expression (which will usually involverandom number generation).

simplify2array() is the utility called fromsapply()whensimplify is not false and is similarly called frommapply().

Usage

lapply(X, FUN,...)sapply(X, FUN,..., simplify=TRUE, USE.NAMES=TRUE)vapply(X, FUN, FUN.VALUE,..., USE.NAMES=TRUE)replicate(n, expr, simplify="array")simplify2array(x, higher=TRUE, except= c(0L,1L))

Arguments

X

a vector (atomic or list) or anexpressionobject. Other objects (including classed objects) will be coercedbybase::as.list.

FUN

the function to be applied to each element ofX:see ‘Details’. In the case of functions like+,%*%, the function name must be backquoted or quoted.

...

optional arguments toFUN.

simplify

logical or character string; should the result besimplified to a vector, matrix or higher dimensional array ifpossible? Forsapply it must be named and not abbreviated.The default value,TRUE, returns a vector or matrix if appropriate,whereas ifsimplify = "array" the result may be anarray of “rank”(==length(dim(.))) one higher than the resultofFUN(X[[i]]).

USE.NAMES

logical; ifTRUE and ifX is character,useX asnames for the result unless it had namesalready. Since this argument follows... its name cannotbe abbreviated.

FUN.VALUE

a (generalized) vector; a template for the returnvalue from FUN. See ‘Details’.

n

integer: the number of replications.

expr

the expression (alanguage object, usually a call)to evaluate repeatedly.

x

a list, typically returned fromlapply().

higher

logical; if true,simplify2array() will produce a(“higher rank”) array when appropriate, whereashigher = FALSE would return a matrix (or vector) only.These two cases correspond tosapply(*, simplify = "array") orsimplify = TRUE, respectively.

except

integer vector orNULL; the defaultc(0L, 1L) corresponds to the exceptions used bysapply: a listwith elements of common length 0 or 1 is not simplified to an arraybut is returned, respectively, as is or unlisted.These exceptions can be disabled by specifying only a subset of0:1, orNULL to always simplify to an array (ifpossible).

Details

FUN is found by a call tomatch.fun and typicallyis specified as a function or a symbol (e.g., a backquoted name) or acharacter string specifying a function to be searched for from theenvironment of the call tolapply.

FunctionFUN must be able to accept as input any of theelements ofX. If the latter is an atomic vector,FUNwill always be passed a length-one vector of the same type asX.

Arguments in... cannot have the same name as any of theother arguments, and care may be needed to avoid partial matching toFUN. In general-purpose code it is good practice to name thefirst two argumentsX andFUN if... is passedthrough: this both avoids partial matching toFUN and ensuresthat a sensible error message is given if arguments namedX orFUN are passed through....

Simplification insapply is only attempted ifX haslength greater than zero and if the return values from all elementsofX are all of the same (positive) length. If the commonlength is one the result is a vector, and if greater than one is amatrix with a column corresponding to each element ofX.

Simplification is always done invapply. This functionchecks that all values ofFUN are compatible with theFUN.VALUE, in that they must have the same length and type.(Types may be promoted to a higher type within the ordering logical< integer < double < complex, but not demoted.)

Users of S4 classes should pass a list tolapply andvapply: the internal coercion is done by theas.list inthe base namespace and not one defined by a user (e.g., by setting S4methods on the base function).

Value

Forlapply,sapply(simplify = FALSE) andreplicate(simplify = FALSE), a list.

Forsapply(simplify = TRUE) andreplicate(simplify = TRUE): ifX has length zero orn = 0, an empty list.Otherwise an atomic vector or matrix or list of the same length asX (of lengthn forreplicate). If simplificationoccurs, the output type is determined from the highest type of thereturn values in the hierarchy NULL < raw < logical < integer < double <complex < character < list < expression, after coercion of pairliststo lists.

vapply returns a vector or array of type matching theFUN.VALUE. Iflength(FUN.VALUE) == 1 avector of the same length asX is returned, otherwisean array. IfFUN.VALUE is not anarray, theresult is a matrix withlength(FUN.VALUE) rows andlength(X) columns, otherwise an arraya withdim(a) == c(dim(FUN.VALUE), length(X)).

The (Dim)names of the array value are taken from theFUN.VALUEif it is named, otherwise from the result of the first function call.Column names of the matrix or more generally the names of the lastdimension of the array value or names of the vector value are set fromX as insapply.

Note

sapply(*, simplify = FALSE, USE.NAMES = FALSE) isequivalent tolapply(*).

For historical reasons, the calls created bylapply areunevaluated, and code has been written (e.g.,bquote) thatrelies on this. This means that the recorded call is always of theformFUN(X[[i]], ...), withi replaced by the current(integer or double) index. This is not normally a problem, but it canbe ifFUN usessys.call ormatch.call or if it is a primitive function that makesuse of the call. This means that it is often safer to call primitivefunctions with a wrapper, so that e.g.lapply(ll, function(x) is.numeric(x)) is required to ensure that method dispatch foris.numeric occurs correctly.

Ifexpr is a function call, be aware of assumptions about whereit is evaluated, and in particular what... might refer to.You can pass additional named arguments to a function call asadditional named arguments toreplicate: see ‘Examples’.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

apply,tapply,mapply for applying a function tomultiplearguments, andrapply for arecursive version oflapply(),eapply for applying a function to eachentry in anenvironment.

Examples

require(stats); require(graphics)x<- list(a=1:10, beta= exp(-3:3), logic= c(TRUE,FALSE,FALSE,TRUE))# compute the list mean for each list elementlapply(x, mean)# median and quartiles for each list elementlapply(x, quantile, probs=1:3/4)sapply(x, quantile)i39<- sapply(3:9, seq)# list of vectorssapply(i39, fivenum)vapply(i39, fivenum,       c(Min.=0,"1st Qu."=0, Median=0,"3rd Qu."=0, Max.=0))## sapply(*, "array") -- artificial example(v<- structure(10*(5:8), names= LETTERS[1:4]))f2<-function(x, y) outer(rep(x, length.out=3), y)(a2<- sapply(v, f2, y=2*(1:5), simplify="array"))a.2<- vapply(v, f2, outer(1:3,1:5), y=2*(1:5))stopifnot(dim(a2)== c(3,5,4), all.equal(a2, a.2),          identical(dimnames(a2), list(NULL,NULL,LETTERS[1:4])))hist(replicate(100, mean(rexp(10))))## use of replicate() with parameters:foo<-function(x=1, y=2) c(x, y)# does not work: bar <- function(n, ...) replicate(n, foo(...))bar<-function(n, x) replicate(n, foo(x= x))bar(5, x=3)

Value of Last Evaluated Expression

Description

The value of the internal evaluation of a top-levelR expressionis always assigned to.Last.value (inpackage:base)before further processing (e.g., printing).

Usage

.Last.value

Details

The value of a top-level assignmentis put in.Last.value,unlike S.

Do not assign to.Last.value in the workspace, because thiswill always mask the object of the same name inpackage:base.

See Also

eval

Examples

## These will not work correctly from example(),## but they will in make check or if pasted in,## as example() does not run them at the top levelgamma(1:15)# think of some intensive calculation...fac14<- .Last.value# keep themlibrary("splines")# returns invisibly.Last.value# shows what library(.) above returned

Length of an Object

Description

Get or set the length of vectors (including lists) and factors, and ofany otherR object for which a method has been defined.

Usage

length(x)length(x)<- value

Arguments

x

anR object. For replacement, a vector or factor.

value

a non-negative integer or double (which will be rounded down).

Details

Both functions are generic: you can write methods to handle specificclasses of objects, seeInternalMethods.length<- has a"factor" method.

The replacement form can be used to reset the length of a vector. Ifa vector is shortened, extra values are discarded and when a vector islengthened, it is padded out to its new length withNAs(nul for raw vectors).

Both areprimitive functions.

Value

The default method forlength currently returns a non-negativeinteger of length 1, except for vectors of more than23112^{31}-1 elements, when it returns a double.

For vectors (including lists) and factors the length is the number ofelements. For an environment it is the number of objects in theenvironment, andNULL has length 0. For expressions andpairlists (includinglanguage objects and dot-dot-dot lists) it is thelength of the pairlist chain. All other objects (including functions)have length one: note that for functions this differs from S.

The replacement form removes all the attributes ofx except itsnames, which are adjusted (and if necessary extended by"").

Warning

Package authors have written methods that return a result of lengthother than one (Formula) and that return a vector of typedouble (Matrix), even with non-integer values(earlier versions ofsets). Where a single double value isreturned that can be represented as an integer it is returned as alength-one integer vector.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

nchar for counting the number of characters in charactervectors,lengths for getting the length of every elementin a list.

Examples

length(diag(4))# = 16 (4 x 4)length(options())# 12 or morelength(y~ x1+ x2+ x3)# 3length(expression(x,{y<- x^2; y+2}, x^y))# 3## from example(warpbreaks)require(stats)fm1<- lm(breaks~ wool* tension, data= warpbreaks)length(fm1$call)# 3, lm() and two arguments.length(formula(fm1))# 3, ~ lhs rhs

Lengths of List or Vector Elements

Description

Get the length of each element of alist or atomicvector (is.atomic) as an integer or numeric vector.

Usage

lengths(x, use.names=TRUE)

Arguments

x

alist, list-like such as anexpression,NULL or an atomic vector (for whichthe result is trivial).

use.names

logical indicating if the result should inherit thenames fromx.

Details

This function loops overx and returns a compatible vectorcontaining the length of each element inx. Effectively,length(x[[i]]) is called for alli, so any methods onlength are considered.

lengths is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.

Value

A non-negativeinteger of lengthlength(x),except when any element has a length of more than23112^{31}-1 elements, when it returns a double vector.Whenuse.names is true, the names are taken from the names onx, if any.

Note

One raison d'être oflengths(x) is its use as amore efficient version ofsapply(x, length) and similar*apply calls tolength. This is the reason whyx may be an atomic vector, even thoughlengths(x) istrivial in that case.

See Also

length for getting the length of anyR object.

Examples

require(stats)## summarize by monthl<- split(airquality$Ozone, airquality$Month)avgOz<- lapply(l, mean, na.rm=TRUE)## merge resultairquality$avgOz<- rep(unlist(avgOz, use.names=FALSE), lengths(l))## but this is safer and cleaner, but can be slowerairquality$avgOz<- unsplit(avgOz, airquality$Month)## should always be true, except when a length does not fit in 32 bitsstopifnot(identical(lengths(l), vapply(l, length, integer(1L))))## empty lists are not a problemx<- list()stopifnot(identical(lengths(x), integer()))## nor are "list-like" expressions:lengths(expression(u, v,1+0:9))## and we should dispatch to length methodsf<- c(rep(1,3), rep(2,6),3)dates<- split(as.POSIXlt(Sys.time()+1:10), f)stopifnot(identical(lengths(dates), vapply(dates, length, integer(1L))))

Levels Attributes

Description

levels provides access to the levels attribute of a variable.The first form returns the value of the levels of its argumentand the second sets the attribute.

Usage

levels(x)levels(x)<- value

Arguments

x

an object, for example a factor.

value

a valid value forlevels(x).For the default method,NULL or a character vector. For thefactor method, a vector of character strings with length atleast the number of levels ofx, or a named list specifying how torename the levels.

Details

Both the extractor and replacement forms are generic and new methodscan be written for them. The most important method for the replacementfunction is that forfactors.

For the factor replacement method, aNA invaluecauses that level to be removed from the levels and the elementsformerly with that level to be replaced byNA.

Note that for a factor, replacing the levels vialevels(x) <- value is not the same as (and is preferred to)attr(x, "levels") <- value.

The replacement function isprimitive.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

nlevels,relevel,reorder.

Examples

## assign individual levelsx<- gl(2,4,8)levels(x)[1]<-"low"levels(x)[2]<-"high"x## or as a groupy<- gl(2,4,8)levels(y)<- c("low","high")y## combine some levelsz<- gl(3,2,12, labels= c("apple","salad","orange"))zlevels(z)<- c("fruit","veg","fruit")z## same, using a named listz<- gl(3,2,12, labels= c("apple","salad","orange"))zlevels(z)<- list("fruit"= c("apple","orange"),"veg"="salad")z## we can add levels this way:f<- factor(c("a","b"))levels(f)<- c("c","a","b")ff<- factor(c("a","b"))levels(f)<- list(C="C", A="a", B="b")f

Report Version of libcurl

Description

Report version oflibcurl in use.

Usage

libcurlVersion()

Value

A character string, with value thelibcurl version in use or"" if none is. Iflibcurl is available, has attributes

ssl_version

A character string naming theSSL/TLS implementationand version, possibly"none". It is intended for the versionof OpenSSL used, but not all implementations oflibcurl useOpenSSL — for example macOS reports"SecureTranspart", itswrapper forSSL/TLS.

libssh_version

A character string naming thelibssh version,which may or may not be available (it is used fore.g.scp andsftp protocols). Where present,something like"libssh2/1.5.0".

protocols

A character vector of the names of supportedprotocols, also known as ‘schemes’ when part of a URL.

Warning

In late 2017 alibcurl installation was seen divided into twolibraries,libcurl andlibcurl-feature, and the firsthad been updated but not the second. As the compiled functionrecording the version was in the latter, the version reported bylibcurlVersion was misleading.

See Also

extSoftVersion for versions of other third-partysoftware.

curlGetHeaders,download.file andurl for functions which (optionally) uselibcurl.

https://curl.se/docs/sslcerts.html andhttps://curl.se/docs/ssl-compared.html for more details onSSL versions (the current standard being known asTLS). Normallylibcurl used withR uses SecureTransport on macOS, OpenSSL onWindows and GnuTLS, NSS or OpenSSL on Unix-alikes. (At the time ofwriting Debian-based Linuxen use GnuTLS and RedHat-based ones useOpenSSL, having previously used NSS.)

Examples

libcurlVersion()

Search Paths for Packages

Description

.libPaths gets/sets the library trees within which packages arelooked for.

Usage

.libPaths(new, include.site=TRUE).Library.Library.site

Arguments

new

a character vector with the locations ofR librarytrees. Tilde expansion (path.expand) is done, and ifany element contains one of*?[, globbing is done wheresupported by the platform: seeSys.glob.

include.site

a logical value indicating whether the value of.Library.site should be included in the new set of librarytree locations. Defaulting toTRUE, it is ignored when.libPaths is called without thenew argument.

Details

.Library is a character string giving the location of thedefault library, the ‘library’ subdirectory ofR_HOME.

.Library.site is a (possibly empty) character vector giving thelocations of the site libraries.

.libPaths is used for getting or setting the library trees thatRknows about and hence uses when looking for packages (the library searchpath). If called with argumentnew, by default, the library searchpath is set to the existing directories inunique(c(new, .Library.site, .Library)) and this is returned. Ifinclude.siteisFALSE when thenew argument is set,.Library.siteis not added to the new library search path. If called without thenew argument, a character vector with the currently active librarytrees is returned.

How paths innew with a trailing slash are treated isOS-dependent. On a POSIX filesystem existing directories can usuallybe specified with a trailing slash. On Windows filepaths with atrailing slash (or backslash) are invalid and existing directoriesspecified with a trailing slash may not be added to the library search path.

At startup, the library search path is initialized from theenvironment variablesR_LIBS,R_LIBS_USER andR_LIBS_SITE, which if set should give lists of directories whereR library trees are rooted, colon-separated on Unix-alike systems andsemicolon-separated on Windows. For the latter two, a value ofNULL indicates an empty list of directories. (Note that as fromR 4.2.0, both are set byR start-up code if not already set or emptyso can be interrogated from anR session to find their defaults:in earlier versions this was true only forR_LIBS_USER.)

First,.Library.site is initialized fromR_LIBS_SITE. Ifthis is unset or empty, the ‘site-library’ subdirectory ofR_HOME is used. Only directories which exist at the time ofinitialization are retained. Then,.libPaths() is called withthe combination of the directories given byR_LIBS andR_LIBS_USER. By defaultR_LIBS is unset, and ifR_LIBS_USER is unset or empty, it is set to directory‘R/R.version$platform-library/x.y’ of the homedirectory on Unix-alike systems (or‘Library/R/m/x.y/library’ for CRAN macOS builds, withmSys.info()["machine"]) and‘R/win-library/x.y’ subdirectory ofLOCALAPPDATA onWindows, forRx.y.z.

BothR_LIBS_USER andR_LIBS_SITE feature possibleexpansion of specifiers forR-version-specific information as part ofthe startup process. The possible conversion specifiers all startwith a ‘⁠%⁠’ and are followed by a single letter (use ‘⁠%%⁠’to obtain ‘⁠%⁠’), with currently available conversionspecifications as follows:

⁠%V⁠

R version number including the patch level (e.g.,‘⁠2.5.0⁠’).

⁠%v⁠

R version number excluding the patch level (e.g.,‘⁠2.5⁠’).

⁠%p⁠

the platform for whichR was built, the value ofR.version$platform.

⁠%o⁠

the underlying operating system, the value ofR.version$os.

⁠%a⁠

the architecture (CPU)R was built on/for, thevalue ofR.version$arch.

(Seeversion for details on R version information.)In addition, ‘⁠%U⁠’ and ‘⁠%S⁠’ expand to theR defaults for,respectively,R_LIBS_USER andR_LIBS_SITE.

Function.libPaths always uses the values of.Libraryand.Library.site in the base namespace..Library.sitecan be set by the site in ‘Rprofile.site’, which should befollowed by a call to.libPaths(.libPaths()) to make use of theupdated value.

For consistency, the paths are always normalized bynormalizePath(winslash = "/").

LOCALAPPDATA (usuallyC:\Users\username\AppData\Local) onWindows is a hidden directory and may not be viewed by some software. Itmay be opened byshell.exec(Sys.getenv("LOCALAPPDATA")).

Value

A character vector of file paths.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

library

Examples

.libPaths()# all library trees R knows about

Loading/Attaching and Listing of Packages

Description

library andrequire load and attach add-on packages.

Usage

library(package, help, pos=2, lib.loc=NULL,        character.only=FALSE, logical.return=FALSE,        warn.conflicts, quietly=FALSE,        verbose= getOption("verbose"),        mask.ok, exclude, include.only,        attach.required= missing(include.only))require(package, lib.loc=NULL, quietly=FALSE,        warn.conflicts,        character.only=FALSE,        mask.ok, exclude, include.only,        attach.required= missing(include.only))conflictRules(pkg, mask.ok=NULL, exclude=NULL)

Arguments

package,help

the name of a package, given as aname orliteral character string, or a character string, depending onwhethercharacter.only isFALSE (default) orTRUE.

pos

the position on the search list at which to attach theloaded namespace. Can also be the name of a position on the currentsearch list as given bysearch().

lib.loc

a character vector describing the location ofRlibrary trees to search through, orNULL. The default valueofNULL corresponds to all libraries currently known to.libPaths().Non-existent library trees are silently ignored.

character.only

a logical indicating whetherpackage orhelp can be assumed to be character strings.

logical.return

logical. If it isTRUE,FALSE orTRUE is returned to indicate success.

warn.conflicts

logical. IfTRUE, warnings areprinted aboutconflicts from attaching the newpackage. A conflict is a function masking a function,or a non-function masking a non-function. The default isTRUEunless specified asFALSE in theconflicts.policy option.

verbose

a logical. IfTRUE, additional diagnostics areprinted.

quietly

a logical. IfTRUE, no message confirmingpackage attaching is printed, and most often, no errors/warnings areprinted if package attaching fails.

pkg

character string naming a package.

mask.ok

character vector of names of objects that can maskobjects on the search path without signaling an error when strictconflict checking is enabled.

exclude,include.only

character vector of names of objects toexclude or include in the attached frame. Only one of these argumentsmay be used in a call tolibrary orrequire.

attach.required

logical specifying whether required packageslisted in theDepends clause of theDESCRIPTION fileshould be attached automatically.

Details

library(package) andrequire(package) both load thenamespace of the package with namepackage and attach it on thesearch list.require is designed for use inside otherfunctions; it returnsFALSE and gives a warning (rather than anerror aslibrary() does by default) if the package does notexist. Both functions check and update the list of currently attachedpackages and do not reload a namespace which is already loaded. (Ifyou want to reload such a package, calldetach(unload = TRUE) orunloadNamespace first.) If you want to load apackage without attaching it on the search list, seerequireNamespace.

To suppress messages during the loading of packages usesuppressPackageStartupMessages: this will suppress allmessages fromR itself but not necessarily all those from packageauthors.

Iflibrary is called with nopackage orhelpargument, it lists all available packages in the libraries specifiedbylib.loc, and returns the corresponding information in anobject of class"libraryIQR". (The structure of this class maychange in future versions.) Use.packages(all = TRUE) toobtain just the names of all available packages, andinstalled.packages() for even more information.

library(help = somename) computes basic information about thepackagesomename, and returns this in an object of class"packageInfo". (The structure of this class may change infuture versions.) When used with the default value (NULL) forlib.loc, the attached packages are searched before the libraries.

Value

Normallylibrary returns (invisibly) the list of attachedpackages, butTRUE orFALSE iflogical.return isTRUE. When called aslibrary() it returns an object ofclass"libraryIQR", and forlibrary(help=), one ofclass"packageInfo".

require returns (invisibly) a logical indicating whether the requiredpackage is available.

Conflicts

Handling of conflicts depends on the setting of theconflicts.policy option. If this option is not set, thenconflicts result in warning messages if the argumentwarn.conflicts isTRUE. If the option is set to thecharacter string"strict", then all unresolved conflicts signalerrors. Conflicts can be resolved using themask.ok,exclude, andinclude.only arguments tolibrary andrequire. Defaults formask.ok andexclude can bespecified usingconflictRules.

If theconflicts.policy option is set to the string"depends.ok" then conflicts resulting from attaching declareddependencies will not produce errors, but other conflicts will.This is likely to be the best setting for most users wanting someadditional protection against unexpected conflicts.

The policy can be tuned further by specifying theconflicts.policy option as a named list with the followingfields:

error:

logical; ifTRUE treat unresolvedconflicts as errors.

warn:

logical; unlessFALSE issue a warningmessage when conflicts are found.

generics.ok:

logical; ifTRUE ignore conflictscreated by defining S4 generics for functions on the search path.

depends.ok:

logical; ifTRUE do not treatconflicts with required packages as errors.

can.mask:

character vector of names of packages thatare allowed to be masked. These would typically be base packagesattached by default.

Licenses

Some packages have restrictive licenses, and there is a mechanism toallow users to be aware of such licenses. IfgetOption("checkPackageLicense") == TRUE, then at firstuse of a namespace of a package with a not-known-to-be-FOSS (seebelow) license the user is asked to view and accept the license: alist of accepted licenses is stored in file ‘~/.R/licensed’. Ina non-interactive session it is an error to use such a package whoselicense has not already been recorded as accepted.

Free or Open Source Software (FOSS,e.g.https://en.wikipedia.org/wiki/FOSS) packages aredetermined by the same filters used byavailable.packages but applied to just the currentpackage, not its dependencies.

There can also be a site-wide file ‘R_HOME/etc/licensed.site’ ofpackages (one per line).

Formal methods

library takes some further actions when packagemethodsis attached (as it is by default). Packages may define formal genericfunctions as well as re-defining functions in other packages (notablybase) to be generic, and this information is cached wheneversuch a namespace is loaded aftermethods and re-defined functions(implicit generics) are excluded from the list of conflicts.The caching and check for conflicts require looking for a pattern ofobjects; the search may be avoided by defining an object.noGenerics (with any value) in the namespace. Naturally, if thepackagedoes have any such methods, this will prevent them frombeing used.

Note

library andrequire can only load/attach aninstalled package, and this is detected by having a‘DESCRIPTION’ file containing a ‘⁠Built:⁠’ field.

Under Unix-alikes, the code checks that the package was installedunder a similar operating system as given byR.version$platform(the canonical name of the platform under which R was compiled),provided it contains compiled code. Packages which do not containcompiled code can be shared between Unix-alikes, but not to other OSesbecause of potential problems with line endings and OS-specific helpfiles. If sub-architectures are used, the OS similarity is notchecked since the OS used to build may differ(e.g.i386-pc-linux-gnu code can be built on anx86_64-unknown-linux-gnu OS).

The package name given tolibrary andrequire must matchthe name given in the package's ‘DESCRIPTION’ file exactly, evenon case-insensitive file systems such as are common on Windows andmacOS.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

.libPaths,.packages.

attach,detach,search,objects,autoload,requireNamespace,library.dynam,data,install.packages andinstalled.packages;INSTALL,REMOVE.

The initial set of packages attached is set byoptions(defaultPackages=): see alsoStartup.

Examples

library()# list all available packageslibrary(lib.loc= .Library)# list all packages in the default librarylibrary(help= splines)# documentation on package 'splines'library(splines)# attach package 'splines'require(splines)# the samesearch()# "splines", toodetach("package:splines")# if the package name is in a character vector, usepkg<-"splines"library(pkg, character.only=TRUE)detach(pos= match(paste("package", pkg, sep=":"), search()))require(pkg, character.only=TRUE)detach(pos= match(paste("package", pkg, sep=":"), search()))require(nonexistent)# FALSE## Not run:## if you want to mask as little as possible, uselibrary(mypkg, pos="package:base")## End(Not run)

Loading DLLs from Packages

Description

Load the specified file of compiled code if it has not been loadedalready, or unloads it.

Usage

library.dynam(chname, package, lib.loc,              verbose= getOption("verbose"),              file.ext= .Platform$dynlib.ext,...)library.dynam.unload(chname, libpath,                     verbose= getOption("verbose"),                     file.ext= .Platform$dynlib.ext).dynLibs(new)

Arguments

chname

a character string naming a DLL (also known as a dynamicshared object or library) to load.

package

a character vector with the name of package.

lib.loc

a character vector describing the location ofRlibrary trees to search through.

libpath

the path to the loaded package whose DLL is to be unloaded.

verbose

a logical value indicating whether an announcementis printed on the console before loading the DLL. Thedefault value is taken from the verbose entry in the systemoptions.

file.ext

the extension (including ‘⁠.⁠’ if used) to appendto the file name to specify the library to be loaded. This defaultsto the appropriate value for the operating system.

...

additional arguments needed by some libraries thatare passed to the call todyn.load to controlhow the library and its dependencies are loaded.

new

a list of"DLLInfo" objects corresponding to theDLLs loaded by packages. Can be missing.

Details

Seedyn.load for what sort of objects these functions handle.

library.dynam is designed to be used inside a package ratherthan at the command line, and should really only be used inside.onLoad. The system-specific extension for DLLs (e.g.,‘.so’ or ‘.sl’ on Unix-alike systems,‘.dll’ on Windows) should not be added.

library.dynam.unload is designed for use in.onUnload: it unloads the DLL and updates the value of.dynLibs()

.dynLibs is used for getting (with no argument) or setting theDLLs which are currently loaded by packages (usinglibrary.dynam).

Value

Ifchname is not specified,library.dynam returns anobject of class"DLLInfoList" corresponding to the DLLsloaded by packages.

Ifchname is specified, an object of class"DLLInfo" that identifies the DLL and which can be usedin future calls is returned invisibly. Note that the class"DLLInfo" has a method for$ which can be used toresolve native symbols within that DLL.

library.dynam.unload invisibly returns an object of class"DLLInfo" identifying the DLL successfully unloaded.

.dynLibs returns an object of class"DLLInfoList"corresponding to its current value.

Warning

Do not usedyn.unload on a DLL loaded bylibrary.dynam: uselibrary.dynam.unload to ensurethat.dynLibs gets updated. Otherwise a subsequent call tolibrary.dynam will be told the object is already loaded.

Note that whether or not it is possible to unload a DLL and thenreload a revised version of the same file is OS-dependent: see the‘Value’ section of the help fordyn.unload.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

getLoadedDLLs for information on"DLLInfo" and"DLLInfoList" objects.

.onLoad,library,dyn.load,.packages,.libPaths

SHLIB for how to create suitable DLLs.

Examples

## Which DLLs were dynamically loaded by packages?library.dynam()## More on library.dynam.unload() :require(nlme)nlme:::.onUnload# shows library.dynam.unload() calldetach("package:nlme")# by default, unload=FALSE ,  so,tail(library.dynam(),2)# nlme still there## How to unload the DLL ?## Best is to unload the namespace,  unloadNamespace("nlme")## If we need to do it separately which should be exceptional:pd.file<- attr(packageDescription("nlme"),"file")library.dynam.unload("nlme", libpath= sub("/Meta.*",'', pd.file))tail(library.dynam(),2)# 'nlme' is gone nowunloadNamespace("nlme")# now gives warning

The R License Terms

Description

The license terms under whichR is distributed.

Usage

license()licence()

Details

R is distributed under the terms of the GNU GENERAL PUBLIC LICENSE,either Version 2, June 1991 or Version 3, June 2007. A copy of theversion 2 license is in file ‘R_HOME/doc/COPYING’and can be viewed byRShowDoc("COPYING"). Version 3 of thelicense can be displayed byRShowDoc("GPL-3").

A small number of files (some of the API header files) are distributedunder the LESSER GNU GENERAL PUBLIC LICENSE, version 2.1 or later. Acopy of this license is in file ‘R_SHARE_DIR/licenses/LGPL-2.1’and can be viewed byRShowDoc("LGPL-2.1"). Version 3 of thelicense can be displayed byRShowDoc("LGPL-3").


Lists – Generic and Dotted Pairs

Description

Functions to construct, coerce and check for both kinds ofR lists.

Usage

list(...)pairlist(...)as.list(x,...)## S3 method for class 'environment'as.list(x, all.names=FALSE, sorted=FALSE,...)as.pairlist(x)is.list(x)is.pairlist(x)alist(...)

Arguments

...

objects, possibly named.

x

object to be coerced or tested.

all.names

a logical indicating whether to copy all values or(default) only those whose names do not begin with a dot.

sorted

a logical indicating whether thenames ofthe resulting list should be sorted (increasingly). Note that thisis somewhat costly, but may be useful for comparison of environments.

Details

Almost all lists inR internally areGeneric Vectors, whereastraditionaldotted pair lists (as in LISP) remain available butrarely seen by users (except asformals of functions).

The arguments tolist orpairlist are of the formvalue ortag = value. The functions return a list ordotted pair list composed of its arguments with each value eithertagged or untagged, depending on how the argument was specified.

alist handles its arguments as if they described functionarguments. So the values are not evaluated, and tagged arguments withno value are allowed whereaslist simply ignores them.alist is most often used in conjunction withformals.

as.list attempts to coerce its argument to a list. Forfunctions, this returns the concatenation of the list of formalarguments and the function body. For expressions, the list ofconstituent elements is returned.as.list is generic, and asthe default method callsas.vector(mode = "list") for anon-list, methods foras.vector may be invoked.as.listturns a factor into a list of one-element factors, keepingnames. Other attributes maybe dropped unless the argument already is a list or expression. (Thisis inconsistent with functions such asas.characterwhich always drop attributes, and is for efficiency since lists can beexpensive to copy.)

is.list returnsTRUE if and only if its argumentis alistor apairlist oflength>0> 0.is.pairlist returnsTRUE if and only if the argumentis a pairlist orNULL (see below).

The"environment" method foras.list copies thename-value pairs (for names not beginning with a dot) from anenvironment to a named list. The user can request that all namedobjects are copied. Unlesssorted = TRUE, the list is in noparticular order (the orderdepends on the order of creation of objects and whether theenvironment is hashed). No enclosing environments are searched.(Objects copied are duplicated so this can be an expensive operation.)Note that there is an inverse operation, theas.environment() method for list objects.

An empty pairlist,pairlist() is the same asNULL. This is different fromlist(): some butnot all operations will promote an empty pairlist to an empty list.

as.pairlist is implemented asas.vector(x, "pairlist"), and hence will dispatch methods for the generic functionas.vector. Lists are copied element-by-element into a pairlistand the names of the list used as tags for the pairlist: the returnvalue for other types of argument is undocumented.

list,is.list andis.pairlist areprimitive functions.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

vector("list", length) for creation of a list with emptycomponents;c, for concatenation;formals.unlist is an approximate inverse toas.list().

plotmath’ for the use oflist in plot annotation.

Examples

require(graphics)# create a plotting structurepts<- list(x= cars[,1], y= cars[,2])plot(pts)is.pairlist(.Options)# a user-level pairlist## "pre-allocate" an empty list of length 5vector("list",5)# Argument listsf<-function() x# Note the specification of a "..." argument:formals(f)<- al<- alist(x=, y=2+3,...=)fal## environment->list coercione1<- new.env()e1$a<-10e1$b<-20as.list(e1)

List the Files in a Directory/Folder

Description

These functions produce a character vector of the names of files ordirectories in the named directory.

Usage

list.files(path=".", pattern=NULL, all.files=FALSE,           full.names=FALSE, recursive=FALSE,           ignore.case=FALSE, include.dirs=FALSE, no..=FALSE)       dir(path=".", pattern=NULL, all.files=FALSE,           full.names=FALSE, recursive=FALSE,           ignore.case=FALSE, include.dirs=FALSE, no..=FALSE)list.dirs(path=".", full.names=TRUE, recursive=TRUE)

Arguments

path

a character vector of full path names; the defaultcorresponds to the working directory,getwd(). Tildeexpansion (seepath.expand) is performed. Missingvalues will be ignored. Elements with a marked encoding willbe converted to the native encoding (and if that fails, considerednon-existent).

pattern

an optionalregular expression. Only file nameswhich match the regular expression will be returned.

all.files

a logical value. IfFALSE, only thenames of visible files are returned (following Unix-style visibility,that is files whose name does not start with a dot). IfTRUE,all file names will be returned.

full.names

a logical value. IfTRUE, the directorypath is prepended to the file names to give a relative file path.IfFALSE, the file names (rather than paths) are returned.

recursive

logical. Should the listing recurse into directories?

ignore.case

logical. Should pattern-matching be case-insensitive?

include.dirs

logical. Should subdirectory names be included inrecursive listings? (They always are in non-recursive ones).

no..

logical. Should both"." and".." be excludedalso from non-recursive listings?

Value

A character vector containing the names of the files in thespecified directories (empty if there were no files). If apath does not exist or is not a directory or is unreadable itis skipped.

The files are sorted in alphabetical order, on the full pathiffull.names = TRUE.

list.dirs implicitly hasall.files = TRUE, and ifrecursive = TRUE, the answer includespath itself(provided it is a readable directory).

dir is an alias forlist.files.

Note

File naming conventions are platform dependent. The pattern matchingworks with the case of file names as returned by the OS.

On a POSIX filesystem recursive listings will follow symbolic links todirectories.

Author(s)

Ross Ihaka, Brian Ripley

See Also

file.info,file.accessandfiles for many more file handling functions andfile.choose

for interactive selection.

glob2rx to convert wildcards (as used by system filecommands and shells) to regular expressions.

Sys.glob for wildcard expansion on file paths.basename anddirname, useful for splitting pathsinto non-directory (aka ‘filename’) and directory parts.

Examples

list.files(R.home())## Only files starting with a-l or r## Note that a-l is locale-dependent, but using case-insensitive## matching makes it unambiguous in English localesdir("../..", pattern="^[a-lr]", full.names=TRUE, ignore.case=TRUE)list.dirs(R.home("doc"))list.dirs(R.home("doc"), full.names=FALSE)

Create Data Frame From List

Description

Create a data frame from a list of variables.

Usage

list2DF(x= list(), nrow=0)

Arguments

x

A list of same-length variables for the data frame.

nrow

An integer giving the desired number of rows for the data frame incasex gives no variables (i.e., has length zero).

Details

Note that all list elements are taken “as is”.

Value

A data frame with the given variables.

See Also

data.frame

Examples

## Create a data frame holding a list of character vectors and the## corresponding lengths:x<- list(character(),"A", c("B","C"))n<- lengths(x)list2DF(list(x= x, n= n))## Create data frames with no variables and the desired number of rows:list2DF()list2DF(nrow=3L)

From A List, Build or Add To an Environment

Description

From anamedlist x, create anenvironment containing all list components as objects, or“multi-assign” fromx into a pre-existing environment.

Usage

list2env(x, envir=NULL, parent= parent.frame(),         hash=(length(x)>100), size= max(29L, length(x)))

Arguments

x

alist, wherenames(x) mustnot contain empty ("") elements.

envir

anenvironment orNULL.

parent

(for the caseenvir = NULL): a parent frame akaenclosing environment, seenew.env.

hash

(for the caseenvir = NULL): logical indicatingif the created environment should use hashing, seenew.env.

size

(in the caseenvir = NULL, hash = TRUE): hash size,seenew.env.

Details

This will be very slow for large inputs unless hashing is used on theenvironment.

Environments must have uniquely named entries, but named lists neednot: where the list has duplicate names it is thelast elementwith the name that is used. Empty names throw an error.

Value

Anenvironment, either newly created (as bynew.env) if theenvir argument wasNULL,otherwise the updated environmentenvir. Since environmentsare never duplicated, the argumentenvir is also changed.

Author(s)

Martin Maechler

See Also

environment,new.env,as.environment; further,assign.

The (semantical) “inverse”:as.list.environment.

Examples

L<- list(a=1, b=2:4, p= pi, ff= gl(3,4, labels= LETTERS[1:3]))e<- list2env(L)ls(e)stopifnot(ls(e)== sort(names(L)),          identical(L$b, e$b))# "$" working for environments as for lists## consistency, when we do the inverse:ll<- as.list(e)# -> dispatching to the as.list.environment() methodrbind(names(L), names(ll))# not in the same order, typically,# but the same content:stopifnot(identical(L[sort.list(names(L))],                    ll[sort.list(names(ll))]))## now add to e -- can be seen as a fast "multi-assign":list2env(list(abc= LETTERS, note="just an example",              df= data.frame(x= rnorm(20), y= rbinom(20,1, prob=0.2))),         envir= e)utils::ls.str(e)

Reload Saved Datasets

Description

Reload datasets written with the functionsave.

Usage

load(file, envir= parent.frame(), verbose=FALSE)

Arguments

file

a (readable binary-mode)connection or a character stringgiving the name of the file to load (whentilde expansionis done).

envir

the environment where the data should be loaded.

verbose

should item names be printed during loading?

Details

load can loadR objects saved in the current or any earlierformat. It can read a compressed file (seesave)directly from a file or from a suitable connection (including a calltourl).

A not-open connection will be opened in mode"rb" and closedafter use. Any connection other than agzfile orgzcon connection will be wrapped ingzconto allow compressed saves to be handled: note that this leaves theconnection in an altered state (in particular, binary-only), and thatit needs to be closed explicitly (it will not be garbage-collected).

OnlyR objects saved in the current format (used sinceR 1.4.0)can be read from a connection. If no input is available on aconnection a warning will be given, but any input not in the currentformat will result in a error.

Loading from an earlier version will give a warning about the‘magic number’: magic numbers1971:1977 are fromR <0.99.0, andRD[ABX]1 fromR 0.99.0 toR 1.3.1. These are allobsolete, and you are strongly recommended to re-save such files in acurrent format.

Theverbose argument is mainly intended for debugging. If itisTRUE, then as objects from the file are loaded, theirnames will be printed to the console. Ifverbose is set toan integer value greater than one, additional names corresponding toattributes and other parts of individual objects will also be printed.Larger values will print names to a greater depth.

Objects can be saved with references to namespaces, usually as part ofthe environment of a function or formula. Such objects can be loadedeven if the namespace is not available: it is replaced by a referenceto the global environment with a warning. The warning identifies thefirst object with such a reference (but there may be more than one).

Value

A character vector of the names of objects created, invisibly.

Warning

SavedR objects are binary files, even those saved withascii = TRUE, so ensure that they are transferred withoutconversion of end of line markers.load tries to detect such aconversion and gives an informative error message.

load(file) replaces all existing objects with the same namesin the current environment (typically your workspace,.GlobalEnv) and hence potentially overwrites important data.It is considerably safer to useenvir = to load into adifferent environment, or toattach(file) whichload()s into a new entry in thesearch path.

See Also

save,download.file; furtherattach as wrapper forload().

For other interfaces to the underlying serialization format, seeunserialize andreadRDS.

Examples

## save all dataxx<- pi# to ensure there is some datasave(list= ls(all.names=TRUE), file="all.rda")rm(xx)## restore the saved values to the current environmentlocal({   load("all.rda")   ls()})xx<- exp(1:3)## restore the saved values to the user's workspaceload("all.rda")## which is here *equivalent* to## load("all.rda", .GlobalEnv)## This however annihilates all objects in .GlobalEnv with the same names !xx# no longer exp(1:3)rm(xx)attach("all.rda")# safer and will warn about masked objects w/ same name in .GlobalEnvls(pos=2)##  also typically need to cleanup the search path:detach("file:all.rda")## clean up (the example):unlink("all.rda")## Not run:con<- url("http://some.where.net/R/data/example.rda")## print the value to see what objects were created.print(load(con))close(con)# url() always opens the connection## End(Not run)

Query or Set Aspects of the Locale

Description

Get details of or set aspects of the locale for theR process.

Usage

Sys.getlocale(category="LC_ALL")Sys.setlocale(category="LC_ALL", locale="").LC.categories

Arguments

category

character string. The following categories shouldalways be supported:"LC_ALL","LC_COLLATE","LC_CTYPE","LC_MONETARY","LC_NUMERIC" and"LC_TIME". Some systems (not Windows) will also support"LC_MESSAGES","LC_PAPER" and"LC_MEASUREMENT".These category names are available in.LC.categories; even whennot supported,Sys.getlocale(.) will return"", e.g., forthe"LC_PAPER" example on Windows.

locale

character string. A valid locale name on the system inuse. Normally"" (the default) will pick up the defaultlocale for the system.

Details

The locale describes aspects of the internationalization of a program.Initially most aspects of the locale ofR are set to"C"(which is the default for the C language and reflects North-Americanusage – also known as"POSIX").R sets"LC_CTYPE" and"LC_COLLATE", which allow the use of a different character setand alphabetic comparisons in that character set (including the use ofsort),"LC_MONETARY" (for use bySys.localeconv) and"LC_TIME" may affect thebehaviour ofas.POSIXlt andstrptime andfunctions which use them (but notdate).

The first seven categories described here are those specified byPOSIX."LC_MESSAGES" will be"C" on systems that do notsupport message translation, and is not supported on Windows, whereyoumust use theLANGUAGE environment variable formessage translation, see below and theSys.setLanguage()utility. Trying to use an unsupported category is an error forSys.setlocale.

Note that setting category"LC_ALL" sets only categories"LC_COLLATE","LC_CTYPE","LC_MONETARY" and"LC_TIME".

Attempts to set an invalid locale are ignored. There may or may notbe a warning, depending on the OS.

Attempts to change the character set (bySys.setlocale("LC_CTYPE", ), if that implies a differentcharacter set) during a session may not work and are likely to lead tosome confusion.

Note that theLANGUAGE environment variable has precedence over"LC_MESSAGES" in selecting the language for message translationon mostR platforms.

On platforms where ICU is used for collation the locale used forcollation can be reset byicuSetCollate. Except onWindows, the initial setting is taken from the"LC_COLLATE"category, and it is reset when this is changed by a call toSys.setlocale.

Value

A character string of length one describing the locale in use (aftersetting forSys.setlocale), or an empty character string if thecurrent locale settings are invalid orNULL if localeinformation is unavailable.

Forcategory = "LC_ALL" the details of the string aresystem-specific: it might be a single locale name or a set of localenames separated by"/" (macOS) or";"(Windows, Linux). For portability, it is best to query categoriesindividually: it is not necessarily the case that the result offoo <- Sys.getlocale() can be used inSys.setlocale("LC_ALL", locale = foo).

Available locales

On most Unix-alikes the POSIX shell commandlocale -a willlist the ‘available public’ locales. What that means isplatform-dependent. On recent Linuxen this may mean ‘availableto be installed’ as on some RPM-based systems the locale data is inseparateRPMs. On Debian/Ubuntu the set of available locales ismanaged by OS-specific facilities such aslocale-gen andlocale -a lists those currently enabled.

For Windows, Microsoft moves its documentation frequently so a Websearch is the best way to find current information. FromR 4.2,UCRTlocale names should be used. The character set should match thesystem/ANSI codepage (l10n_info()$codepage be the same asl10n_info()$system.codepage). Setting it to any other valueresults in a warning and may cause encoding problems. As fromR 4.2on recent Windows the system codepage is 65001 and one should alwaysuse locale names ending with".UTF-8" (except for"C"and""), otherwise Windows may add a different character set.

Warning

Setting"LC_NUMERIC" to any value other than"C" maycauseR to function anomalously, so gives a warning. Inputconversions inR itself are unaffected, but the reading and writingof ASCIIsave files will be, as may packages which dotheir own input/output.

Setting it temporarily on a Unix-alike to produce graphical or textoutput may work well enough, butoptions(OutDec) isoften preferable.

Almost all the output routines used byR itself under Windows ignorethe setting of"LC_NUMERIC" since they make use of the Triolibrary which is not internationalized.

Note

Changing the values of locale categories whilstR is running oughtto be noticed by the OS services, and usually is but exceptions havebeen seen (usually in collation services).

Do not use the value ofSys.getlocale("LC_CTYPE") to attempt tofind the character set – for example UTF-8 locales can have suffix‘⁠.UTF-8⁠’ or ‘⁠.utf8⁠’ (more common on Linux than ‘⁠UTF-8⁠’)or none (as on macOS) and Latin-9 locales can have suffix‘⁠ISO8859-15⁠’, ‘⁠iso885915⁠’, ‘⁠iso885915@euro⁠’ or‘⁠ISO8859-15@euro⁠’. Usel10n_info instead.

See Also

strptime for uses ofcategory = "LC_TIME".Sys.localeconv for details of numerical and monetaryrepresentations.

l10n_info gives some summary facts about the locale andits encoding (including if it is UTF-8).

The ‘R Installation and Administration’ manual for backgroundon locales and how to find out locale names on your system.

Examples

Sys.getlocale()## Date-time  related :Sys.getlocale("LC_TIME")-> olcTthen<- as.POSIXlt("2001-01-01 01:01:01", tz="UTC")## Not run:c(m= months(then), wd= weekdays(then))# locale specificSys.setlocale("LC_TIME","de")# Solaris: details are OS-dependentSys.setlocale("LC_TIME","de_DE")# Many Unix-alikesSys.setlocale("LC_TIME","de_DE.UTF-8")# Linux, macOS, other Unix-alikesSys.setlocale("LC_TIME","de_DE.utf8")# some Linux versionsSys.setlocale("LC_TIME","German.UTF-8")# WindowsSys.getlocale("LC_TIME")# the last one successfully set abovec(m= months(then), wd= weekdays(then))# in C_TIME locale 'cT' ; typically German## End(Not run)Sys.setlocale("LC_TIME","C")c(m= months(then), wd= weekdays(then))# "standard" (still platform specific ?)Sys.setlocale("LC_TIME", olcT)# reset to previous## Other localesSys.getlocale("LC_PAPER")# may or may not be set.LC.categories# of length 9 on all platforms## Not run: Sys.setlocale("LC_COLLATE", "C")   # turn off locale-specific sorting,# usually (but not on all platforms)Sys.setenv("LANGUAGE"="es")# set the language for error/warning messages## End(Not run)## some nice formatting; should work on most platforms,## macOS does not name the entries. sep<- switch(Sys.info()[["sysname"]],"Darwin"=,"SunOS"="/","Linux"=,"Windows"=";")##' show a "full" Sys.getlocale() nicely: showL<-function(loc){     sl<- strsplit(strsplit(loc, sep)[[1L]],"=")if(all(sapply(sl, length)==2L))        setNames(sapply(sl, `[[`,2L), sapply(sl, `[[`,1L))else       setNames(as.character(sl), .LC.categories[1+seq_along(sl)])} print.Dlist(lloc<- showL(Sys.getlocale()))## R-supported ones (but LC_ALL): lloc[.LC.categories[-1]]

Logarithms and Exponentials

Description

log computes logarithms, by default natural logarithms,log10 computes common (i.e., base 10) logarithms, andlog2 computes binary (i.e., base 2) logarithms.The general formlog(x, base) computes logarithms with basebase.

log1p(x) computeslog(1+x)\log(1+x) accurately also forx1|x| \ll 1.

exp computes the exponential function.

expm1(x) computesexp(x)1\exp(x) - 1 accurately also forx1|x| \ll 1.

Usage

log(x, base= exp(1))logb(x, base= exp(1))log10(x)log2(x)log1p(x)exp(x)expm1(x)

Arguments

x

a numeric or complex vector.

base

a positive or complex number: the base with respect to whichlogarithms are computed. Defaults toee=exp(1).

Details

All exceptlogb are generic functions: methods can be definedfor them individually or via theMathgroup generic.

log10 andlog2 are only convenience wrappers, but logsto bases 10 and 2 (whether computedvialog or the wrappers)will be computed more efficiently and accurately where supported by the OS.Methods can be set for them individually (and otherwise methods forlog will be used).

logb is a wrapper forlog for compatibility with S. If(S3 or S4) methods are set forlog they will be dispatched.Do not set S4 methods onlogb itself.

All exceptlog areprimitive functions.

Value

A vector of the same length asx containing the transformedvalues.log(0) gives-Inf, andlog(x) fornegative values ofx isNaN.exp(-Inf) is0.

For complex inputs to the log functions, the value is a complex numberwith imaginary part in the range[π,π][-\pi, \pi]: whichend of the range is used might be platform-specific.

S4 methods

exp,expm1,log,log10,log2 andlog1p are S4 generic and are members of theMath group generic.

Note that this means that the S4 generic forlog has asignature with only one argument,x, but thatbase canbe passed to methods (but will not be used for method selection). Onthe other hand, if you only set a method for theMath groupgeneric thenbase argument oflog will be ignored foryour class.

Source

log1p andexpm1 may be taken from the operating system,but if not available there then they are based on the Fortran subroutinedlnrel by W. Fullerton of Los Alamos Scientific Laboratory (seehttps://netlib.org/slatec/fnlib/dlnrel.f) and (for small x) asingle Newton step for the solution oflog1p(y) = xrespectively.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.(forlog,log10 andexp.)

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer. (forlogb.)

See Also

Trig,sqrt,Arithmetic.

Examples

log(exp(3))log10(1e7)# = 7x<-10^-(1+2*1:9)cbind(deparse.level=2,# to get nice column names      x, log(1+x), log1p(x), exp(x)-1, expm1(x))

Logical Operators

Description

These operators act on raw, logical and number-like vectors.

Usage

! xx& yx&& yx| yx|| yxor(x, y)isTRUE(x)isFALSE(x)

Arguments

x,y

raw,logical or ‘number-like’ vectors (i.e., oftypesdouble (classnumeric),integer andcomplex), or objects forwhich methods have been written.

Details

! indicates logical negation (NOT).

& and&& indicate logical AND and| and||indicate logical OR. The shorter forms performs elementwisecomparisons in much the same way as arithmetic operators. The longerforms evaluates left to right, proceeding only until the result isdetermined. The longer form is appropriate for programmingcontrol-flow and typically preferred inif clauses.

Using vectors of more than one element in&& or|| willgive an error.

xor indicates elementwise exclusive OR.

isTRUE(x) is the same as{ is.logical(x) && length(x) == 1 && !is.na(x) && x };isFALSE() is defined analogously. Consequently,if(isTRUE(cond)) may be preferable toif(cond) becauseofNAs.
In earlierR versions,isTRUE <- function(x) identical(x, TRUE),had the drawback to be false e.g., forx <- c(val = TRUE).

Numeric and complex vectors will be coerced to logical values, withzero being false and all non-zero values being true. Raw vectors arehandled without any coercion for!,&,| andxor, with these operators being applied bitwise (so! isthe 1s-complement).

The operators!,& and| are generic functions:methods can be written for them individually or via theOps (or S4Logic, see below)group generic function. (SeeOps forhow dispatch is computed.)

NA is a valid logical object. Where a component ofx ory isNA, the result will beNA if theoutcome is ambiguous. In other wordsNA & TRUE evaluates toNA, butNA & FALSE evaluates toFALSE. See theexamples below.

SeeSyntax for the precedence of these operators: unlike manyother languages (including S) the AND and OR operators do not have thesame precedence (the AND operators have higher precedence than the ORoperators).

Value

For!, a logical or raw vector(for rawx) of the samelength asx: names, dims and dimnames are copied fromx,and all other attributes (including class) if no coercion is done.

For|,& andxor a logical or raw vector. Ifinvolving a zero-length vector the result has length zero. Otherwise,the elements of shorter vectors are recycled as necessary (with awarning when they are recycled onlyfractionally).The rules for determining the attributes of the result are rathercomplicated. Most attributes are taken from the longer argument, thefirst if they are of the same length. Names will be copied from thefirst if it is the same length as the answer, otherwise from thesecond if that is. For time series, these operations are allowed onlyif the series are compatible, when the class andtspattribute of whichever is a time series (the same, if both are) areused. For arrays (and an array result) the dimensions and dimnamesare taken from first argument if it is an array, otherwise the second.

For||,&& andisTRUE, a length-one logical vector.

S4 methods

!,& and| are S4 generics, the latter two partof theLogic group generic (andhence methods need argument namese1, e2).

Note

The elementwise operators are sometimes called as functions ase.g.`&`(x, y): see the description of howargument-matching is done inOps.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

TRUE orlogical.

any andall for OR and AND on many scalararguments.

Syntax for operator precedence.

L%||% R which takesL if it is notNULL,andR otherwise.

bitwAnd for bitwise versions for integer vectors.

Examples

y<-1+(x<- stats::rpois(50, lambda=1.5)/4-1)x[(x>0)&(x<1)]# all x values between 0 and 1if(any(x==0)|| any(y==0))"zero encountered"## construct truth tables :x<- c(NA,FALSE,TRUE)names(x)<- as.character(x)outer(x, x, `&`)## AND tableouter(x, x, `|`)## OR  table

Logical Vectors

Description

Create or test for objects of type"logical", and the basiclogical constants.

Usage

TRUEFALSET; Flogical(length=0)as.logical(x,...)is.logical(x)

Arguments

length

a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error.

x

object to be coerced or tested.

...

further arguments passed to or from other methods.

Details

TRUE andFALSE arereserved words denoting logicalconstants in theR language, whereasT andF are globalvariables whose initial values set to these. All four arelogical(1) vectors.

as.logical is a generic function. Methods should return an objectof type"logical".

Logical vectors are coerced to integer vectors in contexts where anumerical value is required, withTRUE being mapped to1L,FALSE to0L andNA toNA_integer_.

Value

logical creates a logical vector of the specified length.Each element of the vector is equal toFALSE.

as.logical attempts to coerce its argument to be of logicaltype. In numeric and complex vectors, zeros areFALSE andnon-zero values areTRUE.Forfactors, this uses thelevels(labels). Likeas.vector it strips attributes includingnames. Character stringsc("T", "TRUE", "True", "true") areregarded as true,c("F", "FALSE", "False", "false") as false,and all others asNA.

is.logical returnsTRUE orFALSE depending onwhether its argument is of logical type or not.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

NA, the other logical constant.Logical operators are documented inLogic.

Examples

## non-zero values are TRUEas.logical(c(pi,0))if(length(letters)) cat("26 is TRUE\n")## logical interpretation of particular stringscharvec<- c("FALSE","F","False","false","fAlse","0","TRUE","T","True","true","tRue","1")as.logical(charvec)## factors are converted via their levels, so string conversion is usedas.logical(factor(charvec))as.logical(factor(c(0,1)))# "0" and "1" give NA

Long Vectors

Description

Vectors of2312^{31} or more elements were added inR 3.0.0.

Details

Prior toR 3.0.0, all vectors inR were restricted to at most23112^{31} - 1 elements and could be indexed by integervectors.

Currently allatomic (raw, logical, integer, numeric, complex,character) vectors,lists andexpressions can be muchlonger on 64-bit platforms: such vectors are referred to as‘long vectors’ and have a slightly different internalstructure. In theory they can contain up to2522^{52} elements, butaddress space limits of current CPUs and OSes will be much smaller.Such objects will have alength that is expressed as a double,and can be indexed by double vectors.

Arrays (including matrices) can be based on long vectors provided eachof their dimensions is at most23112^{31} - 1: thus thereare no 1-dimensional long arrays.

R code typically only needs minor changes to work with long vectors,maybe only checking thatas.integer is not used unnecessarilyfor e.g. lengths. However, compiled code typically needs quiteextensive changes. Note that the.C and.Fortran interfaces do not accept long vectors, so.Call (or similar) has to be used.

Because of the storage requirements (a minimum of 64 bytes percharacter string), character vectors are only going to be usable ifthey have a small number of distinct elements, and even then factorswill be more efficient (4 bytes per element rather than 8). So it isexpected that most of the usage of long vectors will be integervectors (including factors) and numeric vectors.

Matrix algebra

It is now possible to usem×nm \times n matrices with morethan 2 billion elements. Whether matrix algebra (including%*%,crossprod,svd,qr,solve andeigen) willactually work is somewhat implementation dependent, including theFortran compiler used and if an external BLAS or LAPACK is used.

An efficient parallel BLAS implementation will often be important toobtain usable performance. For example on one particular platformchol on a 47,000 square matrix took about 5 hours with theinternal BLAS, 21 minutes using an optimized BLAS on one core, and 2minutes using an optimized BLAS on 16 cores.


Lower and Upper Triangular Part of a Matrix

Description

Returns a matrix of logicals the same size of a given matrix withentriesTRUE in the lower or upper triangle.

Usage

lower.tri(x, diag=FALSE)upper.tri(x, diag=FALSE)

Arguments

x

a matrix or otherR object withlength(dim(x)) == 2.For back compatibility reasons, when the above is not fulfilled,as.matrix(x) is called first.

diag

logical. Should the diagonal be included?

See Also

diag,matrix; furtherrowandcol on whichlower.tri() andupper.tri() are built.

Examples

(m2<- matrix(1:20,4,5))lower.tri(m2)m2[lower.tri(m2)]<-NAm2

List Objects

Description

ls andobjects return a vector of character stringsgiving the names of the objects in the specified environment. Wheninvoked with no argument at the top level prompt,ls shows whatdata sets and functions a user has defined. When invoked with noargument inside a function,ls returns the names of thefunction's local variables: this is useful in conjunction withbrowser.

Usage

ls(name, pos=-1L, envir= as.environment(pos),   all.names=FALSE, pattern, sorted=TRUE)objects(name, pos=-1L, envir= as.environment(pos),        all.names=FALSE, pattern, sorted=TRUE)

Arguments

name

which environment to use in listing the available objects.Defaults to thecurrent environment. Although calledname for back compatibility, in fact this argument canspecify the environment in any form; see the ‘Details’ section.

pos

an alternative argument toname for specifying theenvironment as a position in the search list. Mostly there forback compatibility.

envir

an alternative argument toname for specifying theenvironment. Mostly there for back compatibility.

all.names

a logical value. IfTRUE, allobject names are returned. IfFALSE, names which begin with a‘⁠.⁠’ are omitted.

pattern

an optionalregular expression. Only namesmatchingpattern are returned.glob2rx can beused to convert wildcard patterns to regular expressions.

sorted

logical indicating if the resultingcharacter should be sorted alphabetically. Note thatthis is part ofls() may take most of the time.

Details

Thename argument can specify the environment from whichobject names are taken in one of several forms:as an integer (the position in thesearch list); asthe character string name of an element in the search list; or as anexplicitenvironment (including usingsys.frame to access the currently active function calls).By default, the environment of the call tols orobjectsis used. Thepos andenvir arguments are an alternativeway to specify an environment, but are primarily there for backcompatibility.

Note that theorder of strings forsorted = TRUE islocale dependent, seeSys.getlocale. Ifsorted = FALSE the order is arbitrary, depending if the environment ishashed, the order of insertion of objects, ....

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

glob2rx for converting wildcard patterns to regularexpressions.

ls.str for a long listing based onstr.apropos (orfind)for finding objects in the whole search path;grep for more details on ‘regular expressions’;class,methods, etc., forobject-oriented programming.

Examples

.Ob<-1ls(pattern="O")ls(pattern="O", all.names=TRUE)# also shows ".[foo]"# shows an empty list because inside myfunc no variables are definedmyfunc<-function(){ls()}myfunc()# define a local variable inside myfuncmyfunc<-function(){y<-1; ls()}myfunc()# shows "y"

Make Syntactically Valid Names

Description

Make syntactically valid names out of character vectors.

Usage

make.names(names, unique=FALSE, allow_=TRUE)

Arguments

names

character vector to be coerced to syntactically validnames. This is coerced to character if necessary.

unique

logical; ifTRUE, the resulting elements areunique. This may be desired for, e.g., column names.

allow_

logical. For compatibility withR prior to 1.9.0.

Details

A syntactically valid name consists of letters, numbers and the dot orunderline characters and starts with a letter or the dot not followedby a number. Names such as".2way" are not valid, and neitherare thereserved words.

The definition of aletter depends on the current locale, butonly ASCII digits are considered to be digits.

The character"X" is prepended if necessary.All invalid characters are translated to".". A missing valueis translated to"NA". Names which matchR keywords have a dotappended to them. Duplicated values are altered bymake.unique.

Value

A character vector of same length asnames with each changed toa syntactically valid name, in the current locale's encoding.

Warning

Some OSes, notably FreeBSD, report extremely incorrect information aboutwhich characters are alphabetic in some locales (typically, allmulti-byte locales including UTF-8 locales). However,R providessubstitutes on Windows, macOS and AIX.

Note

Prior toR version 1.9.0, underscores were not valid in variable names,and code that relies on them being converted to dots will no longerwork. Useallow_ = FALSE for back-compatibility.

allow_ = FALSE is also useful when creating names for export toapplications which do not allow underline in names (such as someDBMSes).

See Also

make.unique,names,character,data.frame.

Examples

make.names(c("a and b","a-and-b"), unique=TRUE)# "a.and.b"  "a.and.b.1"make.names(c("a and b","a_and_b"), unique=TRUE)# "a.and.b"  "a_and_b"make.names(c("a and b","a_and_b"), unique=TRUE, allow_=FALSE)# "a.and.b"  "a.and.b.1"make.names(c("","X"), unique=TRUE)# "X.1" "X" currently; R up to 3.0.2 gave "X" "X.1"state.name[make.names(state.name)!= state.name]# those 10 with a space

Make Character Strings Unique

Description

Makes the elements of a character vector unique byappending sequence numbers to duplicates.

Usage

make.unique(names, sep=".")

Arguments

names

a character vector.

sep

a character string used to separate a duplicate name fromits sequence number.

Details

The algorithm used bymake.unique has the property thatmake.unique(c(A, B)) == make.unique(c(make.unique(A), B)).

In other words, you can append one string at a time to a vector,making it unique each time, and get the same result as applyingmake.unique to all of the strings at once.

If character vectorA is already unique, thenmake.unique(c(A, B)) preservesA.

Value

A character vector of same length asnames with duplicateschanged, in the current locale's encoding.

Author(s)

Thomas P. Minka

See Also

make.names

Examples

make.unique(c("a","a","a"))make.unique(c(make.unique(c("a","a")),"a"))make.unique(c("a","a","a.2","a"))make.unique(c(make.unique(c("a","a")),"a.2","a"))## Now show a bit where this is used :trace(make.unique)## Applied in data.frame() constructions:(d1<- data.frame(x=1, x=2, x=3))# direct d2<- data.frame(data.frame(x=1, x=2), x=3)# pairwisestopifnot(identical(d1, d2),          colnames(d1)== c("x","x.1","x.2"))untrace(make.unique)

Apply a Function to Multiple List or Vector Arguments

Description

mapply is a multivariate version ofsapply.mapply appliesFUN to the first elements of each ...argument, the second elements, the third elements, and so on.Arguments are recycled if necessary.

.mapply() is a bare-bones version ofmapply(), e.g., to beused in other functions.

Usage

mapply(FUN,..., MoreArgs=NULL, SIMPLIFY=TRUE,       USE.NAMES=TRUE).mapply(FUN, dots, MoreArgs)

Arguments

FUN

function to apply, found viamatch.fun.

...

arguments to vectorize over, will be recycled to commonlength (zero if one of them is). See also ‘Details’.

dots

list orpairlist of arguments tovectorize over, see... above.

MoreArgs

a list of other arguments toFUN.

SIMPLIFY

logical or character string; attempt to reduce theresult to a vector, matrix or higher dimensional array; seethesimplify argument ofsapply.

USE.NAMES

logical; use the names of the first ... argument, orif that is an unnamed character vector, use that vector as the names.

Details

mapply callsFUN for the values of...(re-cycled to the length of the longest, unless any have length zerowhere recycling to zero length will returnlist()),followed by the arguments given inMoreArgs. The arguments inthe call will be named if... orMoreArgs are named.

For the arguments in... (or components indots) class specificsubsetting (such as[) andlength methods will beused where applicable.

Value

Alist, or forSIMPLIFY = TRUE, a vector, array or list.

See Also

sapply, after whichmapply() is modelled.

outer, which applies a vectorized function to allcombinations of two arguments.

Examples

mapply(rep,1:4,4:1)mapply(rep, times=1:4, x=4:1)mapply(rep, times=1:4, MoreArgs= list(x=42))mapply(function(x, y) seq_len(x)+ y,       c(a=1, b=2, c=3),# names from first       c(A=10, B=0, C=-10))word<-function(C, k) paste(rep.int(C, k), collapse="")## names from the first, too:utils::str(L<- mapply(word, LETTERS[1:6],6:1, SIMPLIFY=FALSE))mapply(word,"A", integer())# gave Error, now list()

Compute Table Margins

Description

For a contingency table in array form, compute the sum of tableentries for a given margin or set of margins.

Usage

marginSums(x, margin=NULL)margin.table(x, margin=NULL)

Arguments

x

an array, usually atable.

margin

a vector giving the margins to compute sums for.E.g., for a matrix1 indicates rows,2 indicatescolumns,c(1, 2) indicates rows and columns.Whenx has nameddimnames, it can be a character vectorselecting dimension names.

Value

The relevant marginal table, or just the sum of all entries ifmargin has length zero. The class ofx is copied to theoutput table ifmargin is non-NULL.

Note

margin.table is an earlier name, retained for back-compatibility.

Author(s)

Peter Dalgaard

See Also

rowSums andcolSums for similar functionality.

proportions andaddmargins.

Examples

m<- matrix(1:4,2)marginSums(m,1)# = rowSums(m)marginSums(m,2)# = colSums(m)DF<- as.data.frame(UCBAdmissions)tbl<- xtabs(Freq~ Gender+ Admit, DF)tblmarginSums(tbl,"Gender")# a 1-dim "table"rowSums(tbl)# a numeric vector

Create a Matrix or a Vector

Description

mat.or.vec creates annr bync zero matrix ifnc is greater than 1, and a zero vector of lengthnr ifnc equals 1.

Usage

mat.or.vec(nr, nc)

Arguments

nr,nc

numbers of rows and columns.

Examples

mat.or.vec(3,1)mat.or.vec(3,2)

Value Matching

Description

match returns a vector of the positions of (first) matches ofits first argument in its second.

%in% is a more intuitive interface as a binary operator,which returns a logical vector indicating if there is a match or notfor its left operand.

Usage

match(x, table, nomatch=NA_integer_, incomparables=NULL)x%in% table

Arguments

x

vector orNULL: the values to be matched.Long vectors are supported.

table

vector orNULL: the values to be matched against.Long vectors are not supported.

nomatch

the value to be returned in the case when no match isfound. Note that it is coerced tointeger.

incomparables

a vector of values that cannot be matched. Anyvalue inx matching a value in this vector is assigned thenomatch value. For historical reasons,FALSE isequivalent toNULL.

Details

%in% is currently defined as
"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0

Factors, raw vectors and lists are converted to character vectors,internally classed objects are transformed viamtfrm, andthenx andtable are coerced to a common type (the laterof the two types inR's ordering, logical < integer < numeric <complex < character) before matching. Ifincomparables haspositive length it is coerced to the common type.

Matching for lists is potentially very slow and best avoided except insimple cases.

Exactly what matches what is to some extent a matter of definition.For all types,NA matchesNA and no other value.For real and complex values,NaN values are regardedas matching any otherNaN value, but not matchingNA,where for complexx, real and imaginary parts must match both(unless containing at least oneNA).

Character strings will be compared as byte sequences if any input ismarked as"bytes", and otherwise are regarded as equal if they arein different encodings but would agree when translated to UTF-8 (seeEncoding).

That%in% never returnsNA makes it particularlyuseful inif conditions.

Value

A vector of the same length asx.

match: An integer vector giving the position intable ofthe first match if there is a match, otherwisenomatch.

Ifx[i] is found to equaltable[j] then the valuereturned in thei-th position of the return value isj,for the smallest possiblej. If no match is found, the valueisnomatch.

%in%: A logical vector, indicating if a match was located foreach element ofx: thus the values areTRUE orFALSE and neverNA.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

pmatch andcharmatch for (partial)string matching,match.arg, etc for function argumentmatching.findInterval similarly returns a vector of positions, butfinds numbers within intervals, rather than exact matches.

is.element for an S-compatible equivalent of%in%.

unique (andduplicated) are using the samedefinitions of “match” or “equality” asmatch(),and these are less strict than==, e.g., forNA andNaN in numeric or complex vectors,or for strings with different encodings, see also above.

Examples

## The intersection of two sets can be defined via match():## Simple version:## intersect <- function(x, y) y[match(x, y, nomatch = 0)]intersect# the R function in base is slightly more carefulintersect(1:10,7:20)1:10%in% c(1,3,5,9)sstr<- c("c","ab","B","bba","c",NA,"@","bla","a","Ba","%")sstr[sstr%in% c(letters, LETTERS)]"%w/o%"<-function(x, y) x[!x%in% y]#--  x without y(1:10)%w/o% c(3,7,12)## Note that setdiff() is very similar and typically makes more sense:        c(1:6,7:2)%w/o% c(3,7,12)# -> keeps duplicatessetdiff(c(1:6,7:2),      c(3,7,12))# -> unique values## Illuminating example about NA matchingr<- c(1,NA,NaN)zN<- c(complex(real=NA, imaginary=  r), complex(real=  r, imaginary=NA),        complex(real=  r, imaginary=NaN), complex(real=NaN, imaginary=  r))zM<- cbind(Re=Re(zN), Im=Im(zN), match= match(zN, zN))rownames(zM)<- format(zN)zM##--> many "NA's" (= 1) and the four non-NA's (3 different ones, at 7,9,10)length(zN)# 12unique(zN)# the "NA" and the 3 different non-NA NaN'sstopifnot(identical(unique(zN), zN[c(1,7,9,10)]))## very strict equality would have 4 duplicates (of 12):symnum(outer(zN, zN, Vectorize(identical,c("x","y")),FALSE,FALSE,FALSE,FALSE))## removing "(very strictly) duplicates",i<- c(5,8,11,12)# we get 8 pairwise non-identicals :Ixy<- outer(zN[-i], zN[-i], Vectorize(identical,c("x","y")),FALSE,FALSE,FALSE,FALSE)stopifnot(identical(Ixy, diag(8)==1))

Argument Verification Using Partial Matching

Description

match.arg matches a characterarg against a table ofcandidate values as specified bychoices.

Usage

match.arg(arg, choices, several.ok=FALSE)

Arguments

arg

a character vector (of length one unlessseveral.okisTRUE) orNULL which means to takechoices[1].

choices

a character vector of candidate values, often missing, see‘Details’.

several.ok

logical specifying ifarg should be allowedto have more than one element.

Details

In the one-argument formmatch.arg(arg), the choices areobtained from a default setting for the formal argumentarg ofthe function from whichmatch.arg was called. (Since defaultargument matching will setarg tochoices, this isallowed as an exception to the ‘length one unlessseveral.ok isTRUE’ rule, and returns the firstelement.)

Matching is done usingpmatch, soarg may beabbreviated and the empty string ("") never matches, not evenitself, seepmatch.

Value

The unabbreviated version of the exact or unique partial match ifthere is one; otherwise, an error is signalled ifseveral.ok isfalse, as per default. Whenseveral.ok is true and (at least)one element ofarg has a match, all unabbreviated versions ofmatches are returned.

Warning

The error messages given are liable to change and did so inR 4.2.0.Do not test them in packages.

See Also

pmatch,match.fun,match.call.

Examples

require(stats)## Extends the example for 'switch'center<-function(x, type= c("mean","median","trimmed")){  type<- match.arg(type)  switch(type,         mean= mean(x),         median= median(x),         trimmed= mean(x, trim=.1))}x<- rcauchy(10)center(x,"t")# Workscenter(x,"med")# Workstry(center(x,"m"))# Errorstopifnot(identical(center(x),       center(x,"mean")),          identical(center(x,NULL), center(x,"mean")))## Allowing more than one 'arg' and hence more than one match:match.arg(c("gauss","rect","ep"),          c("gaussian","epanechnikov","rectangular","triangular"),          several.ok=TRUE)match.arg(c("a",""),  c("",NA,"bb","abc"), several.ok=TRUE)# |-->  "abc"

Argument Matching

Description

match.call returns a call in which all of the specified arguments arespecified by their full names.

Usage

match.call(definition= sys.function(sys.parent()),           call= sys.call(sys.parent()),           expand.dots=TRUE,           envir= parent.frame(2L))

Arguments

definition

a function, by default the function from whichmatch.call is called. See details.

call

an unevaluated call to the function specified bydefinition, as generated bycall.

expand.dots

logical. Should arguments matching...in the call be included or left as a... argument?

envir

an environment, from which the... incallare retrieved, if any.

Details

‘function’ on this help page means an interpreted function(also known as a ‘closure’):match.call does not supportprimitive functions (where argument matching is normallypositional).

match.call is most commonly used in two circumstances:

  • To record the call for later re-use: for example mostmodel-fitting functions record the call as elementcall ofthe list they return. Here the defaultexpand.dots = TRUEis appropriate.

  • To pass most of the call to another function, oftenmodel.frame. Here the common idiom is thatexpand.dots = FALSE is used, and the... elementof the matched call is removed. An alternative is toexplicitly select the arguments to be passed on, as is done inlm.

Callingmatch.call outside a function without specifyingdefinition is an error.

Value

An object of classcall.

References

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

See Also

sys.call() is similar, but doesnot expand theargument names;call,pmatch,match.arg,match.fun.

Examples

match.call(get, call("get","abc", i=FALSE, p=3))## -> get(x = "abc", pos = 3, inherits = FALSE)fun<-function(x, lower=0, upper=1){  structure((x- lower)/(upper- lower), CALL= match.call())}fun(4* atan(1), u= pi)

Extract a Function Specified by Name

Description

When called inside functions that take a function as argument, extractthe desired function object while avoiding undesired matching toobjects of other types.

Usage

match.fun(FUN, descend=TRUE)

Arguments

FUN

item to match as function: a function, symbol orcharacter string. See ‘Details’.

descend

logical; control whether to search past non-functionobjects.

Details

match.fun is not intended to be used at the top level since itwill perform matching in theparent of the caller.

IfFUN is a function, it is returned. If it is a symbol (forexample, enclosed in backquotes) or acharacter vector of length one, it will be looked up usinggetin the environment of the parent of the caller. If it is of any othermode, it is attempted first to get the argument to the caller as asymbol (usingsubstitute twice), and if that fails, an error isdeclared.

Ifdescend = TRUE,match.fun will look past non-functionobjects with the given name; otherwise ifFUN points to anon-function object then an error is generated.

This is used in base functions such asapply,lapply,outer, andsweep.

Value

A function matchingFUN or an error is generated.

Bugs

Thedescend argument is a bit of misnomer and probably notactually needed by anything. It may go away in the future.

It is impossible to fully foolproof this. If oneattaches alist or data frame containing a length-one character vector with thesame name as a function, it may be used (although namespaceswill help).

Author(s)

Peter Dalgaard and Robert Gentleman, based on an earlier versionby Jonathan Rougier.

See Also

match.arg,get

Examples

# Same as get("*"):match.fun("*")# Overwrite outer with a vectorouter<-1:5try(match.fun(outer, descend=FALSE))#-> Error:  not a functionmatch.fun(outer)# finds it anywayis.function(match.fun("outer"))# as well

Miscellaneous Mathematical Functions

Description

abs(x) computes the absolute value of x,sqrt(x) computes the(principal) square root of x,x\sqrt{x}.

The naming follows the standard for computer languages such as C or Fortran.

Usage

abs(x)sqrt(x)

Arguments

x

a numeric orcomplex vector or array.

Details

These areinternal genericprimitive functions: methodscan be defined for them individually or via theMath group generic. For complexarguments (and the default method),z,abs(z) ==Mod(z) andsqrt(z) == z^0.5.

abs(x) returns aninteger vector whenx isinteger orlogical.

S4 methods

Both are S4 generic and members of theMath group generic.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

Arithmetic for simple,log for logarithmic,sin for trigonometric, andSpecial forspecial mathematical functions.

plotmath’ for the use ofsqrt in plot annotation.

Examples

require(stats)# for splinerequire(graphics)xx<--9:9plot(xx, sqrt(abs(xx)),  col="red")lines(spline(xx, sqrt(abs(xx)), n=101), col="pink")

Matrix Multiplication

Description

Multiplies two matrices, if they are conformable. If one argument isa vector, it will be promoted to either a row or column matrix to makethe two arguments conformable. If both are vectors of the samelength, it will return the inner product (as a matrix).

Usage

x%*% y

Arguments

x,y

numeric or complex matrices or vectors.

Details

When a vector is promoted to a matrix, its names are notpromoted to row or column names, unlikeas.matrix.

Promotion of a vector to a 1-row or 1-column matrix happens when oneof the two choices allowsx andy to get conformabledimensions.

This operator is a generic function: methods can be written for itindividually or via thematOps groupgeneric function; it dispatches to S3 and S4 methods. Methods need to bewritten for a function that takes two arguments namedx andy.

Value

A double or complex matrix product. Usedrop to removedimensions which have only one level.

Note

The propagation of NaN/Inf values, precision, and performance of matrixproducts can be controlled byoptions("matprod").

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

For matrixcross products,crossprod() andtcrossprod() are typically preferable.matrix,Arithmetic,diag.

Examples

x<-1:4(z<- x%*% x)# scalar ("inner") product (1 x 1 matrix)drop(z)# as scalary<- diag(x)z<- matrix(1:12, ncol=3, nrow=4)y%*% zy%*% xx%*% z

Matrices

Description

matrix creates a matrix from the given set of values.

as.matrix attempts to turn its argument into a matrix.

is.matrix tests if its argument is a (strict) matrix.

Usage

matrix(data=NA, nrow=1, ncol=1, byrow=FALSE,       dimnames=NULL)as.matrix(x,...)## S3 method for class 'data.frame'as.matrix(x, rownames.force=NA,...)is.matrix(x)

Arguments

data

an optional data vector (including a list orexpression vector). Non-atomic classedR objects arecoerced byas.vector and all attributes discarded.

nrow

the desired number of rows.

ncol

the desired number of columns.

byrow

logical. IfFALSE (the default) the matrix isfilled by columns, otherwise the matrix is filled by rows.

dimnames

adimnames attribute for the matrix:NULL or alist of length 2 giving the row and columnnames respectively. An empty list is treated asNULL, and alist of length one as row names. The list can be named, and thelist names will be used as names for the dimensions.

x

anR object.

...

additional arguments to be passed to or from methods.

rownames.force

logical indicating if the resulting matrixshould have character (rather thanNULL)rownames. The default,NA, usesNULLrownames if the data frame has ‘automatic’ row.names or for azero-row data frame.

Details

If one ofnrow orncol is not given, an attempt ismade to infer it from the length ofdata and the otherparameter. If neither is given, a one-column matrix is returned.

If there are too few elements indata to fill the matrix,then the elements indata are recycled. Ifdata haslength zero,NA of an appropriate type is used for atomicvectors (0 for raw vectors) andNULL for lists.

is.matrix returnsTRUE ifx is a vector and has a"dim" attribute of length 2 andFALSE otherwise.Note that adata.frame isnot a matrix by thistest. The function is generic: you can write methods to handlespecific classes of objects, seeInternalMethods.

as.matrix is a generic function. The method for data frameswill return a character matrix if there is only atomic columns and anynon-(numeric/logical/complex) column, applyingas.vectorto factors andformat to other non-character columns.Otherwise, the usual coercion hierarchy (logical < integer < double <complex) will be used, e.g., all-logical data frames will be coercedto a logical matrix, mixed logical-integer will give a integer matrix,etc.

The default method foras.matrix callsas.vector(x), andhence e.g. coerces factors to character vectors.

When coercing a vector, it produces a one-column matrix, andpromotes the names (if any) of the vector to the rownames of the matrix.

is.matrix is aprimitive function.

Theprint method for a matrix gives a rectangular layout withdimnames or indices. For a list matrix, the entries of length notone are printed in the form ‘⁠integer,7⁠’ indicating the typeand length.

Note

If you just want to convert a vector to a matrix, something like

  dim(x) <- c(nx, ny)  dimnames(x) <- list(row_names, col_names)

will avoid duplicatingxand preserveclass(x) which may be useful, e.g.,forDate objects.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

data.matrix, which attempts to convert to a numericmatrix.

A matrix is the special case of a two-dimensionalarray.inherits(m, "array") is true for amatrixm.

Examples

is.matrix(as.matrix(1:10))!is.matrix(warpbreaks)# data.frame, NOT matrix!warpbreaks[1:10,]as.matrix(warpbreaks[1:10,])# using as.matrix.data.frame(.) method## Example of setting row and column namesmdat<- matrix(c(1,2,3,11,12,13), nrow=2, ncol=3, byrow=TRUE,               dimnames= list(c("row1","row2"),                               c("C.1","C.2","C.3")))mdat

Find Maximum Position in Matrix

Description

Find the maximum position for each row of a matrix, breaking ties at random.

Usage

max.col(m, ties.method= c("random","first","last"))

Arguments

m

a numerical matrix.

ties.method

a character string specifying how ties arehandled,"random" by default; can be abbreviated; see‘Details’.

Details

Whenties.method = "random", as per default, ties are broken atrandom. In this case, the determination of a tie assumes thatthe entries are probabilities: there is a relative tolerance of10510^{-5}, relative to the largest (in magnitude, omittinginfinity) entry in the row.

Ifties.method = "first",max.col returns thecolumn number of thefirst of several maxima in every row, thesame asunname(apply(m, 1,which.max))ifm has no missing values.
Correspondingly,ties.method = "last" returns thelastof possibly several indices.

Value

index of a maximal value for each row, an integer vector oflengthnrow(m).

References

Venables, W. N. and Ripley, B. D. (2002)Modern Applied Statistics with S.New York: Springer (4th ed).

See Also

which.max for vectors.

Examples

table(mc<- max.col(swiss))# mostly "1" and "5", 5 x "2" and once "4"swiss[unique(print(mr<- max.col(t(swiss)))),]# 3 33 45 45 33 6set.seed(1)# reproducible example:(mm<- rbind(x= round(2*stats::runif(12)),             y= round(5*stats::runif(12)),             z= round(8*stats::runif(12))))## Not run:[,1][,2][,3][,4][,5][,6][,7][,8][,9][,10][,11][,12]x111202211000y324245245131z230373454175## End(Not run)## column indices of all row maxima :utils::str(lapply(1:3,function(i) which(mm[i,]== max(mm[i,]))))max.col(mm); max.col(mm)# "random"max.col(mm,"first")# -> 4 6 5max.col(mm,"last")# -> 7 9 11

Arithmetic Mean

Description

Generic function for the (trimmed) arithmetic mean.

Usage

mean(x,...)## Default S3 method:mean(x, trim=0, na.rm=FALSE,...)

Arguments

x

anR object. Currently there are methods fornumeric/logical vectors anddate,date-time andtime interval objects. Complex vectorsare allowed fortrim = 0, only.

trim

the fraction (0 to 0.5) of observations to betrimmed from each end ofx before the mean is computed.Values of trim outside that range are taken as the nearest endpoint.

na.rm

a logical evaluating toTRUE orFALSEindicating whetherNA values should be stripped before thecomputation proceeds.

...

further arguments passed to or from other methods.

Value

Iftrim is zero (the default), the arithmetic mean of thevalues inx is computed, as a numeric or complex vector oflength one. Ifx is not logical (coerced to numeric), numeric(including integer) or complex,NA_real_ is returned, with a warning.

Iftrim is non-zero, a symmetrically trimmed mean is computedwith a fraction oftrim observations deleted from each endbefore the mean is computed.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

weighted.mean,mean.POSIXct,colMeans for row and column means.

Examples

x<- c(0:10,50)xm<- mean(x)c(xm, mean(x, trim=0.10))

In-memory Compression and Decompression

Description

In-memory compression or decompression for raw vectors.

Usage

memCompress(from, type= c("gzip","bzip2","xz","none"))memDecompress(from,              type= c("unknown","gzip","bzip2","xz","none"),              asChar=FALSE)

Arguments

from

raw vector. FormemCompress, a character vectorwill be converted to a raw vector with character strings separatedby"\n". Types except"bzip2" support longraw vectors.

type

character string, the type of compression. May beabbreviated to a single letter, defaults to the first of the alternatives.

asChar

logical: should the result be converted to a characterstring? NB: character strings have a limit of23112^{31}-1 bytes, so raw vectors should be used forlarge inputs.

Details

type = "none" passes the input through unchanged, but may beuseful iftype is a variable.

type = "unknown" attempts to detect the type of compressionapplied (if any): this will always succeed forbzip2compression, and will succeed for other forms if there is a suitableheader. If no type of compression is detected this is the same astype = "none" but a warning is given.

gzip compression uses whatever is the default compressionlevel of the underlying library (usually6). This supports theRFC 1950 format, sometimes known as ‘zlib’ format, forcompression and decompression and for decompression only RFC 1952, the‘gzip’ format (which wraps the ‘zlib’ format with aheader and footer).

bzip2 compression always adds a header ("BZh"). Theunderlying library only supports in-memory (de)compression of up to23112^{31}-1 elements. Compression is equivalent tobzip2 -9 (the default).

Compressing withtype = "xz" is equivalent to compressing afile withxz -9e (including adding the ‘magic’header): decompression should cope with the contents of any filecompressed byxz version 4.999 and later, as well as by someversions oflzma. There are other versions, in particular‘raw’ streams, that are not currently handled.

All the types of compression can expand the input: for"gzip"and"bzip2" the maximum expansion is known and somemCompress can always allocate sufficient space. For"xz" it is possible (but extremely unlikely) that compressionwill fail if the output would have been too large.

Value

A raw vector or a character string (ifasChar = TRUE).

libdeflate

Support for thelibdeflate library was added forR 4.4.0. Ituses different code for the RFC 1950 ‘zlib’ format (and RFC1952 for decompression), expected to be substantially faster thanusing the reference (or system) zlib library. It is used fortype = "gzip" if available.

The headers and sources can be downloaded fromhttps://github.com/ebiggers/libdeflate and pre-built versionsare available for most Linux distributions. It is used for binaryWindows distributions.

See Also

connections.

extSoftVersion for the versions of thezlib orlibdeflate,bzip2 andxz libraries in use.

https://en.wikipedia.org/wiki/Data_compression for background ondata compression,https://zlib.net/,https://en.wikipedia.org/wiki/Gzip,http://www.bzip.org/,https://en.wikipedia.org/wiki/Bzip2,andhttps://en.wikipedia.org/wiki/XZ_Utils for references about theparticular schemes used.

Examples

txt<- readLines(file.path(R.home("doc"),"COPYING"))sum(nchar(txt))txt.gz<- memCompress(txt,"g")# "gzip", the defaultlength(txt.gz)txt2<- strsplit(memDecompress(txt.gz,"g", asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt2))## as from R 4.4.0 this is detected if not specified.txt2b<- strsplit(memDecompress(txt.gz, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt2b, txt2))txt.bz2<- memCompress(txt,"b")length(txt.bz2)## can auto-detect bzip2:txt3<- strsplit(memDecompress(txt.bz2, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt3))## xz compression is only worthwhile for large objectstxt.xz<- memCompress(txt,"x")length(txt.xz)txt3<- strsplit(memDecompress(txt.xz, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt3))## test decompressing a gzip-ed filetf<- tempfile(fileext=".gz")con<- gzfile(tf,"w")writeLines(txt, con)close(con)(nf<- file.size(tf))# if (nzchar(Sys.which("file"))) system2("file", tf)foo<- readBin(tf,"raw", n= nf)unlink(tf)## will detect the gzip header and choose type = "gzip"txt3<- strsplit(memDecompress(foo, asChar=TRUE),"\n")[[1]]stopifnot(identical(txt, txt3))

Query and Set Heap Size Limits

Description

Query and set the maximal size of the vector heap and the maximalnumber of heap nodes for the currentR process.

Usage

mem.maxVSize(vsize=0)mem.maxNSize(nsize=0)

Arguments

vsize

numeric; new size limit in Mb.

nsize

numeric; new maximal node number.

Details

New limits lower than current usage are ignored.Specifying a size ofInf sets the limit to the maximal possiblevalue for the platform.

The default maximal values are unlimited on most platforms, but can beadjusted using environment variables as described inMemory. On macOS a lower default vector heap limit isused to protect against theR process being killed when macOSover-commits memory.

Adjusting the maximal number of nodes is rarely necessary. Adjustingthe vector heap size limit can be useful on macOS in particular butshould be done with caution.

Value

The current or new value, in Mb formem.maxVSize.Inf isreturned if the current value is unlimited.

See Also

Memory.


Memory Available for Data Storage

Description

HowR manages its workspace.

Details

R has a variable-sized workspace. There are (rarely-used)command-line options to control its minimum size, but no longer any tocontrol the maximum size.

R maintains separate areas for fixed and variable sized objects. Thefirst of these is allocated as an array ofcons cells (Lispprogrammers will know what they are, others may think of them as thebuilding blocks of the language itself, parse trees, etc.), and thesecond are thrown on aheap of ‘Vcells’ of 8 bytes each.Each cons cell occupies 28 bytes on a 32-bit build ofR, (usually) 56bytes on a 64-bit build.

The default values are (currently) an initial setting of 350k conscells and 6Mb of vector heap. Note that the areas are not actuallyallocated initially: rather these values are the sizes for triggeringgarbage collection. These values can be set by the command lineoptions--min-nsize and--min-vsize (or if they arenot used, the environment variablesR_NSIZE andR_VSIZE)whenR is started. ThereafterR will grow or shrink the areasdepending on usage, never decreasing below the initial values. Themaximal vector heap size can be set with the environment variableR_MAX_VSIZE. An attempt to set a lower maximum than the currentusage is ignored. Vector heap limits are given in bytes.

How much timeR spends in the garbage collector will depend on theseinitial settings and on the trade-off the memory manager makes, whenmemory fills up, between collecting garbage to free up unused memoryand growing these areas. The strategy used for growth can bespecified by setting the environment variableR_GC_MEM_GROW toan integer value between 0 and 3. This variable is read atstart-up. Higher values grow the heap more aggressively, thus reducinggarbage collection time but using more memory.

You can find out the current memory consumption (the heap and conscells used as numbers and megabytes) by typinggc() at theR prompt. Note that followinggcinfo(TRUE), automaticgarbage collection always prints memory use statistics.

The command-line option--max-ppsize controls the maximumsize of the pointer protection stack. This defaults to 50000, but canbe increased to allow deep recursion or large and complicatedcalculations to be done.Note that parts of the garbagecollection process goes through the full reserved pointer protectionstack and hence becomes slower when the size is increased. Currentlythe maximum value accepted is 500000.

See Also

An Introduction to R for more command-line options.

Memory-limits for the design limitations.

gc for information on the garbage collector and totalmemory usage,object.size(a) for the (approximate)size ofR objecta.memory.profile forprofiling the usage of cons cells.


Memory Limits in R

Description

R holds objects it is using in virtual memory. This help filedocuments the current design limitations on large objects: thesediffer between 32-bit and 64-bit builds ofR.

Details

CurrentlyR runs on 32- and 64-bit operating systems, and most 64-bitOSes (including Linux, Solaris, Windows and macOS) can run either32- or 64-bit builds ofR. The memory limits depends mainly on thebuild, but for a 32-bit build ofR on Windows they also depend on theunderlying OS version.

R holds all objects in virtual memory, and there are limits based on theamount of memory that can be used by all objects:

  • There may be limits on the size of the heap and the number ofcons cells allowed – seeMemory – but these areusually not imposed.

  • There is a limit on the (user) address space of a singleprocess such as theR executable. This is system-specific, and candepend on the executable.

  • The environment may impose limitations on the resourcesavailable to a single process: Windows' versions ofR do so directly.

Error messages beginning ‘⁠cannot allocate vector of size⁠’indicate a failure to obtain memory, either because the size exceededthe address-space limit for a process or, more likely, because thesystem was unable to provide the memory. Note that on a 32-bit buildthere may well be enough free memory available, but not a large enoughcontiguous block of address space into which to map it.

There are also limits on individual objects. The storage spacecannot exceed the address limit, and if you try to exceed that limit,the error message begins ‘⁠cannot allocate vector of length⁠’.The number of bytes in a character string is limited to231121092^{31} - 1 \approx 2\thinspace 10^9,which is also the limit on each dimension of an array.

Unix

The address-space limit is system-specific: 32-bit OSesimposes a limit of no more than 4Gb: it is often 3Gb. Running32-bit executables on a 64-bit OS will have similar limits: 64-bitexecutables will have an essentially infinite system-specific limit(e.g., 128Tb for Linux on x86_64 CPUs).

See the OS/shell's help on commands such aslimit orulimit for how to impose limitations on the resources availableto a single process. For example abash user could use

ulimit -t 600 -v 4000000

whereas acsh user might use

limit cputime 10mlimit vmemoryuse 4096m

to limit a process to 10 minutes of CPU time and (around) 4Gb ofvirtual memory. (There are other options to set the RAM in use, but theyare not generally honoured.)

Windows

The address-space limit is 2Gb under 32-bit Windows unless the OS'sdefault has been changed to allow more (up to 3Gb). Seehttps://docs.microsoft.com/en-gb/windows/desktop/Memory/physical-address-extensionandhttps://docs.microsoft.com/en-gb/windows/desktop/Memory/4-gigabyte-tuning.Under most 64-bit versions of Windows the limit for a 32-bit buildofR is 4Gb: for the oldest ones it is 2Gb. The limit for a 64-bitbuild ofR (imposed by the OS) is 8Tb.

It is not normally possible to allocate as much as 2Gb to a singlevector in a 32-bit build ofR even on 64-bit Windows because ofpreallocations by Windows in the middle of the address space.

See Also

object.size(a) for the (approximate) size ofR objecta.


Profile the Usage of Cons Cells

Description

Lists the usage of the cons cells bySEXPREC type.

Usage

memory.profile()

Details

The current types and their uses are listed in the include file‘Rinternals.h’.

Value

A vector of counts, named by the types. Seetypeof foran explanation of types.

See Also

gc for the overall usage of cons cells.Rprofmem andtracemem allow memory profilingof specific code or objects, but need to be enabled at compile time.

Examples

memory.profile()

Merge Two Data Frames

Description

Merge two data frames by common columns or row names, or do otherversions of databasejoin operations.

Usage

merge(x, y,...)## Default S3 method:merge(x, y,...)## S3 method for class 'data.frame'merge(x, y, by= intersect(names(x), names(y)),      by.x= by, by.y= by, all=FALSE, all.x= all, all.y= all,      sort=TRUE, suffixes= c(".x",".y"), no.dups=TRUE,      incomparables=NULL,...)

Arguments

x,y

data frames, or objects to be coerced to one.

by,by.x,by.y

specifications of the columns used for merging.See ‘Details’.

all

logical;all = L is shorthand forall.x = L andall.y = L, whereL is eitherTRUE orFALSE.

all.x

logical; ifTRUE, then extra rows will be added tothe output, one for each row inx that has no matching row iny. These rows will haveNAs in those columns that areusually filled with values fromy. The default isFALSE, so that only rows with data from bothx andy are included in the output.

all.y

logical; analogous toall.x.

sort

logical. Should the result be sorted on thebycolumns?

suffixes

a character vector of length 2 specifying the suffixesto be used for making unique the names of columns in the resultwhich are not used for merging (appearing inby etc).

no.dups

logical indicating thatsuffixes are appended inmore cases to avoid duplicated column names in the result. Thiswas implicitly false beforeR version 3.5.0.

incomparables

values which cannot be matched. Seematch. This is intended to be used for merging on onecolumn, so these are incomparable values of that column.

...

arguments to be passed to or from methods.

Details

merge is a generic function whose principal method is for dataframes: the default method coerces its arguments to data frames andcalls the"data.frame" method.

By default the data frames are merged on the columns with names theyboth have, but separate specifications of the columns can be given byby.x andby.y. The rows in the two data frames thatmatch on the specified columns are extracted, and joined together. Ifthere is more than one match, all possible matches contribute one roweach. For the precise meaning of ‘match’, seematch.

Columns to merge on can be specified by name, number or by a logicalvector: the name"row.names" or the number0 specifiesthe row names. If specified by name it must correspond uniquely to anamed column in the input.

Ifby or bothby.x andby.y are of length 0 (alength zero vector orNULL), the result,r, is theCartesian product ofx andy, i.e.,dim(r) = c(nrow(x)*nrow(y), ncol(x) + ncol(y)).

Ifall.x is true, all the non matching cases ofx areappended to the result as well, withNA filled in thecorresponding columns ofy; analogously forall.y.

If the columns in the data frames not used in merging have any commonnames, these havesuffixes (".x" and".y" bydefault) appended to try to make the names of the result unique. Ifthis is not possible, an error is thrown.

If aby.x column name matches one ofy, and ifno.dups is true (as by default), the y version gets suffixed aswell, avoiding duplicate column names in the result.

The complexity of the algorithm used is proportional to the length ofthe answer.

In SQL database terminology, the default value ofall = FALSEgives anatural join, a special case of aninnerjoin. Specifyingall.x = TRUE gives aleft (outer)join,all.y = TRUE aright (outer) join, and both(all = TRUE) a(full) outer join.DBMSes do not matchNULL records, equivalent toincomparables = NA inR.

Value

A data frame. The rows are by default lexicographically sorted on thecommon columns, but forsort = FALSE are in an unspecified order.The columns are the common columns followed by theremaining columns inx and then those iny. If thematching involved row names, an extra character column calledRow.names is added at the left, and in all cases the result has‘automatic’ row names.

Note

This is intended to work with data frames with vector-like columns:some aspects work with data frames containing matrices, but not all.

Currently long vectors are not accepted for inputs, which are thusrestricted to less than 2^31 rows. That restriction also applies tothe result for 32-bit platforms.

See Also

data.frame,by,cbind.

dendrogram for a class which has amerge method.

Examples

authors<- data.frame(## I(*) : use character columns of names to get sensible sort order    surname= I(c("Tukey","Venables","Tierney","Ripley","McNeil")),    nationality= c("US","Australia","US","UK","Australia"),    deceased= c("yes", rep("no",4)))authorN<- within(authors,{ name<- surname; rm(surname)})books<- data.frame(    name= I(c("Tukey","Venables","Tierney","Ripley","Ripley","McNeil","R Core")),    title= c("Exploratory Data Analysis","Modern Applied Statistics ...","LISP-STAT","Spatial Statistics","Stochastic Simulation","Interactive Data Analysis","An Introduction to R"),    other.author= c(NA,"Ripley",NA,NA,NA,NA,"Venables & Smith"))(m0<- merge(authorN, books))(m1<- merge(authors, books, by.x="surname", by.y="name")) m2<- merge(books, authors, by.x="name", by.y="surname")stopifnot(exprs={   identical(m0, m2[, names(m0)])   as.character(m1[,1])== as.character(m2[,1])   all.equal(m1[,-1], m2[,-1][ names(m1)[-1]])   identical(dim(merge(m1, m2, by=NULL)),             c(nrow(m1)*nrow(m2), ncol(m1)+ncol(m2)))})## "R core" is missing from authors and appears only here :merge(authors, books, by.x="surname", by.y="name", all=TRUE)## example of using 'incomparables'x<- data.frame(k1= c(NA,NA,3,4,5), k2= c(1,NA,NA,4,5), data=1:5)y<- data.frame(k1= c(NA,2,NA,4,5), k2= c(NA,NA,3,4,5), data=1:5)merge(x, y, by= c("k1","k2"))# NA's matchmerge(x, y, by="k1")# NA's match, so 6 rowsmerge(x, y, by="k2", incomparables=NA)# 2 rows

Diagnostic Messages

Description

Generate a diagnostic message from its arguments.

Usage

message(..., domain=NULL, appendLF=TRUE)suppressMessages(expr, classes="message")packageStartupMessage(..., domain=NULL, appendLF=TRUE)suppressPackageStartupMessages(expr).makeMessage(..., domain=NULL, appendLF=FALSE)

Arguments

...

zero or more objects which can be coerced to character(and which are pasted together with no separator) or (formessage only) a single condition object.

domain

seegettext. IfNA, messages willnot be translated, see also the note instop.

appendLF

logical: should messages given as a character stringhave a newline appended?

expr

expression to evaluate.

classes

character, indicating which classes of messages shouldbe suppressed.

Details

message is used for generating ‘simple’ diagnosticmessages which are neither warnings nor errors, but neverthelessrepresented as conditions. Unlike warnings and errors, a finalnewline is regarded as part of the message, and is optional.The default handler sends the message to thestderr()connection.

If a condition object is supplied tomessage it should bethe only argument, and further arguments will be ignored, with a warning.

While the message is being processed, amuffleMessage restartis available.

suppressMessages evaluates its expression in a context thatignores all ‘simple’ diagnostic messages.

packageStartupMessage is a variant whose messages can besuppressed separately bysuppressPackageStartupMessages. (Theyare still messages, so can be suppressed bysuppressMessages.)

.makeMessage is a utility used bymessage,warningandstop to generate a text message from the...arguments by possible translation (seegettext) andconcatenation (with no separator).

See Also

warning andstop for generating warningsand errors;conditions for condition handling andrecovery.

gettext for the mechanisms for the automated translationof text.

Examples

message("ABC","DEF")suppressMessages(message("ABC"))testit<-function(){  message("testing package startup messages")  packageStartupMessage("initializing ...", appendLF=FALSE)  Sys.sleep(1)  packageStartupMessage(" done")}testit()suppressPackageStartupMessages(testit())suppressMessages(testit())

Does a Formal Argument have a Value?

Description

missing can be used to test whether a value was specifiedas an argument to a function.

Usage

missing(x)

Arguments

x

a formal argument.

Details

missing(x) is only reliable ifx has not been alteredsince entering the function: in particular it willalwaysbe false afterx <- match.arg(x).

The example shows how a plotting function can be written to work witheither a pair of vectors giving x and y coordinates of points to beplotted or a single vector giving y values to be plotted against theirindices.

Currentlymissing can only be used in the immediate body ofthe function that defines the argument, not in the body of a nestedfunction or alocal call. This may change in the future.

This is a ‘special’primitive function: it must notevaluate its argument.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

See Also

substitute for argument expression;NA for missing values in data.

Examples

myplot<-function(x, y){if(missing(y)){                        y<- x                        x<-1:length(y)}                plot(x, y)}

The (Storage) Mode of an Object

Description

Get or set the ‘mode’ (a kind of ‘type’), or the storagemode of anR object.

Usage

mode(x)mode(x)<- valuestorage.mode(x)storage.mode(x)<- value

Arguments

x

anyR object.

value

a character string giving the desired mode or‘storage mode’ (type) of the object.

Details

Bothmode andstorage.mode return a character stringgiving the (storage) mode of the object — often the same — bothrelying on the output oftypeof(x), see the examplebelow.

mode(x) <- "newmode" changes themode of objectx tonewmode. This is only supported if there is an appropriateas.newmode function, for example"logical","integer","double","complex","raw","character","list","expression","name","symbol" and"function". Attributes arepreserved (but see below).

storage.mode(x) <- "newmode" is a more efficientprimitiveversion ofmode<-, which works for"newmode" which isone of the internal types (seetypeof), but not for"single". Attributes are preserved.

As storage mode"single" is only a pseudo-mode inR, it willnot be reported bymode orstorage.mode: useattr(object, "Csingle") to examine this. However,mode<- can be used to set the mode to"single",which sets the real mode to"double" and the"Csingle"attribute toTRUE. Setting any other mode will remove thisattribute.

Note (in the examples below) that somecalls have mode"(" which is S compatible.

Mode names

Modes have the same set of names as types (seetypeof)except that

  • types"integer" and"double" arereturned as"numeric".

  • types"special","builtin" and"closure" are returned as"function".

  • type"symbol" is called mode"name".

  • type"language" is returned as"(" or"call".

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

typeof for the R-internal ‘mode’ or ‘type’,type.convert,attributes.

Examples

require(stats)sapply(options(), mode)cex3<- c("NULL","1","1:1","1i","list(1)","data.frame(x = 1)","pairlist(pi)","c","lm","formals(lm)[[1]]","formals(lm)[[2]]","y ~ x","expression((1))[[1]]","(y ~ x)[[1]]","expression(x <- pi)[[1]][[1]]")lex3<- sapply(cex3,function(x) eval(str2lang(x)))mex3<- t(sapply(lex3,function(x) c(typeof(x), storage.mode(x), mode(x))))dimnames(mex3)<- list(cex3, c("typeof(.)","storage.mode(.)","mode(.)"))mex3## This also makes a local copy of 'pi':storage.mode(pi)<-"complex"storage.mode(pi)rm(pi)

Auxiliary Function for Matching

Description

Transform objects for matching viamatch(), think“match form”-> "mtfrm".base provides the S3 generic and adefault plus"POSIXct" and"POSIXlt" methods.

Usage

mtfrm(x)

Arguments

x

anR object

Details

Matching viamatch will usemtfrm to transforminternally classed objects (seeis.object) to a vectorrepresentation appropriate for matching. The default method performsas.character if this preserves the length.

Ideally, methods formtfrm should ensure that comparisons ofsame-classed objects viamatch are consistent with thoseemployed by methods forduplicated/uniqueand==/!= (where applicable).

Value

A vector of the same length asx.


‘Not Available’ / Missing Values

Description

NA is a logical constant of length 1 which contains a missingvalue indicator.NA can be coerced to any other vectortype except raw. There are also constantsNA_integer_,NA_real_,NA_complex_ andNA_character_ of theother atomic vector types which support missing values: all of thesearereserved words in theR language.

The generic functionis.na indicates which elements are missing.

The generic functionis.na<- sets elements toNA.

The generic functionanyNA implementsany(is.na(x)) in apossibly faster way (especially for atomic vectors).

Usage

NAis.na(x)anyNA(x, recursive=FALSE)## S3 method for class 'data.frame'is.na(x)is.na(x)<- value

Arguments

x

anR object to be tested: the default method foris.na andanyNA handle atomic vectors, lists,pairlists, andNULL.

recursive

logical: shouldanyNA be applied recursivelyto lists and pairlists?

value

a suitable index vector for use withx.

Details

TheNA of character type is distinct from the string"NA". Programmers who need to specify an explicit missingstring should useNA_character_ (rather than"NA") or setelements toNA usingis.na<-.

is.na andanyNA are generic: you can writemethods to handle specific classes of objects, seeInternalMethods.

Functionis.na<- may provide a safer way to set missingness.It behaves differently for factors, for example.

Numerical computations usingNA will normally result inNA: a possible exception is whereNaN is alsoinvolved, in which case either might result (which may depend ontheR platform). However, this is not guaranteed and future CPUsand/or compilers may behave differently. Dynamic binary translation mayalso impact this behavior (with valgrind, computations usingNAmay result inNaN even when noNaN is involved).

Logical computations treatNA as a missingTRUE/FALSEvalue, and so may returnTRUE orFALSE if the expressiondoes not depend on theNA operand.

The default method foranyNA handles atomic vectors without aclass andNULL. It callsany(is.na(x)) on objects withclasses and forrecursive = FALSE, on lists and pairlists.

Value

The default method foris.na applied to an atomic vectorreturns a logical vector of the same length as its argumentx,containingTRUE for those elements markedNA or, fornumeric or complex vectors,NaN, andFALSEotherwise. (A complex value is regarded asNA if either itsreal or imaginary part isNA orNaN.)dim,dimnames andnames attributes are copied tothe result.

The default methods also work for lists and pairlists:
Foris.na, elementwise the result is false unless that elementis a length-one atomic vector and the single element of that vector isregarded asNA orNaN (note that anyis.namethod for the class of the element is ignored).
anyNA(recursive = FALSE) works the same way asis.na;anyNA(recursive = TRUE) appliesanyNA (with methoddispatch) to each element.

The data frame method foris.na returns a logical matrixwith the same dimensions as the data frame, and with dimnames takenfrom the row and column names of the data frame.

anyNA(NULL) is false;is.na(NULL) islogical(0)(no longer warning sinceR version 3.5.0).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

See Also

NaN,is.nan, etc.,and the utility functioncomplete.cases.

na.action,na.omit,na.failon how methods can be tuned to deal with missing values.

Examples

is.na(c(1,NA))#> FALSE  TRUEis.na(paste(c(1,NA)))#> FALSE FALSE(xx<- c(0:4))is.na(xx)<- c(2,4)xx#> 0 NA  2 NA  4anyNA(xx)# TRUE# Some logical operations do not return NAc(TRUE,FALSE)&NAc(TRUE,FALSE)|NA## Measure speed difference in a favourable case:## the difference depends on the platform, on most ca 3x.x<-1:10000; x[5000]<-NaN# coerces x to be doubleif(require("microbenchmark")){# does not work reliably on all platforms  print(microbenchmark(any(is.na(x)), anyNA(x)))}else{  nSim<-2^13  print(rbind(is.na= system.time(replicate(nSim, any(is.na(x)))),              anyNA= system.time(replicate(nSim, anyNA(x)))))}## anyNA() can work recursively with list()s:LL<- list(1:5, c(NA,5:8), c("A","NA"), c("a",NA_character_))L2<- LL[c(1,3)]sapply(LL, anyNA); c(anyNA(LL), anyNA(LL,TRUE))sapply(L2, anyNA); c(anyNA(L2), anyNA(L2,TRUE))## ... lists, and hence data frames, too:dN<- dd<- USJudgeRatings; dN[3,6]<-NAanyNA(dd)# FALSEanyNA(dN)# TRUE

Names and Symbols

Description

A ‘name’ (also known as a ‘symbol’) is a way to refer toR objects by name (rather than the value of the object, if any, boundto that name).

as.name andas.symbol are identical: they attempt tocoerce the argument to a name.

is.symbol and the identicalis.name returnTRUEorFALSE depending on whether the argument is a name or not.

Usage

as.symbol(x)is.symbol(x)as.name(x)is.name(x)

Arguments

x

object to be coerced or tested.

Details

Names are limited to 10,000 bytes (and were to 256 bytes in versionsofR before 2.13.0).

as.name first coerces its argument internally to a charactervector (so methods foras.character are not used). It thentakes the first element and provided it is not"", returns asymbol of that name (and if the element isNA_character_, thename is`NA`).

as.name is implemented asas.vector(x, "symbol"),and hence will dispatch methods for the generic functionas.vector.

is.name andis.symbol areprimitive functions.

Value

Foras.name andas.symbol, anR object of type"symbol" (seetypeof).

Foris.name andis.symbol, a length-one logical vectorwith valueTRUE orFALSE.

Note

The term ‘symbol’ is from the LISP background ofR, whereas‘name’ has been the standard S term for this.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

call,is.language.For the internal object mode,typeof.

plotmath for another use of ‘symbol’.

Examples

an<- as.name("arrg")is.name(an)# TRUEmode(an)# nametypeof(an)# symbol

The Names of an Object

Description

Functions to get or set the names of an object.

Usage

names(x)names(x)<- value

Arguments

x

anR object.

value

a character vector of up to the same length asx, orNULL.

Details

names is a generic accessor function, andnames<- is ageneric replacement function. The default methods get and setthe"names" attribute of a vector (including a list) orpairlist.

For anenvironmentenv,names(env) givesthe names of the corresponding list, i.e.,names(as.list(env, all.names = TRUE)) which are also given byls(env, all.names = TRUE, sorted = FALSE). If theenvironment is used as a hash table,names(env) are its“keys”.

Ifvalue is shorter thanx, it is extended by characterNAs to the length ofx.

It is possible to update just part of the names attribute via thegeneral rules: see the examples. This works because the expressionthere is evaluated asz <- "names<-"(z, "[<-"(names(z), 3, "c2")).

The name"" is special: it is used to indicate that there is noname associated with an element of a (atomic or generic) vector.Subscripting by"" will match nothing (not even elements whichhave no name).

A name can be characterNA, but such a name will never bematched and is likely to lead to confusion.

Both areprimitive functions.

Value

Fornames,NULL or a character vector of the same lengthasx. (NULL is given if the object has no names,including for objects of types which cannot have names.) For anenvironment, the length is the number of objects in the environmentbut the order of the names is arbitrary.

Fornames<-, the updated object. (Note that the value ofnames(x) <- value is that of the assignment,value, notthe return value from the left-hand side.)

Note

For vectors, the names are one of theattributes withrestrictions on the possible values. For pairlists, the names are thetags and converted to and from a character vector.

For a one-dimensional array thenames attribute really isdimnames[[1]].

Formally classed aka “S4” objects typically haveslotNames() (and nonames()).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

slotNames,dimnames.

Examples

# print the names attribute of the islands data setnames(islands)# remove the names attributenames(islands)<-NULLislandsrm(islands)# remove the copy madez<- list(a=1, b="c", c=1:3)names(z)# change just the name of the third element.names(z)[3]<-"c2"zz<-1:3names(z)## assign just one namenames(z)[2]<-"b"z

The Number of Arguments to a Function

Description

When used inside a function body,nargs returns the number ofarguments supplied to that function,including positionalarguments left blank.

Usage

nargs()

Details

The count includes empty (missing) arguments, so thatfoo(x,,z)will be considered to have three arguments (see ‘Examples’).This can occur in rather indirect ways, so for examplex[]might dispatch a call to`[.some_method`(x, ) which isconsidered to have two arguments.

This is aprimitive function.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

args,formals andsys.call.

Examples

tst<-function(a, b=3,...){nargs()}tst()# 0tst(clicketyclack)# 1 (even non-existing)tst(c1, a2, rr3)# 3foo<-function(x, y, z, w){   cat("call was ", deparse(match.call()),"\n", sep="")   nargs()}foo()# 0foo(,,3)# 3foo(z=3)# 1, even though this is the same callnargs()# not really meaningful

Count the Number of Characters (or Bytes or Width)

Description

nchar takes a character vector as an argument andreturns a vector whose elements contain the sizes ofthe corresponding elements ofx. Internally, it is a generic,for which methods can be defined (seeInternalMethods).

nzchar is a fast way to find out if elements of a charactervector are non-empty strings.

Usage

nchar(x, type="chars", allowNA=FALSE, keepNA=NA)nzchar(x, keepNA=FALSE)

Arguments

x

character vector, or a vector to be coerced to a charactervector. Giving a factor is an error.

type

character string: partial matching to one ofc("bytes", "chars", "width"). See ‘Details’.

allowNA

logical: shouldNA be returned for invalidmultibyte strings or"bytes"-encoded strings (rather thanthrowing an error)?

keepNA

logical: shouldNA be returned whenx isNA? If false,nchar() returns2, as that is the number of printing characters used whenstrings are written to output, andnzchar() isTRUE. Thedefault fornchar(),NA, means to usekeepNA = TRUEunlesstype is"width".

Details

The ‘size’ of a character string can be measured in one ofthree ways (corresponding to thetype argument):

bytes

The number of bytes needed to store the string(plus in C a final terminator which is not counted).

chars

The number of characters.

width

The number of columnscat will use toprint the string in a monospaced font. The same ascharsif this cannot be calculated.

These will often be the same, and usually will be in single-bytelocales (but note howtype determines the default forkeepNA). There will be differences between the first two withmultibyte character sequences, e.g. in UTF-8 locales.

The internal equivalent of the default method ofas.character is performed onx (so there is nomethod dispatch). If you want to operate on non-vector objectspassing them throughdeparse first will be required.

Value

Fornchar, an integer vector giving the sizes of each element.For missing values (i.e.,NA, i.e.,NA_character_),nchar() returnsNA_integer_ ifkeepNA istrue, and2, the number of printing characters, if false.

type = "width" gives (an approximation to) the number ofcolumns used in printing each element in a terminal font, taking intoaccount double-width, zero-width and ‘composing’ characters.The approximation is likely to be poor when there are unassigned ornon-printing characters.

IfallowNA = TRUE and an element is detected as invalid in amulti-byte character set such as UTF-8, its number of characters andthe width will beNA. Otherwise the number of characters willbe non-negative, so!is.na(nchar(x, "chars", TRUE)) is a testof validity.

A character string marked with"bytes" encoding (seeEncoding) has a number of bytes, but neither a knownnumber of characters nor a width, so the latter two types areNA ifallowNA = TRUE, otherwise an error.

Names, dims and dimnames are copied from the input.

Fornzchar, a logical vector of the same length asx,true if and only if the element has non-zero size; if the element isNA,nzchar() is true whenkeepNA is false (thedefault) orNA, andNA otherwise.

Note

This doesnot by default give the number of characters thatwill be used toprint() the string. UseencodeString to find that.

Where character strings have been marked as UTF-8, the number ofcharacters and widths will be computed in UTF-8, even though printingmay use escapes such as ‘⁠<U+2642>⁠’ in a non-UTF-8 locale.

The concept of ‘width’ is a slippery one even in a monospacedfont. Some human languages have the concept ofcombiningcharacters, in which two or more characters are rendered together: anexample would be"y\u306", which is two characters of widthone: combining characters are given width zero, and there are otherzero-width characters such as the zero-width space"\u200b".

Some East Asian languages have ‘wide’ characters, ideographswhich are conventionally printed across two columns when mixed withASCII and other ‘narrow’ characters in those languages. Theproblem is that whether a computer prints wide characters over two orone columns depends on the font, with it not being uncommon to use twocolumns in a font intended for East Asian users and a single column ina ‘Western’ font. Unicode has encodings for ‘fullwidth’versions of ASCII characters and ‘halfwidth’ versions ofKatakana (Japanese) and Hangul (Korean) characters. Then there is the‘East Asian Ambiguous class’ (Greek, Cyrillic, signs, someaccented Latin chars, etc), for which the historical practice was touse two columns in East Asia and one elsewhere. The width quoted bynchar for characters in that class (and some others) depends onthe locale, being one except in some East Asian locales on some OSes(notably Windows).

Control characters are usually given width zero: this includesCR andLF. Computing the width of a string containing control charactersshould be avoided (and may depend on the OS andR version).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Unicode Standard Annex #11:East Asian Width.https://www.unicode.org/reports/tr11/

See Also

strwidth giving width of strings for plotting;paste,substr,strsplit

Examples

x<- c("asfef","qwerty","yuiop[","b","stuff.blah.yech")nchar(x)# 5  6  6  1 15nchar(deparse(mean))# 18 17  <-- unless mean differs from base::mean## NA behaviour as function of keepNA=* :logi<- setNames(, c(FALSE,NA,TRUE))sapply(logi, \(k) data.frame(nchar=  nchar(NA, keepNA=k),                             nzchar= nzchar(NA, keepNA=k)))x[3]<-NA; xnchar(x, keepNA=TRUE)#  5  6 NA  1 15nchar(x, keepNA=FALSE)#  5  6  2  1 15stopifnot(identical(nchar(x), nchar(x, keepNA=TRUE)),          identical(nchar(x,"w"), nchar(x, keepNA=FALSE)),          identical(is.na(x), is.na(nchar(x))))##' nchar() for all three types :nchars<-function(x,...)   vapply(c("chars","bytes","width"),function(tp) nchar(x, tp,...), integer(length(x)))nchars("\u200b")# in R versions (>= 2015-09-xx):## chars bytes width##     1     3     0data.frame(x, nchars(x))## all three types : same unless for NA## force the same by forcing 'keepNA':(ncT<- nchars(x, keepNA=TRUE))## .... NA NA NA ....(ncF<- nchars(x, keepNA=FALSE))## ....  2  2  2 ....stopifnot(apply(ncT,1,function(.) length(unique(.)))==1,          apply(ncF,1,function(.) length(unique(.)))==1)

The Number of Levels of a Factor

Description

Return the number of levels which its argument has.

Usage

nlevels(x)

Arguments

x

an object, usually a factor.

Details

This is usually applied to a factor, but other objects can have levels.

The actual factor levels (if they exist) can be obtainedwith thelevels function.

Value

The length oflevels(x), which is zero ifx has no levels.

See Also

levels,factor.

Examples

nlevels(gl(3,7))# = 3

Class for ‘no quote’ Printing of Character Strings

Description

Print character strings without quotes.

Usage

noquote(obj, right=FALSE)## S3 method for class 'noquote'print(x, quote=FALSE, right=FALSE,...)## S3 method for class 'noquote'c(..., recursive=FALSE)

Arguments

obj

anyR object, typically a vector ofcharacter strings.

right

optionallogical eventually to be passed toprint(), used byprint.default(), indicatingwhether or not strings should be right aligned.

x

an object of class"noquote".

quote,...

further options passed to next methods, such asprint.

recursive

for compatibility with the genericc function.

Details

noquote returns its argument as an object of class"noquote". There is a method forc() and subscriptmethod ("[.noquote") which ensures that the class is not lostby subsetting. The print method (print.noquote) printscharacter stringswithout quotes ("...." is printed as⁠....⁠).

Ifright is specified in a callprint(x, right=*), ittakes precedence over a possibleright setting ofx,e.g., created byx <- noquote(*, right=TRUE).

These functions exist both as utilities and as an example of using (S3)class and object orientation.

Author(s)

Martin Maechler[email protected]

See Also

methods,class,print.

Examples

lettersnql<- noquote(letters)nqlnql[1:4]<-"oh"nql[1:12]cmp.logical<-function(log.v){## Purpose: compact printing of logicals  log.v<- as.logical(log.v)  noquote(if(length(log.v)==0)"()"else c(".","|")[1+ log.v])}cmp.logical(stats::runif(20)>0.8)chmat<- as.matrix(format(stackloss))# a "typical" character matrix## noquote(*, right=TRUE)  so it prints exactly like a data framechmat<- noquote(chmat, right=TRUE)chmat

Compute the Norm of a Matrix

Description

Computes a matrix norm ofx using LAPACK. The norm can bethe one ("O") norm, the infinity ("I") norm, theFrobenius ("F") norm, the maximum modulus ("M") amongelements of a matrix, or the “spectral” or"2"-norm, asdetermined by the value oftype.

Usage

norm(x, type= c("O","I","F","M","2"))

Arguments

x

numeric matrix; note that packages such asMatrixdefine morenorm() methods.

type

character string, specifying thetype of matrixnorm to be computed.A character indicating the type of norm desired.

"O","o" or"1"

specifies theone norm,(maximum absolute column sum);

"I" or"i"

specifies theinfinity norm (maximumabsolute row sum);

"F","f","E" or"e"

specifies theFrobenius norm (theEuclidean norm ofxtreated as if it were a vector);

"M" or"m"

specifies themaximum modulus ofall the elements inx; and

"2"

specifies the “spectral” or 2-norm, whichis the largest singular value (svd) ofx.

The default is"O". Only the first character oftype[1] is used.

Details

Thebase method ofnorm() calls the LAPACK functiondlange.

Note that the 1-, Inf- and"M" norm is faster to calculate thanthe Frobenius one.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.

Value

The matrix norm, a non-negative number. Zero for a 0-extent (empty) matrix.

Source

Except fornorm = "2", the LAPACK routineDLANGE.

LAPACK is fromhttps://netlib.org/lapack/.

References

Anderson, E.,et al (1994).LAPACK User's Guide,2nd edition, SIAM, Philadelphia.

See Also

rcond for the (reciprocal) condition number.

Examples

(x1<- cbind(1,1:10))norm(x1)norm(x1,"I")norm(x1,"M")stopifnot(all.equal(norm(x1,"F"),                    sqrt(sum(x1^2))))hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}h9<- hilbert(9)## all 5 (4 different) types of norm:(nTyp<- eval(formals(base::norm)$type))sapply(nTyp, norm, x= h9)stopifnot(exprs={# 0-extent matrices:    sapply(nTyp, norm, x= matrix(,1,0))==0    sapply(nTyp, norm, x= matrix(,0,0))==0})

Express File Paths in Canonical Form

Description

Convert file paths to canonical form for the platform, to display themin a user-understandable form and so that relative and absolute paths canbe compared.

Usage

normalizePath(path, winslash="\\", mustWork=NA)

Arguments

path

character vector of file paths.

winslash

the separator to be used on Windows – ignoredelsewhere. Must be one ofc("/", "\\").

mustWork

logical: ifTRUE then an error is given if the resultcannot be determined; ifNA then a warning.

Details

Tilde-expansion (seepath.expand) is first done onpaths.

Where the Unix-alike platform supports it attempts to turn paths intoabsolute paths in their canonical form (no ‘⁠./⁠’, ‘⁠../⁠’ norsymbolic links). It relies on the POSIX system functionrealpath: if the platform does not have that (we know of nocurrent example) then the result will be an absolute path but mightnot be canonical. Even whererealpath is used the canonicalpath need not be unique, for examplevia hard links ormultiple mounts.

On Windows it converts relative paths to absolute paths, resolves symboliclinks, converts short names for path elements to long names and ensures theseparator is that specified bywinslash. It will match each pathelement case-insensitively or case-sensitively as during the usual namelookup and return the canonical case. It relies on Windows API functionGetFinalPathNameByHandle and in case of an error (such asinsufficient permissions) it currently falls back to theR 3.6 (andolder) implementation, which relies onGetFullPathName andGetLongPathName with limitations described in the Notes section.An attempt is made not to introduceUNC paths in presence of mapped drivesor symbolic links: ifGetFinalPathNameByHandle returns aUNC path,butGetLongPathName returns a path starting with a drive letter, Rfalls back to theR 3.6 (and older) implementation.UTF-8-encoded paths not valid in the current locale can be used.

mustWork = FALSE is useful for expressing paths for use inmessages.

Value

A character vector.

If an input is not a real path the result is system-dependent (unlessmustWork = TRUE, when this should be an error). It will beeither the corresponding input element or a transformation of it intoan absolute path.

Converting to an absolute file path can fail for a large number ofreasons. The most common are

Note

The canonical form of paths may not be what you expect. For example,on macOS absolute paths such as ‘/tmp’ and ‘/var’ aresymbolic links. On Linux, a path produced by bash process substitution isa symbolic link (such as ‘/proc/fd/63’) to a pipe and there is nocanonical form of such path. InR 3.6 and older on Windows, symlinks willnot be resolved and the long names for path elements will be returned withthe case in which they are inpath, which may not be canonical incase-insensitive folders.

Examples

cat(normalizePath(c(R.home(), tempdir())), sep="\n")

Not Yet Implemented Functions and Unused Arguments

Description

In order to pinpoint missing functionality, theR core team usesthese functions for missingR functions and not yet used arguments ofexistingR functions (which are typically there for compatibilitypurposes).

You are very welcome to contribute your code ...

Usage

.NotYetImplemented().NotYetUsed(arg, error=TRUE)

Arguments

arg

an argument of a function that is not yet used.

error

a logical. IfTRUE, an error is signalled; ifFALSE; only a warning is given.

See Also

the contrary,Deprecated andDefunct for outdated code.

Examples

require(graphics)barplot(1:5, inside=TRUE)# 'inside' is not yet used

The Number of Rows/Columns of an Array

Description

nrow andncol return the number of rows or columnspresent inx.NCOL andNROW do the same treating a vector as1-column matrix, even a 0-length vector, compatibly withas.matrix() orcbind(), see the example.

Usage

nrow(x)ncol(x)NCOL(x)NROW(x)

Arguments

x

a vector, array, data frame, orNULL.

Value

aninteger of length 1 orNULL, thelatter only forncol andnrow.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole (ncol andnrow.)

See Also

dim which returnsall dimensions, andlength which gives a number (a ‘count’) also in cases wheredim() isNULL, and hencenrow() andncol()returnNULL;array,matrix.

Examples

ma<- matrix(1:12,3,4)nrow(ma)# 3ncol(ma)# 4ncol(array(1:24, dim=2:4))# 3, the second dimensionNCOL(1:12)# 1NROW(1:12)# 12, the length() of the vector## as.matrix() produces 1-column matrices from 0-length vectors,## and so does cbind() :dim(as.matrix(numeric()))# 0 1dim(    cbind(numeric()))# dittoNCOL(numeric())# 1## However, as.matrix(NULL) fails and cbind(NULL) gives NULL, hence for## consistency:NCOL(NULL)# 0## (This gave 1 in R < 4.4.0.)

Double Colon and Triple Colon Operators

Description

Accessing exported and internal variables, i.e.R objects(including lazy loaded data sets) in a namespace.

Usage

pkg::namepkg:::name

Arguments

pkg

package name: symbol or literal character string.

name

variable name: symbol or literal character string.

Details

For a packagepkg,pkg::name returns the value of theexported variablename in namespacepkg, whereaspkg:::name returns the value of the internal variablename. The package namespace will be loaded if it was notloaded before the call, but the package will not be attached to thesearch path.

Specifying a variable or package that does not exist is an error.

Note thatpkg::name doesnot access the objects in theenvironmentpackage:pkg (which does not exist until thepackage's namespace is attached): the latter may contain objects notexported from the namespace. It can access datasets made available bylazy-loading.

Note

It is typically a design mistake to use:::in your code since the corresponding object has probably been keptinternal for a good reason. Consider contacting the packagemaintainer if you feel the need to access the object foranything but mere inspection.

See Also

get to access an object masked by another of the same name.loadNamespace,asNamespace for more aboutnamespaces.

Examples

base::logbase::"+"## Beware --  use ':::' at your own risk! (see "Details")stats:::coef.default

Hooks for Namespace Events

Description

Packages can supply functions to be called whenloaded, attached, detached or unloaded.

Usage

.onLoad(libname, pkgname).onAttach(libname, pkgname).onUnload(libpath).onDetach(libpath).Last.lib(libpath)

Arguments

libname

a character string giving the library directory wherethe package defining the namespace was found.

pkgname

a character string giving the name of the package.

libpath

a character string giving the complete path to the package.

Details

After loading,loadNamespace looks for a hook functionnamed.onLoad and calls it (with two unnamed arguments) beforesealing the namespace and processing exports.

When the package is attached (vialibrary orattachNamespace), the hook function.onAttach islooked for and if found is called (with two unnamed arguments) beforethe package environment is sealed.

If a function.onDetach is in the namespace or.Last.libis exported from the package, it will be called (with a singleargument) when the package isdetached. Beware that itmight be called if.onAttach has failed, so it should bewritten defensively. (It is called withintryCatch, soerrors will not stop the package being detached.)

If a namespace is unloaded (viaunloadNamespace), a hookfunction.onUnload is run (with a single argument) before finalunloading.

Note that the code in.onLoad and.onUnload should notassume any package except the base package is on the search path.Objects in the current package will be visible (unless this iscircumvented), but objects from other packages should be imported orthe double colon operator should be used.

.onLoad,.onUnload,.onAttach and.onDetach are looked for as internal objects in the namespaceand should not be exported (whereas.Last.lib should be).

Note that packages are not detached nor namespaces unloaded at the endof anR session unless the user arranges to do so (e.g.,via.Last).

Anything needed for the functioning of the namespace should behandled at load/unload times by the.onLoad and.onUnload hooks. For example, DLLs can be loaded (unless doneby auseDynLib directive in the ‘NAMESPACE’ file) andinitialized in.onLoad and unloaded in.onUnload. Use.onAttach only for actions that are needed only when thepackage becomes visible to the user (for example a start-up message)or need to be run after the package environment has been created.

Good practice

Loading a namespace should where possible be silent, with startupmessages given by.onAttach. These messages (and any essentialones from.onLoad) should usepackageStartupMessageso they can be silenced where they would be a distraction.

There should be no calls tolibrary norrequire in thesehooks. The way for a package to load other packages is via the‘⁠Depends⁠’ field in the ‘DESCRIPTION’ file: this ensuresthat the dependence is documented and packages are loaded in thecorrect order. Loading a namespace should not change the search path,so rather than attach a package, dependence of a namespace on anotherpackage should be achieved by (selectively) importing from the otherpackage's namespace.

Uses oflibrary with argumenthelp to display basicinformation about the package should useformat on thecomputed package information object and pass this topackageStartupMessage.

There should be no calls toinstalled.packages in startupcode: it is potentially very slow and may fail in versions ofRbefore 2.14.2 if package installation is going on in parallel. Seeits help page for alternatives.

Compiled code should be loaded (e.g.,vialibrary.dynam) in.onLoad or auseDynLibdirective in the ‘NAMESPACE’ file, and not in.onAttach.Similarly, compiled code should not be unloaded (e.g.,vialibrary.dynam.unload) in.Last.lib nor.onDetach, only in.onUnload.

See Also

setHook shows how users can set hooks on the same events, andlists the sequence of events involving all of the hooks.

reg.finalizer for hooks to be run at the end of a session.

loadNamespace for more about namespaces.


Loading and Unloading Name Spaces

Description

Functions to load and unload name spaces.

Usage

attachNamespace(ns, pos=2L, depends=NULL, exclude, include.only)loadNamespace(package, lib.loc=NULL,              keep.source= getOption("keep.source.pkgs"),              partial=FALSE, versionCheck=NULL,              keep.parse.data= getOption("keep.parse.data.pkgs"))requireNamespace(package,..., quietly=FALSE)loadedNamespaces()unloadNamespace(ns)isNamespaceLoaded(name)

Arguments

ns

string or name space object.

pos

integer specifying position to attach.

depends

NULL or a character vector of dependencies to berecorded in object.Depends in the package.

package

string naming the package/name space to load.

lib.loc

character vector specifying library search path (the locationofR library trees to search through.

keep.source

now ignored except during package installation.

keep.parse.data

ignored except during package installation.

partial

logical; if true, stop just after loading code.

versionCheck

NULL or a version specification (a listwith componentsop andversion).

quietly

logical: should progress and error messages be suppressed?

name

string or ‘name’, seeas.symbol,of a package, e.g.,"stats".

exclude,include.only

character vectors; seelibrary.

...

further arguments to be passed toloadNamespace.

Details

The functionsloadNamespace andattachNamespace areusually called implicitly whenlibrary is used to load a namespace and any imports needed. However it may be useful at times tocall these functions directly.

loadNamespace loads the specified name space and registers it inan internal data base. A request to load a name space when one of thatname is already loaded has no effect. The arguments have the samemeaning as the corresponding arguments tolibrary, whosehelp page explains the details of how a particular installed packagecomes to be chosen. After loading,loadNamespace looks for ahook function named.onLoad as an internal variable inthe name space (it should not be exported). Partial loading is usedto support installation with lazy-loading.

Optionally the package licence is checked during loading: see section‘Licenses’ in the help forlibrary.

loadNamespace does not attach the name space it loads to thesearch path.attachNamespace can be used to attach a framecontaining the exported values of a name space to the search path (butthis is almost always donevialibrary). Thehook function.onAttach is run after the name spaceexports are attached.

requireNamespace is a wrapper forloadNamespaceanalogous torequire that returns a logical value.

loadedNamespaces returns a character vector of the names ofthe loaded name spaces.

isNamespaceLoaded(pkg) is equivalent to but more efficient thanpkg %in% loadedNamespaces().

unloadNamespace can be used to attempt to force a name space tobe unloaded. If the name space is attached, it is firstdetached, thereby running a.onDetach or.Last.lib function in the name space if one is exported. Anerror is signaled and the name space is not unloaded if the name spaceis imported by other loaded name spaces. If defined, a hook function.onUnload is run before removing the name space from theinternal registry.

See the comments in the help fordetach about someissues with unloading and reloading name spaces.

Value

attachNamespace returns invisibly the package environment itadds to the search path.

loadNamespace returns the name space environment, either onealready loaded or the one the function causes to be loaded.

requireNamespace returnsTRUE if it succeeds orFALSE.

loadedNamespaces returns acharacter vector.

unloadNamespace returnsNULL, invisibly.

Tracing

As fromR 4.1.0 the operation ofloadNamespace can be traced,which can help track down the causes of unexpected messages (includingwhich package(s) they come from sinceloadNamespace is called inmany ways including from itself and by:: and can be called byload). Setting the environment variable_R_TRACE_LOADNAMESPACE_ to a numerical value will generateadditional messages on progress. Non-zero values,e.g.1, report which namespace is being loaded and whenloading completes: values2 to4 report in increasingdetail. Negative values are reserved for tracing specific features andtheir current meanings are documented in source-code comments.

Loading standard packages is never traced.

Author(s)

Luke Tierney and R-core

References

The ‘Writing R Extensions’ manual, section “Package namespaces”.

See Also

getNamespace,asNamespace,topenv,.onLoad (etc);furtherenvironment.

Examples

(lns<- loadedNamespaces()) statL<- isNamespaceLoaded("stats") stopifnot( identical(statL,"stats"%in% lns))## The string "foo" and the symbol 'foo' can be used interchangably here: stopifnot( identical(isNamespaceLoaded("foo"),FALSE),            identical(isNamespaceLoaded(quote(foo)),FALSE),            identical(isNamespaceLoaded(quote(stats)), statL))hasS<- isNamespaceLoaded("splines")# (to restore if needed)Sns<- asNamespace("splines")# loads it if not alreadystopifnot(   isNamespaceLoaded("splines"))if(is.null(try(unloadNamespace(Sns))))# try unloading the NS 'object'stopifnot(! isNamespaceLoaded("splines"))if(hasS) loadNamespace("splines")# (restoring previous state)

Top Level Environment

Description

Finding the top levelenvironment from an environmentenvir and its enclosing environments.

Usage

topenv(envir= parent.frame(),       matchThisEnv= getOption("topLevelEnvironment"))

Arguments

envir

environment.

matchThisEnv

return this environment, if it matches beforeany other criterion is satisfied. The default, the option‘⁠topLevelEnvironment⁠’, is set bysys.source,which treats a specific environment as the top level environment.Supplying the argument asNULL oremptyenv() meansit will never match.

Details

topenv returns the first top levelenvironmentfound when searchingenvir and its enclosing environments. If notop level environment is found,.GlobalEnv is returned. Anenvironment is considered top level if it is the internal environmentof a namespace, a package environment in thesearchpath, or.GlobalEnv .

See Also

environment, notablyparent.env() on“enclosing environments”;loadNamespace for more on namespaces.

Examples

topenv(.GlobalEnv)topenv(new.env())# also global envtopenv(environment(ls))# namespace:basetopenv(environment(lm))# namespace:stats

The Null Object

Description

NULL represents the null object inR: it is areservedword.NULL is often returned by expressions and functionswhose value is undefined.

Usage

NULLas.null(x,...)is.null(x)

Arguments

x

an object to be tested or coerced.

...

ignored.

Details

NULL can be indexed (seeExtract) in just about anysyntactically legal way: apart fromNULL[[]] which is an error, the result isalwaysNULL. Objects with valueNULL can be changed byreplacement operators and will be coerced to the type of theright-hand side.

NULL is also used as the emptypairlist: see theexamples. Because pairlists are often promoted to lists, you mayencounterNULL being promoted to an empty list.

Objects with valueNULL cannot have attributes as there is onlyone null object: attempts to assign them are either an error(attr) or promote the object to an empty list withattribute(s) (attributes andstructure).

Value

as.null ignores its argument and returnsNULL.

is.null returnsTRUE if its argument's valueisNULL andFALSE otherwise.

Note

is.null is aprimitive function.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

%||%:L %||% R is equivalent to if(!is.null(L)) L else R

Examples

is.null(list())# FALSE (on purpose!)is.null(pairlist())# TRUEis.null(integer(0))# FALSEis.null(logical(0))# FALSEas.null(list(a=1, b="c"))

Numeric Vectors

Description

Creates or coerces objects of type"numeric".is.numeric is a more general test of an object beinginterpretable as numbers.

Usage

numeric(length=0)as.numeric(x,...)is.numeric(x)

Arguments

length

a non-negative integer specifying the desired length.Double values will be coerced to integer:supplying an argument of length other than one is an error.

x

object to be coerced or tested.

...

further arguments passed to or from other methods.

Details

numeric is identical todouble.It creates a double-precision vector of the specified length with eachelement equal to0.

as.numeric is a generic function, but S3 methods must bewritten foras.double. It is identical toas.double.

is.numeric is aninternal genericprimitivefunction: you can write methods to handle specific classes of objects,seeInternalMethods. It isnot the same asis.double. Factors are handled by the default method,and there are methods for classes"Date","POSIXt" and"difftime" (all of whichreturn false). Methods foris.numeric should only return trueif the base type of the class isdouble orintegerand values can reasonably be regarded as numeric(e.g., arithmetic on them makes sense, and comparison should be donevia the base type).

Value

fornumeric andas.numeric seedouble.

The default method foris.numeric returnsTRUEif its argument is ofmode"numeric"(type"double" or type"integer") and not afactor, andFALSE otherwise. That is,is.integer(x) || is.double(x), or(mode(x) == "numeric") && !is.factor(x).

Warning

Ifx is afactor,as.numeric will returnthe underlying numeric (integer) representation, which is oftenmeaningless as it may not correspond to thefactorlevels, see the ‘Warning’ section infactor (and the 2nd example below).

S4 methods

as.numeric andis.numeric are internally S4 generic andso methods can be set for themviasetMethod.

To ensure thatas.numeric andas.doubleremain identical, S4 methods can only be set foras.numeric.

Note on names

It is a historical anomaly thatR has two names for itsfloating-point vectors,double andnumeric(and formerly hadreal).

double is the name of thetype.numeric is the name of themode and also of the implicitclass. As an S4 formal class, use"numeric".

The potential confusion is thatR has usedmode"numeric" to mean ‘double or integer’, which conflictswith the S4 usage. Thusis.numeric tests the mode, not theclass, butas.numeric (which is identical toas.double)coerces to the class.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

double,integer,storage.mode.

Examples

## Conversion does trim whitespace; non-numeric strings give NA + warningas.numeric(c("-.1"," 2.7 ","B"))## Numeric values are sometimes accidentally converted to factors.## Converting them back to numeric is trickier than you'd expect.f<- factor(5:10)as.numeric(f)# not what you might expect, probably not what you want## what you typically meant and want:as.numeric(as.character(f))## the same, considerably more efficient (for long vectors):as.numeric(levels(f))[f]

Numeric Versions

Description

A simple S3 class for representing numeric versionsincluding package versions, and associated methods.

Usage

numeric_version(x, strict=TRUE)package_version(x, strict=TRUE)R_system_version(x, strict=TRUE)getRversion()as.numeric_version(x)as.package_version(x)is.numeric_version(x)is.package_version(x)

Arguments

x

for the creators, a character vector with suitable numericversion strings (see ‘Details’);forpackage_version, alternatively an Rversion object as obtained byR.version.Foras.numeric_version andas.package_version,suitable character vectors as above, or numeric version objects.Foris.numeric_version andis.package_version,arbitrary R objects.

strict

a logical indicating whether invalid numeric versionsshould result in an error (default) or not.

Details

Numeric versions are sequences of one or more non-negative integers,usually (e.g., in package ‘DESCRIPTION’ files) represented ascharacter strings with the elements of the sequence concatenated andseparated by single ‘⁠.⁠’ or ‘⁠-⁠’ characters.R packageversions consist of at least two such integers, anR system versionof exactly three (major, minor and patch level).

Functionsnumeric_version,package_version andR_system_version create a representation from such strings (ifsuitable) which allows for coercion and testing, combination,comparison, summaries (min/max), inclusion in data frames,subscripting, and printing. The classes can hold a vector of suchrepresentations.

getRversion returns the version of the runningR as an Rsystem version object.

The[[ operator extracts or replaces a single version. Toaccess the integers of a version use two indices: see the examples.

See Also

compareVersion;packageVersion for the version of a specificR package.R.version etc for the version ofR (and the informationunderlyinggetRversion()).

Examples

x<- package_version(c("1.2-4","1.2-3","2.1"))x<"1.4-2.3"c(min(x), max(x))x[2,2]x$majorx$minorif(getRversion()<="2.5.0"){## work around missing feature  cat("Your version of R, ", as.character(getRversion()),", is outdated.\n","Now trying to work around that ...\n", sep="")}x[[1]]x[[c(1,3)]]# '4' as a numeric versionx[1,3]# samex[[1,3]]# 4 as an integerx[[2,3]]<-0# zero the patchlevelx[[c(2,3)]]<-0# samexx[[3]]<-"2.2.3"xx<- c(x, package_version("0.0"))is.na(x)[4]<-TRUEstopifnot(identical(is.na(x), c(rep(FALSE,3),TRUE)),  anyNA(x))

Numeric Constants

Description

HowR parses numeric constants.

Details

R parses numeric constants in its input in a very similar way to C99floating-point constants.

Inf andNaN are numeric constants (withtypeof(.) "double"). In text input (e.g., inscan andas.double), these are recognizedignoring case as isinfinity as an alternative toInf.NA_real_ andNA_integer_ are constants oftypes"double" and"integer" representing missingvalues. All other numeric constants start with a digit or period andare either a decimal or hexadecimal constant optionally followed byL.

Hexadecimal constants start with0x or0X followed bya non-empty sequence from0-9 a-f A-F . which is interpreted as ahexadecimal number, optionally followed by a binary exponent. A binaryexponent consists of aP orp followed by an optionalplus or minus sign followed by a non-empty sequence of (decimal)digits, and indicates multiplication by a power of two. Thus0x123p456 is291×2456291 \times 2^{456}.

Decimal constants consist of a non-empty sequence of digits possiblycontaining a period (the decimal point), optionally followed by adecimal exponent. A decimal exponent consists of anE ore followed by an optional plus or minus sign followed by anon-empty sequence of digits, and indicates multiplication by a powerof ten.

Values which are too large or too small to be representable willoverflow toInf or underflow to0.0.

A numeric constant immediately followed byi is regarded as animaginarycomplex number.

A numeric constant immediately followed byL is regarded as aninteger number when possible (and with a warning if itcontains a".").

Only the ASCII digits 0–9 are recognized as digits, even in languageswhich have other representations of digits. The ‘decimalseparator’ is always a period and never a comma.

Note that a leading plus or minus is not regarded by the parser aspart of a numeric constant but as a unary operator applied to the constant.

Note

When a string is parsed to input a numeric constant, the number may ormay not be representable exactly in the C double type used. If notone of the nearest representable numbers will be returned.

R's own C code is used to convert constants to binary numbers, so theeffect can be expected to be the same on all platforms implementingfullIEC 60559 arithmetic (the most likely area of difference beingthe handling of numbers less than.Machine$double.xmin).The same code is used byscan.

See Also

Syntax.For complex numbers, seecomplex.Quotes for the parsing of character constants,Reserved for the “reserved words” inR.

Examples

## You can create numbers using fixed or scientific formatting.2.12.1e10-2.1E-10## The resulting objects have class numeric and type double.class(2.1)typeof(2.1)## This holds even if what you typed looked like an integer.class(2)typeof(2)## If you actually wanted integers, use an "L" suffix.class(2L)typeof(2L)## These are equal but not identical2==2Lidentical(2,2L)## You can write numbers between 0 and 1 without a leading "0"## (but typically this makes code harder to read).1234sqrt(1i)# remember elementary math?utils::str(0xA0)identical(1L, as.integer(1))## You can combine the "0x" prefix with the "L" suffix :identical(0xFL, as.integer(15))

Integer Numbers Displayed in Octal

Description

Integers which are displayed in octal (base-8 number system) format, with asmany digits as are needed to display the largest, using leading zeroes asnecessary.

Arithmetic works as for integers, and non-integer valued mathematicalfunctions typically work by truncating the result to integer.

Usage

as.octmode(x)## S3 method for class 'octmode'as.character(x, keepStr=FALSE,...)## S3 method for class 'octmode'format(x, width=NULL,...)## S3 method for class 'octmode'print(x,...)

Arguments

x

an object, for the methods inheriting from class"octmode".

keepStr

alogical indicating that names anddimensions should be kept; setTRUE for back compatibility, if needed.

width

NULL or a positive integer specifying the minimumfield width to be used, with padding by leading zeroes.

...

further arguments passed to or from other methods.

Details

"octmode" objects are integer vectors with that classattribute, used primarily to ensure that they are printed in octalnotation, specifically for Unix-like file permissions such as755. Subsetting ([) works too, as do arithmetic orother mathematical operations, albeit truncated to integer.

as.character(x) drops allattributes (unless whenkeepStr=TRUE where it keeps,dim,dimnames andnames for back compatibility) and converts each entry individually, hence with noleading zeroes, whereas informat(), whenwidth = NULL (thedefault), the output is padded with leading zeroes to the smallest widthneeded for all the non-missing elements.

as.octmode can convert integers (oftype"integer" or"double") and character vectors whose elements contain onlydigits0-7 (or areNA) to class"octmode".

There is a! method and methods for| and&:these recycle their arguments to the length of the longer and thenapply the operators bitwise to each element.

See Also

These are auxiliary functions forfile.info.

hexmode,sprintf for other options inconverting integers to octal,strtoi to convert octalstrings to integers.

Examples

(on<- as.octmode(c(16,32,127:129)))# "020" "040" "177" "200" "201"unclass(on[3:4])# subsetting## manipulate file modesfmode<- as.octmode("170")(fmode|"644")&"755"(umask<- Sys.umask())# depends on platformc(fmode,"666","755")&!umaskom<- as.octmode(1:12)om# print()s via format()stopifnot(nchar(format(om))==2)om[1:7]# *no* leading zeroes!stopifnot(format(om[1:7])== as.character(1:7))om2<- as.octmode(c(1:10,60:70))om2# prints via format() -> with 3 octalsstopifnot(nchar(format(om2))==3)as.character(om2)# strings of length 1, 2, 3## Integer arithmetic (remaining "octmode"):om^2om*64-om(fac<- factorial(om))# !1, !2, !3, !4 .. in hexadecimalsas.integer(fac)# indeed the same as  factorial(1:12)

Function Exit Code

Description

on.exit records the expression given as its argument as needingto be executed when the current function exits (either naturally or asthe result of an error). This is useful for resetting graphicalparameters or performing other cleanup actions.

If no expression is provided, i.e., the call ison.exit(), thenthe currenton.exit code is removed.

Usage

on.exit(expr=NULL, add=FALSE, after=TRUE)

Arguments

expr

an expression to be executed.

add

if TRUE, addexpr to be executed after any previouslyset expressions (or before ifafter is FALSE); otherwise (thedefault)expr will overwrite any previously set expressions.

after

ifadd is TRUE andafter is FALSE, thenexpr will be added on top of the expressions that were alreadyregistered. The resulting last in first out order is useful for freeingor closing resources in reverse order.

Details

Theexpr argument passed toon.exit is recorded withoutevaluation. If it is not subsequently removed/replaced by anotheron.exit call in the same function, it is evaluated in theevaluation frame of the function when it exits (including duringstandard error handling). Thus any functions or variables in theexpression will be looked for in the function and its environment atthe time of exit: to capture the current value inexpr usesubstitute or similar.

If multipleon.exit expressions are set usingadd = TRUEthen all expressions will be run even if one signals an error.

This is a ‘special’primitive function: it onlyevaluates the argumentsadd andafter.

Value

InvisibleNULL.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

sys.on.exit which returns the expression stored for usebyon.exit() in the function in whichsys.on.exit() isevaluated.

Examples

require(graphics)opar<- par(mai= c(1,1,1,1))on.exit(par(opar))

Operators on the Date Class

Description

Operators for the"Date" class.

There is anOps method and specificmethods for+ and- for theDate class.

Usage

date+ xx+ datedate- xdate1 lop date2

Arguments

date

an object of class"Date".

date1,date2

date objects or character vectors. (Charactervectors are converted byas.Date.)

x

a numeric vector (in days)or an object of class"difftime", rounded to the nearest whole day.

lop

one of==,!=,<,<=,>or>=.

Details

x does not need to be integer if specified as a numeric vector,but see the comments about fractional days in the help forDates.

Examples

(z<- Sys.Date())z+10z< c("2009-06-01","2010-01-01","2015-01-01")

Options Settings

Description

Allow the user to set and examine a variety of globaloptionswhich affect the way in whichR computes and displays its results.

Usage

options(...)getOption(x, default=NULL).Options

Arguments

...

any options can be defined, usingname = value.However, only the ones below are used in baseR.

Options can also be passed by giving a single unnamed argument whichis a named list.

x

a character string holding an option name.

default

if the specified option is not set in the options list,this value is returned. This facilitates retrieving an option andchecking whether it is set and setting it separately if not.

Details

Invokingoptions() with no arguments returns a list with thecurrent values of the options. Note that not all options listed beloware set initially. To access the value of a single option, one shoulduse, e.g.,getOption("width") rather thanoptions("width") which is alist of length one.

Value

ForgetOption, the current value set for optionx, ordefault (which defaults toNULL) if the option is unset.

Foroptions(), a list of all set options sorted by name. Foroptions(name), a list of length one containing the set value,orNULL if it is unset. For uses setting one or more options,a list with the previous values of the options changed (returnedinvisibly).

Options used in baseR

add.smooth:

typically logical, defaulting toTRUE. Could also be set to an integer for specifying howmany (simulated) smooths should be added. This is currently onlyused byplot.lm.

askYesNo:

a function (typically set by a front-end)to ask the user binary response functions in a consistent way,or a vector of strings used byaskYesNo to useas default responses for such questions.

browserNLdisabled:

logical: whether newline isdisabled as a synonym for"n" in the browser.

catch.script.errors:

logical, false by default. Iftrueandinteractive() is false, e.g., when anR script is run byR CMDBATCH <script>.R, thenerrors donot stop execution of the script. Rather evaluationcontinues after printing the error (and jumping to top level).Also,traceback() would provide info about the error.Do use with care!

checkPackageLicense:

logical, not set by default. Iftrue,loadNamespace asks a user to accept anynon-standard license at first load of the package.

check.bounds:

logical, defaulting toFALSE. Iftrue, awarning is produced whenever avector (atomic orlist) is extended, by somethinglikex <- 1:3; x[5] <- 6.

CBoundsCheck:

logical, controlling whether.C and.Fortran make copies to check forarray over-runs on the atomic vector arguments.

Initially set from value of the environment variableR_C_BOUNDS_CHECK (set toyes to enable).

conflicts.policy:

character string or list controllinghandling of conflicts found in calls tolibrary orrequire. Seelibrary for details.

continue:

a non-empty string setting the prompt usedfor lines which continue over one line.

defaultPackages:

the packages that are attached bydefault whenR starts up. Initially set from the value of theenvironment variableR_DEFAULT_PACKAGES, or if that is unsettoc("datasets", "utils", "grDevices", "graphics", "stats", "methods"). (SetR_DEFAULT_PACKAGES toNULL ora comma-separated list of package names.)This option can be changed in a ‘.Rprofile’ file, but it willnot work to exclude themethods package at this stage, asthe value is screened formethods before that file is read.

deparse.cutoff:

integer value controlling theprinting of language constructs which aredeparsed.Default60.

deparse.max.lines:

controls the number of lines usedwhen deparsing inbrowser, upon entry to a functionwhose debugging flag is set, and if optiontraceback.max.linesis unset, oftraceback(). Initially unset, and onlyused if set to a positive integer.

traceback.max.lines:

controls the number of lines usedwhen deparsing intraceback, if set.Initially unset, and only used if set to a positive integer.

digits:

controls the number ofsignificant (seesignif) digits toprint when printing numeric values. It is a suggestion only.Valid values are 1...22 with default 7. See the note inprint.default about values greater than 15.

digits.secs:

controls the maximum number of digits toprint when formatting time values in seconds. Valid valuesare 0...6 with default 0 (equivalent toNULL which is usedwhen it is undefined as on vanilla startup). Seestrftime.

download.file.extra:

Extra command-line argument(s) fornon-default methods: seedownload.file.

download.file.method:

Method to be used fordownload.file. Currently download methods"internal","wininet" (Windows only),"libcurl","wget" and"curl" are available.If not set,method = "auto"is chosen: seedownload.file.

echo:

logical. Only used in non-interactive mode,when it controls whether input is echoed. Command-line option--no-echo sets this toFALSE, but otherwiseit starts the session asTRUE.

encoding:

The name of an encoding, default"native.enc". Seeconnections.

error:

either a function or an expression governingthe handling of non-catastrophic errors such as those generated bystop as well as by signals and internally detectederrors. If the option is a function, a call to that function,with no arguments, is generated as the expression. By defaultthe option is not set: seestop for the behaviour inthat case. The functionsdump.frames andrecover provide alternatives that allow post-mortemdebugging. Note that these need to specified ase.g.options(error = utils::recover) in startupfiles such as ‘.Rprofile’.

expressions:

sets a limit on the number of nestedexpressions that will be evaluated. Valid values are25...500000 with default 5000. If you increase it, you mayalso want to startR with a larger protection stack;see--max-ppsize inMemory. Note too thatyou may cause a segfault from overflow of the C stack, and on OSeswhere it is possible you may want to increase that. Once thelimit is reached an error is thrown. The current number underevaluation can be found by callingCstack_info.

interrupt:

a function taking no arguments to be calledon a user interrupt if the interrupt condition is not otherwisehandled.

keep.parse.data:

When internally storing source code(keep.source is TRUE), also store parse data. Parse data canthen be retrieved withgetParseData() and used e.g. forspell checking of string constants or syntax highlighting. The valuehas effect only when internally storing source code (seekeep.source). The default isTRUE.

keep.parse.data.pkgs:

As forkeep.parse.data, usedonly when packages are installed. Defaults toFALSE unless theenvironment variableR_KEEP_PKG_PARSE_DATA is set toyes.The space overhead of parse data can be substantial even aftercompression and it causes performance overhead when loading packages.

keep.source:

WhenTRUE, the source code forfunctions (newly defined or loaded) is stored internallyallowing comments to be kept in the right places. Retrieve thesource by printing or usingdeparse(fn, control = "useSource").

The default isinteractive(), i.e.,TRUE forinteractive use.

keep.source.pkgs:

As forkeep.source, used onlywhen packages are installed. Defaults toFALSE unless theenvironment variableR_KEEP_PKG_SOURCE is set toyes.

matprod:

a string selecting the implementation ofthe matrix products%*%,crossprod, andtcrossprod for double and complex vectors:

"internal"

uses an unoptimized 3-loop algorithmwhich correctly propagatesNaN andInf values and is consistent in precision withother summation algorithms insideR likesum orcolSums (which now means that it uses along double accumulator for summation if available and enabled,seecapabilities).

"default"

uses BLAS to speed up computation, butto ensure correct propagation ofNaN andInfvalues it uses an unoptimized 3-loop algorithm for inputs that maycontainNaN orInf values. When deemedbeneficial for performance,"default" may call the3-loop algorithm unconditionally, i.e., without checking theinput forNaN/Inf values. The 3-loop algorithm uses(only) adouble accumulator for summation, which isconsistent with the reference BLAS implementation.

"blas"

uses BLAS unconditionally without anychecks and should be used with extreme caution. BLASlibraries do not propagateNaN orInf values correctly and for inputs withNaN/Inf values the results may be undefined.

"default.simd"

is experimental and will likely beremoved in future versions ofR. It provides the same behavioras"default", but the check whether the input containsNaN/Inf values is faster on someSIMD hardware.On older systems it will run correctly, but may be much slower than"default".

max.print:

integer, defaulting to99999.print orshow methods can make use ofthis option, to limit the amount of information that is printed,to something in the order of (and typically slightly less than)max.printentries.

OutDec:

character string containing a singlecharacter. The preferred character to be used as the decimalpoint in output conversions, that is in printing, plotting,format,formatC andas.character but not whendeparsing nor bysprintf(which is sometimes used prior to printing).

pager:

the command used for displaying text files byfile.show, details depending on the platform:

On a unix-alike

defaults to ‘R_HOME/bin/pager’, which is a shellscript running the command-line specified by the environmentvariablePAGER whose default is set at configuration,usually toless.

On Windows

defaults to"internal", which uses a pager similar to theGUI console. Another possibility is"console" to use theconsole itself.

Can be a character string or anR function, in which case itneeds to accept the arguments(files, header,title, delete.file) corresponding to the first four arguments offile.show.

papersize:

the default paper format used bypostscript; set by environment variableR_PAPERSIZE whenR is started: if that is unset or invalidit defaults platform dependently

on a unix-alike

to a value derived from the locale categoryLC_PAPER, or if that is unavailable to a default setwhenR was built.

on Windows

to"a4", or"letter" in US andCanadian locales.

PCRE_limit_recursion:

Logical: shouldgrep(perl = TRUE) and similar limit the maximalrecursion allowed when matching? Only relevant for PCRE1 andPCRE2 <= 10.23.

PCRE can be built not to use a recursion stack (seepcre_config), but it uses recursion by default witha recursion limit of 10000000 which potentially needs a very largeC stack: see the discussion athttps://www.pcre.org/original/doc/html/pcrestack.html. Iftrue, the limit is reduced usingR's estimate of the C stack sizeavailable (if known), otherwise 10000. IfNA, the limit isimposed only if any input string has 1000 or more bytes. Thelimit has no effect when PCRE's Just-in-Time compiler is used.

PCRE_study:

Logical or integer: shouldgrep(perl = TRUE) and similar ‘study’ thepatterns? Either logical or a numerical threshold for the minimumnumber of strings to be matched for the pattern to be studied (thedefault is10)). Missing values and negative numbers aretreated as false. This option is ignored with PCRE2 (PCRE version >=10.00) which does not have a separate study phase and patterns areautomatically optimized when possible.

PCRE_use_JIT:

Logical: shouldgrep(perl =TRUE),strsplit(perl = TRUE) and similar make useof PCRE's Just-In-Time compiler if available? (This applies only tostudied patterns with PCRE1.) Default: true. Missing values aretreated as false.

pdfviewer:

default PDF viewer.The default is set from the environment variableR_PDFVIEWER,the default value of which

on a unix-alike

is set whenR is configured, and

on Windows

is the full path toopen.exe, a utilitysupplied withR.

printcmd:

the command used bypostscriptfor printing; set by environment variableR_PRINTCMD whenR is started. This should be a command that expects either inputto be piped to ‘stdin’ or to be given a single filenameargument. Usually set to"lpr" on a Unix-alike.

prompt:

a non-empty string to be used forR's prompt;should usually end in a blank (" ").

rl_word_breaks:

(Unix only:) Used for the readline-based terminalinterface. Default value" \t\n\"\\'`><=%;,|&{()}".

This is the set of characters use to break the input line intotokens for object- and file-name completion. Those who do not usespaces around operators may prefer
" \t\n\"\\'`><=+-*%;,|&{()}"

save.defaults,save.image.defaults:

seesave.

scipen:

integer. A penalty to be appliedwhen deciding to print numeric values in fixed or exponentialnotation. Positive values bias towards fixed and negative towardsscientific notation: fixed notation will be preferred unless it ismore thanscipen digits wider.

setWidthOnResize:

a logical. If set andTRUE,Rrun in a terminal using a recentreadline library will setthewidth option when the terminal is resized.

showWarnCalls,showErrorCalls:

a logical.Should warning and error messages produced by the default handlersshow a summary of the call stack? By default error call stacksare shown in non-interactive sessions. Whenwarningorstop are called on a condition object the callstacks are only shown if the value returned byconditionCall for the condition object is notNULL.

showNCalls:

integer. Controls how long the sequenceof calls must be (in bytes) before ellipses are used. Defaults to50 and should be at least 30 and no more than 500.

show.error.locations:

Should source locations oferrors be printed? If set toTRUE or"top", thesource location that is highest on the stack (the most recentcall) will be printed."bottom" will print the locationof the earliest call found on the stack.

Integer values can select other entries. The value0corresponds to"top" and positive values count down thestack from there. The value-1 corresponds to"bottom" and negative values count up from there.

show.error.messages:

a logical. Should error messagesbe printed? Intended for use withtry or auser-installed error handler.

texi2dvi:

used by functionstexi2dvi andtexi2pdf in packagetools.

unix-alike only:

Set at startup from the environment variableR_TEXI2DVICMD,which defaults first to the value of environment variableTEXI2DVI, and then to a value set whenR was installed (thefull path to atexi2dvi script if one was found). Ifnecessary, that environment variable can be set to"emulation".

timeout:

positive integer. The timeout for someInternet operations, in seconds. Default 60 (seconds) but can beset from environment variableR_DEFAULT_INTERNET_TIMEOUT. (Invalid values of the option orthe variable are silently ignored: non-integer numeric values willbe truncated.) Seedownload.file andconnections.

topLevelEnvironment:

seetopenv andsys.source.

url.method:

character string: the default method forurl. Normally unset, which is equivalent to"default", which is"internal" except on Windows.

useFancyQuotes:

controls the use ofdirectional quotes insQuote,dQuote and inrendering text help (seeRd2txt in packagetools). Can beTRUE,FALSE,"TeX" or"UTF-8".

verbose:

logical. ShouldR report extra informationon progress? Set toTRUE by the command-line option--verbose.

warn:

integer value to set the handling of warningmessages by the default warning handler. Ifwarn is negative all warnings are ignored. Ifwarnis zero (the default) warnings are stored until the top–levelfunction returns. If 10 or fewer warnings were signalled theywill be printed otherwise a message saying how many weresignalled. An object calledlast.warning iscreated and can be printed through the functionwarnings. Ifwarn is one, warnings areprinted as they occur. Ifwarn is two (or larger, coercibleto integer), all warnings are turned into errors. While sometimesuseful for debugging, turning warnings into errors may triggerbugs and resource leaks that would not have been triggered otherwise.

warnPartialMatchArgs:

logical. If true, warns ifpartial matching is used in argument matching.

warnPartialMatchAttr:

logical. If true, warns ifpartial matching is used in extracting attributes viaattr.

warnPartialMatchDollar:

logical. If true, warns ifpartial matching is used for extraction by$.

warning.expression:

anR code expression to be calledif a warning is generated, replacing the standard message. Ifnon-null it is called irrespective of the value of optionwarn.

warning.length:

sets the truncation limit in bytes for errorand warning messages. A non-negative integer, with allowed values100...8170, default 1000.

nwarnings:

the limit for the number of warnings keptwhenwarn = 0, default 50. This will discard messages ifcalled whilst they are being collected. If you increase thislimit, be aware that the current implementation pre-allocatesthe equivalent of a named list for them, i.e., do not increase it tomore than say a million.

width:

controls the maximum number of columns on aline used in printing vectors, matrices and arrays, and whenfilling bycat.

Columns are normally the same as characters except in East Asianlanguages.

You may want to change this if you re-size the window thatR isrunning in. Valid values are 10...10000 with default normally80. (The limits on valid values are in file ‘Print.h’ and can bechanged by re-compilingR.) SomeR consoles automatically changethe value when they are resized.

See the examples onStartup for one way to set thisautomatically from the terminal width whenR is started.

The ‘factory-fresh’ default settings of some of these options are

add.smoothTRUE
check.boundsFALSE
continue"+ "
digits7
echoTRUE
encoding"native.enc"
errorNULL
expressions5000
keep.sourceinteractive()
keep.source.pkgsFALSE
max.print99999
OutDec"."
prompt"> "
scipen0
show.error.messagesTRUE
timeout60
verboseFALSE
warn0
warning.length1000
width80

Others are set from environment variables or are platform-dependent.

Options set in packagegrDevices

These will be set when packagegrDevices (or its namespace)is loaded if not already set.

bitmapType:

(Unix only, incl. macOS) character. Thedefault type for thebitmap devices such aspng. Defaults to"cairo" on systems where that is available, or to"quartz" on macOS where that is available.

device:

a character string givingthe name of a function, or the function object itself,which when called creates a new graphics device of the defaulttype for that session. The value of this option defaults to thenormal screen device (e.g.,X11,windows orquartz) for an interactive session, andpdfin batch use or if a screen is not available. If set to the nameof a device, the device is looked for first from the globalenvironment (that is down the usual search path) and then in thegrDevices namespace.

The default values in interactive and non-interactive sessions areconfigurable via environment variablesR_INTERACTIVE_DEVICE andR_DEFAULT_DEVICErespectively.

The search logic for ‘the normal screen device’ is thatthis iswindows on Windows, andquartz if availableon macOS (running at the console, and compiled into the build).OtherwiseX11 is used if environment variableDISPLAYis set.

device.ask.default:

logical. The default fordevAskNewPage("ask") when a device is opened.

locatorBell:

logical. Should selection inlocatorandidentify be confirmed by a bell? DefaultTRUE.Honoured at least onX11 andwindows devices.

windowsTimeout:

(Windows-only) integer vector of length 2representing two times in milliseconds. These control thedouble-buffering ofwindows devices when that isenabled: the first is the delay after plotting finishes(default 100) and the second is the update interval duringcontinuous plotting (default 500). The values at the time thedevice is opened are used.

Other options used by package graphics

max.contour.segments:

positive integer, defaulting to25000 if not set. A limit on the number ofsegments in a single contour line incontour orcontourLines.

Options set in package stats

These will be set when packagestats (or its namespace)is loaded if not already set.

contrasts:

the defaultcontrasts used inmodel fitting such as withaov orlm.A character vector of length two, the first giving the function tobe used with unordered factors and the second the function to beused with ordered factors. By default the elements are namedc("unordered", "ordered"), but the names are unused.

na.action:

the name of a function for treating missingvalues (NA's) for certain situations, seena.action andna.pass.

show.coef.Pvalues:

logical, affecting whether Pvalues are printed in summary tables of coefficients. SeeprintCoefmat.

show.nls.convergence:

logical, shouldnlsconvergence messages be printed for successful fits?

show.signif.stars:

logical, should stars be printed onsummary tables of coefficients? SeeprintCoefmat.

ts.eps:

the relative tolerance for certain time series(ts) computations. Default1e-05.

ts.S.compat:

logical. Used to select S compatibilityfor plotting time-series spectra. See the description of argumentlog inplot.spec.

Options set (or used) in package utils

These will be set (apart fromNcpus) when packageutils(or its namespace) is loaded if not already set.

BioC_mirror:

The URL of a Bioconductor mirrorfor use bysetRepositories,e.g. the default ‘⁠"https://bioconductor.org"⁠’or the European mirror‘⁠"https://bioconductor.statistik.tu-dortmund.de"⁠’. Can be setbychooseBioCmirror.

browser:

The HTML browser to be used bybrowseURL. This sets the default browser on UNIX ora non-default browser on Windows. Alternatively, anR functionthat is called with a URL as its argument. SeebrowseURL for further details.

ccaddress:

default Cc: address used bycreate.post (and hencebug.report andhelp.request). Can beFALSE or"".

citation.bibtex.max:

default 1; the maximal number ofbibentries (bibentry) in acitation forwhich the BibTeX version is printed in addition to the text one.

de.cellwidth:

integer: the cell widths (number ofcharacters) to be used in the data editordataentry.If this is unset (the default), 0, negative orNA, variablecell widths are used.

demo.ask:

default for theask argument ofdemo.

editor:

a non-empty character string or anR functionthat sets the default text editor, e.g., foreditandfile.edit. Set from the environment variableEDITOR on UNIX, or if unsetVISUAL orvi.As a string it should specify the name of or path to an externalcommand.

example.ask:

default for theask argument ofexample.

help.ports:

optional integer vector for setting portsof the internal HTTP server, seestartDynamicHelp.

help.search.types:

default types of documentationto be searched byhelp.search and??.

help.try.all.packages:

default for an argument ofhelp.

help_type:

default for an argument ofhelp, used also as the help type by?.

help.htmlmath:

default for thetexmath argumentofRd2HTML, controlling how LaTeX-like mathematicalequations are displayed in R help pages (if enabled). Usefulvalues are"katex" (equivalent toNULL, the default)and"mathjax"; for all other values basic substitutions areused.

help.htmltoc:

default for thetoc argumentofRd2HTML, controlling whether a table of contentsshould be included.

HTTPUserAgent:

string used as the ‘user agent’ inHTTP(S) requests bydownload.file,urlandcurlGetHeaders, orNULL when requests willbe made without a user agent header. The default is"R (versionplatformarchos)"except when ‘⁠libcurl⁠’ is used when it is"libcurl/version" for the ‘⁠libcurl⁠’ version in use.

install.lock:

logical: should per-directory packagelocking be used byinstall.packages? Most usefulfor binary installs on macOS and Windows, but can be used in astartup file for source installsviaR CMDINSTALL. For binary installs, can also bethe character string"pkglock".

internet.info:

The minimum level of information to beprinted on URL downloads etc, using the"internal" and"libcurl" methods.Default is 2, for failure causes. Set to 1 or 0 to get moredetailed information (for the"internal" method 0 providesmore information than 1).

install.packages.check.source:

Used byinstall.packages (and indirectlyupdate.packages) on platforms which support binarypackages. Possible values"yes" and"no", withunset being equivalent to"yes".

install.packages.compile.from.source:

Used byinstall.packages(type = "both") (and indirectlyupdate.packages) on platforms whichsupport binary packages. Possible values are"never","interactive" (which means ask in interactive use and"never" in batch use) and"always". The default istaken from environment variableR_COMPILE_AND_INSTALL_PACKAGES, with default"interactive" if unset. However,install.packagesuses"never" unless amake program is found,consulting the environment variableMAKE.

mailer:

default emailing method used bycreate.post and hencebug.report andhelp.request.

menu.graphics:

Logical: should graphical menus be usedif available? Defaults toTRUE. Currently applies toselect.list,chooseCRANmirror,setRepositories and to select from multiple (text)help files inhelp.

Ncpus:

an integern1n \ge 1, used ininstall.packages as default for the number of CPUsto use in a potentially parallel installation, asNcpus = getOption("Ncpus", 1L), i.e., when unset isequivalent to a setting of 1.

pkgType:

The default type of packages to be downloadedand installed – seeinstall.packages.Possible values are platform dependently

on Windows

"win.binary","source" and"both" (the default).

on Unix-alikes

"source" (the default except under aCRAN macOS build),"mac.binary" and"both" (the default for CRAN macOS builds).("mac.binary.el-capitan","mac.binary.mavericks","mac.binary.leopard"and"mac.binary.universal" are no longer in use.)

Value"binary" is a synonym for the native binary type (ifthere is one);"both" is used byinstall.packages to choose between source and binaryinstalls.

repos:

character vector of repository URLs for use byavailable.packages and related functions. Initiallyset from entries marked as default in the‘repositories’ file,whose path is configurable via environment variableR_REPOSITORIES(set this toNULL to skip initialization at startup).The ‘factory-fresh’ setting from the file inR.home("etc") isc(CRAN="@CRAN@"), a value that causes some utilities toprompt for a CRAN mirror. To avoid this do set the CRAN mirror,by something like

local({    r <- getOption("repos")    r["CRAN"] <- "https://my.local.cran"    options(repos = r)})

in your ‘.Rprofile’,or use a personal ‘repositories’ file.

Note that you can add more repositories (Bioconductor,R-Forge, RForge.net, ...) for the current sessionusingsetRepositories.

str:

a list of options controlling the defaultstr display. Defaults tostrOptions().

str.dendrogram.last:

seestr.dendrogram.

SweaveHooks,SweaveSyntax:

seeSweave.

unzip:

a character string used byunzip:the path of the external programunzip or"internal".Defaults (platform dependently)

on unix-alikes

to the value ofR_UNZIPCMD, which is set in‘etc/Renviron’ to the path of theunzip command foundduring configuration and otherwise to"".

on Windows

to"internal" when the internal unzipcode is used.

Options set in package parallel

These will be set when packageparallel (or its namespace)is loaded if not already set.

mc.cores:

an integer giving the maximum allowed numberofadditionalR processes allowed to be run in parallel tothe currentR process. Defaults to the setting of theenvironment variableMC_CORES if set. Most applicationswhich use this assume a limit of2 if it is unset.

Options used on Unix only

dvipscmd:

character string giving a command to be used inthe (deprecated) off-line printing of help pagesviaPostScript. Defaults to"dvips".

Options used on Windows only

warn.FPU:

logical, by default undefined. If true,awarning is produced wheneverdyn.load repairs thecontrol word damaged by a buggy DLL.

Note

For compatibility with S there is a visible object.Options whosevalue is a pairlist containing the currentoptions() (in noparticular order). Assigning to it will make a local copy and notchange the original. (Using it however is faster than callingoptions()).

An option set toNULL is indistinguishable from a non existingoption.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Examples

op<- options(); utils::str(op)# op is a named listgetOption("width")== options()$width# the latter needs more memoryoptions(digits=15)pi# set the editor, and save previous valueold.o<- options(editor="nedit")old.ooptions(check.bounds=TRUE, warn=1)x<-NULL; x[4]<-"yes"# gives a warningoptions(digits=5)print(1e5)options(scipen=3); print(1e5)options(op)# reset (all) initial optionsoptions("digits")## Not run: ## set contrast handling to be like Soptions(contrasts= c("contr.helmert","contr.poly"))## End(Not run)## Not run: ## on error, terminate the R session with error status 66options(error= quote(q("no", status=66, runLast=FALSE)))stop("test it")## End(Not run)## Not run: ## Set error actions for debugging:## enter browser on error, see ?recover:options(error= recover)## allows to call debugger() afterwards, see ?debugger:options(error= dump.frames)## A possible setting for non-interactive sessionsoptions(error= quote({dump.frames(to.file=TRUE); q()}))## End(Not run)# Compare the two ways to get an option and use it# acconting for the possibility it might not be set.if(as.logical(getOption("performCleanp",TRUE)))   cat("do cleanup\n")## Not run:# a clumsier way of expressing the above w/o the default.tmp<- getOption("performCleanup")if(is.null(tmp))  tmp<-TRUEif(tmp)   cat("do cleanup\n")## End(Not run)

Ordering Permutation

Description

order returns a permutation which rearranges its firstargument into ascending or descending order, breaking ties by furtherarguments.sort.list does the same, using only one argument.
See the examples for how to use these functions to sort data frames,etc.

Usage

order(..., na.last=TRUE, decreasing=FALSE,      method= c("auto","shell","radix"))sort.list(x, partial=NULL, na.last=TRUE, decreasing=FALSE,          method= c("auto","shell","quick","radix"))

Arguments

...

a sequence of numeric, complex, character or logicalvectors, all of the same length, or a classedR object.

x

an atomic vector formethods"shell" and"quick". Whenx is a non-atomicR object, the default"auto" and"radix" methods may work iforder(x,..)does.

partial

vector of indices for partial sorting.(Non-NULL values are not implemented.)

decreasing

logical. Should the sort order be increasing ordecreasing? For the"radix" method, this can be a vector oflength equal to the number of arguments in... and theelements are recycled as necessary.For the other methods, it must be length one.

na.last

for controlling the treatment ofNAs.IfTRUE, missing values in the data are put last; ifFALSE, they are put first; ifNA, they are removed(see ‘Note’.)

method

the method to be used: partial matches are allowed. Thedefault ("auto") implies"radix" for numeric vectors,integer vectors, logical vectors and factors with fewer than2312^{31} elements. Otherwise, it implies"shell".For details of methods"shell","quick", and"radix",see the help forsort.

Details

In the case of ties in the first vector, values in the second are usedto break the ties. If the values are still tied, values in the laterarguments are used to break the tie (see the first example).The sort used isstable (except formethod = "quick"),so any unresolved ties will be left in their original ordering.

Complex values are sorted first by the real part, then the imaginarypart.

Except for method"radix", the sort order for character vectorswill depend on the collating sequence of the locale in use: seeComparison.

The"shell" method is generally the safest bet and is thedefault method, except for short factors, numeric vectors, integervectors and logical vectors, where"radix" is assumed. Method"radix" stably sorts logical, numeric and character vectors inlinear time. It outperforms the other methods, although there aredrawbacks, especially for character vectors (seesort).Method"quick" forsort.list is only supported fornumericx withna.last = NA, is not stable, and isslower than"radix".

partial = NULL is supported for compatibility with otherimplementations of S, but no other values are accepted and ordering isalways complete.

For a classedR object, the sort order is taken fromxtfrm: as its help page notes, this can be slow unless asuitable method has been defined oris.numeric(x) istrue. For factors, this sorts on the internal codes, which isparticularly appropriate for ordered factors.

Value

An integer vector unless any of the inputs has2312^{31} ormore elements, when it is a double vector.

Warning

In programmatic use it is unsafe to name the... arguments,as the names could match current or future controlarguments such asdecreasing. A sometimes-encountered unsafepractice is to calldo.call('order', df_obj) wheredf_obj might be a data frame: copydf_obj andremove any names, for example usingunname.

Note

sort.list can get called by mistake as a method forsort with a list argument: it gives a suitable errormessage for listx.

There is a historical difference in behaviour forna.last = NA:sort.list removes theNAs and then computes the orderamongst the remaining elements:order computes the orderamongst the non-NA elements of the original vector. Thus

   x[order(x, na.last = NA)]   zz <- x[!is.na(x)]; zz[sort.list(x, na.last = NA)]

both sort the non-NA values ofx.

Prior toR 3.3.0method = "radix" was only supported forintegers of range less than 100,000.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Knuth, D. E. (1998)The Art of Computer Programming, Volume 3: Sorting andSearching. 2nd ed. Addison-Wesley.

See Also

sort,rank,xtfrm.

Examples

require(stats)(ii<- order(x<- c(1,1,3:1,1:4,3), y<- c(9,9:1), z<- c(2,1:9)))## 6  5  2  1  7  4 10  8  3  9rbind(x, y, z)[,ii]# shows the reordering (ties via 2nd & 3rd arg)## Suppose we wanted descending order on y.## A simple solution for numeric 'y' isrbind(x, y, z)[, order(x,-y, z)]## More generally we can make use of xtfrmcy<- as.character(y)rbind(x, y, z)[, order(x,-xtfrm(cy), z)]## The radix sort supports multiple 'decreasing' values:rbind(x, y, z)[, order(x, cy, z, decreasing= c(FALSE,TRUE,FALSE),                       method="radix")]## Sorting data frames:dd<- transform(data.frame(x, y, z),                z= factor(z, labels= LETTERS[9:1]))## Either as above {for factor 'z' : using internal coding}:dd[ order(x,-y, z),]## or along 1st column, ties along 2nd, ... *arbitrary* no.{columns}:dd[ do.call(order, dd),]set.seed(1)# reproducible example:d4<- data.frame(x= round(   rnorm(100)), y= round(10*runif(100)),                 z= round(8*rnorm(100)), u= round(50*runif(100)))(d4s<- d4[ do.call(order, d4),])(i<- which(diff(d4s[,3])==0))#   in 2 places, needed 3 cols to break ties:d4s[ rbind(i, i+1),]## rearrange matched vectors so that the first is in ascending orderx<- c(5:1,6:8,12:9)y<-(x-5)^2o<- order(x)rbind(x[o], y[o])## tests of na.lasta<- c(4,3,2,NA,1)b<- c(4,NA,2,7,1)z<- cbind(a, b)(o<- order(a, b)); z[o,](o<- order(a, b, na.last=FALSE)); z[o,](o<- order(a, b, na.last=NA)); z[o,]##  speed examples on an average laptop for long vectors:##  factor/small-valued integers:x<- factor(sample(letters,1e7, replace=TRUE))system.time(o<- sort.list(x, method="quick", na.last=NA))# 0.1 secstopifnot(!is.unsorted(x[o]))system.time(o<- sort.list(x, method="radix"))# 0.05 sec, 2X fasterstopifnot(!is.unsorted(x[o]))##  large-valued integers:xx<- sample(1:200000,1e7, replace=TRUE)system.time(o<- sort.list(xx, method="quick", na.last=NA))# 0.3 secsystem.time(o<- sort.list(xx, method="radix"))# 0.2 sec##  character vectors:xx<- sample(state.name,1e6, replace=TRUE)system.time(o<- sort.list(xx, method="shell"))# 2 secsystem.time(o<- sort.list(xx, method="radix"))# 0.007 sec, 300X faster##  double vectors:xx<- rnorm(1e6)system.time(o<- sort.list(xx, method="shell"))# 0.4 secsystem.time(o<- sort.list(xx, method="quick", na.last=NA))# 0.1 secsystem.time(o<- sort.list(xx, method="radix"))# 0.05 sec, 2X faster

Outer Product of Arrays

Description

The outer product of the arraysX andY is the arrayA with dimensionc(dim(X), dim(Y)) where elementA[c(arrayindex.x, arrayindex.y)] = FUN(X[arrayindex.x], Y[arrayindex.y], ...).

Usage

outer(X, Y, FUN="*",...)X%o% Y

Arguments

X,Y

first and second arguments for functionFUN.Typically a vector or array.

FUN

a function to use on the outer products, foundviamatch.fun (except for the special case"*").

...

optional arguments to be passed toFUN.

Details

X andY must be suitable arguments forFUN. Eachwill be extended byrep to length the products of thelengths ofX andY beforeFUN is called.

FUN is called with these two extended vectors as arguments(plus any arguments in...). It must be a vectorizedfunction (or the name of one) expecting at least two arguments andreturning a value with the same length as the first (and the second).

Where they exist, the [dim]names ofX andY will becopied to the answer, and a dimension assigned which is theconcatenation of the dimensions ofX andY (or lengthsif dimensions do not exist).

FUN = "*" is handled as a special caseviaas.vector(X) %*% t(as.vector(Y)), and is intended only fornumeric vectors and arrays.

%o% is binary operator providing a wrapper forouter(x, y, "*").

Author(s)

Jonathan Rougier

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

%*% for usual (inner) matrix vectormultiplication;kronecker which is based onouter;Vectorize for vectorizing a non-vectorized function.

Examples

x<-1:9; names(x)<- x# Multiplication & Power Tablesx%o% xy<-2:8; names(y)<- paste(y,":", sep="")outer(y, x, `^`)outer(month.abb,1999:2003, FUN= paste)## three way multiplication table:x%o% x%o% y[1:3]

Parentheses and Braces

Description

Open parenthesis,(, and open brace,{, are.Primitive functions inR.

Effectively,( is semantically equivalent to the identityfunction(x) x, whereas{ is slightly more interesting,see examples.

Usage

(...){...}

Value

For(, the result of evaluating the argument. This hasvisibility set, so will auto-print if used at top-level.

For{, the result of the last expression evaluated. This hasthe visibility of the last evaluation.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

if,return, etc for other objects used intheR language itself.

Syntax for operator precedence.

Examples

f<- get("(")e<- expression(3+2*4)identical(f(e), e)do<- get("{")do(x<-3, y<-2*x-3,6-x-y); x; y## note the differences(2+3){2+3;4+5}(invisible(2+3)){invisible(2+3)}

Parse R Expressions

Description

parse() returns the parsed but unevaluated expressions in anexpression, a “list” ofcalls.

str2expression(s) andstr2lang(s) return special versionsofparse(text=s, keep.source=FALSE) and can therefore be regarded astransforming character stringss to expressions, calls, etc.

Usage

parse(file="", n=NULL, text=NULL, prompt="?",      keep.source= getOption("keep.source"), srcfile,      encoding="unknown")str2lang(s)str2expression(text)

Arguments

file

aconnection, or a character string giving the name of afile or a URL to read the expressions from.Iffile is"" andtext is missing orNULLthen input is taken from the console.

n

integer (or coerced to integer). The maximum number ofexpressions to parse. Ifn isNULL or negative orNA the input is parsed in its entirety.

text

character vector. The text to parse. Elements are treatedas if they were lines of a file. OtherR objects will be coercedto character if possible.

prompt

the prompt to print when parsing from the keyboard.NULL means to useR's prompt,getOption("prompt").

keep.source

a logical value; ifTRUE, keepsource reference information.

srcfile

NULL, a character vector, or asrcfile object. See the ‘Details’ section.

encoding

encoding to be assumed for input strings. If thevalue is"latin1" or"UTF-8" it is used to markcharacter strings as known to be in Latin-1 or UTF-8: it is not usedto re-encode the input. To do the latter, specify the encoding aspart of the connectioncon orviaoptions(encoding=): see the example underfile. Argumentsencoding = "latin1" andencoding = "UTF-8" are ignored with a warning when runningin aMBCS locale.

s

acharacter vector of length1, i.e., a“string”.

Details

parse(....):

Iftext has length greater than zero (after coercion) it is used inpreference tofile.

All versions ofR accept input from a connection with end of linemarked byLF (as used on Unix),CRLF (as used on DOS/Windows)orCR (as used on classic Mac OS). The final line can be incomplete,that is missing the finalEOL marker.

When input is taken from the console,n = NULL is equivalent ton = 1, andn < 0 will read until anEOF character isread. (TheEOF character isCtrl-Z for the Windows front-ends.) Theline-length limit is 4095 bytes when reading from the console (whichmay impose a lower limit: see ‘An Introduction to R’).

The default forsrcfile is set as follows. Ifkeep.source is notTRUE,srcfiledefaults to a character string, either"<text>" or onederived fromfile. Whenkeep.source isTRUE, iftext is used,srcfile will be set to asrcfilecopy containing the text. If a characterstring is used forfile, asrcfile objectreferring to that file will be used.

Whensrcfile is a character string, error messages willinclude the name, but source reference information will not be addedto the result. Whensrcfile is asrcfileobject, source reference information will be retained.

str2expression(s):

for acharacter vectors,str2expression(s) corresponds toparse(text = s, keep.source=FALSE), which is always oftype (typeof) andclassexpression.

str2lang(s):

for acharacter strings,str2lang(s) corresponds toparse(text = s, keep.source=FALSE)[[1]] (plus a checkthat boths and theparse(*) result are of length one)which is typically acall but may also be asymbol akaname,NULL or an atomic constant such as2,1L, orTRUE. Put differently, the value ofstr2lang(.) is a call or one of its parts, in short“a call or simpler”.

Currently, encoding is not handled instr2lang() andstr2expression().

Value

parse() andstr2expression() return an object of type"expression", forparse() with up tonelements if specified as a non-negative integer.

str2lang(s),s a string, returns “acall or simpler”, see the ‘Details:’ section.

Whensrcfile is non-NULL, a"srcref" attributewill be attached to the result containing a list ofsrcref records corresponding to each element, a"srcfile" attribute will be attached containing a copy ofsrcfile, and a"wholeSrcref" attribute will beattached containing asrcref record corresponding toall of the parsed text. Detailed parse information will be stored inthe"srcfile" attribute, to be retrieved bygetParseData.

A syntax error (including an incomplete expression) will throw an error.

Character strings in the result will have a declared encoding ifencoding is"latin1" or"UTF-8", or iftext is supplied with every element of known encoding in aLatin-1 or UTF-8 locale.

Partial parsing

When a syntax error occurs during parsing,parsesignals an error. The partial parse data will be stored in thesrcfile argument if it is asrcfile objectand thetext argument was used to supply the text. In othercases it will be lost when the error is triggered.

The partial parse data can be retrieved usinggetParseData applied to thesrcfile object.Because parsing was incomplete, it will typically include referencesto"parent" entries that are not present.

Note

Usingparse(text = *, ..) or its simplified and hence moreefficient versionsstr2lang() orstr2expression() is atleast an order of magnitude less efficient thancall(..) oras.call().

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Murdoch, D. (2010).“Source References”.The R Journal,2(2), 16–19.doi:10.32614/RJ-2010-010.

See Also

scan,source,eval,deparse.

The source reference information can be used for debugging (seee.g.setBreakpoint) and profiling (seeRprof). It can be examined bygetSrcrefand related functions. More detailed information is available throughgetParseData.

Examples

fil<- tempfile(fileext=".Rdmped")cat("x <- c(1, 4)\n  x ^ 3 -10 ; outer(1:7, 5:9)\n", file= fil)# parse 3 statements from our temp fileparse(file= fil, n=3)unlink(fil)## str2lang(<string>)  || str2expression(<character>) :stopifnot(exprs={  identical( str2lang("x[3] <- 1+4"), quote(x[3]<-1+4))  identical( str2lang("log(y)"),      quote(log(y)))  identical( str2lang("abc"),      quote(abc)-> qa)  is.symbol(qa)&!is.call(qa)# a symbol/name, not a call  identical( str2lang("1.375"),1.375)# just a number, not a call  identical( str2expression(c("# a comment","","42")), expression(42))})# A partial parse with a syntax errortxt<- "x<-1an error"sf<- srcfile("txt")tryCatch(parse(text= txt, srcfile= sf), error=function(e)"Syntax error.")getParseData(sf)

Concatenate Strings

Description

Concatenate vectors after converting to character.Concatenation happens in two basically different ways, determined bycollapse being a string or not.

Usage

paste(..., sep=" ", collapse=NULL, recycle0=FALSE)paste0(...,            collapse=NULL, recycle0=FALSE)

Arguments

...

one or moreR objects, to be converted to character vectors.

sep

a character string to separate the terms. NotNA_character_.

collapse

an optional character string to separate the results. NotNA_character_. Whencollapse is a string,the result is always a string (character of length 1).

recycle0

logical indicating if zero-lengthcharacter arguments should result in the zero-lengthcharacter(0). Note that whencollapse isa string,recycle0 doesnot recycle to zero-length, butto"".

Details

paste converts its arguments (viaas.character) to character strings, and concatenatesthem (separating them by the string given bysep).

If the arguments are vectors, they are concatenated term-by-term to give acharacter vector result. Vector arguments are recycled as needed.Zero-length arguments are recycled as"" unlessrecycle0isTRUE andcollapse isNULL.

Note thatpaste() coercesNA_character_, thecharacter missing value, to"NA" which may seemundesirable, e.g., when pasting two character vectors, or verydesirable, e.g. inpaste("the value of p is ", p).

paste0(..., collapse) is equivalent topaste(..., sep = "", collapse), slightly more efficiently.

If a value is specified forcollapse, the values in the resultare then concatenated into a single string, with the elements beingseparated by the value ofcollapse.

Value

A character vector of the concatenated values. This will be of lengthzero if all the objects are, unlesscollapse is non-NULL, in whichcase it is"" (a single empty string).

If any input into an element of the result is in UTF-8 (and none aredeclared with encoding"bytes", seeEncoding),that element will be in UTF-8, otherwise in the current encoding inwhich case the encoding of the element is declared if the currentlocale is either Latin-1 or UTF-8, at least one of the correspondinginputs (including separators) had a declared encoding and all inputswere either ASCII or declared.

If an input into an element is declared with encoding"bytes",no translation will be done of any of the elements and the resultingelement will have encoding"bytes". Ifcollapse isnon-NULL, this applies also to the second, collapsing, phase, but sometranslation may have been done in pasting object together in the firstphase.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

toString typically callspaste(*, collapse=", ").String manipulation withas.character,substr,nchar,strsplit; further,cat which concatenates andwrites to a file, andsprintf for C like stringconstruction.

plotmath’ for the use ofpaste in plot annotation.

Examples

## When passing a single vector, paste0 and paste work like as.character.paste0(1:12)paste(1:12)# sameas.character(1:12)# same## If you pass several vectors to paste0, they are concatenated in a## vectorized way.(nth<- paste0(1:12, c("st","nd","rd", rep("th",9))))## paste works the same, but separates each input with a space.## Notice that the recycling rules make every input as long as the longest input.paste(month.abb,"is the", nth,"month of the year.")paste(month.abb, letters)## You can change the separator by passing a sep argument## which can be multiple characters.paste(month.abb,"is the", nth,"month of the year.", sep="_*_")## To collapse the output into a single string, pass a collapse argument.paste0(nth, collapse=", ")## For inputs of length 1, use the sep argument rather than collapsepaste("1st","2nd","3rd", collapse=", ")# probably not what you wantedpaste("1st","2nd","3rd", sep=", ")## You can combine the sep and collapse arguments together.paste(month.abb, nth, sep=": ", collapse="; ")## Using paste() in combination with strwrap() can be useful## for dealing with long strings.(title<- paste(strwrap("Stopping distance of cars (ft) vs. speed (mph) from Ezekiel (1930)",    width=30), collapse="\n"))plot(dist~ speed, cars, main= title)## zero length arguments recycled as `""` -- NB: `{}` <==> character(0)  herepaste({},1:2)## 'recycle0 = TRUE' allows standard vectorized behaviour, i.e., zero-length##                   recycling resulting in zero-length result character(0):valid<-FALSEval<- pipaste("The value is", val[valid],"-- not so good!")# ->  ".. value is  -- not .."paste("The value is", val[valid],"-- good: empty!", recycle0=TRUE)# -> character(0)## When 'collapse = <string>',  result is (length 1) string in all casespaste("foo",{},"bar", collapse="|")# |-->  "foo  bar"paste("foo",{},        collapse="|", recycle0=TRUE)# |-->  ""## If all arguments are empty (and collapse a string),   ""  results alwayspaste(    collapse="|")paste(    collapse="|", recycle0=TRUE)paste({}, collapse="|")paste({}, collapse="|", recycle0=TRUE)

Expand File Paths

Description

Expand a path name, for example by replacing a leading tilde by theuser's home directory (if defined on that platform).

Usage

path.expand(path)

Arguments

path

character vector containing one or more path names.

Details

On Unix - alikes:

On most builds ofR a leading~user will expand to the homedirectory ofuser.

There are possibly different concepts of ‘home directory’: thatusually used is the setting of the environment variableHOME.

The ‘path names’ need not exist nor be valid path names butthey do need to be representable in the session encoding.

On Windows:

The definition of the ‘home’ directory is in the ‘rw-FAQ’Q2.14: it is taken from theR_USER environment variable whenpath.expand is first called in a session.

The ‘path names’ need not exist nor be valid path names.

Value

A character vector of possibly expanded path names: where the homedirectory is unknown or none is specified the path is returned unchanged.

If the expansion would exceed the maximum path length the result maybe truncated or the path may be returned unchanged.

See Also

basename,normalizePath,file.path.

Examples

path.expand("~/foo")

Report Configuration Options for PCRE

Description

Report some of the configuration options of the version of PCRE in usein thisR session.

Usage

pcre_config()

Value

A named logical vector, currently with elements

UTF-8

Support for UTF-8 inputs. Required.

Unicode properties

Support for ‘⁠\p{xx}⁠’ and ‘⁠\P{xx}⁠’in regular expressions. Desirable and used by some CRAN packages.As of PCRE2, always present with support for UTF-8.

JIT

Support for just-in-time compilation. Desirable for speed(but only available as a compile-time option on certainarchitectures, and may be unused as unreliable on some of those,e.g.arm64).

stack

Does match recursion use a stack (TRUE, the defaultfor PCRE1 and PCRE2 older than 10.30) or a heap? See the discussion athttps://www.pcre.org/original/doc/html/pcrestack.html (AddedinR 3.4.0.). No longer relevant and alwaysFALSE in PCRE2since version 10.30 which no longer uses function recursion to rememberbacktracking positions.

See Also

extSoftVersion for the PCRE version.

Examples

pcre_config()

Forward Pipe Operator

Description

Pipe a value into a call expression or a function expression.

Usage

lhs|> rhs

Arguments

lhs

expression producing a value.

rhs

a call expression.

Details

A pipe expression passes, or ‘pipes’, the result of the left-hand-sideexpressionlhs to the right-hand-side expressionrhs.

Thelhs isinserted as the first argument in the call. Sox |> f(y) isinterpreted asf(x, y).

To avoid ambiguities, functions inrhs calls may not besyntactically special, such as+ orif.

It is also possible to use a named argument with the placeholder_ in therhs call to specify where thelhs is tobe inserted. The placeholder can only appear once on therhs.

The placeholder can also be used as the first argument in anextraction call, such as_$coef. More generally, it can be usedas the head of a chain of extractions, such as_$coef[[2]],using a sequence of the extraction functions$,[,[[, or@.

Pipe notation allows a nested sequence of calls to be written in a waythat may make the sequence of processing steps easier to follow.

Currently, pipe operations are implemented as syntax transformations.So an expression written asx |> f(y) is parsed asf(x, y). It is worth emphasizing that while the code in a pipeline iswritten sequentially, regular R semantics for evaluation apply andso piped expressions will be evaluated only when first used in therhs expression.

Value

Returns the result of evaluating the transformed expression.

Background

The forward pipe operator is motivated by the pipe introduced in themagrittr package, but is more streamlined. It is similar tothe pipe or pipeline operators introduced in other languages, includingF#, Julia, and JavaScript.

Warning

This was introduced inR 4.1.0. Code using it will not be parsedas intended (probably with an error) in earlier versions ofR.

Examples

# simple uses:mtcars|> head()# same as head(mtcars)mtcars|> head(2)# same as head(mtcars, 2)mtcars|> subset(cyl==4)|> nrow()# same as nrow(subset(mtcars, cyl == 4))# to pass the lhs into an argument other than the first, either# use the _ placeholder with a named argument:mtcars|> subset(cyl==4)|> lm(mpg~ disp, data= _)# or use an anonymous function:mtcars|> subset(cyl==4)|>(function(d) lm(mpg~ disp, data= d))()mtcars|> subset(cyl==4)|>(\(d) lm(mpg~ disp, data= d))()# or explicitly name the argument(s) before the "one":mtcars|> subset(cyl==4)|> lm(formula= mpg~ disp)# using the placeholder as the head of an extraction chain:mtcars|> subset(cyl==4)|> lm(formula= mpg~ disp)|> _$coef[[2]]# the pipe operator is implemented as a syntax transformation:quote(mtcars|> subset(cyl==4)|> nrow())# regular R evaluation semantics applystop()|>(function(...){})()# stop() is not used on RHS so is not evaluated

Generic X-Y Plotting

Description

Generic function for plotting ofR objects.

For simple scatter plots,plot.default will be used.However, there areplot methods for manyR objects,includingfunctions,data.frames,density objects, etc. Usemethods(plot) andthe documentation for these. Most of these methods are implementedusing traditional graphics (thegraphics package), but this isnot mandatory.

For more details about graphical parameter arguments used bytraditional graphics, seepar.

Usage

plot(x, y,...)

Arguments

x

the coordinates of points in the plot. Alternatively, asingle plotting structure, function oranyR object with aplot method can be provided.

y

the y coordinates of points in the plot,optionalifx is an appropriate structure.

...

arguments to be passed to methods, such asgraphical parameters (seepar).Many methods will accept the following arguments:

type

what type of plot should be drawn. Possible types are

  • "p" forpoints,

  • "l" forlines,

  • "b" forboth,

  • "c" for the lines part alone of"b",

  • "o" for both ‘overplotted’,

  • "h" for ‘histogram’ like (or‘high-density’) vertical lines,

  • "s" for stairsteps,

  • "S" for othersteps, see ‘Details’ below,

  • "n" for no plotting.

All othertypes give a warning or an error; using, e.g.,type = "punkte" being equivalent totype = "p" for Scompatibility. Note that some methods,e.g.plot.factor, do not accept this.

main

an overall title for the plot: seetitle.

sub

a subtitle for the plot: seetitle.

xlab

a title for the x axis: seetitle.

ylab

a title for the y axis: seetitle.

asp

they/xy/x aspect ratio,seeplot.window.

Details

The two step types differ in their x-y preference: Going from(x1,y1)(x1,y1) to(x2,y2)(x2,y2) withx1<x2x1 < x2,type = "s"moves first horizontal, then vertical, whereastype = "S" movesthe other way around.

Note

Theplot generic was moved from thegraphics package tothebase package inR 4.0.0. It is currently re-exported fromthegraphics namespace to allow packages importing it from thereto continue working, but this may change in future versions ofR.

See Also

plot.default,plot.formula and othermethods;points,lines,par.For thousands of points, consider usingsmoothScatter()instead ofplot().

For X-Y-Z plotting seecontour,persp andimage.

Examples

require(stats)# for lowess, rpois, rnormrequire(graphics)# for plot methodsplot(cars)lines(lowess(cars))plot(sin,-pi,2*pi)# see ?plot.function## Discrete Distribution Plot:plot(table(rpois(100,5)), type="h", col="red", lwd=10,     main="rpois(100, lambda = 5)")## Simple quantiles/ECDF, see ecdf() {library(stats)} for a better one:plot(x<- sort(rnorm(47)), type="s", main="plot(x, type = \"s\")")points(x, cex=.5, col="dark red")

Partial String Matching

Description

pmatch seeks matches for the elements of its first argumentamong those of its second.

Usage

pmatch(x, table, nomatch=NA_integer_, duplicates.ok=FALSE)

Arguments

x

the values to be matched: converted to a character vector byas.character.Long vectors are supported.

table

the values to be matched against: converted to a charactervector.Long vectors are not supported.

nomatch

the value to be returned at non-matching or multiplypartially matching positions. Note that it is coerced tointeger.

duplicates.ok

should elements intable be used morethan once?

Details

The behaviour differs by the value ofduplicates.ok. Considerfirst the case if this is true. First exact matches are considered,and the positions of the first exact matches are recorded. Then uniquepartial matches are considered, and if found recorded. (A partialmatch occurs if the whole of the element ofx matches thebeginning of the element oftable.) Finally,all remaining elements ofx are regarded as unmatched.In addition, an empty string can match nothing, not even an exactmatch to an empty string. This is the appropriate behaviour forpartial matching of character indices, for example.

Ifduplicates.ok isFALSE, values oftable oncematched are excluded from the search for subsequent matches. Thisbehaviour is equivalent to theR algorithm for argumentmatching, except for the consideration of empty strings (which inargument matching are matched after exact and partial matching to anyremaining arguments).

charmatch is similar topmatch withduplicates.ok true, the differences being that itdifferentiates between no match and an ambiguous partial match, itdoes match empty strings, and it does not allow multiple exact matches.

NA values are treated as if they were the string constant"NA".

Value

An integer vector (possibly includingNA ifnomatch = NA) of the same length asx, giving the indices of theelements intable which matched, ornomatch.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

See Also

match,charmatch andmatch.arg,match.fun,match.call, for function argument matching etc.,startsWith for particular checking of initial matches;grep etc for more general (regexp) matching of strings.

Examples

pmatch("","")# returns NApmatch("m",   c("mean","median","mode"))# returns NApmatch("med", c("mean","median","mode"))# returns 2pmatch(c("","ab","ab"), c("abc","ab"), duplicates.ok=FALSE)pmatch(c("","ab","ab"), c("abc","ab"), duplicates.ok=TRUE)## comparecharmatch(c("","ab","ab"), c("abc","ab"))

Find Zeros of a Real or Complex Polynomial

Description

Find zeros of a real or complex polynomial.

Usage

polyroot(z)

Arguments

z

the vector of polynomial coefficients in increasing order.

Details

A polynomial of degreen1n - 1,

p(x)=z1+z2x++znxn1p(x) = z_1 + z_2 x + \cdots + z_n x^{n-1}

is given by its coefficient vectorz[1:n].polyroot returns then1n-1 complex zeros ofp(x)p(x)using the Jenkins-Traub algorithm.

If the coefficient vectorz has zeroes for the highest powers,these are discarded.

There is no maximum degree, but numerical stabilitymay be an issue for all but low-degree polynomials.

Value

A complex vector of lengthn1n - 1, wherenn is the positionof the largest non-zero element ofz.

Source

C translation by Ross Ihaka of Fortran code in the reference, withmodifications by the R Core Team.

References

Jenkins, M. A. and Traub, J. F. (1972).Algorithm 419: zeros of a complex polynomial.Communications of the ACM,15(2), 97–99.doi:10.1145/361254.361262.

See Also

uniroot for numerical root finding of arbitraryfunctions;complex and thezero example in the demosdirectory.

Examples

polyroot(c(1,2,1))round(polyroot(choose(8,0:8)),11)# guess what!for(n1in1:4) print(polyroot(1:n1), digits=4)polyroot(c(1,2,1,0,0))# same as the first

Convert Positions in the Search Path to Environments

Description

Returns the environment at a specified position in the search path.

Usage

pos.to.env(x)

Arguments

x

an integer between1 andlength(search()), the lengthof the search path, or-1.

Details

SeveralR functions for manipulating objects in environments (such asget andls) allow specifying environmentsvia corresponding positions in the search path.pos.to.env isa convenience function for programmers which converts these positionsto corresponding environments; users will typically have no need forit. It isprimitive.

-1 is interpreted as the environment the function is calledfrom.

This is aprimitive function.

Examples

pos.to.env(1)# R_GlobalEnv# the next returns the base environmentpos.to.env(length(search()))

Pretty Breakpoints

Description

Compute a sequence of aboutn+1 equally spaced ‘round’values which cover the range of the values inx.The values are chosen so that they are 1, 2 or 5 times a power of 10.

Usage

pretty(x,...)## Default S3 method:pretty(x, n=5, min.n= n%/%3,  shrink.sml=0.75,       high.u.bias=1.5, u5.bias=.5+1.5*high.u.bias,       eps.correct=0, f.min=2^-20,...).pretty(x, n=5L, min.n= n%/%3,  shrink.sml=0.75,       high.u.bias=1.5, u5.bias=.5+1.5*high.u.bias,       eps.correct=0L, f.min=2^-20, bounds=TRUE)

Arguments

x

an object coercible to numeric byas.numeric.

n

integer giving thedesired number ofintervals. Non-integer values are rounded down.

min.n

nonnegative integer giving theminimal number ofintervals. Ifmin.n == 0,pretty(.) may return asingle value.

shrink.sml

positive number, a factor (smaller than one)by which a default scale is shrunk in the case whenrange(x) is very small (usually 0).

high.u.bias

non-negative numeric, typically>1> 1.The interval unit is determined as {1,2,5,10} timesb, apower of 10. Largerhigh.u.bias values favor larger units.

u5.bias

non-negative numericmultiplier favoring factor 5 over 2. Default and ‘optimal’:u5.bias = .5 + 1.5*high.u.bias.

eps.correct

integer code, one of {0,1,2}. If non-0, anepsilon correction is made at the boundaries such thatthe result boundaries will be outsiderange(x); in thesmall case, the correction is only done ifeps.correct >= 2.

f.min

positive factor multiplied by.Machine$double.xminto get the smallest “acceptable”cellcmc_m whichdetermines theunit of the algorithm. Smallercellvalues are set tocnc_n signalling awarning aboutbeing “corrected”.New fromR 4.2.0,: previouslyf.min = 20 washardcoded in the algorithm.

bounds

alogical indicating if the resulting vectorshouldcover the fullrange(x), i.e., strictly includethe bounds ofx. New fromR 4.2.0, allowingbound=FALSEto reproduce howR's graphics engine computes axis tick locations (inGEPretty()).

...

further arguments for methods.

Details

pretty ignores non-finite values inx.

Letd <- max(x) - min(x)0\ge 0.Ifd is not (very close) to 0, we letc <- d/n,otherwise more or lessc <- max(abs(range(x)))*shrink.sml / min.n.Then, the10 baseb is10log10(c)10^{\lfloor{\log_{10}(c)}\rfloor} suchthatbc<10bb \le c < 10b.

Now determine the basicunituu as one of{1,2,5,10}b\{1,2,5,10\} b, depending onc/b[1,10)c/b \in [1,10)and the two ‘bias’ coefficients,h=h =high.u.bias andf=f =u5.bias.

.........

Value

pretty() returns an numeric vector ofapproximatelyn increasing numbers which are “pretty” in decimal notation.(in extreme range cases, the numbers can no longer be “pretty”given the other constraints; e.g., forpretty(..)

For ease of investigating the underlying CR_pretty()function,.pretty() returns a namedlist. Bydefault, whenbounds=TRUE, the entries arel,u,andn, whereas forbounds=FALSE, they arens,nu,n, and (a “pretty”)unitwhere then*'s are integer valued (but onlyn is of classinteger). Programmers may use this to create prettysequence (iterator) objects.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

axTicks for the computation of pretty axis ticklocations in plots, particularly on the log scale.

Examples

pretty(1:15)# 0  2  4  6  8 10 12 14 16pretty(1:15, high.u.bias=2)# 0  5 10 15pretty(1:15, n=4)# 0  5 10 15pretty(1:15*2)# 0  5 10 15 20 25 30pretty(1:20)# 0  5 10 15 20pretty(1:20, n=2)# 0 10 20pretty(1:20, n=10)# 0  2  4 ... 20for(kin5:11){  cat("k=", k,": "); print(diff(range(pretty(100+ c(0, pi*10^-k)))))}##-- more bizarre, when  min(x) == max(x):pretty(pi)add.names<-function(v){ names(v)<- paste(v); v}utils::str(lapply(add.names(-10:20), pretty))## min.n = 0  returns a length-1 vector "if pretty":utils::str(lapply(add.names(0:20),  pretty, min.n=0))sapply(    add.names(0:20),   pretty, min.n=4)pretty(1.234e100)pretty(1001.1001)pretty(1001.1001, shrink.sml=0.2)for(kin-7:3)  cat("shrink=", formatC(2^k, width=9),":",      formatC(pretty(1001.1001, shrink.sml=2^k), width=6),"\n")

Look Up a Primitive Function

Description

.Primitive looks up by name a ‘primitive’(internally implemented) function.

Usage

.Primitive(name)

Arguments

name

name of theR function.

Details

The advantage of.Primitive over.Internalfunctions is the potential efficiency of argument passing, and thatpositional matching can be used where desirable, e.g. inswitch. For more details, see the ‘R Internals’manual.

All primitive functions are in the base namespace.

This function is almost never used:`name` or, more carefully,get(name, envir = baseenv()) work equally well and donot depend on knowing which functions are primitive (which does changeasR evolves).

See Also

is.primitive showing that primitive functions come intwo types (typeof),.Internal.

Examples

mysqrt<- .Primitive("sqrt")c.Internal# this one *must* be primitive!`if`# need backticks

Print Values

Description

print prints its argument and returns itinvisibly (viainvisible(x)). It is a generic function which means thatnew printing methods can be easily added for newclasses.

Usage

print(x,...)## S3 method for class 'factor'print(x, quote=FALSE, max.levels=NULL,      width= getOption("width"),...)## S3 method for class 'table'print(x, digits= getOption("digits"), quote=FALSE,      na.print="", zero.print="0",      right= is.numeric(x)|| is.complex(x),      justify="none",...)## S3 method for class 'function'print(x, useSource=TRUE,...)

Arguments

x

an object used to select a method.

...

further arguments passed to or from other methods.

quote

logical, indicating whether or not strings should beprinted with surrounding quotes.

max.levels

integer, indicating how many levels should beprinted for a factor; if0, no extra "Levels" line will beprinted. The default,NULL, entails choosingmax.levelssuch that the levels print on one line of widthwidth.

width

only used whenmax.levels is NULL, see above.

digits

minimal number ofsignificant digits, seeprint.default.

na.print

character string (orNULL) indicatingNA values in printed output, seeprint.default.

zero.print

character specifying how zeros (0) should beprinted; for sparse tables, using"." can produce morereadable results, similar to printing sparse matrices inMatrix.

right

logical, indicating whether or not strings should beright aligned.

justify

character indicating if strings should left- orright-justified or left alone, passed toformat.

useSource

logical indicating if internally stored sourceshould be used for printing when present, e.g., ifoptions(keep.source = TRUE) has been in use.

Details

The default method,print.default has its own help page.Usemethods("print") to get all the methods for theprint generic.

print.factor allows some customization and is used for printingordered factors as well.

print.table for printingtables allows othercustomization. As of R 3.0.0, it only prints a description in case of a tablewith 0-extents (this can happen if a classifier has no valid data).

Seenoquote as an example of a class whose mainpurpose is a specificprint method.

References

Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.

See Also

The default methodprint.default, and help for themethods above; furtheroptions,noquote.

For more customizable (but cumbersome) printing, seecat,format or alsowrite.For a simple prototypical print method, see.print.via.format in packagetools.

Examples

require(stats)ts(1:20)#-- print is the "Default function" --> print.ts(.) is calledfor(iin1:3) print(1:i)## Printing of factorsattenu$station## 117 levels -> 'max.levels' depending on width## ordered factors: levels  "l1 < l2 < .."esoph$agegp[1:12]esoph$alcgp[1:12]## Printing of sparse (contingency) tablesset.seed(521)t1<- round(abs(rt(200, df=1.8)))t2<- round(abs(rt(200, df=1.4)))table(t1, t2)# simpleprint(table(t1, t2), zero.print=".")# nicer to read## same for non-integer "table":T<- table(t2,t1)T<- T*(1+round(rlnorm(length(T)))/4)print(T, zero.print=".")# quite nicer,print.table(T[,2:8]*1e9, digits=3, zero.print=".")## still slightly inferior to  Matrix::Matrix(T)  for larger T## Corner cases with empty extents:table(1,NA)# < table of extent 1 x 0 >

Printing Data Frames

Description

Print a data frame.

Usage

## S3 method for class 'data.frame'print(x,..., digits=NULL,      quote=FALSE, right=TRUE, row.names=TRUE, max=NULL)

Arguments

x

object of classdata.frame.

...

optional arguments toprint methods.

digits

the minimum number of significant digits to be used: seeprint.default.

quote

logical, indicating whether or not entries should beprinted with surrounding quotes.

right

logical, indicating whether or not strings should beright-aligned. The default is right-alignment.

row.names

logical (or character vector), indicating whether (orwhat) row names should be printed.

max

numeric orNULL, specifying the maximal number ofentries to be printed. By default, whenNULL,getOption("max.print") used.

Details

This callsformat which formats the data framecolumn-by-column, then converts to a character matrix and dispatchesto theprint method for matrices.

Whenquote = TRUE only the entries are quoted not the row namesnor the column names.

See Also

data.frame.

Examples

(dd<- data.frame(x=1:8, f= gl(2,4), ch= I(letters[1:8])))# print() with defaultsprint(dd, quote=TRUE, row.names=FALSE)# suppresses row.names and quotes all entries

Default Printing

Description

print.default is thedefault method of the genericprint function which prints its argument.

Usage

## Default S3 method:print(x, digits=NULL, quote=TRUE,      na.print=NULL, print.gap=NULL, right=FALSE,      max=NULL, width=NULL, useSource=TRUE,...)

Arguments

x

the object to be printed.

digits

a non-null value fordigits specifies the minimumnumber of significant digits to be printed in values. The default,NULL, usesgetOption("digits"). (For theinterpretation for complex numbers seesignif.)Non-integer values will be rounded down, and only valuesgreater than or equal to 1 and no greater than 22 are accepted.

quote

logical, indicating whether or not strings(characters) should be printed with surrounding quotes.

na.print

a character string which is used to indicateNA values in printed output, orNULL(see ‘Details’).

print.gap

a non-negative integer1024\le 1024,orNULL (meaning 1), giving the spacing between adjacentcolumns in printed vectors, matrices and arrays.

right

logical, indicating whether or not strings should beright aligned. The default is left alignment.

max

a non-null value formax specifies the approximatemaximum number of entries to be printed. The default,NULL,usesgetOption("max.print"): see that help page for moredetails.

width

controls the maximum number of columns on a line used inprinting vectors, matrices, etc. The default,NULL, usesgetOption("width"): see that help page for moredetails including allowed values.

useSource

logical, indicating whether to use sourcereferences or copies rather than deparsinglanguage objects.The default is to use the original source if it is available.

...

further arguments to be passed to or from othermethods. They are ignored in this function.

Details

The default for printingNAs is to printNA (withoutquotes) unless this is a characterNAandquote = FALSE, when ‘⁠<NA>⁠’ is printed.

The same number of decimal places is used throughout a vector. Thismeans thatdigits specifies the minimum number of significantdigits to be used, and that at least one entry will be encoded withthat minimum number. However, if all the encoded elements then havetrailing zeroes, the number of decimal places is reduced until atleast one element has a non-zero final digit. Decimal points are onlyincluded if at least one decimal place is selected.

You can suppress “exponential” /scientific notation inprinting of numbers (atomic vectorsx),viaformat(., scientific=FALSE), see theprI() examplebelow, or also by increasing global optionscipen, e.g.,options(scipen = 12).

Attributes are printed respecting their class(es), using the values ofdigits toprint.default, but using the default values(for the methods called) of the other arguments.

Optionwidth controls the printing of vectors, matrices andarrays, and optiondeparse.cutoff controls the printing oflanguage objects such as calls and formulae.

When themethods package is attached,print will callshow forR objects with formal classes (‘S4’)if called with no optional arguments.

Large number of digits

Note that for large values ofdigits, currently fordigits >= 16, the calculation of the number of significantdigits will depend on the platform's internal (C library)implementation of ‘⁠sprintf()⁠’ functionality.

Single-byte locales

If a non-printable character is encountered during output, it isrepresented as one of the ANSI escape sequences (‘⁠\a⁠’, ‘⁠\b⁠’,‘⁠\f⁠’, ‘⁠\n⁠’, ‘⁠\r⁠’, ‘⁠\t⁠’, ‘⁠\v⁠’, ‘⁠\\⁠’ and‘⁠\0⁠’: seeQuotes), or failing that as a 3-digit octalcode: for example the UK currency pound sign in the C locale (ifimplemented correctly) is printed as ‘⁠\243⁠’. Which charactersare non-printable depends on the locale.(Because some versions of Windows get this wrong, all bytes with theupper bit set are regarded as printable on Windows in a single-bytelocale.)

Unicode and other multi-byte locales

In all locales, the characters in the ASCII range (‘⁠0x00⁠’ to‘⁠0x7f⁠’) are printed in the same way, as-is if printable, otherwisevia ANSI escape sequences or 3-digit octal escapes as described forsingle-byte locales. Whether a character is printable depends on thecurrent locale and the operating system (C library).

Multi-byte non-printing characters are printed as an escape sequenceof the form ‘⁠\uxxxx⁠’ or ‘⁠\Uxxxxxxxx⁠’ (in hexadecimal).This is the internal code for the wide-character representation of thecharacter. If this is not known to be Unicode code points, a warning isissued. The only known exceptions are certain Japanese ISO 2022locales on commercial Unixes, which use a concatenation of the bytes:it is unlikely thatR compiles on such a system.

It is possible to have a character string in a character vector thatis not valid in the current locale. If a byte is encountered that isnot part of a valid character it is printed in hex in the form‘⁠\xab⁠’ and this is repeated until the start of a valid character.(This will rapidly recover from minor errors in UTF-8.)

See Also

The genericprint,options.The"noquote" class and print method.

encodeString, which encodes a character vector the wayit would be printed.

Examples

piprint(pi, digits=16)LETTERS[1:16]print(LETTERS, quote=FALSE)M<- cbind(I=1, matrix(1:10000, ncol=10,                         dimnames= list(NULL, LETTERS[1:10])))utils::head(M)# makes more sense thanprint(M, max=1000)# prints 90 rows and a message about omitting 910(x<-2^seq(-8,30, by=1/4))# auto-prints; by default all in "exponential" formatprI<-function(x) noquote(format(x, scientific=FALSE))prI(x)# prints more "nicely" (using a bit more space)

Print Matrices, Old-style

Description

An earlier method for printing matrices, provided for S compatibility.

Usage

prmatrix(x, rowlab=, collab=,         quote=TRUE, right=FALSE, na.print=NULL,...)

Arguments

x

numeric or character matrix.

rowlab,collab

(optional) character vectors giving row or columnnames respectively. By default, these are taken fromdimnames(x).

quote

logical; ifTRUE andx is of mode"character",quotes (‘⁠"⁠’) are used.

right

ifTRUE andx is of mode"character", the output columns areright-justified.

na.print

howNAs are printed. If this is non-null, itsvalue is used to representNA.

...

arguments forprint methods.

Details

prmatrix is an earlier form ofprint.matrix, andis very similar to the S function of the same name.

Value

Invisibly returns its argument,x.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

print.default, and otherprint methods.

Examples

prmatrix(m6<- diag(6), rowlab= rep("",6), collab= rep("",6))chm<- matrix(scan(system.file("help","AnIndex", package="splines"),                   what=""),,2, byrow=TRUE)chm# uses print.matrix()prmatrix(chm, collab= paste("Column",1:3), right=TRUE, quote=FALSE)

Running Time of R

Description

proc.time determines how much real and CPU time (in seconds)the currently runningR process has already taken.

Usage

proc.time()

Details

proc.time returns five elements for backwards compatibility,but itsprint method prints a named vector oflength 3. The first two entries are the total user and system CPUtimes of the currentR process and any child processes on which ithas waited, and the third entry is the ‘real’ elapsed timesince the process was started.

Value

An object of class"proc_time" which is a numeric vector oflength 5, containing the user, system, and total elapsed times for thecurrently runningR process, and the cumulative sum of user andsystem times of any child processes spawned by it on which it haswaited. (Theprint method uses thesummary method tocombine the child times with those of the main process.)

The definition of ‘user’ and ‘system’ times is from yourOS. Typically it is something like

The ‘user time’ is the CPU time charged for the executionof user instructions of the calling process. The ‘system time’is the CPU time charged for execution by the system on behalf of thecalling process.

Times of child processes are not available on Windows and will alwaysbe given asNA.

The resolution of the times will be system-specific and on Unix-alikestimes are rounded down to milliseconds. On modern systems they willbe that accurate, but on older systems they might be accurate to 1/100or 1/60 sec. They are typically available to 10ms on Windows.

This is aprimitive function.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

system.time for timing anR expression,gc.time for how much of the time was spent in garbagecollection.

setTimeLimit tolimit the CPU or elapsed time forthe session or an expression.

Examples

## a way to time an R expression: system.time is preferredptm<- proc.time()for(iin1:50) mad(stats::runif(500))proc.time()- ptm

Product of Vector Elements

Description

prod returns the product of all the valuespresent in its arguments.

Usage

prod(..., na.rm=FALSE)

Arguments

...

numeric or complex or logical vectors.

na.rm

logical. Should missing values be removed?

Details

Ifna.rm isFALSE anNAvalue in any of the arguments will causea value ofNA to be returned, otherwiseNA values are ignored.

This is a generic function: methods can be defined for itdirectly or via theSummary group generic.For this to work properly, the arguments... should beunnamed, and dispatch is on the first argument.

Logical true values are regarded as one, false values as zero.For historical reasons,NULL is accepted and treated as if itwerenumeric(0).

Value

The product, a numeric (of type"double") or complex vector of length one.NB: the product of an empty set is one, by definition.

S4 methods

This is part of the S4Summarygroup generic. Methods for it must use the signaturex, ..., na.rm.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

sum,cumprod,cumsum.

plotmath’ for the use ofprod in plot annotation.

Examples

print(prod(1:7))== print(gamma(8))

Express Table Entries as Fraction of Marginal Table

Description

Returns conditional proportions givenmargins, i.e.,entries ofx, divided by the appropriate marginal sums.

Usage

proportions(x, margin=NULL)prop.table(x, margin=NULL)

Arguments

x

an array, usually atable.

margin

a vector giving the margins to split by.E.g., for a matrix1 indicates rows,2 indicatescolumns,c(1, 2) indicates rows and columns.Whenx has nameddimnames, it can be a character vectorselecting dimension names.

Value

A table or array likex, expressed relative tomargin.

Note

prop.table is an earlier name, retained for back-compatibility.

Author(s)

Peter Dalgaard

See Also

marginSums.

apply andsweep are more generalmechanisms for sweeping out marginal statistics.

Examples

m<- matrix(1:4,2)mproportions(m,1)DF<- as.data.frame(UCBAdmissions)tbl<- xtabs(Freq~ Gender+ Admit, DF)tblproportions(tbl,"Gender")

Push Text Back on to a Connection

Description

Functions to push back text lines onto aconnection, and to enquirehow many lines are currently pushed back.

Usage

pushBack(data, connection, newLine=TRUE,         encoding= c("","bytes","UTF-8"))pushBackLength(connection)clearPushBack(connection)

Arguments

data

a character vector.

connection

aconnection.

newLine

logical. If true, a newline is appended to each stringpushed back.

encoding

character string, partially matched. See details.

Details

Several character strings can be pushed back on one or more occasions.The occasions form a stack, so the first line to be retrieved will bethe first string from the last call topushBack. Lines whichare pushed back are read prior to the normal input from theconnection, by the normal text-reading functions such asreadLines andscan.

Pushback is only allowed for readable connections in text mode.

Not all uses of connections respect pushbacks, in particular the inputconnection is still wired directly, so for example parsingcommands from the console andscan("") ignore pushbacks onstdin.

When character strings with a marked encoding (seeEncoding) are pushed back they are converted to thecurrent encoding ifencoding = "". This may involverepresenting characters as ‘⁠<U+xxxx>⁠’ if they cannot beconverted. They will be converted to UTF-8 ifencoding = "UTF-8" or left as-is ifencoding = "bytes".

Value

pushBack andclearPushBack() return nothing, invisibly.

pushBackLength returns the number of lines currently pushed back.

See Also

connections,readLines.

Examples

zz<- textConnection(LETTERS)readLines(zz,2)pushBack(c("aa","bb"), zz)pushBackLength(zz)readLines(zz,1)pushBackLength(zz)readLines(zz,1)readLines(zz,1)close(zz)

The QR Decomposition of a Matrix

Description

qr computes the QR decomposition of a matrix.

Usage

qr(x,...)## Default S3 method:qr(x, tol=1e-07, LAPACK=FALSE,...)qr.coef(qr, y)qr.qy(qr, y)qr.qty(qr, y)qr.resid(qr, y)qr.fitted(qr, y, k= qr$rank)qr.solve(a, b, tol=1e-7)## S3 method for class 'qr'solve(a, b,...)is.qr(x)as.qr(x)

Arguments

x

a numeric or complex matrix whose QR decomposition is to becomputed. Logical matrices are coerced to numeric.

tol

the tolerance for detecting linear dependencies in thecolumns ofx. Only used ifLAPACK is false andx is real.

qr

a QR decomposition of the type computed byqr.

y,b

a vector or matrix of right-hand sides of equations.

a

a QR decomposition or (qr.solve only) a rectangular matrix.

k

effective rank.

LAPACK

logical. For realx, if true use LAPACKotherwise use LINPACK (the default).

...

further arguments passed to or from other methods.

Details

The QR decomposition plays an important role in manystatistical techniques. In particular it can be used to solve theequationAx=b\bold{Ax} = \bold{b} for given matrixA\bold{A},and vectorb\bold{b}. It is useful for computing regressioncoefficients and in applying the Newton-Raphson algorithm.

The functionsqr.coef,qr.resid, andqr.fittedreturn the coefficients, residuals and fitted values obtained whenfittingy to the matrix with QR decompositionqr.(If pivoting is used, some of the coefficients will beNA.)qr.qy andqr.qty returnQ %*% y andt(Q) %*% y, whereQ is the (complete)Q\bold{Q} matrix.

All the above functions keepdimnames (andnames) ofx andy if there are any.

solve.qr is the method forsolve forqr objects.qr.solve solves systems of equations via the QR decomposition:ifa is a QR decomposition it is the same assolve.qr,but ifa is a rectangular matrix the QR decomposition iscomputed first. Either will handle over- and under-determinedsystems, providing a least-squares fit if appropriate.

is.qr returnsTRUE ifx is alistandinherits from"qr".

It is not possible to coerce objects to mode"qr". Objectseither are QR decompositions or they are not.

The LINPACK interface is restricted to matricesx with lessthan2312^{31} elements.

qr.fitted andqr.resid only support the LINPACK interface.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.

Value

The QR decomposition of the matrix as computed by LINPACK(*) or LAPACK.The components in the returned value correspond directlyto the values returned by DQRDC(2)/DGEQP3/ZGEQP3.

qr

a matrix with the same dimensions asx.The upper triangle contains theR\bold{R} of the decompositionand the lower triangle contains information on theQ\bold{Q} ofthe decomposition (stored in compact form). Note that the storageused by DQRDC and DGEQP3 differs.

qraux

a vector of lengthncol(x) which containsadditional information onQ\bold{Q}.

rank

the rank ofx as computed by the decomposition(*):always full rank in the LAPACK case.

pivot

information on the pivoting strategy used duringthe decomposition.

Non-complex QR objects computed by LAPACK have the attribute"useLAPACK" with valueTRUE.

*)dqrdc2 instead of LINPACK's DQRDC

In the (default) LINPACK case (LAPACK = FALSE),qr()uses amodified version of LINPACK's DQRDC, called‘dqrdc2’. It differs by using the tolerancetolfor a pivoting strategy which moves columns with near-zero 2-norm tothe right-hand edge of the x matrix. This strategy means thatsequential one degree-of-freedom effects can be computed in a naturalway.

Note

To compute the determinant of a matrix (do youreally need it?),the QR decomposition is much more efficient than using eigenvalues(eigen). Seedet.

Using LAPACK (including in the complex case) uses column pivoting anddoes not attempt to detect rank-deficient matrices.

Source

Forqr, the LINPACK routineDQRDC (but modified todqrdc2(*)) and the LAPACKroutinesDGEQP3 andZGEQP3. Further LINPACK and LAPACKroutines are used forqr.coef,qr.qy andqr.aty.

LAPACK and LINPACK are fromhttps://netlib.org/lapack/ andhttps://netlib.org/linpack/ and their guides are listedin the references.

References

Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W. (1978)LINPACK Users Guide. Philadelphia: SIAM Publications.

See Also

qr.Q,qr.R,qr.X forreconstruction of the matrices.lm.fit,lsfit,eigen,svd.

det (usingqr) to compute the determinant of a matrix.

Examples

hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}h9<- hilbert(9); h9qr(h9)$rank#--> only 7qrh9<- qr(h9, tol=1e-10)qrh9$rank#--> 9##-- Solve linear equation system  H %*% x = y :y<-1:9/10x<- qr.solve(h9, y, tol=1e-10)# or equivalently :x<- qr.coef(qrh9, y)#-- is == but much better than#-- solve(h9) %*% yh9%*% x# = y## overdetermined systemA<- matrix(runif(12),4)b<-1:4qr.solve(A, b)# or solve(qr(A), b)solve(qr(A, LAPACK=TRUE), b)# this is a least-squares solution, cf. lm(b ~ 0 + A)## underdetermined systemA<- matrix(runif(12),3)b<-1:3qr.solve(A, b)solve(qr(A, LAPACK=TRUE), b)# solutions will have one zero, not necessarily the same one

Reconstruct the Q, R, or X Matrices from a QR Object

Description

Returns the original matrix from which the object was constructed orthe components of the decomposition.

Usage

qr.X(qr, complete=FALSE, ncol=)qr.Q(qr, complete=FALSE, Dvec=)qr.R(qr, complete=FALSE)

Arguments

qr

object representing a QR decomposition. This willtypically have come from a previous call toqr orlsfit.

complete

logical expression of length 1. Indicates whether anarbitrary orthogonal completion of theQ\bold{Q} orX\bold{X} matrices is to be made, or whether theR\bold{R}matrix is to be completed by binding zero-value rows beneath thesquare upper triangle.

ncol

integer in the range1:nrow(qr$qr). The numberof columns to be in the reconstructedX\bold{X}. The defaultwhencomplete isFALSE is the firstmin(ncol(X), nrow(X)) columns of the originalX\bold{X}from which the qr object was constructed. The default whencomplete isTRUE is a square matrix with the originalX\bold{X} in the firstncol(X) columns and an arbitraryorthogonal completion (unitary completion in the complex case) inthe remaining columns.

Dvec

vector (not matrix) of diagonal values. Each column ofthe returnedQ\bold{Q} will be multiplied by the correspondingdiagonal value. Defaults to all1s.

Value

qr.X returnsX\bold{X}, the original matrix fromwhich the qr object was constructed, providedncol(X) <= nrow(X).Ifcomplete isTRUE or the argumentncol is greater thanncol(X), additional columns from an arbitrary orthogonal(unitary) completion ofX are returned.

qr.Q returns part or all ofQ, the orthogonal (unitary)transformation of ordernrow(X) represented byqr. Ifcomplete isTRUE,Q hasnrow(X) columns.Ifcomplete isFALSE,Q hasncol(X)columns. WhenDvec is specified, each column ofQ ismultiplied by the corresponding value inDvec.

Note thatqr.Q(qr, *) is a special case ofqr.qy(qr, y) (with a “diagonal”y), andqr.X(qr, *) is basicallyqr.qy(qr, R) (apart frompivoting anddimnames setting).

qr.R returnsR. This may be pivoted, e.g., ifa <- qr(x) thenx[, a$pivot] =QR. The number ofrows ofR is eithernrow(X) orncol(X) (and maydepend on whethercomplete isTRUE orFALSE).

See Also

qr,qr.qy.

Examples

p<- ncol(x<- LifeCycleSavings[,-1])# not the 'sr'qrstr<- qr(x)# dim(x) == c(n,p)qrstr$ rank# = 4 = pQ<- qr.Q(qrstr)# dim(Q) == dim(x)R<- qr.R(qrstr)# dim(R) == ncol(x)X<- qr.X(qrstr)# X == xrange(X- as.matrix(x))# ~ < 6e-12## X == Q %*% R if there has been no pivoting, as here:all.equal(unname(X),          unname(Q%*% R))# example of pivotingx<- cbind(int=1,           b1= rep(1:0, each=3), b2= rep(0:1, each=3),           c1= rep(c(1,0,0),2), c2= rep(c(0,1,0),2), c3= rep(c(0,0,1),2))x# is singular, columns "b2" and "c3" are "extra"a<- qr(x)zapsmall(qr.R(a))# columns are int b1 c1 c2 b2 c3a$pivotpivI<- sort.list(a$pivot)# the inverse permutationall.equal(x,            qr.Q(a)%*% qr.R(a))# no, nostopifnot( all.equal(x[, a$pivot], qr.Q(a)%*% qr.R(a)),# TRUE all.equal(x, qr.Q(a)%*% qr.R(a)[, pivI]))# TRUE too!

Terminate an R Session

Description

The functionquit or its aliasq terminate the currentR session.

Usage

quit(save="default", status=0, runLast=TRUE)   q(save="default", status=0, runLast=TRUE)

Arguments

save

a character string indicating whether the environment(workspace) should be saved, one of"no","yes","ask" or"default".

status

the (numerical) error status to be returned to theoperating system, where relevant. Conventionally0indicates successful completion.

runLast

should.Last() be executed?

Details

save must be one of"no","yes","ask" or"default". In the first case the workspaceis not saved, in the second it is saved and in the third the user isprompted and can also decidenot to quit. The default is toask in interactive use but may be overridden by command-linearguments (which must be supplied in non-interactive use).

Immediatelybefore normal termination,.Last() isexecuted if the function.Last exists andrunLast istrue. If in interactive use there are errors in the.Lastfunction, control will be returned to the command prompt, so do testthe function thoroughly. There is a system analogue,.Last.sys(), which is run after.Last() ifrunLast is true.

Exactly what happens at termination of anR session depends on theplatform and GUI interface in use. A typical sequence is to run.Last() and.Last.sys() (unlessrunLast isfalse), to save the workspace if requested (and in most cases alsoto save the session history: seesavehistory), thenrun any finalizers (seereg.finalizer) that have beenset to be run on exit, close all open graphics devices, remove thesession temporary directory and print any remaining warnings(e.g., from.Last() and device closure).

Some error status values are used byR itself. The default errorhandler for non-interactive use effectively callsq("no", 1, FALSE) and returns error status 1. Error status 2 is used forR‘suicide’, that is a catastrophic failure, and other smallnumbers are used by specific ports for initialization failures. Itis recommended that users choose statuses of 10 or more.

Valid values ofstatus are system-dependent, but0:255are normally valid. (Many OSes will report the last byte of thevalue, that is report the value modulo 256. But not all.)

Warning

The value of.Last is for the end user to control: asit can be replaced later in the session, it cannot safely be usedprogrammatically, e.g. by a package. The other way to set code to be runat the end of the session is to use afinalizer: seereg.finalizer.

Note

TheR.app GUI on macOS has its own version of these functionswith slightly different behaviour for thesave argument (theGUI's ‘Startup’ preferences for this action are taken into account).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

.First for setting things on startup.

Examples

## Not run: ## Unix-flavour example.Last<-function(){  graphics.off()# close devices before printing  cat("Now sending PDF graphics to the printer:\n")  system("lpr Rplots.pdf")  cat("bye bye...\n")}quit("yes")## End(Not run)

Quotes

Description

Descriptions of the various uses of quoting inR.

Details

Three types of quotes are part of the syntax ofR: single and doublequotation marks and the backtick (or back quote, ‘⁠`⁠’). Inaddition, backslash is used to escape the following characterinside character constants.

Character constants

Single and double quotes delimit character constants. They can be usedinterchangeably but double quotes are preferred (and characterconstants are printed using double quotes), so single quotes arenormally only used to delimit character constants containing doublequotes.

Backslash is used to start an escape sequence inside characterconstants. Escaping a character not in the following table is anerror.

Single quotes need to be escaped by backslash in single-quotedstrings, and double quotes in double-quoted strings.

⁠\n⁠ newline (aka ‘line feed’)
⁠\r⁠ carriage return
⁠\t⁠ tab
⁠\b⁠ backspace
⁠\a⁠ alert (bell)
⁠\f⁠ form feed
⁠\v⁠ vertical tab
⁠\\⁠ backslash ‘⁠\⁠
⁠\'⁠ ASCII apostrophe ‘⁠'⁠
⁠\"⁠ ASCII quotation mark ‘⁠"⁠
⁠\`⁠ ASCII grave accent (backtick) ‘⁠`⁠
⁠\nnn⁠ character with given octal code (1, 2 or 3 digits)
⁠\xnn⁠ character with given hex code (1 or 2 hex digits)
⁠\unnnn⁠ Unicode character with given code (1--4 hex digits)
⁠\Unnnnnnnn⁠ Unicode character with given code (1--8 hex digits)

Alternative forms for the last two are ‘⁠\u{nnnn}⁠’ and‘⁠\U{nnnnnnnn}⁠’. All except the Unicode escape sequences arealso supported when reading character strings byscanandread.table ifallowEscapes = TRUE. Unicodeescapes can be used to enter Unicode characters not in the currentlocale'scharset (when the string will be stored internally in UTF-8).The maximum allowed value for ‘⁠\nnn⁠’ is ‘⁠\377⁠’ (the samecharacter as ‘⁠\xff⁠’).

As fromR 4.1.0 the largest allowed ‘⁠\U⁠’ value is‘⁠\U10FFFF⁠’, the maximum Unicode point.

The parser does not allow the use of both octal/hex and Unicodeescapes in a single string.

These forms will also be used byprint.defaultwhen outputting non-printable characters (including backslash).

EmbeddedNULs are not allowed in character strings, so using escapes(such as ‘⁠\0⁠’) for aNUL will result in the string beingtruncated at that point (usually with a warning).

Raw character constants are also available using a syntax similar tothe one used in C++:r"(...)" with... any charactersequence, except that it must not contain the closing sequence‘⁠)"⁠’. The delimiter pairs[] and{} can also beused, andR can be used in place ofr. For additionalflexibility, a number of dashes can be placed between the opening quoteand the opening delimiter, as long as the same number of dashes appearbetween the closing delimiter and the closing quote.

Names and Identifiers

Identifiers consist of a sequence of letters, digits, the period(.) and the underscore. They must not start with a digit norunderscore, nor with a period followed by a digit.Reservedwords are not valid identifiers.

The definition of aletter depends on the current locale, butonly ASCII digits are considered to be digits.

Such identifiers are also known assyntactic names and may be useddirectly inR code. Almost always, other names can be usedprovided they are quoted. The preferred quote is the backtick(‘⁠`⁠’), anddeparse will normally use it, but undermany circumstances single or double quotes can be used (as a characterconstant will often be converted to a name). One place wherebackticks may be essential is to delimit variable names in formulae:seeformula.

Note

UTF-16 surrogate pairs in ‘⁠\unnnn\uoooo⁠’ form will be convertedto a single Unicode point, so for example ‘⁠\uD834\uDD1E⁠’ givesthe single character ‘⁠\U1D11E⁠’. However, unpaired values inthe surrogate range such as in the string"abc\uD834de" will beconverted to a non-standard-conformant UTF-8 string (as is done by mostother software): this may change in future.

See Also

Syntax for other aspects of the syntax.

sQuote for quoting English text.

shQuote for quoting OS commands.

The ‘R Language Definition’ manual.

Examples

'single quotes can be used more-or-less interchangeably'"with double quotes to create character vectors"## Single quotes inside single-quoted strings need backslash-escaping.## Ditto double quotes inside double-quoted strings.##identical('"It\'s alive!", he screamed.',"\"It's alive!\", he screamed.")# same## Backslashes need doubling, or they have a special meaning.x<-"In ALGOL, you could do logical AND with /\\."print(x)# shows it as above ("input-like")writeLines(x)# shows it as you like it ;-)## Single backslashes followed by a letter are used to denote## special characters like tab(ulator)s and newlines:x<-"long\tlines can be\nbroken with newlines"writeLines(x)# see also ?strwrap## Backticks are used for non-standard variable names.## (See make.names and ?Reserved for what counts as## non-standard.)`x y`<-1:5`x y`d<- data.frame(`1st column`= rchisq(5,2), check.names=FALSE)d$`1st column`## Backslashes followed by up to three numbers are interpreted as## octal notation for ASCII characters."\110\145\154\154\157\40\127\157\162\154\144\41"## \x followed by up to two numbers is interpreted as## hexadecimal notation for ASCII characters.(hw1<-"\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21")## Mixing octal and hexadecimal in the same string is OK(hw2<-"\110\x65\154\x6c\157\x20\127\x6f\162\x6c\144\x21")## \u is also hexadecimal, but supports up to 4 digits,## using Unicode specification.  In the previous example,## you can simply replace \x with \u.(hw3<-"\u48\u65\u6c\u6c\u6f\u20\u57\u6f\u72\u6c\u64\u21")## The last three are all identical tohw<-"Hello World!"stopifnot(identical(hw, hw1), identical(hw1, hw2), identical(hw2, hw3))## Using Unicode makes more sense for non-latin characters.(nn<-"\u0126\u0119\u1114\u022d\u2001\u03e2\u0954\u0f3f\u13d3\u147b\u203c")## Mixing \x and \u throws a _parse_ error (which is not catchable!)## Not run:"\x48\u65\x6c\u6c\x6f\u20\x57\u6f\x72\u6c\x64\u21"## End(Not run)##   -->   Error: mixing Unicode and octal/hex escapes .....## \U works like \u, but supports up to six hex digits.## So we can replace \u with \U in the previous example.n2<-"\U0126\U0119\U1114\U022d\U2001\U03e2\U0954\U0f3f\U13d3\U147b\U203c"stopifnot(identical(nn, n2))## Under systems supporting multi-byte locales (and not Windows),## \U also supports the rarer characters outside the usual 16^4 range.## See the R language manual,## https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Literal-constants## and bug 16098 https://bugs.r-project.org/show_bug.cgi?id=16098## This character may or not be printable (the platform decides)## and if it is, may not have a glyph in the font used."\U1d4d7"# On Windows this used to give the incorrect value of "\Ud4d7"## nul characters (for terminating strings in C) are not allowed (parse errors)## Not run:"foo\0bar"# Error: nul character not allowed (line 1)"foo\u0000bar"# same error## End(Not run)## A Windows path written as a raw string constant:r"(c:\Program files\R)"## More raw strings:r"{(\1\2)}"r"(use both "double" and 'single' quotes)"r"---(\1--)-)---"

Version Information

Description

R.Version() provides detailed information about the version ofR running.

R.version is a variable (alist) holding thisinformation (andversion is a copy of it for S compatibility).

Usage

R.Version()R.versionR.version.stringversionR_compiled_by()

Details

This gives details of the OS under whichR was built, not the oneunder which it is currently running (for which seeSys.info).

Note that OS names might not be what you expect: for example macOSMavericks 10.9.4 identifies itself as ‘⁠darwin13.3.0⁠’, Linuxusually as ‘⁠linux-gnu⁠’, Solaris 10 as ‘⁠solaris2.10⁠’ and Windowsas ‘⁠mingw32⁠’.

R.version$crt is supported on Windows sinceR 4.2.0 and returns"ucrt" to denote the Universal C Runtime. It would return"msvcrt" for the older Microsoft Visual C++ Runtime (butR doesnot use that runtime since 4.2.0).

Value

R.Version returns a list with character-string components

platform

the platform for whichR was built. A triplet of theform CPU-VENDOR-OS, as determined by the configure script. E.g,"i686-unknown-linux-gnu" or"i386-pc-mingw32".

arch

the architecture (CPU)R was built on/for.

os

the underlying operating system.

crt

the C runtime on Windows.

system

CPU and OS, separated by a comma.

status

the status of the version (e.g.,"alpha").

major

the major version number.

minor

the minor version number, including the patch level.

year

the year the version was released.

month

the month the version was released.

day

the day the version was released.

svn rev

the Subversion revision number, which should be either"unknown" or a single number. (A range of numbers or a numberwith ‘⁠M⁠’ or ‘⁠S⁠’ appended indicates inconsistencies in thesources used to build this version ofR.)

language

always"R".

version.string

acharacter string concatenating some of the info above,useful for plotting, etc.

R.version andversion are lists of class"simple.list" which has aprint method.

R_compiled_by returns a two-element character vector givingdetails of the C and Fortran compilers used to buildR. (Emptystrings if no information is available.)

Note

Donot useR.version$os to test the platform thecode is running on: use.Platform$OS.type instead. Slightlydifferent versions of the OS may report different values ofR.version$os, as may different versions ofR.Alternatively,osVersion typically contains moredetails about the platformR is running on.

R.version.string is a copy ofR.version$version.stringfor simplicity and backwards compatibility.

See Also

sessionInfo which provides additional information;getRversion typically used inside R code,osVersion,.Platform,Sys.info.

Examples

require(graphics)R.version$os# to check how lucky you are ...plot(0)# any plotmtext(R.version.string, side=1, line=4, adj=1)# a useful bottom-right note## a good way to detect macOS:if(grepl("^darwin", R.version$os)) message("running on macOS")## Short R version string, ("space free", useful in file/directory names;##                          also fine for unreleased versions of R):shortRversion<-function(){   rvs<- R.version.stringif(grepl("devel",(st<- R.version$status)))       rvs<- sub(paste0(" ",st," "),"-devel_", rvs, fixed=TRUE)   gsub("[()]","", gsub(" ","_", sub(" version ","-", rvs)))}shortRversion()

Random Number Generation

Description

.Random.seed is an integer vector, containing the random numbergenerator (RNG)state for random number generation inR. Itcan be saved and restored, but should not be altered by the user.

RNGkind is a more friendly interface to query or set the kindof RNG in use.

RNGversion can be used to set the random generators as theywere in an earlierR version (for reproducibility).

set.seed is the recommended way to specify seeds.

Usage

.Random.seed<- c(rng.kind, n1, n2,...)RNGkind(kind=NULL, normal.kind=NULL, sample.kind=NULL)RNGversion(vstr)set.seed(seed, kind=NULL, normal.kind=NULL, sample.kind=NULL)

Arguments

kind

character orNULL. Ifkind is a characterstring, setR's RNG to the kind desired. Use"default" toreturn to theR default. See ‘Details’ for theinterpretation ofNULL.

normal.kind

character string orNULL. If it is a characterstring, set the method of Normal generation. Use"default"to return to theR default.NULL makes no change.

sample.kind

character string orNULL. If it is a characterstring, set the method of discrete uniform generation (used insample, for instance). Use"default" to return to theR default.NULL makes no change.

seed

a single value, interpreted as an integer, orNULL(see ‘Details’).

vstr

a character string containing a version number,e.g.,"1.6.2". The default RNG configuration of the currentR version is used ifvstr is greater than the current version.

rng.kind

integer code in0:k for the abovekind.

n1,n2,...

integers. See the details for how many are required(which depends onrng.kind).

Details

The currently available RNG kinds are given below.kind ispartially matched to this list. The default is"Mersenne-Twister".

"Wichmann-Hill"

The seed,.Random.seed[-1] == r[1:3] is an integer vector oflength 3, where eachr[i] is in1:(p[i] - 1), wherep is the length 3 vector of primes,p = (30269, 30307, 30323).The Wichmann–Hill generator has a cycle length of6.9536×10126.9536 \times 10^{12} (=prod(p-1)/4, seeApplied Statistics (1984)33, 123 which corrects the original article).It exhibits 12 clear failures in the TestU01 Crush suite and 22in the BigCrush suite (L'Ecuyer, 2007).

"Marsaglia-Multicarry":

Amultiply-with-carry RNG is used, as recommended by GeorgeMarsaglia in his post to the mailing list ‘sci.stat.math’.It has a period of more than2602^{60}.

It exhibits 40 clear failures in L'Ecuyer's TestU01 Crush suite.Combined with Ahrens-Dieter or Kinderman-Ramage it exhibitsdeviations from normality even for univariate distributiongeneration. SeePR#18168 for a discussion.

The seed is two integers (all values allowed).

"Super-Duper":

Marsaglia's famous Super-Duper from the 70's. This is the originalversion which doesnot pass the MTUPLE test of the Diehardbattery. It has a period of4.6×1018\approx 4.6\times 10^{18} for most initial seeds. The seed is two integers (allvalues allowed for the first seed: the second must be odd).

We use the implementation by Reedset al. (1982–84).

The two seeds are the Tausworthe and congruence long integers,respectively.

It exhibits 25 clear failures in the TestU01 Crush suite(L'Ecuyer, 2007).

"Mersenne-Twister":

From Matsumoto and Nishimura (1998); code updated in 2002.A twistedGFSR with period21993712^{19937} - 1 and equidistribution in 623consecutive dimensions (over the whole period). The ‘seed’ is a624-dimensional set of 32-bit integers plus a current position inthat set.

R uses its own initialization method due to B. D. Ripley and isnot affected by the initialization issue in the 1998 code ofMatsumoto and Nishimura addressed in a 2002 update.

It exhibits 2 clear failures in each of the TestU01 Crush and theBigCrush suite (L'Ecuyer, 2007).

"Knuth-TAOCP-2002":

A 32-bit integerGFSR using lagged Fibonacci sequences withsubtraction. That is, the recurrence used is

Xj=(Xj100Xj37)mod230X_j = (X_{j-100} - X_{j-37}) \bmod 2^{30}%

and the ‘seed’ is the set of the 100 last numbers (actuallyrecorded as 101 numbers, the last being a cyclic shift of thebuffer). The period is around21292^{129}.

"Knuth-TAOCP":

An earlier version from Knuth (1997).

The 2002 version was not backwards compatible with the earlierversion: the initialization of theGFSR from the seed was altered.R did not allow you to choose consecutive seeds, the reported‘weakness’, and already scrambled the seeds. Otherwise,the algorithm is identical to Knuth-TAOCP-2002, with the samelagged Fibonacci recurrence formula.

Initialization of this generator is done in interpretedR codeand so takes a short but noticeable time.

It exhibits 3 clear failure in the TestU01 Crush suite and4 clear failures in the BigCrush suite(L'Ecuyer, 2007).

"L'Ecuyer-CMRG":

A ‘combined multiple-recursive generator’ from L'Ecuyer(1999), each element of which is a feedback multiplicativegenerator with three integer elements: thus the seed is a (signed)integer vector of length 6. The period is around21912^{191}.

The 6 elements of the seed are internally regarded as 32-bitunsigned integers. Neither the first three nor the last threeshould be all zero, and they are limited to less than4294967087 and4294944443 respectively.

This is not particularly interesting of itself, but provides thebasis for the multiple streams used in packageparallel.

It exhibits 6 clear failures in each of the TestU01 Crush and theBigCrush suite (L'Ecuyer, 2007).

"user-supplied":

Use a user-supplied generator. SeeRandom.user fordetails.

normal.kind can be"Kinderman-Ramage","Buggy Kinderman-Ramage" (not forset.seed),"Ahrens-Dieter","Box-Muller","Inversion" (thedefault), or"user-supplied". (For inversion, see thereference inqnorm.) The Kinderman-Ramage generatorused in versions prior to 1.7.0 (now called"Buggy") had severalapproximation errors and should only be used for reproduction of oldresults. The"Box-Muller" generator is stateful as pairs ofnormals are generated and returned sequentially. The state is resetwhenever it is selected (even if it is the current normal generator)and whenkind is changed.

sample.kind can be"Rounding" or"Rejection",or partial matches to these. The former was the default in versionsprior to 3.6.0: it madesample noticeably non-uniformon large populations, and should only be used for reproduction of oldresults. SeePR#17494 for a discussion.

set.seed uses a single integer argument to set as many seedsas are required. It is intended as a simple way to get quite differentseeds by specifying small integer arguments, and also as a way to getvalid seed sets for the more complicated methods (especially"Mersenne-Twister" and"Knuth-TAOCP"). There is noguarantee that different values ofseed will seed the RNGdifferently, although any exceptions would be extremely rare. Ifcalled withseed = NULL it re-initializes (see ‘Note’)as if no seed had yet been set.

The use ofkind = NULL,normal.kind = NULL orsample.kind = NULL inRNGkind orset.seed selects the currently-usedgenerator (including that used in the previous session if theworkspace has been restored): if no generator has been used it selects"default".

Value

.Random.seed is aninteger vector whose firstelementcodes the kind of RNG and normal generator. The lowesttwo decimal digits are in0:(k-1)wherek is the number of availableRNGs. The hundredsrepresent the type of normal generator (starting at0), andthe ten thousands represent the type of discrete uniform sampler.

In the underlying C,.Random.seed[-1] isunsigned;therefore inR.Random.seed[-1] can be negative, due tothe representation of an unsigned integer by a signed integer.

RNGkind returns a three-element character vector of the RNG,normal and sample kinds selectedbefore the call, invisibly if either argument is notNULL. A type starts a session as the default, and is selected either by a call toRNGkind or by setting.Random.seed in the workspace. (NB: prior toR 3.6.0 the firsttwo kinds were returned in a two-element character vector.)

RNGversion returns the same information asRNGkind aboutthe defaults in a specificR version.

set.seed returnsNULL, invisibly.

Note

Initially, there is no seed; a new one is created from the currenttime and the process ID when one is required. Hence differentsessions will give different simulation results, by default. However,the seed might be restored from a previous session if a previouslysaved workspace is restored.

.Random.seed saves the seed set for the uniform random-numbergenerator, at least for the system generators. It does notnecessarily save the state of other generators, and in particular doesnot save the state of the Box–Muller normal generator. If you wantto reproduce work later, callset.seed (preferably withexplicit values forkind andnormal.kind) rather thanset.Random.seed.

The object.Random.seed is only looked for in the user'sworkspace.

Do not rely on randomness of low-order bits fromRNGs. Most of thesupplied uniform generators return 32-bit integer values that areconverted to doubles, so they take at most2322^{32} distinctvalues and long runs will return duplicated values (Wichmann-Hill isthe exception, and all give at least 30 varying bits.)

Author(s)

of RNGkind: Martin Maechler. Current implementation, B. D. Ripleywith modifications by Duncan Murdoch.

References

Ahrens, J. H. and Dieter, U. (1973).Extensions of Forsythe's method for random sampling from the normaldistribution.Mathematics of Computation,27, 927–937.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).The New S Language.Wadsworth & Brooks/Cole.(set.seed, storing in.Random.seed.)

Box, G. E. P. and Muller, M. E. (1958).A note on the generation of normal random deviates.Annals of Mathematical Statistics,29, 610–611.doi:10.1214/aoms/1177706645.

De Matteis, A. and Pagnutti, S. (1993).Long-range Correlation Analysis of the Wichmann-Hill Random NumberGenerator.Statistics and Computing,3, 67–70.doi:10.1007/BF00153065.

Kinderman, A. J. and Ramage, J. G. (1976).Computer generation of normal random variables.Journal of the American Statistical Association,71,893–896.doi:10.2307/2286857.

Knuth, D. E. (1997).The Art of Computer Programming.Volume 2, third edition.
Source code athttps://www-cs-faculty.stanford.edu/~knuth/taocp.html.

Knuth, D. E. (2002).The Art of Computer Programming.Volume 2, third edition, ninth printing.

L'Ecuyer, P. (1999).Good parameters and implementations for combined multiple recursiverandom number generators.Operations Research,47, 159–164.doi:10.1287/opre.47.1.159.

L'Ecuyer, P. and Simard, R. (2007).TestU01: A C Library for Empirical Testing of Random Number GeneratorsACM Transactions on Mathematical Software,33, Article 22.doi:10.1145/1268776.1268777.
The TestU01 C library is available fromhttp://simul.iro.umontreal.ca/testu01/tu01.html or alsohttps://github.com/umontreal-simul/TestU01-2009.

Marsaglia, G. (1997).A random number generator for C.Discussion paper, posting on Usenet newsgroupsci.stat.math onSeptember 29, 1997.

Marsaglia, G. and Zaman, A. (1994).Some portable very-long-period random number generators.Computers in Physics,8, 117–121.doi:10.1063/1.168514.

Matsumoto, M. and Nishimura, T. (1998).Mersenne Twister: A 623-dimensionally equidistributed uniformpseudo-random number generator,ACM Transactions on Modeling and Computer Simulation,8, 3–30.
Source code formerly athttp://www.math.keio.ac.jp/~matumoto/emt.html.
Now seehttp://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/c-lang.html.

Reeds, J., Hubert, S. and Abrahams, M. (1982–4).C implementation of SuperDuper, University of California at Berkeley.(Personal communication from Jim Reeds to Ross Ihaka.)

Wichmann, B. A. and Hill, I. D. (1982).Algorithm AS 183: An Efficient and Portable Pseudo-random NumberGenerator.Applied Statistics,31, 188–190; Remarks:34, 198 and35, 89.doi:10.2307/2347988.

See Also

sample for random sampling with and without replacement.

Distributions for functions for random-variate generation fromstandard distributions.

Examples

require(stats)## Seed the current RNG, i.e., set the RNG statusset.seed(42); u1<- runif(30)set.seed(42); u2<- runif(30)# the same because of identical RNG status:stopifnot(identical(u1, u2))## the default random seed is 626 integers, so only print a few runif(1); .Random.seed[1:6]; runif(1); .Random.seed[1:6]## If there is no seed, a "random" new one is created: rm(.Random.seed); runif(1); .Random.seed[1:6]ok<- RNGkind()RNGkind("Wich")# (partial string matching on 'kind')## This shows how 'runif(.)' works for Wichmann-Hill,## using only R functions:p.WH<- c(30269,30307,30323)a.WH<- c(171,172,170)next.WHseed<-function(i.seed= .Random.seed[-1]){(a.WH* i.seed)%% p.WH}my.runif1<-function(i.seed= .Random.seed){ ns<-next.WHseed(i.seed[-1]); sum(ns/ p.WH)%%1}set.seed(1998-12-04)# (when the next lines were added to the souRce)rs<- .Random.seed(WHs<-next.WHseed(rs[-1]))u<- runif(1)stopifnot(next.WHseed(rs[-1])== .Random.seed[-1], all.equal(u, my.runif1(rs)))## ----.Random.seedRNGkind("Super")# matches  "Super-Duper"RNGkind().Random.seed# new, corresponding to  Super-Duper## Reset:RNGkind(ok[1])RNGversion(getRversion())# the default version for this R version## ----sum(duplicated(runif(1e6)))# around 110 for default generator## and we would expect about almost sure duplicates beyond aboutqbirthday(1-1e-6, classes=2e9)# 235,000

User-supplied Random Number Generation

Description

FunctionRNGkind allows user-coded uniform andnormal random number generators to be supplied. The details are givenhere.

Details

A user-specified uniform RNG is called from entry points indynamically-loaded compiled code. The user must supply the entry pointuser_unif_rand, which takes no arguments and returns apointer to a double. The example below will show the generalpattern. The generator should have at least 25 bits of precision.

Optionally, the user can supply the entry pointuser_unif_init,which is called with anunsigned int argument whenRNGkind (orset.seed) is called, and is intendedto be used to initialize the user's RNG code. The argument is intendedto be used to set the ‘seeds’; it is theseed argument toset.seed or an essentially random seed ifRNGkindis called.

If only these functions are supplied, no information about thegenerator's state is recorded in.Random.seed. Optionally,functionsuser_unif_nseed anduser_unif_seedloc can besupplied which are called with no arguments and should return pointersto the number of seeds and to an integer (specifically, ‘⁠Int32⁠’)array of seeds. Calls toGetRNGstate andPutRNGstatewill then copy this array to and from.Random.seed.

A user-specified normal RNG is specified by a single entry pointuser_norm_rand, which takes no arguments and returns apointer to a double.

Warning

As with all compiled code, mis-specifying thesefunctions can crashR. Do include the ‘R_ext/Random.h’header file for type checking.

Examples

## Not run:##  Marsaglia's congruential PRNG#include <R_ext/Random.h>static Int32 seed;static double res;static int nseed=1;double* user_unif_rand(void){    seed=69069* seed+1;    res= seed*2.32830643653869e-10;    return&res;}void  user_unif_init(Int32 seed_in){ seed= seed_in;}int* user_unif_nseed(void){ return&nseed;}int* user_unif_seedloc(void){ return(int*)&seed;}/*  ratio-of-uniformsfor normal*/#include <math.h>static double x;double* user_norm_rand(void){    double u, v, z;    do{        u= unif_rand();        v=0.857764*(2.* unif_rand()-1);        x= v/u; z=0.25* x* x;if(z<1.- u)break;if(z>0.259/u+0.35) continue;}while(z>-log(u));    return&x;}## Use under Unix:R CMD SHLIB urand.cR> dyn.load("urand.so")> RNGkind("user")> runif(10)> .Random.seed> RNGkind(,"user")> rnorm(10)> RNGkind()[1]"user-supplied""user-supplied"## End(Not run)

Range of Values

Description

range returns a vector containing the minimum and maximum ofall the given arguments.

Usage

range(..., na.rm=FALSE)## Default S3 method:range(..., na.rm=FALSE, finite=FALSE)## same for classes 'Date' and 'POSIXct'.rangeNum(..., na.rm, finite, isNumeric)

Arguments

...

anynumeric or character objects.

na.rm

logical, indicating ifNA's should beomitted.

finite

logical, indicating if all non-finite elements shouldbe omitted.

isNumeric

afunction returningTRUE orFALSE when called onc(..., recursive = TRUE),is.numeric() for the defaultrange() method.

Details

range is a generic function: methods can be defined for itdirectly or via theSummary group generic.For this to work properly, the arguments... should beunnamed, and dispatch is on the first argument.

Ifna.rm isFALSE,NAandNaN values in any of the arguments will causeNA valuesto be returned, otherwiseNA values are ignored.

Iffinite isTRUE, the minimumand maximum of all finite values is computed, i.e.,finite = TRUEincludesna.rm = TRUE.

A special situation occurs when there is no (after omissionofNAs) nonempty argument left, seemin.

S4 methods

This is part of the S4Summarygroup generic. Methods for it must use the signaturex, ..., na.rm.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

min,max.

Theextendrange() utility in packagegrDevices.

Examples

(r.x<- range(stats::rnorm(100)))diff(r.x)# the SAMPLE rangex<- c(NA,1:3,-1:1/0); xrange(x)range(x, na.rm=TRUE)range(x, finite=TRUE)

Sample Ranks

Description

Returns the sample ranks of the values in a vector. Ties (i.e., equalvalues) and missing values can be handled in several ways.

Usage

rank(x, na.last=TRUE,     ties.method= c("average","first","last","random","max","min"))

Arguments

x

a numeric, complex, character or logical vector.

na.last

a logical or character string controlling the treatmentofNAs. IfTRUE, missing values in the data areput last; ifFALSE, they are put first; ifNA, theyare removed; if"keep" they are kept with rankNA.

ties.method

a character string specifying how ties are treated,see ‘Details’; can be abbreviated.

Details

If all components are different (and noNAs), the ranks arewell defined, with values inseq_along(x). With some values equal(called ‘ties’), the argumentties.method determines theresult at the corresponding indices. The"first" method resultsin a permutation with increasing values at each index set of ties, andanalogously"last" with decreasing values. The"random" method puts these in random order whereas thedefault,"average", replaces them by their mean, and"max" and"min" replaces them by their maximum andminimum respectively, the latter being the typical sportsranking.

NA values are never considered to be equal: forna.last = TRUE andna.last = FALSE they are given distinct ranks inthe order in which they occur inx.

NB:rank is not itself generic butxtfrmis, andrank(xtfrm(x), ....) will have the desired result ifthere is axtfrm method. Otherwise,rank will make useof==,>,is.na and extraction methods forclassed objects, possibly rather slowly.

Value

A numeric vector of the same length asx with names copied fromx (unlessna.last = NA, when missing values areremoved). The vector is of integer type unlessx is a longvector orties.method = "average" when it is of double type(whether or not there are any ties).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

order andsort;xtfrm, see above.

Examples

(r1<- rank(x1<- c(3,1,4,15,92)))x2<- c(3,1,4,1,5,9,2,6,5,3,5)names(x2)<- letters[1:11](r2<- rank(x2))# ties are averaged## rank() is "idempotent": rank(rank(x)) == rank(x) :stopifnot(rank(r1)== r1, rank(r2)== r2)## ranks without averagingrank(x2, ties.method="first")# first occurrence winsrank(x2, ties.method="last")#  last occurrence winsrank(x2, ties.method="random")# ties broken at randomrank(x2, ties.method="random")# and again## keep ties ties, no average(rma<- rank(x2, ties.method="max"))# as used classically(rmi<- rank(x2, ties.method="min"))# as in Sportsstopifnot(rma+ rmi== round(r2+ r2))## Comparing all tie.methods:tMeth<- eval(formals(rank)$ties.method)rx2<- sapply(tMeth,function(M) rank(x2, ties.method=M))cbind(x2, rx2)## ties.method's does not matter w/o ties:x<- sample(47)rx<- sapply(tMeth,function(MM) rank(x, ties.method=MM))stopifnot(all(rx[,1]== rx))

Recursively Apply a Function to a List

Description

rapply is a recursive version oflapply withflexibility inhow the result is structured (how = "..").

Usage

rapply(object, f, classes="ANY", deflt=NULL,       how= c("unlist","replace","list"),...)

Arguments

object

alist orexpression, i.e., “list-like”.

f

afunction of one “principal” argument,passing further arguments via....

classes

character vector ofclass names, or"ANY" to match any class.

deflt

the default result (not used ifhow = "replace").

how

character string partially matching the three possibilities given:see ‘Details’.

...

additional arguments passed to the call tof.

Details

This function has two basic modes. Ifhow = "replace", eachelement ofobject which is not itself list-like and has a classincluded inclasses is replaced by the result of applyingf to the element.

Otherwise, with modehow = "list" orhow = "unlist",conceptuallyobjectis copied, all non-list elements which have a class included inclasses are replaced by the result of applyingf to theelement and all others are replaced bydeflt. Finally, ifhow = "unlist",unlist(recursive = TRUE) is called onthe result.

The semantics differ in detail fromlapply: inparticular the arguments are evaluated before calling the C code.

InR 3.5.x and earlier,object was required to be a list,which wasnot the case for its list-like components.

Value

Ifhow = "unlist", a vector, otherwise “list-like”of similar structure asobject.

References

Chambers, J. A. (1998)Programming with Data.Springer.
(rapply is only described briefly there.)

See Also

lapply,dendrapply.

Examples

X<- list(list(a= pi, b= list(c=1L)), d="a test")# the "identity operation":rapply(X,function(x) x, how="replace")-> X.; stopifnot(identical(X, X.))rapply(X, sqrt, classes="numeric", how="replace")rapply(X, deparse, control="all")# passing extras. argument of deparse()rapply(X, nchar, classes="character", deflt=NA_integer_, how="list")rapply(X, nchar, classes="character", deflt=NA_integer_, how="unlist")rapply(X, nchar, classes="character",                      how="unlist")rapply(X, log, classes="numeric", how="replace", base=2)## with expression() / list():E<- expression(list(a= pi, b= expression(c= C1* C2)), d="a test")LE<- list(expression(a= pi, b= expression(c= C1* C2)), d="a test")rapply(E, nchar, how="replace")# "expression(c = C1 * C2)" are 23 charsrapply(E, nchar, classes="character", deflt=NA_integer_, how="unlist")rapply(LE, as.character)# a "pi" | b1 "expression" | b2 "C1 * C2" ..rapply(LE, nchar)# (see above)stopifnot(exprs={  identical(E, rapply(E, identity, how="replace"))  identical(LE, rapply(LE, identity, how="replace"))})

Raw Vectors

Description

Creates or tests for objects of type"raw".

Usage

raw(length=0)as.raw(x)is.raw(x)

Arguments

length

desired length.

x

object to be coerced.

Details

The raw type is intended to hold raw bytes. It is possible to extractsubsequences of bytes, and to replace elements (but only by elementsof a raw vector). The relational operators (seeComparison,using the numerical order of the byte representation) work, as do thelogical operators (seeLogic) with a bitwise interpretation.

A raw vector is printed with each byte separately represented as apair of hex digits. If you want to see a character representation(with escape sequences for non-printing characters) userawToChar.

Coercion to raw treats the input values as representing small(decimal) integers, so the input is first coerced to integer, and thenvalues which are outside the range[0 ... 255] or areNA are set to0 (thenul byte).

as.raw andis.raw areprimitive functions.

Value

raw creates a raw vector of the specified length.Each element of the vector is equal to0.Raw vectors are used to store fixed-length sequences of bytes.

as.raw attempts to coerce its argument to be of rawtype. The (elementwise) answer will be0 unless thecoercion succeeds (or if the original value successfully coerces to 0).

is.raw returns true if and only iftypeof(x) == "raw".

See Also

charToRaw,rawShift, etc.

& for bitwise operations on raw vectors.

Examples

xx<- raw(2)xx[1]<- as.raw(40)# NB, not just 40.xx[2]<- charToRaw("A")xx## 28 41   -- raw prints hexadecimalsdput(xx)## as.raw(c(0x28, 0x41))as.integer(xx)## 40 65x<-"A test string"(y<- charToRaw(x))is.vector(y)# TRUErawToChar(y)is.raw(x)is.raw(y)stopifnot( charToRaw("\xa3")== as.raw(0xa3))isASCII<-function(txt) all(charToRaw(txt)<= as.raw(127))isASCII(x)# trueisASCII("\xa325.63")# false (in Latin-1, this is an amount in UK pounds)

Raw Connections

Description

Input and output raw connections.

Usage

rawConnection(object, open="r")rawConnectionValue(con)

Arguments

object

character or raw vector. A description of the connection.For an input this is anR raw vector object, and for an outputconnection the name for the connection.

open

character. Any of the standard connection open modes.

con

an output raw connection.

Details

An input raw connection is opened and the raw vector is copiedat the time the connection object is created, andclosedestroys the copy.

An output raw connection is opened and creates anR raw vectorinternally. The raw vector can be retrievedviarawConnectionValue.

If a connection is open for both input and output the initial rawvector supplied is copied when the connections is open

Value

ForrawConnection, a connection object of class"rawConnection" which inherits from class"connection".

ForrawConnectionValue, a raw vector.

Note

As output raw connections keep the internal raw vector up to datecall-by-call, they are relatively expensive to use (althoughover-allocation is used), and it may be better to use an anonymousfile() connection to collect output.

On (rare) platforms wherevsnprintf does not return the needed lengthof output there is a 100,000 character limit on the length of line foroutput connections: longer lines will be truncated with a warning.

See Also

connections,showConnections.

Examples

zz<- rawConnection(raw(0),"r+")# start with empty raw vectorwriteBin(LETTERS, zz)seek(zz,0)readLines(zz)# raw vector has embedded nulsseek(zz,0)writeBin(letters[1:3], zz)rawConnectionValue(zz)close(zz)

Convert to or from (Bit/Packed) Raw Vectors

Description

Conversion to and from and manipulation of objects of type"raw",both used as bits or “packed” 8 bits.

Usage

charToRaw(x)rawToChar(x, multiple=FALSE)rawShift(x, n)rawToBits(x)intToBits(x)packBits(x, type= c("raw","integer","double"))numToInts(x)numToBits(x)

Arguments

x

object to be converted or shifted.

multiple

logical: should the conversion be to a singlecharacter string or multiple individual characters?

n

the number of bits to shift. Positive numbers shift rightand negative numbers shift left: allowed values are-8 ... 8.

type

the result type, partially matched.

Details

packBits accepts raw, integer or logical inputs, the last twowithout any NAs.

numToBits(.) andpackBits(., type="double") areinverse functions of each other, see also the examples.

Note that ‘bytes’ are not necessarily the same as characters,e.g. in UTF-8 locales.

Value

charToRaw converts a length-one character string to raw bytes.It does so without taking into account any declared encoding (seeEncoding).

rawToChar converts raw bytes either to a single characterstring or a character vector of single bytes (with"" for0). (Note that a single character string could containembeddedNULs; only trailing nulls are allowed and will be removed.)In either case it is possible to create a result which is invalid in amultibyte locale, e.g. one using UTF-8.Long vectors areallowed ifmultiple is true.

rawShift(x, n) shift the bits inx byn positionsto the right, see the argumentn, above.

rawToBits returns a raw vector of 8 times the length of a rawvector with entries 0 or 1.intToBits returns a raw vectorof 32 times the length of an integer vector with entries 0 or 1.(Non-integral numeric values are truncated to integers.) Inboth cases the unpacking is least-significant bit first.

packBits packs its input (using only the lowest bit for raw orinteger vectors) least-significant bit first to a raw, integer or double(“numeric”) vector.

numToInts() andnumToBits() splitdouble precision numeric vectorseither into to twointegers each or into 64 bits each,stored asraw. In both cases the unpacking is least-significantelement first.

Examples

x<-"A test string"(y<- charToRaw(x))is.vector(y)# TRUErawToChar(y)rawToChar(y, multiple=TRUE)(xx<- c(y,  charToRaw("&"), charToRaw(" more")))rawToChar(xx)rawShift(y,1)rawShift(y,-2)rawToBits(y)showBits<-function(r) stats::symnum(as.logical(rawToBits(r)))z<- as.raw(5)z; showBits(z)showBits(rawShift(z,1))# shift to rightshowBits(rawShift(z,2))showBits(z)showBits(rawShift(z,-1))# shift to leftshowBits(rawShift(z,-2))# ..showBits(rawShift(z,-3))# shifted off entirelypackBits(as.raw(0:31))i<--2:3stopifnot(exprs={  identical(i, packBits(intToBits(i),"integer"))  identical(packBits(0:31),            packBits(as.raw(0:31)))})str(pBi<- packBits(intToBits(i)))data.frame(B= matrix(pBi, nrow=6, byrow=TRUE),           hex= format(as.hexmode(i)), i)## Look at internal bit representation of ...## ... of integers :bitI<-function(x) vapply(as.integer(x),function(x){            b<- substr(as.character(rev(intToBits(x))),2L,2L)            paste0(c(b[1L]," ", b[2:32]), collapse="")},"")print(bitI(-8:8), width=35, quote=FALSE)## ... of double precision numbers in format  'sign exp | mantissa'## where  1 bit sign  1 <==> "-";##       11 bit exp   is the base-2 exponent biased by 2^10 - 1 (1023)##       52 bit mantissa is without the implicit leading '1'### Bit representation  [ sign | exponent | mantissa ] of double prec numbers :bitC<-function(x) noquote(vapply(as.double(x),function(x){# split one double    b<- substr(as.character(rev(numToBits(x))),2L,2L)    paste0(c(b[1L]," ", b[2:12]," | ", b[13:64]), collapse="")},""))bitC(17)bitC(c(-1,0,1))bitC(2^(-2:5))bitC(1+2^-(1:53))# from 0.5 converge to 1###  numToBits(.)  <==>   intToBits(numToInts(.)) :d2bI<-function(x) vapply(as.double(x),function(x) intToBits(numToInts(x)), raw(64L))d2b<-function(x) vapply(as.double(x),function(x)           numToBits(x), raw(64L))set.seed(1)x<- c(sort(rt(2048, df=1.5)),2^(-10:10),1+2^-(1:53))str(bx<- d2b(x))# a  64 x 2122  raw matrixstopifnot( identical(bx, d2bI(x)))## Show that  packBits(*, "double")  is the inverse of numToBits() :packBits(numToBits(pi), type="double")bitC(2050)b<- numToBits(2050) identical(b, numToBits(packBits(b, type="double")))pbx<- apply(bx,2, packBits, type="double")stopifnot( identical(pbx, x))

Utilities for Processing Rd Files

Description

Utilities for converting files in R documentation (Rd) format to otherformats or create indices from them, and for converting documentationin other formats to Rd format.

Usage

R CMD Rdconv[options] fileR CMD Rd2pdf[options] files

Arguments

file

the path to a file to be processed.

files

a list of file names specifying the R documentationsources to use, by either giving the paths to the files, or the pathto a directory with the sources of a package.

options

further options to control the processing, or forobtaining information about usage and version of the utility.

Details

R CMD Rdconv converts Rd format to plain text, HTML or LaTeXformats: it can also extract the examples.

R CMD Rd2pdf is the user-level program for producing PDF outputfrom Rd sources. It will make use of the environment variablesR_PAPERSIZE (set byR CMD, with a default set whenRwas installed: values forR_PAPERSIZE area4,letter,legal andexecutive)

andR_PDFVIEWER (the PDF previewer). Also,RD2PDF_INPUTENC can be set toinputenx to make use of theLaTeX package of that name rather thaninputenc: this might beneeded for better support of the UTF-8 encoding.

R CMD Rd2pdf callstools::texi2pdf to produceits PDF file: see its help for the possibilities for thetexi2dvi command which that function uses (and which can beoverridden by setting environment variableR_TEXI2DVICMD).

UseR CMDfoo --help to obtain usage information on utilityfoo.

See Also

The section ‘Processing documentation files’ in the‘Writing R Extensions’ manual:RShowDoc("R-exts").


Transfer Binary Data To and From Connections

Description

Read binary data from or write binary data to a connection or raw vector.

Usage

readBin(con, what, n=1L, size=NA_integer_, signed=TRUE,        endian= .Platform$endian)writeBin(object, con, size=NA_integer_,         endian= .Platform$endian, useBytes=FALSE)

Arguments

con

Aconnection object or a character string naming a file ora raw vector.

what

Either an object whose mode will give the mode of thevector to be read, or a character vector of length one describingthe mode: one of"numeric","double","integer","int","logical","complex","character","raw".

n

numeric. The (maximal) number of records to beread. You can use an over-estimate here, but not too large asstorage is reserved forn items.

size

integer. The number of bytes per element in the bytestream. The default,NA_integer_, uses the natural size.Size changing is not supported for raw and complex vectors.

signed

logical. Only used for integers of sizes 1 and 2,when it determines if the quantity on fileshould be regarded as a signed or unsigned integer.

endian

The endianness ("big" or"little") of thetarget system for the file. Using"swap" will force swappingendianness.

object

AnR object to be written to the connection.

useBytes

SeewriteLines.

Details

These functions can only be used with binary-mode connections.Ifcon is a character string, the functions callfile to obtain a binary-mode file connection which isopened for the duration of the function call.

If the connection is open it is read/written from its currentposition. If it is not open, it is opened for the duration of thecall in an appropriate mode (binary read or write) and then closedagain. An open connection must be in binary mode.

IfreadBin is called withcon a raw vector, the data inthe vector is used as input. IfwriteBin is called withcon a raw vector, it is just an indication that a raw vectorshould be returned.

Ifsize is specified and not the natural size of the object,each element of the vector is coerced to an appropriate type beforebeing written or as it is read. Possible sizes are 1, 2, 4 andpossibly 8 for integer or logical vectors, and 4, 8 and possibly 12/16for numeric vectors. (Note that coercion occurs as signed typesexcept ifsigned = FALSE when reading integers of sizes 1 and 2.)Changing sizes is unlikely to preserveNAs, and the extendedprecision sizes are unlikely to be portable across platforms.

readBin andwriteBin read and write C-stylezero-terminated character strings. Input strings are limited to 10000characters.readChar andwriteChar canbe used to read and write fixed-length strings. No check is made thatthe string is valid in the current locale's encoding.

HandlingR's missing and special (Inf,-Inf andNaN) values is discussed in the ‘R Data Import/Export’ manual.

Only23112^{31}-1 bytes can be written in a singlecall (and that is the maximum capacity of a raw vector on 32-bitplatforms).

‘Endian-ness’ is relevant forsize > 1, and shouldalways be set for portable code (the default is only appropriate whenwriting and then reading files on the same platform).

Value

ForreadBin, a vector of appropriate mode and length the number ofitems read (which might be less thann).

ForwriteBin, a raw vector (ifcon is a raw vector) orinvisiblyNULL.

Note

Integer read/writes of size 8 will be available if either C typelong is of size 8 bytes or C typelong long exists andis of size 8 bytes.

Real read/writes of sizesizeof(long double) (usually 12 or 16bytes) will be available only if that type is available and differentfromdouble.

IfreadBin(what = character()) is used incorrectly on a filewhich does not contain C-style character strings, warnings (usuallymany) are given. From a file or connection, the input will be brokeninto pieces of length 10000 with any final part being discarded.

See Also

The ‘R Data Import/Export’ manual.

readChar to read/write fixed-length strings.

connections,readLines,writeLines.

.Machine for the sizes oflong,long longandlong double.

Examples

zzfil<- tempfile("testbin")zz<- file(zzfil,"wb")writeBin(1:10, zz)writeBin(pi, zz, endian="swap")writeBin(pi, zz, size=4)writeBin(pi^2, zz, size=4, endian="swap")writeBin(pi+3i, zz)writeBin("A test of a connection", zz)z<- paste("A very long string",1:100, collapse=" + ")writeBin(z, zz)if(.Machine$sizeof.long==8|| .Machine$sizeof.longlong==8)    writeBin(as.integer(5^(1:10)), zz, size=8)if((s<- .Machine$sizeof.longdouble)>8)    writeBin((pi/3)^(1:10), zz, size= s)close(zz)zz<- file(zzfil,"rb")readBin(zz, integer(),4)readBin(zz, integer(),6)readBin(zz, numeric(),1, endian="swap")readBin(zz, numeric(), size=4)readBin(zz, numeric(), size=4, endian="swap")readBin(zz, complex(),1)readBin(zz, character(),1)z2<- readBin(zz, character(),1)if(.Machine$sizeof.long==8|| .Machine$sizeof.longlong==8)    readBin(zz, integer(),10,  size=8)if((s<- .Machine$sizeof.longdouble)>8)    readBin(zz, numeric(),10, size= s)close(zz)unlink(zzfil)stopifnot(z2== z)## signed vs unsigned intszzfil<- tempfile("testbin")zz<- file(zzfil,"wb")x<- as.integer(seq(0,255,32))writeBin(x, zz, size=1)writeBin(x, zz, size=1)x<- as.integer(seq(0,60000,10000))writeBin(x, zz, size=2)writeBin(x, zz, size=2)close(zz)zz<- file(zzfil,"rb")readBin(zz, integer(),8, size=1)readBin(zz, integer(),8, size=1, signed=FALSE)readBin(zz, integer(),7, size=2)readBin(zz, integer(),7, size=2, signed=FALSE)close(zz)unlink(zzfil)## use of rawz<- writeBin(pi^{1:5}, raw(), size=4)readBin(z, numeric(),5, size=4)z<- writeBin(c("a","test","of","character"), raw())readBin(z, character(),4)

Transfer Character Strings To and From Connections

Description

Transfer character strings to and from connections, without assumingthey are null-terminated on the connection.

Usage

readChar(con, nchars, useBytes=FALSE)writeChar(object, con, nchars= nchar(object, type="chars"),          eos="", useBytes=FALSE)

Arguments

con

aconnection object, or a character string naming a file,or a raw vector.

nchars

integer vector, giving the lengths in characters of(unterminated) character strings to be read or written. Elementsmust be >= 0 and notNA.

useBytes

logical: ForreadChar, shouldnchars beregarded as a number of bytes not characters in a multi-bytelocale? ForwriteChar, seewriteLines.

object

a character vector to be written to the connection, atleast as long asnchars.

eos

‘end of string’: character string. The terminatorto be written after each string, followed by an ASCIInul;useNULL for no terminator at all.

Details

These functions complementreadBin andwriteBin which read and write C-style zero-terminatedcharacter strings. They are for strings of known length, andcan optionally write an end-of-string mark. They are intended onlyfor character strings valid in the current locale.

These functions are intended to be used with binary-mode connections.Ifcon is a character string, the functions callfile to obtain a binary-mode file connection which isopened for the duration of the function call.

If the connection is open it is read/written from its currentposition. If it is not open, it is opened for the duration of thecall in an appropriate mode (binary read or write) and then closedagain. An open connection must be in binary mode.

IfreadChar is called withcon a raw vector, the data inthe vector is used as input. IfwriteChar is called withcon a raw vector, it is just an indication that a raw vectorshould be returned.

Character strings containing ASCIInul(s) will be readcorrectly byreadChar but truncated at the firstnul with a warning.

If the character length requested forreadChar is longer thanthe data available on the connection, what is available isreturned. ForwriteChar if too many characters are requestedthe output is zero-padded, with a warning.

Missing strings are written asNA.

Value

ForreadChar, a character vector of length the number ofitems read (which might be less thanlength(nchars)).

ForwriteChar, a raw vector (ifcon is a raw vector) orinvisiblyNULL.

Note

Earlier versions ofR allowed embeddedNUL bytes within characterstrings, but notR >= 2.8.0.readChar was commonly used toread fixed-size zero-padded byte fields for whichreadBin wasunsuitable.readChar can still be used for such fields ifthere are no embeddedNULs: otherwisereadBin(what = "raw")provides an alternative.

nchars will be interpreted in bytes not characters in anon-UTF-8 multi-byte locale, with a warning.

There is little validity checking of UTF-8 reads.

Using these functions on a text-mode connection may work but shouldnot be mixed with text-mode access to the connection, especially ifthe connection was opened with anencoding argument.

See Also

The ‘R Data Import/Export’ manual.

connections,readLines,writeLines,readBin

Examples

## test fixed-length stringszzfil<- tempfile("testchar")zz<- file(zzfil,"wb")x<- c("a","this will be truncated","abc")nc<- c(3,10,3)writeChar(x, zz, nc, eos=NULL)writeChar(x, zz, eos="\r\n")close(zz)zz<- file(zzfil,"rb")readChar(zz, nc)readChar(zz, nchar(x)+3)# need to read the terminator explicitlyclose(zz)unlink(zzfil)

Read a Line from the Terminal

Description

readline reads a line from the terminal (in interactive use).

Usage

readline(prompt="")

Arguments

prompt

the string printed when prompting the user for input.Should usually end with a space" ".

Details

The prompt string will be truncated to a maximum allowed length,normally 256 chars (but can be changed in the source code).

This can only be used in aninteractive session.

Value

A character vector of length one. Both leading and trailingspaces and tabs are stripped from the result.

In non-interactive use the result is as if the response wasRETURN and the value is"".

See Also

readLines for reading text lines from connections,including files.

Examples

fun<-function(){  ANSWER<- readline("Are you a satisfied R user? ")## a better version would check the answer less cursorily, and## perhaps re-promptif(substr(ANSWER,1,1)=="n")    cat("This is impossible.  YOU LIED!\n")else    cat("I knew it.\n")}if(interactive()) fun()

Read Text Lines from a Connection

Description

Read some or all text lines from a connection.

Usage

readLines(con= stdin(), n=-1L, ok=TRUE, warn=TRUE,          encoding="unknown", skipNul=FALSE)

Arguments

con

aconnection object or a character string.

n

integer. The (maximal) number of lines toread. Negative values indicate that one should read up to the end ofinput on the connection.

ok

logical. Is it OK to reach the end of the connection beforen > 0 lines are read? If not, an error will be generated.

warn

logical. Warn if a text file is missing a finalEOL or ifthere are embeddedNULs in the file.

encoding

encoding to be assumed for input strings. It isused to mark character strings as known to be inLatin-1, UTF-8 or to be bytes: it is not used to re-encode the input.To do thelatter, specify the encoding as part of the connectioncon orviaoptions(encoding=): see the examplesand ‘Details’.

skipNul

logical: shouldNULs be skipped?

Details

If thecon is a character string, the function callsfile to obtain a file connection which is opened forthe duration of the function call. This can be a compressed file.(tilde expansion of the file path is done byfile.)

If the connection is open it is read from its current position. If itis not open, it is opened in"rt" mode for the duration ofthe call and then closed (but not destroyed; one must callclose to do that).

If the final line is incomplete (no finalEOL marker) the behaviourdepends on whether the connection is blocking or not. For anon-blocking text-mode connection the incomplete line is pushed back,silently. For all other connections the line will be accepted, with awarning.

Whatever mode the connection is opened in, any ofLF,CRLF orCR will be accepted as theEOL marker fora line.

EmbeddedNULs in the input stream will terminate the line currentlybeing read, with a warning (unlessskipNul = TRUE orwarn = FALSE).

Ifcon is a not-already-openconnection with a non-defaultencoding argument, the text is converted to UTF-8 and declaredas such (and theencoding argument toreadLines is ignored).See the examples.

Value

A character vector of length the number of lines read.

The elements of the result have a declared encoding ifencoding is"latin1" or"UTF-8",

Note

The default connection,stdin, may be different fromcon = "stdin": seefile.

See Also

connections,writeLines,readBin,scan

Examples

fil<- tempfile(fileext=".data")cat("TITLE extra line","2 3 5 7","","11 13 17", file= fil,    sep="\n")readLines(fil, n=-1)unlink(fil)# tidy up## difference in blockingfil<- tempfile("test")cat("123\nabc", file= fil)readLines(fil)# line with a warningcon<- file(fil,"r", blocking=FALSE)readLines(con)# "123"cat(" def\n", file= fil, append=TRUE)readLines(con)# gets bothclose(con)unlink(fil)# tidy up## Not run:# read a 'Windows Unicode' fileA<- readLines(con<- file("Unicode.txt", encoding="UCS-2LE"))close(con)unique(Encoding(A))# will most likely be UTF-8## End(Not run)

Serialization Interface for Single Objects

Description

Functions to write a singleR object to a file, and to restore it.

Usage

saveRDS(object, file="", ascii=FALSE, version=NULL,        compress=TRUE, refhook=NULL)readRDS(file, refhook=NULL)infoRDS(file)

Arguments

object

R object to serialize.

file

aconnection or the name of the file where theR objectis saved to or read from.

ascii

a logical. IfTRUE orNA, an ASCIIrepresentation is written; otherwise (default), a binary one is used.See the comments in the help forsave.

version

the workspace format version to use.NULLspecifies the current default version (3). The only other supportedvalue is 2, the default fromR 1.4.0 toR 3.5.0.

compress

a logical specifying whether saving to a named file isto use"gzip" compression, or one of"gzip","bzip2" or"xz" to indicate the type of compression tobe used. Ignored iffile is a connection.

refhook

a hook function for handling reference objects.

Details

saveRDS andreadRDS provide the means to save a singleRobject to a connection (typically a file) and to restore the object, quitepossibly under a different name. This differs fromsave andload, which save and restore one or more named objects intoan environment. They are widely used byR itself, for example to storemetadata for a package and to store thehelp.searchdatabases: the".rds" file extension is most often used.

Functionsserialize andunserializeprovide a slightly lower-level interface to serialization: objectsserialized to a connection byserialize can be read back byreadRDS and conversely.

FunctioninfoRDS retrieves meta-data about serialization producedbysaveRDS orserialize.infoRDS cannot be used todetect whether a file is a serialization nor whether it is valid.

All of these interfaces use the same serialization format, butsavewrites a single line header (typically"RDXs\n") before theserialization of a single object (a pairlist of all the objects to besaved).

Iffile is a file name, it is opened bygzfileexcept forsave(compress = FALSE) which usesfile. Only for the exception are marked encodings offile which cannot be translated to the native encoding handledon Windows.

Compression is handled by the connection opened whenfile is afile name, so is only possible whenfile is a connection ifhandled by the connection. So e.g.urlconnections will need to be wrapped in a call togzcon.

If a connection is supplied it will be opened (in binary mode) for theduration of the function if not already open: if it is already open itmust be in binary mode forsaveRDS(ascii = FALSE) or to readnon-ASCII saves.

Value

ForreadRDS, anR object.

ForsaveRDS,NULL invisibly.

ForinfoRDS, anR list with elementsversion (versionnumber, currently 2 or 3),writer_version (version ofR thatproduced the serialization),min_reader_version (minimum version ofR that can read the serialization),format (data representation)andnative_encoding (native encoding of the session that producedthe serialization, available since version 3). The data representation isgiven as"xdr" for big-endian binary representation,"ascii"for ASCII representation (produced viaascii = TRUE orascii = NA) or"binary" (binary representation with native‘endianness’ which can be produced byserialize).

Warning

Files produced bysaveRDS (orserialize to a fileconnection) are not suitable as an interchange format betweenmachines, for example to download from a website. Thefiles produced bysave have a header identifying thefile type and so are better protected against erroneous use.

See Also

serialize,save andload.

The ‘R Internals’ manual for details of the format used.

Examples

fil<- tempfile("women", fileext=".rds")## save a single object to filesaveRDS(women, fil)## restore it under a different namewomen2<- readRDS(fil)identical(women, women2)## or examine the object via a connection, which will be opened as needed.con<- gzfile(fil)readRDS(con)close(con)## Less convenient ways to restore the object## which demonstrate compatibility with unserialize()con<- gzfile(fil,"rb")identical(unserialize(con), women)close(con)con<- gzfile(fil,"rb")wm<- readBin(con,"raw", n=1e4)# size is a guessclose(con)identical(unserialize(wm), women)## Format compatibility with serialize():fil2<- tempfile("women")con<- file(fil2,"w")serialize(women, con)# ASCII, uncompressedclose(con)identical(women, readRDS(fil2))fil3<- tempfile("women")con<- bzfile(fil3,"w")serialize(women, con)# binary, bzip2-compressedclose(con)identical(women, readRDS(fil3))unlink(c(fil, fil2, fil3))

Set Environment Variables from a File

Description

Read as file such as ‘.Renviron’ or ‘Renviron.site’ in theformat described in the help forStartup, and set environmentvariables as defined in the file.

Usage

readRenviron(path)

Arguments

path

A length-one character vector giving the path to thefile. Tilde-expansion is performed where supported.

Value

Scalar logical indicating if the file was read successfully. Returnedinvisibly. If the file cannot be opened for reading, a warning is given.

See Also

Startup for the file format.

Examples

## Not run:## re-read a startup file (or read it in a vanilla session)readRenviron("~/.Renviron")## End(Not run)

Recursive Calling

Description

Recall is used as a placeholder for the name of the functionin which it is called. It allows the definition of recursivefunctions which still work after being renamed, see example below.

Usage

Recall(...)

Arguments

...

all the arguments to be passed.

Note

Recall will not work correctly when passed as a functionargument, e.g. to theapply family of functions.

See Also

do.call andcall.

local for another way to write anonymous recursive functions.

Examples

## A trivial (but inefficient!) example:fib<-function(n)if(n<=2){if(n>=0)1else0}else Recall(n-1)+ Recall(n-2)fibonacci<- fib; rm(fib)## renaming wouldn't work without Recallfibonacci(10)# 55

Finalization of Objects

Description

Registers anR function to be called upon garbage collection ofobject or (optionally) at the end of anR session.

Usage

reg.finalizer(e, f, onexit=FALSE)

Arguments

e

object to finalize. Must be an environment or an external pointer.

f

function to call on finalization. Must accept a single argument,which will be the object to finalize.

onexit

logical: should the finalizer be run if the object isstill uncollected at the end of theR session?

Details

The main purpose of this function is to allow objects that refer toexternal items (a temporary file, say) to perform cleanup actions whenthey are no longer referenced from withinR. This only makes sensefor objects that are never copied on assignment, hence the restrictionto environments and external pointers.

Inter alia, it provides a way to program code to be run atthe end of anR session without manipulating.Last.For use in a package, it is often a good idea to set a finalizer on anobject in the namespace: then it will be called at the end of thesession, or soon after the namespace is unloaded if that is doneduring the session.

Value

NULL.

Note

R's interpreter is not re-entrant and the finalizer could be run inthe middle of a computation. So there are many functions which it ispotentially unsafe to call fromf: one example which causedtrouble isoptions. Finalizers arescheduled at garbage collection but only run at a relatively safe timethereafter.

See Also

gc andMemory for garbage collection andmemory management.

Examples

f<-function(e) print("cleaning....")g<-function(x){ e<- environment(); reg.finalizer(e, f)}g()invisible(gc())# trigger cleanup

Regular Expressions as used in R

Description

This help page documents the regular expression patterns supported bygrep and related functionsgrepl,regexpr,gregexpr,sub andgsub, as well as bystrsplit and optionally byagrep andagrepl.

Details

A ‘regular expression’ is a pattern that describes a set ofstrings. Two types of regular expressions are used inR,extended regular expressions (the default) andPerl-like regular expressions used byperl = TRUE.There is alsofixed = TRUE which can be considered to use aliteral regular expression.

Other functions which use regular expressions (often via the use ofgrep) includeapropos,browseEnv,help.search,list.files andls.These will all useextended regular expressions.

Patterns are described here as they would be printed bycat:(do remember that backslashes need to be doubled when enteringRcharacter strings, e.g. from the keyboard).

Long regular expression patterns may or may not be accepted: the POSIXstandard only requires up to 256bytes.

Extended Regular Expressions

This section covers the regular expressions allowed in the defaultmode ofgrep,grepl,regexpr,gregexpr,sub,gsub,regexec andstrsplit. They usean implementation of the POSIX 1003.2 standard: that allows some scopefor interpretation and the interpretations here are those currentlyused byR. The implementation supports some extensions to thestandard.

Regular expressions are constructed analogously to arithmeticexpressions, by using various operators to combine smallerexpressions. The whole expression matches zero or more characters(read ‘character’ as ‘byte’ ifuseBytes = TRUE).

The fundamental building blocks are the regular expressions that matcha single character. Most characters, including all letters anddigits, are regular expressions that match themselves. Anymetacharacter with special meaning may be quoted by preceding it witha backslash. The metacharacters in extended regular expressions are‘⁠. \ | ( ) [ { ^ $ * + ?⁠’, but note that whether these have aspecial meaning depends on the context.

Escaping non-metacharacters with a backslash isimplementation-dependent. The current implementation interprets‘⁠\a⁠’ as ‘⁠BEL⁠’, ‘⁠\e⁠’ as ‘⁠ESC⁠’, ‘⁠\f⁠’ as‘⁠FF⁠’, ‘⁠\n⁠’ as ‘⁠LF⁠’, ‘⁠\r⁠’ as ‘⁠CR⁠’ and‘⁠\t⁠’ as ‘⁠TAB⁠’. (Note that these will be interpreted byR's parser in literal character strings.)

Acharacter class is a list of characters enclosed between‘⁠[⁠’ and ‘⁠]⁠’ which matches any single character in that list;unless the first character of the list is the caret ‘⁠^⁠’, when itmatches any characternot in the list. For example, theregular expression ‘⁠[0123456789]⁠’ matches any single digit, and‘⁠[^abc]⁠’ matches anything except the characters ‘⁠a⁠’,‘⁠b⁠’ or ‘⁠c⁠’. A range of characters may be specified bygiving the first and last characters, separated by a hyphen. (Becausetheir interpretation is locale- and implementation-dependent,character ranges are best avoided. Some but not all implementationsinclude both cases in ranges when doing caseless matching.) The onlyportable way to specify all ASCII letters is to list them all as thecharacter class
⁠[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz]⁠’.
(Thecurrent implementation uses numerical order of the encoding, normally asingle-byte encoding or Unicode points.)

Certain named classes of characters are predefined. Theirinterpretation depends on thelocale (seelocales); theinterpretation below is that of the POSIX locale.

⁠[:alnum:]⁠

Alphanumeric characters: ‘⁠[:alpha:]⁠’and ‘⁠[:digit:]⁠’.

⁠[:alpha:]⁠

Alphabetic characters: ‘⁠[:lower:]⁠’ and‘⁠[:upper:]⁠’.

⁠[:blank:]⁠

Blank characters: space and tab, andpossibly other locale-dependent characters, but on most platformsnot including non-breaking space.

⁠[:cntrl:]⁠

Control characters. In ASCII, these characters have octal codes000 through 037, and 177 (DEL). In another character set,these are the equivalent characters, if any.

⁠[:digit:]⁠

Digits: ‘⁠0 1 2 3 4 5 6 7 8 9⁠’.

⁠[:graph:]⁠

Graphical characters: ‘⁠[:alnum:]⁠’ and‘⁠[:punct:]⁠’.

⁠[:lower:]⁠

Lower-case letters in the current locale.

⁠[:print:]⁠

Printable characters: ‘⁠[:alnum:]⁠’, ‘⁠[:punct:]⁠’ and space.

⁠[:punct:]⁠

Punctuation characters:
⁠! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~⁠’.

⁠[:space:]⁠

Space characters: tab, newline, vertical tab, form feed, carriagereturn, space and possibly other locale-dependent characters – onmost platforms this does not include non-breaking spaces.

⁠[:upper:]⁠

Upper-case letters in the current locale.

⁠[:xdigit:]⁠

Hexadecimal digits:
⁠0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f⁠’.

For example, ‘⁠[[:alnum:]]⁠’ means ‘⁠[0-9A-Za-z]⁠’, except thelatter depends upon the locale and the character encoding, whereas theformer is independent of locale and character set. (Note that thebrackets in these class names are part of the symbolic names, and mustbe included in addition to the brackets delimiting the bracket list.)Most metacharacters lose their special meaning inside a characterclass. To include a literal ‘⁠]⁠’, place it first in the list.Similarly, to include a literal ‘⁠^⁠’, place it anywhere but first.Finally, to include a literal ‘⁠-⁠’, place it first or last (or,forperl = TRUE only, precede it by a backslash). (Only‘⁠^ - \ ]⁠’ are special inside character classes.)

The period ‘⁠.⁠’ matches any single character. The symbol‘⁠\w⁠’ matches a ‘word’ character (a synonym for‘⁠[[:alnum:]_]⁠’, an extension) and ‘⁠\W⁠’ is its negation(‘⁠[^[:alnum:]_]⁠’). Symbols ‘⁠\d⁠’, ‘⁠\s⁠’, ‘⁠\D⁠’and ‘⁠\S⁠’ denote the digit and space classes and their negations(these are all extensions).

The caret ‘⁠^⁠’ and the dollar sign ‘⁠$⁠’ are metacharactersthat respectively match the empty string at the beginning and end of aline. The symbols ‘⁠\<⁠’ and ‘⁠\>⁠’ match the empty string atthe beginning and end of a word. The symbol ‘⁠\b⁠’ matches theempty string at either edge of a word, and ‘⁠\B⁠’ matches theempty string provided it is not at an edge of a word. (Theinterpretation of ‘word’ depends on the locale andimplementation: these are all extensions.)

A regular expression may be followed by one of several repetitionquantifiers:

⁠?⁠

The preceding item is optional and will be matchedat most once.

⁠*⁠

The preceding item will be matched zero or moretimes.

⁠+⁠

The preceding item will be matched one or moretimes.

⁠{n}⁠

The preceding item is matched exactlyntimes.

⁠{n,}⁠

The preceding item is matchedn or moretimes.

⁠{n,m}⁠

The preceding item is matched at leastntimes, but not more thanm times.

By default repetition is greedy, so the maximal possible number ofrepeats is used. This can be changed to ‘minimal’ by appending? to the quantifier. (There are further quantifiers that allowapproximate matching: see the TRE documentation.)

Regular expressions may be concatenated; the resulting regularexpression matches any string formed by concatenating the substringsthat match the concatenated subexpressions.

Two regular expressions may be joined by the infix operator ‘⁠|⁠’;the resulting regular expression matches any string matching eithersubexpression. For example, ‘⁠abba|cde⁠’ matches either thestringabba or the stringcde. Note that alternationdoes not work inside character classes, where ‘⁠|⁠’ has its literalmeaning.

Repetition takes precedence over concatenation, which in turn takesprecedence over alternation. A whole subexpression may be enclosed inparentheses to override these precedence rules.

The backreference ‘⁠\N⁠’, where ‘⁠N = 1 ... 9⁠’, matchesthe substring previously matched by the Nth parenthesizedsubexpression of the regular expression. (This is anextension for extended regular expressions: POSIX defines them onlyfor basic ones.)

Perl-like Regular Expressions

Theperl = TRUE argument togrep,regexpr,gregexpr,sub,gsub andstrsplit switchesto the PCRE library that implements regular expression patternmatching using the same syntax and semantics as Perl 5.x,with just a few differences.

For complete details please consult the man pages for PCRE, especiallyman pcrepattern andman pcreapi, on your system orfrom the sources athttps://www.pcre.org. (The version in use can befound by callingextSoftVersion. It need not be the versiondescribed in the system's man page. PCRE1 (reported as version < 10.00 byextSoftVersion) has been feature-frozen for some time(essentially 2012), the man pages athttps://www.pcre.org/original/doc/html/ should be a good match.PCRE2 (PCRE version >= 10.00) has man pages athttps://www.pcre.org/current/doc/html/).

Perl regular expressions can be computed byte-by-byte or(UTF-8) character-by-character: the latter is used in all multibytelocales and if any of the inputs are marked as UTF-8 (seeEncoding, or as Latin-1 except in a Latin-1 locale.

All the regular expressions described for extended regular expressionsare accepted except ‘⁠\<⁠’ and ‘⁠\>⁠’: in Perl all backslashedmetacharacters are alphanumeric and backslashed symbols always areinterpreted as a literal character. ‘⁠{⁠’ is not special if itwould be the start of an invalid interval specification. There can bemore than 9 backreferences (but the replacement insubcan only refer to the first 9).

Character ranges are interpreted in the numerical order of thecharacters, either as bytes in a single-byte locale or as Unicode codepoints in UTF-8 mode. So in either case ‘⁠[A-Za-z]⁠’ specifies theset of ASCII letters.

In UTF-8 mode the named character classes only match ASCII characters:see ‘⁠\p⁠’ below for an alternative.

The construct ‘⁠(?...)⁠’ is used for Perl extensions in a varietyof ways depending on what immediately follows the ‘⁠?⁠’.

Perl-like matching can work in several modes, set by the options‘⁠(?i)⁠’ (caseless, equivalent to Perl's ‘⁠/i⁠’), ‘⁠(?m)⁠’(multiline, equivalent to Perl's ‘⁠/m⁠’), ‘⁠(?s)⁠’ (single line,so a dot matches all characters, even new lines: equivalent to Perl's‘⁠/s⁠’) and ‘⁠(?x)⁠’ (extended, whitespace data characters areignored unless escaped and comments are allowed: equivalent to Perl's‘⁠/x⁠’). These can be concatenated, so for example, ‘⁠(?im)⁠’sets caseless multiline matching. It is also possible to unset theseoptions by preceding the letter with a hyphen, and to combine settingand unsetting such as ‘⁠(?im-sx)⁠’. These settings can be appliedwithin patterns, and then apply to the remainder of the pattern.Additional options not in Perl include ‘⁠(?U)⁠’ to set‘ungreedy’ mode (so matching is minimal unless ‘⁠?⁠’ is usedas part of the repetition quantifier, when it is greedy). Initiallynone of these options are set.

If you want to remove the special meaning from a sequence ofcharacters, you can do so by putting them between ‘⁠\Q⁠’ and‘⁠\E⁠’. This is different from Perl in that ‘⁠$⁠’ and ‘⁠@⁠’ arehandled as literals in ‘⁠\Q...\E⁠’ sequences in PCRE, whereas inPerl, ‘⁠$⁠’ and ‘⁠@⁠’ cause variable interpolation.

The escape sequences ‘⁠\d⁠’, ‘⁠\s⁠’ and ‘⁠\w⁠’ representany decimal digit, space character and ‘word’ character(letter, digit or underscore in the current locale: in UTF-8 mode onlyASCII letters and digits are considered) respectively, and theirupper-case versions represent their negation. Vertical tab was notregarded as a space character in aC locale before PCRE 8.34.Sequences ‘⁠\h⁠’, ‘⁠\v⁠’, ‘⁠\H⁠’ and ‘⁠\V⁠’ matchhorizontal and vertical space or the negation. (In UTF-8 mode, thesedo match non-ASCII Unicode code points.)

There are additional escape sequences: ‘⁠\cx⁠’ is‘⁠cntrl-x⁠’ for any ‘⁠x⁠’, ‘⁠\ddd⁠’ is theoctal character (for up to three digits unlessinterpretable as a backreference, as ‘⁠\1⁠’ to ‘⁠\7⁠’ alwaysare), and ‘⁠\xhh⁠’ specifies a character by two hex digits.In a UTF-8 locale, ‘⁠\x{h...}⁠’ specifies a Unicode code pointby one or more hex digits. (Note that some of these will beinterpreted byR's parser in literal character strings.)

Outside a character class, ‘⁠\A⁠’ matches at the start of asubject (even in multiline mode, unlike ‘⁠^⁠’), ‘⁠\Z⁠’ matchesat the end of a subject or before a newline at the end, ‘⁠\z⁠’matches only at end of a subject. and ‘⁠\G⁠’ matches at firstmatching position in a subject (which is subtly different from Perl'send of the previous match). ‘⁠\C⁠’ matches a singlebyte, including a newline, but its use is warned against. In UTF-8mode, ‘⁠\R⁠’ matches any Unicode newline character (not just CR),and ‘⁠\X⁠’ matches any number of Unicode characters that form anextended Unicode sequence. ‘⁠\X⁠’, ‘⁠\R⁠’ and ‘⁠\B⁠’ cannot beused inside a character class (with PCRE1, they are treated as characters‘⁠X⁠’, ‘⁠R⁠’ and ‘⁠B⁠’; with PCRE2 they cause an error).

A hyphen (minus) inside a character class is treated as a range, unless itis first or last character in the class definition. It can be quoted torepresent the hyphen literal (‘⁠\-⁠’). PCRE1 allows an unquoted hyphenat some other locations inside a character class where it cannot representa valid range, but PCRE2 reports an error in such cases.

In UTF-8 mode, some Unicode properties may be supported via‘⁠\p{xx}⁠’ and ‘⁠\P{xx}⁠’ which match characters with andwithout property ‘⁠xx⁠’ respectively. For a list of supportedproperties see the PCRE documentation, but for example ‘⁠Lu⁠’ is‘upper case letter’ and ‘⁠Sc⁠’ is ‘currency symbol’. Notethat properties such as ‘⁠\w⁠’, ‘⁠\W⁠’, ‘⁠\d⁠’, ‘⁠\D⁠’, ‘⁠\s⁠’,‘⁠\S⁠’, ‘⁠\b⁠’ and ‘⁠\B⁠’ by default do not refer to fullUnicode, but one can override this by starting a pattern with ‘⁠(*UCP)⁠’(which comes with a performance penalty).(This support depends on the PCRE library being compiled with‘Unicode property support’ which can be checkedviapcre_config. PCRE2 when compiled with Unicode support alwayssupports also Unicode properties.)

The sequence ‘⁠(?#⁠’ marks the start of a comment which continuesup to the next closing parenthesis. Nested parentheses are notpermitted. The characters that make up a comment play no part at all inthe pattern matching.

If the extended option is set, an unescaped ‘⁠#⁠’ character outsidea character class introduces a comment that continues up to the nextnewline character in the pattern.

The pattern ‘⁠(?:...)⁠’ groups characters just as parentheses dobut does not make a backreference.

Patterns ‘⁠(?=...)⁠’ and ‘⁠(?!...)⁠’ are zero-width positive andnegative lookaheadassertions: they match if an attempt tomatch the... forward from the current position would succeed(or not), but use up no characters in the string being processed.Patterns ‘⁠(?<=...)⁠’ and ‘⁠(?<!...)⁠’ are the lookbehindequivalents: they do not allow repetition quantifiers nor ‘⁠\C⁠’in....

regexpr andgregexpr support ‘named capture’. Ifgroups are named, e.g.,"(?<first>[A-Z][a-z]+)" then thepositions of the matches are also returned by name. (Namedbackreferences are not supported bysub.)

Atomic grouping, possessive qualifiers and conditionaland recursive patterns are not covered here.

Author(s)

This help page is based on the TRE documentation and the POSIXstandard, and thepcre2pattern man page from PCRE2 10.35.

See Also

grep,apropos,browseEnv,glob2rx,help.search,list.files,ls,strsplit andagrep.

TheTRE regexp syntax.

The POSIX 1003.2 standard athttps://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html.

Thepcre2pattern orpcrepatternman page(found as part ofhttps://www.pcre.org/original/pcre.txt), anddetails of Perl's own implementation athttps://perldoc.perl.org/perlre.


Extract or Replace Matched Substrings

Description

Extract or replace matched substrings from match data obtained byregexpr,gregexpr,regexec orgregexec.

Usage

regmatches(x, m, invert=FALSE)regmatches(x, m, invert=FALSE)<- value

Arguments

x

a character vector.

m

an object with match data.

invert

a logical: ifTRUE, extract or replace thenon-matched substrings.

value

an object with suitable replacement values for thematched or non-matched substrings (seeDetails).

Details

Ifinvert isFALSE (default),regmatches extractsthe matched substrings as specified by the match data. For vectormatch data (as obtained fromregexpr), empty matches aredropped; for list match data, empty matches give empty components(zero-length character vectors).

Ifinvert isTRUE,regmatches extracts thenon-matched substrings, i.e., the strings are split according to thematches similar tostrsplit (for vector match data, atmost a single split is performed).

Ifinvert isNA,regmatches extracts bothnon-matched and matched substrings, always starting and ending with anon-match (empty if the match occurred at the beginning or the end,respectively).

Note that the match data can be obtained from regular expressionmatching on a modified version ofx with the same numbers ofcharacters.

The replacement function can be used for replacing the matched ornon-matched substrings. For vector match data, ifinvert isFALSE,value should be a character vector with length thenumber of matched elements inm. Otherwise, it should be alist of character vectors with the same length asm, each aslong as the number of replacements needed. Replacement coerces valuesto character or list and generously recycles values as needed.Missing replacement values are not allowed.

Value

Forregmatches, a character vector with the matched substringsifm is a vector andinvert isFALSE. Otherwise,a list with the matched or/and non-matched substrings.

Forregmatches<-, the updated character vector.

Examples

x<- c("A and B","A, B and C","A, B, C and D","foobar")pattern<-"[[:space:]]*(,|and)[[:space:]]"## Match data from regexpr()m<- regexpr(pattern, x)regmatches(x, m)regmatches(x, m, invert=TRUE)## Match data from gregexpr()m<- gregexpr(pattern, x)regmatches(x, m)regmatches(x, m, invert=TRUE)## Considerx<-"John (fishing, hunting), Paul (hiking, biking)"## Suppose we want to split at the comma (plus spaces) between the## persons, but not at the commas in the parenthesized hobby lists.## One idea is to "blank out" the parenthesized parts to match the## parts to be used for splitting, and extract the persons as the## non-matched parts.## First, match the parenthesized hobby lists.m<- gregexpr("\\([^)]*\\)", x)## Create blank strings with given numbers of characters.blanks<-function(n) strrep(" ", n)## Create a copy of x with the parenthesized parts blanked out.s<- xregmatches(s, m)<- Map(blanks, lapply(regmatches(s, m), nchar))s## Compute the positions of the split matches (note that we cannot call## strsplit() on x with match data from s).m<- gregexpr(", *", s)## And finally extract the non-matched parts.regmatches(x, m, invert=TRUE)## regexec() and gregexec() return overlapping ranges because the## first match is the full match.  This conflicts with regmatches()<-## and regmatches(..., invert=TRUE).  We can work-around by dropping## the first match.drop_first<-function(x){if(!anyNA(x)&& all(x>0)){        ml<- attr(x,'match.length')if(is.matrix(x)) x<- x[-1,]else x<- x[-1]        attr(x,'match.length')<-if(is.matrix(ml)) ml[-1,]else ml[-1]}    x}m<- gregexec("(\\w+) \\(((?:\\w+(?:, )?)+)\\)", x)regmatches(x, m)try(regmatches(x, m, invert=TRUE))regmatches(x, lapply(m, drop_first))## invert=TRUE loses matrix structure because we are retrieving what## is in between every sub-matchregmatches(x, lapply(m, drop_first), invert=TRUE)y<- z<- x## Notice **list**(...) on the RHSregmatches(y, lapply(m, drop_first))<- list(c("<NAME>","<HOBBY-LIST>"))yregmatches(z, lapply(m, drop_first), invert=TRUE)<-    list(sprintf("<%d>",1:5))z## With `perl = TRUE` and `invert = FALSE` capture group names## are preserved.  Collect functions and arguments in calls:NEWS<- head(readLines(file.path(R.home(),'doc','NEWS.2')),100)m<- gregexec("(?<fun>\\w+)\\((?<args>[^)]*)\\)", NEWS, perl=TRUE)y<- regmatches(NEWS, m)y[[16]]## Make tabular, adding original line numbersmdat<- as.data.frame(t(do.call(cbind, y)))mdat<- cbind(mdat, line=rep(seq_along(y), lengths(y)/ ncol(mdat)))head(mdat)NEWS[head(mdat[['line']])]

Remove Objects from a Specified Environment

Description

remove andrm are identicalR functions thatcan be used to remove objects. These canbe specified successively as character strings, or in the charactervectorlist, or through a combination of both. All objectsthus specified will be removed.

Ifenvir is NULL then the currently active environment issearched first.

Ifinherits isTRUE then parents of the supplieddirectory are searched until a variable with the given name isencountered. A warning is printed for each variable that is notfound.

Usage

remove(..., list= character(), pos=-1,       envir= as.environment(pos), inherits=FALSE)rm(..., list= character(), pos=-1,       envir= as.environment(pos), inherits=FALSE)

Arguments

...

the objects to be removed, as names (unquoted) orcharacter strings (quoted).

list

a character vector (orNULL) naming objects to be removed.

pos

where to do the removal. By default, uses thecurrent environment. See ‘details’ for other possibilities.

envir

theenvironment to use. See ‘details’.

inherits

should the enclosing frames of the environment beinspected?

Details

Thepos argument can specify the environment from which to removethe objects in any of several ways:as an integer (the position in thesearch list); asthe character string name of an element in the search list; or as anenvironment (including usingsys.frame toaccess the currently active function calls).Theenvir argument is an alternative way to specify anenvironment, but is primarily there for back compatibility.

It is not allowed to remove variables from the base environment andbase namespace, nor from any environment which is locked (seelockEnvironment).

Earlier versions ofR incorrectly claimed that supplying a charactervector in... removed the objects named in the charactervector, but it removed the character vector. Use thelistargument to specify objectsvia a character vector.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

ls,objects

Examples

tmp<-1:4## work with tmp  and cleanuprm(tmp)## Not run:## remove (almost) everything in the working environment.## You will get no warning, so don't do this unless you are really sure.rm(list= ls())## End(Not run)

Replicate Elements of Vectors and Lists

Description

rep replicates the values inx. It is a genericfunction, and the (internal) default method is described here.

rep.int andrep_len are faster simplified versions fortwo common cases. Internally, they are generic, so methods can bedefined for them (seeInternalMethods).

Usage

rep(x,...)rep.int(x, times)rep_len(x, length.out)

Arguments

x

a vector (of any mode including alist) or a factor or (forrep only) aPOSIXct orPOSIXlt orDateobject; or an S4 object containing such an object.

...

further arguments to be passed to or from other methods.For the internal default method these can include:

times

an integer-valued vector giving the(non-negative) number of times to repeat each element if oflengthlength(x), or to repeat the whole vector if oflength 1. Negative orNA values are an error. Adouble vector is accepted, other inputs being coerced toan integer or double vector.

length.out

non-negative integer. The desired length of theoutput vector. Other inputs will be coerced to a doublevector and the first element taken. Ignored ifNA or invalid.

each

non-negative integer. Each element ofxis repeatedeach times. Other inputs will be coerced toan integer or double vector and the first element taken. Treated as1 ifNA or invalid.

times,length.out

see... above.

Details

The default behaviour is as if the call was

  rep(x, times = 1, length.out = NA, each = 1)

. Normally just one of the additionalarguments is specified, but ifeach is specified with eitherof the other two, its replication is performed first, and then thatimplied bytimes orlength.out.

Iftimes consists of a single integer, the result consists ofthe whole input repeated this many times. Iftimes is avector of the same length asx (after replication byeach), the result consists ofx[1] repeatedtimes[1] times,x[2] repeatedtimes[2] times andso on.

length.out may be given in place oftimes,in which casex is repeated as many times as isnecessary to create a vector of this length. If both are given,length.out takes priority andtimes is ignored.

Non-integer values oftimes will be truncated towards zero.Iftimes is a computed quantity it is prudent to add a smallfuzz or useround. And analogously foreach.

Ifx has length zero andlength.out is supplied and ispositive, the values are filled in using the extraction rules, that isby anNA of the appropriate class for an atomic vector(0 for raw vectors) andNULL for a list.

Value

An object of the same type asx.

rep.int andrep_len return no attributes (except theclass if returning a factor).

The default method ofrep gives the result names (which willalmost always contain duplicates) ifx had names, but retainsno other attributes.

Note

Functionrep.int is a simple case which was provided as aseparate function partly for S compatibility and partly for speed(especially when names can be dropped). The performance ofrephas been improved since, butrep.int is still at least twice asfast whenx has names.

The namerep.int long precedes makingrep generic.

Functionrep is a primitive, but (partial) matching of argumentnames is performed as for normal functions.

For historical reasonsrep (only) works onNULL: theresult is alwaysNULL even whenlength.out is positive.

Although it has never been documented, these functions have alwaysworked onexpression vectors.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

seq,sequence,replicate.

Examples

rep(1:4,2)rep(1:4, each=2)# not the same.rep(1:4, c(2,2,2,2))# same as second.rep(1:4, c(2,1,2,1))rep(1:4, each=2, length.out=4)# first 4 only.rep(1:4, each=2, length.out=10)# 8 integers plus two recycled 1's.rep(1:4, each=2, times=3)# length 24, 3 complete replicationsrep(1,40*(1-.8))# length 7 on most platformsrep(1,40*(1-.8)+1e-7)# better## replicate a listfred<- list(happy=1:10, name="squash")rep(fred,5)# date-time objectsx<- .leap.seconds[1:3]rep(x,2)rep(as.POSIXlt(x), rep(2,3))## named factorx<- factor(LETTERS[1:4]); names(x)<- letters[1:4]xrep(x,2)rep(x, each=2)rep.int(x,2)# no namesrep_len(x,10)

Replace Values in a Vector

Description

replace replaces the values inxwith indices given inlist by those given invalues.If necessary, the values invalues are recycled.

Usage

replace(x, list, values)

Arguments

x

a vector.

list

an index vector.

values

replacement values.

Value

A vector with the values replaced.

Note

x is unchanged: remember to assign the result.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.


Reserved Words in R

Description

The reserved words inR's parser are

ifelserepeatwhilefunctionforinnextbreak

TRUEFALSENULLInfNaNNANA_integer_NA_real_NA_complex_NA_character_

... and..1,..2 etc, which are used to refer toarguments passed down from a calling function, see....

Details

Reserved words outsidequotes are always parsed to bereferences to the objects linked to in the ‘Description’, andhence they are not allowed as syntactic names (seemake.names). Theyare allowed as non-syntacticnames, e.g. insidebacktick quotes.


Reverse Elements

Description

rev provides a reversed version of its argument. It is genericfunction with a default method for vectors and one fordendrograms.

Note that this is no longer needed (nor efficient) for obtainingvectors sorted into descending order, since that is now rather moredirectly achievable bysort(x, decreasing = TRUE).

Usage

rev(x)

Arguments

x

a vector or another object for which reversal is defined.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

seq,sort.

Examples

x<- c(1:5,5:3)## sort into descending order; first more efficiently:stopifnot(sort(x, decreasing=TRUE)== rev(sort(x)))stopifnot(rev(1:7)==7:1)#- don't need 'rev' here

Return the R Home Directory

Description

Return theR home directory, or the full path to a component of theR installation.

Usage

R.home(component="home")

Arguments

component

"home" gives theR homedirectory, other known values are"bin","doc","etc","include","modules" and"share"giving the paths to the corresponding parts of anR installation.

Details

TheR home directory is the top-level directory of theRinstallation being run.

TheR home directory is often referred to asR_HOME,and is the value of an environment variable of that name in anRsession.It can be found outside anR session byRRHOME.

The paths to components often are subdirectories ofR_HOME butneed not be:"doc","include" and"share" arenot for some Linux binary installations ofR.

Value

A character string giving theR home directory or path to aparticular component. Normally the components are all subdirectoriesof theR home directory, but this need not be the case in a Unix-likeinstallation.

The value for"modules" and on Windows"bin" is asub-architecture-specific location. (This is not so for"etc", which may have sub-architecture-specific files as wellas common ones.)

On a Unix-alike, the constructed paths are based on the currentvalues of the environment variablesR_HOME and where setR_SHARE_DIR,R_DOC_DIR andR_INCLUDE_DIR (these areset on startup and should not be altered).

On Windows the values ofR.home() andR_HOME areswitched to the 8.3 short form of path elements if required and ifthe Windows service to do that is enabled. The value ofR_HOME is set to use forward slashes (since many packagemaintainers pass it unquoted to shells, for example in‘Makefile’s).

See Also

commandArgs()[1] may provide related information.

Examples

## These result quite platform-dependently :rbind(home= R.home(),      bin= R.home("bin"))# often the 'bin' sub directory of 'home'# but not always ...list.files(R.home("bin"))

Run Length Encoding

Description

Compute the lengths and values of runs of equal values in a vector– or the reverse operation.

Usage

rle(x)inverse.rle(x,...)## S3 method for class 'rle'print(x, digits= getOption("digits"), prefix="",...)

Arguments

x

a vector (atomic, not a list) forrle();an object of class"rle" forinverse.rle().

...

further arguments; ignored here.

digits

number of significant digits for printing, seeprint.default.

prefix

character string, prepended to each printed line.

Details

‘vector’ is used in the sense ofis.vector.

Missing values are regarded as unequal to the previous value, even ifthat is also missing.

inverse.rle() is the inverse function ofrle(),reconstructingx from the runs.

Value

rle() returns an object of class"rle" which is a listwith components:

lengths

an integer vector containing the length of each run.

values

a vector of the same length aslengths with thecorresponding values.

inverse.rle() returns an atomic vector.

Examples

x<- rev(rep(6:10,1:5))rle(x)## lengths [1:5]  5 4 3 2 1## values  [1:5] 10 9 8 7 6z<- c(TRUE,TRUE,FALSE,FALSE,TRUE,FALSE,TRUE,TRUE,TRUE)rle(z)rle(as.character(z))print(rle(z), prefix="..| ")N<- integer(0)stopifnot(x== inverse.rle(rle(x)),          identical(N, inverse.rle(rle(N))),          z== inverse.rle(rle(z)))

Rounding of Numbers

Description

ceiling takes a single numeric argumentx and returns anumeric vector containing the smallest integers not less than thecorresponding elements ofx.

floor takes a single numeric argumentx and returns anumeric vector containing the largest integers not greater than thecorresponding elements ofx.

trunc takes a single numeric argumentx and returns anumeric vector containing the integers formed by truncating the values inx toward0.

round rounds the values in its first argument to the specifiednumber of decimal places (default 0). See ‘Details’ about“round to even” when rounding off a 5.

signif rounds the values in its first argument to the specifiednumber ofsignificant digits. Hence, fornumericx,signif(x, dig) is the same asround(x, dig - ceiling(log10(abs(x)))).

Usage

ceiling(x)floor(x)trunc(x,...)round(x, digits=0,...)signif(x, digits=6)

Arguments

x

a numeric vector. Or, forround andsignif, acomplex vector.

digits

integer indicating the number of decimal places(round) or significant digits (signif) to be used.Forround, negative values are allowed (see ‘Details’).

...

arguments to be passed to methods.

Details

These are generic functions: methods can be defined for themindividually or via theMath groupgeneric.

Note that for rounding off a 5, theIEC 60559 standard (see also‘IEEE 754’) is expected to be used, ‘go to the even digit’.Thereforeround(0.5) is0 andround(-1.5) is-2. However, this is dependent on OS services and onrepresentation error (since e.g.0.15 is not representedexactly, the rounding rule applies to the represented number and notto the printed number, and soround(0.15, 1) could be either0.1 or0.2).

Rounding to a negative number of digits means rounding to a power often, so for exampleround(x, digits = -2) rounds to the nearesthundred.

Forsignif the recognized values ofdigits are1...22, and non-missing values are rounded to the nearestinteger in that range. Each element of the vector is rounded individually, unlike printing.

These are all primitive functions.

S4 methods

These are all (internally) S4 generic.

ceiling,floor andtrunc are members of theMath group generic. As an S4generic,trunc has only one argument.

round andsignif are members of theMath2 group generic.

Warning

The realities of computer arithmetic can cause unexpected results,especially withfloor andceiling. For example, we‘know’ thatfloor(log(x, base = 8)) forx = 8 is1, but0 has been seen on anR platform. It isnormally necessary to use a tolerance.

Rounding to decimal digits in binary arithmetic is non-trivial (whendigits != 0) and may be surprising. Be aware that most decimalfractions arenot exactly representable in binary double precision.InR 4.0.0, the algorithm forround(x, d), ford>0d > 0, hasbeen improved tomeasure and round “to nearest even”,contrary to earlier versions ofR (or also tosprintf()orformat() based rounding).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language. Wadsworth & Brooks/Cole.

The ISO/IEC/IEEE 60559:2011 standard is available for money fromhttps://www.iso.org.

The IEEE 754:2008 standard is more openly documented, e.g, athttps://en.wikipedia.org/wiki/IEEE_754.

See Also

as.integer.Packageround'sroundX() for severalversions or implementations of rounding, including some previous and thecurrentR version (asversion = "3d.C").

Examples

round(.5+-2:4)# IEEE / IEC rounding: -2  0  0  2  2  4  4## (this is *good* behaviour -- do *NOT* report it as bug !)( x1<- seq(-2,4, by=.5))round(x1)#-- IEEE / IEC rounding !x1[trunc(x1)!= floor(x1)]x1[round(x1)!= floor(x1+.5)](non.int<- ceiling(x1)!= floor(x1))x2<- pi*100^(-1:3)round(x2,3)signif(x2,3)

Round / Truncate Date-Time Objects

Description

Round or truncate date-time objects.

Usage

## S3 method for class 'POSIXt'round(x,      units= c("secs","mins","hours","days","months","years"))## S3 method for class 'POSIXt'trunc(x,      units= c("secs","mins","hours","days","months","years"),...)## S3 method for class 'Date'round(x,...)## S3 method for class 'Date'trunc(x,      units= c("secs","mins","hours","days","months","years"),...)

Arguments

x

an object inheriting from"POSIXt" or"Date".

units

one of the units listed, a string. Can be abbreviated.

...

arguments to be passed to or from other methods, notablydigits forround.

Details

The time is rounded or truncated to the second, minute, hour, day,month or year. Time zones are only relevant to days or more, whenmidnight in the currenttime zone is used.

Forunits arguments besides “months” and “years”,the methods for class"Date" are of little use except to removefractional days.

Value

An object of class"POSIXlt" or"Date".

See Also

round for the generic function and default methods.

DateTimeClasses,Date

Examples

round(.leap.seconds+1000,"hour")         trunc(Sys.time(),"day")(timM<- trunc(Sys.time()-> St,"months"))# shows timezone(datM<- trunc(Sys.Date()-> Sd,"months"))(timY<- trunc(St,"years"))# + timezone(datY<- trunc(Sd,"years"))stopifnot(inherits(datM,"Date"), inherits(timM,"POSIXt"),          substring(format(datM),9,10)=="01",# first of month          substring(format(datY),6,10)=="01-01",# Jan 1          identical(format(datM), format(timM)),          identical(format(datY), format(timY)))

Row Indexes

Description

Returns a matrix of integers indicating their row number in amatrix-like object, or a factor indicating the row labels.

Usage

row(x, as.factor=FALSE).row(dim)

Arguments

x

a matrix-like object, that is one with a two-dimensionaldim.

dim

a matrix dimension, i.e., an integer valued numeric vector oflength two (with non-negative entries).

as.factor

a logical value indicating whether the value shouldbe returned as a factor of row labels (created if necessary)rather than as numbers.

Value

An integer (or factor) matrix with the same dimensions asx and whoseij-th element is equal toi (or thei-th row label).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

col to get columns;slice.index for a general way to get slice indicesin an array.

Examples

x<- matrix(1:12,3,4)# extract the diagonal of a matrix - more slowly than diag(x)dx<- x[row(x)== col(x)]dx# create an identity 5-by-5 matrix more slowly than diag(n = 5):x<- matrix(0, nrow=5, ncol=5)x[row(x)== col(x)]<-1x(i34<- .row(3:4))stopifnot(identical(i34, .row(c(3,4))))# 'dim' maybe "double"

Get and Set Row Names for Data Frames

Description

All data frames have row names, a character vector oflength the number of rows with no duplicates nor missing values.

There are generic functions for getting and setting row names,with default methods for arrays.The description here is for thedata.frame method.

`.rowNamesDF<-` is a (non-generic replacement) function to setrow names for data frames, with extra argumentmake.names.This function only exists as workaround as we cannot easily change therow.names<- generic without breaking legacy code in existing packages.

Usage

row.names(x)row.names(x)<- value.rowNamesDF(x, make.names=FALSE)<- value

Arguments

x

object of class"data.frame", or any other class forwhich a method has been defined.

make.names

logical, i.e., one ofFALSE, NA, TRUE,indicating what should happen if the specified row names, i.e.,value, are invalid, e.g., duplicated orNA. The default(is back compatible),FALSE, will signal an error, whereNA will “automatic” row names andTRUE will callmake.names(value, unique=TRUE) for constructing validnames.

value

an object to be coerced to character unless an integervector. It should have (after coercion) the same length as thenumber of rows ofx with no duplicated nor missing values.NULL is also allowed: see ‘Details’.

Details

A data frame has (by definition) a vector ofrow names whichhas length the number of rows in the data frame, and contains neithermissing nor duplicated values. Where a row names sequence has beenadded by the software to meet this requirement, they are regarded as‘automatic’.

Row names are currently allowed to be integer or character, butfor backwards compatibility (withR <= 2.4.0)row.names willalways return a character vector. (Useattr(x, "row.names") ifyou need to retrieve an integer-valued set of row names.)

UsingNULL for the value resets the row names toseq_len(nrow(x)), regarded as ‘automatic’.

Value

row.names returns a character vector.

row.names<- returns a data frame with the row names changed.

Note

row.names is similar torownames for arrays, andit has a method that callsrownames for an array argument.

Row names of the form1:n forn > 2 are storedinternally in a compact form, which might be seen from C code or bydeparsing but never viarow.names orattr(x, "row.names"). Additionally, some names of thissort are marked as ‘automatic’ and handled differently byas.matrix anddata.matrix (and potentiallyother functions). (All zero-row data frames are regarded as havingautomatic row names.)

References

Chambers, J. M. (1992)Data for models.Chapter 3 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

data.frame,rownames,names.

.row_names_info for the internal representations.

Examples

## To illustrate the note:df<- data.frame(x= c(TRUE,FALSE,NA,NA), y= c(12,34,56,78))row.names(df)<-1:4attr(df,"row.names")#> 1:4deparse(df)# or dput(df)##--> c(NA, 4L) : Compact storage, *not* regarded as automatic.row.names(df)<-NULLattr(df,"row.names")#> 1:4deparse(df)# or dput(df) -- shows##--> c(NA, -4L) : Compact storage, regarded as automatic.

Row and Column Names

Description

Retrieve or set the row or column names of a matrix-like object.

Usage

rownames(x, do.NULL=TRUE, prefix="row")rownames(x)<- valuecolnames(x, do.NULL=TRUE, prefix="col")colnames(x)<- value

Arguments

x

a matrix-likeR object, with at least two dimensions forcolnames.

do.NULL

logical. IfFALSE and names areNULL, names are created.

prefix

for created names.

value

a valid value for that component ofdimnames(x). For a matrix or array this is eitherNULL or a character vector of non-zero length equal to theappropriate dimension.

Details

The extractor functions try to do something sensible for anymatrix-like objectx. If the object hasdimnamesthe first component is used as the row names, and the second component(if any) is used for the column names. For a data frame,rownamesandcolnames eventually callrow.names andnames respectively, but the latter are preferred.

Ifdo.NULL isFALSE, a character vector (of lengthNROW(x) orNCOL(x)) is returned in anycase, prependingprefix to simple numbers, if there are nodimnames or the corresponding component of the dimnames isNULL.

The replacement methods for arrays/matrices coerce vector and factorvalues ofvalue to character, but do not dispatch methods foras.character.

For a data frame,value forrownames should be acharacter vector of non-duplicated and non-missing names (this isenforced), and forcolnames a character vector of (preferably)unique syntactically-valid names. In both cases,value will becoerced byas.character, and settingcolnameswill convert the row names to character.

Note

If the replacement versions are called on a matrix without anyexisting dimnames, they will add suitable dimnames. Butconstructions such as

    rownames(x)[3] <- "c"

may not work unlessx already has dimnames, since this willcreate a length-3value from theNULL value ofrownames(x).

See Also

dimnames,case.names,variable.names.

Examples

m0<- matrix(NA,4,0)rownames(m0)m2<- cbind(1,1:4)colnames(m2, do.NULL=FALSE)colnames(m2)<- c("x","Y")rownames(m2)<- rownames(m2, do.NULL=FALSE, prefix="Obs.")m2

Give Column Sums of a Matrix or Data Frame, Based on a Grouping Variable

Description

Compute column sums across rows of a numeric matrix-like object foreach level of a grouping variable.rowsum is generic, with amethod for data frames and a default method for vectors and matrices.

Usage

rowsum(x, group, reorder=TRUE,...)## S3 method for class 'data.frame'rowsum(x, group, reorder=TRUE, na.rm=FALSE,...)## Default S3 method:rowsum(x, group, reorder=TRUE, na.rm=FALSE,...)

Arguments

x

a matrix, data frame or vector of numeric data. Missingvalues are allowed. A numeric vector will be treated as a column vector.

group

a vector or factor giving the grouping, with one elementper row ofx. Missing values will be treated as anothergroup and a warning will be given.

reorder

ifTRUE, then the result will be in order ofsort(unique(group)), ifFALSE, it will be in the orderthat groups were encountered.

na.rm

logical (TRUE orFALSE). ShouldNA(includingNaN) values be discarded?

...

other arguments to be passed to or from methods.

Details

The default is to reorder the rows to agree withtapply as inthe example below. Reordering should not add noticeably to the timeexcept when there are very many distinct values ofgroup andx has few columns.

The original function was written by Terry Therneau, but this is anew implementation using hashing that is much faster for large matrices.

To sum over all the rows of a matrix (i.e., a singlegroup) usecolSums, which should be even faster.

For integer arguments, over/underflow in forming the sum results inNA.

Value

A matrix or data frame containing the sums. There will be one row perunique value ofgroup.

See Also

tapply,aggregate,rowSums

Examples

require(stats)x<- matrix(runif(100), ncol=5)group<- sample(1:8,20,TRUE)(xsum<- rowsum(x, group))## Slower versionstapply(x, list(group[row(x)], col(x)), sum)t(sapply(split(as.data.frame(x), group), colSums))aggregate(x, list(group), sum)[-1]

Register S3 Methods

Description

Register S3 methods in R scripts.

Usage

.S3method(generic, class, method)

Arguments

generic

a character string naming an S3 generic function.

class

a character string naming an S3 class.

method

a character string or function giving the S3 method tobe registered. If not given, the function namedgeneric.class is used.

Details

This function should only be used in R scripts: for package code, oneshould use the corresponding ‘⁠S3method⁠’ ‘NAMESPACE’ directive.

Examples

## Create a generic function and register a method for objects## inheriting from class 'cls':gen<-function(x) UseMethod("gen")met<-function(x) writeLines("Hello world.").S3method("gen","cls", met)## Create an object inheriting from class 'cls', and call the## generic on it:x<- structure(123, class="cls")gen(x)

Random Samples and Permutations

Description

sample takes a sample of the specified size from the elementsofx using either with or without replacement.

Usage

sample(x, size, replace=FALSE, prob=NULL)sample.int(n, size= n, replace=FALSE, prob=NULL,           useHash=(n>1e+07&&!replace&& is.null(prob)&& size<= n/2))

Arguments

x

either a vector of one or more elements from which to choose,or a positive integer. See ‘Details.’

n

a positive number, the number of items to choose from. See‘Details.’

size

a non-negative integer giving the number of items to choose.

replace

should sampling be with replacement?

prob

a vector of probability weights for obtaining the elementsof the vector being sampled.

useHash

logical indicating if the hash-version ofthe algorithm should be used. Can only be used forreplace = FALSE,prob = NULL, andsize <= n/2, and reallyshould be used for largen, asuseHash=FALSE will usememory proportional ton.

Details

Ifx has length 1, is numeric (in the sense ofis.numeric) andx >= 1, samplingviasample takes place from1:x.Note that thisconvenience feature may lead to undesired behaviour whenx isof varying length in calls such assample(x). See the examples.

Otherwisex can be anyR object for whichlength andsubsetting by integers make sense: S3 or S4 methods for theseoperations will be dispatched as appropriate.

Forsample the default forsize is the number of itemsinferred from the first argument, so thatsample(x) generates arandom permutation of the elements ofx (or1:x).

It is allowed to ask forsize = 0 samples withn = 0 ora length-zerox, but otherwisen > 0 or positivelength(x) is required.

Non-integer positive numerical values ofn orx will betruncated to the next smallest integer, which has to be no larger than.Machine$integer.max.

The optionalprob argument can be used to give a vector ofweights for obtaining the elements of the vector being sampled. Theyneed not sum to one, but they should be non-negative and not all zero.Ifreplace is true, Walker's alias method (Ripley, 1987) isused when there are more than 200 reasonably probable values: thisgives results incompatible with those fromR < 2.2.0.

Ifreplace is false, these probabilities are appliedsequentially, that is the probability of choosing the next item isproportional to the weights amongst the remaining items. The numberof nonzero weights must be at leastsize in this case.

sample.int is a bare interface in which bothn andsize must be supplied as integers.

Argumentn can be larger than the largest integer oftypeinteger, up to the largest representable integer in typedouble. Only uniform sampling is supported. Tworandom numbers are used to ensure uniform sampling of large integers.

Value

Forsample a vector of lengthsize with elementsdrawn from eitherx or from the integers1:x.

Forsample.int, an integer vector of lengthsize withelements from1:n, or a double vector ifn231n \ge 2^{31}.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Ripley, B. D. (1987)Stochastic Simulation. Wiley.

See Also

RNGkind(sample.kind = ..) about random number generation,notably the change ofsample() results withR version 3.6.0.

CRAN packagesampling for other methods of weighted samplingwithout replacement.

Examples

x<-1:12# a random permutationsample(x)# bootstrap resampling -- only if length(x) > 1 !sample(x, replace=TRUE)# 100 Bernoulli trialssample(c(0,1),100, replace=TRUE)## More careful bootstrapping --  Consider this when using sample()## programmatically (i.e., in your function or simulation)!# sample()'s surprise -- examplex<-1:10    sample(x[x>8])# length 2    sample(x[x>9])# oops -- length 10!    sample(x[x>10])# length 0## safer version:resample<-function(x,...) x[sample.int(length(x),...)]resample(x[x>8])# length 2resample(x[x>9])# length 1resample(x[x>10])# length 0## R 3.0.0 and latersample.int(1e10,12, replace=TRUE)sample.int(1e10,12)# not that there is much chance of duplicates

Save R Objects

Description

save writes an external representation ofR objects to thespecified file. The objects can be read back from the file at a laterdate by using the functionload orattach(ordata in some cases).

save.image() is just a short-cut for ‘save my currentworkspace’, i.e.,save(list = ls(all.names = TRUE), file = ".RData", envir = .GlobalEnv).It is also what happens withq("yes").

Usage

save(..., list= character(),     file= stop("'file' must be specified"),     ascii=FALSE, version=NULL, envir= parent.frame(),     compress= isTRUE(!ascii), compression_level,     eval.promises=TRUE, precheck=TRUE)save.image(file=".RData", version=NULL, ascii=FALSE,           compress=!ascii, safe=TRUE)

Arguments

...

the names of the objects to be saved (as symbols orcharacter strings).

list

a character vector (orNULL) containing the names of objects to besaved.

file

a (writable binary-mode)connection or the name of thefile where the data will be saved (whentilde expansionis done). Must be a file name forsave.image orversion = 1.

ascii

ifTRUE, an ASCII representation of the data iswritten. The default value ofascii isFALSE whichleads to a binary file being written. IfNA andversion >= 2, a different ASCII representation is used whichwrites double/complex numbers as binary fractions.

version

the workspace format version to use.NULLspecifies the current default format (3). Version 1 was the defaultfromR 0.99.0 toR 1.3.1 and version 2 fromR 1.4.0 to 3.5.0.Version 3 is supported fromR 3.5.0.

envir

environment to search for objects to be saved.

compress

logical or character string specifying whether savingto a named file is to use compression.TRUE corresponds togzip compression, and character strings"gzip","bzip2" or"xz" specify the type ofcompression. Ignored whenfile is a connection andfor workspace format version 1.

compression_level

integer: the level of compression to beused. Defaults to6 forgzip compression and to9 forbzip2 orxz compression. See thehelp forfile for possible values and their merits.

eval.promises

logical: should objects which are promises beforced before saving?

precheck

logical: should the existence of the objects bechecked before starting to save (and in particular before openingthe file/connection)? Does not apply to version 1 saves.

safe

logical. IfTRUE, a temporary file is used forcreating the saved workspace. The temporary file is renamed tofile if the save succeeds. This preserves an existingworkspacefile if the save fails, but at the cost of usingextra disk space during the save.

Details

The names of the objects specified either as symbols (or characterstrings) in... or as a character vector inlist areused to look up the objects from environmentenvir. By defaultpromises are evaluated, but ifeval.promises = FALSEpromises are saved (together with their evaluation environments).(Promises embedded in objects are always saved unevaluated.)

AllR platforms use the XDR (big-endian) representation of C ints anddoubles in binary save-d files, and these are portable across allRplatforms.

ASCII saves used to be useful for moving data between platforms butare now mainly of historical interest. They can be more compact thanbinary saves where compression is not used, but are almost alwaysslower to both read and write: binary saves compress much better thanASCII ones. Further, decimal ASCII saves may not restoredouble/complex values exactly, and what value is restored may dependon theR platform.

Default values for theascii,compress,safe andversion arguments can be modified with the"save.defaults" option (used both bysave andsave.image), see also the ‘Examples’ section. If a"save.image.defaults" option is set it is used in preference to"save.defaults" for functionsave.image (which allowsthis to have different defaults). In addition,compression_level can be part of the"save.defaults"option.

A connection that is not already open will be opened in mode"wb". Supplying a connection which is open and not in binarymode gives an error.

Compression

Large files can be reduced considerably in size by compression. Aparticular 46MBR object was saved as 35MB without compression in 2seconds, 22MB withgzip compression in 8 secs, 19MB withbzip2 compression in 13 secs and 9.4MB withxzcompression in 40 secs. The load times were 1.3, 2.8, 5.5 and 5.7seconds respectively. These results are indicative, but the relativeperformances do depend on the actual file:xz compressedunusually well here.

It is possible to compress later (withgzip,bzip2orxz) a file saved withcompress = FALSE: the effectis the same as saving with compression. Also, a saved file can beuncompressed and re-compressed under a different compression scheme(and seeresaveRdaFiles for a way to do so from withinR).

Parallel compression

Thatfile can be a connection can be exploited to make use ofan external parallel compression utility such aspigz(https://zlib.net/pigz/) orpbzip2(https://launchpad.net/pbzip2)via apipeconnection. For example, using 8 threads,

    con <- pipe("pigz -p8 > fname.gz", "wb")    save(myObj, file = con); close(con)    con <- pipe("pbzip2 -p8 -9 > fname.bz2", "wb")    save(myObj, file = con); close(con)    con <- pipe("xz -T8 -6 -e > fname.xz", "wb")    save(myObj, file = con); close(con)

where the last requiresxz 5.1.1 or later built with supportfor multiple threads (and parallel compression is only effective forlarge objects: at level 6 it will compress in serialized chunks of 12MB).

Warnings

The... arguments only give thenames of the objectsto be saved: they are searched for in the environment given by theenvir argument, and the actual objects given as arguments neednot be those found.

SavedR objects are binary files, even those saved withascii = TRUE, so ensure that they are transferred withoutconversion of end-of-line markers and of 8-bit characters. The linesare delimited byLF on all platforms.

Although the default version was not changed betweenR 1.4.0 andR3.4.4 nor sinceR 3.5.0, this does not mean that saved files arenecessarily backwards compatible. You will be able to load a savedimage into an earlier version ofR which supports its version unlessuse is made of later additions (for example for version 2, rawvectors, external pointers and some S4 objects).

One such ‘later addition’ waslong vectors, introduced inR3.0.0 and loadable only on 64-bit platforms.

Loading files saved withASCII = NA requires a C99-compliant Cfunctionsscanf: this is a problem on Windows, first workedaround inR 3.1.2: version-2 files in that format should be readablein earlier versions ofR on all other platforms.

Note

For saving singleR objects,saveRDS() is mostlypreferable tosave(), notably because of thefunctionalnature ofreadRDS(), as opposed toload().

The most common reason for failure is lack of write permission in thecurrent directory. Forsave.image and for saving at the end ofa session this will shown by messages like

    Error in gzfile(file, "wb") : unable to open connection    In addition: Warning message:    In gzfile(file, "wb") :      cannot open compressed file '.RDataTmp',      probable reason 'Permission denied'

See Also

dput,dump,load,data.

For other interfaces to the underlying serialization format, seeserialize andsaveRDS.

Examples

x<- stats::runif(20)y<- list(a=1, b=TRUE, c="oops")save(x, y, file="xy.RData")save.image()# creating ".RData" in current working directoryunlink("xy.RData")# set save defaults using option:options(save.defaults= list(ascii=TRUE, safe=FALSE))save.image()# creating ".RData"if(interactive()) withAutoprint({   file.info(".RData")   readLines(".RData", n=7)# first 7 lines; first starts w/ "RDA"..})unlink(".RData")

Scaling and Centering of Matrix-like Objects

Description

scale is generic function whose default method centers and/orscales the columns of a numeric matrix.

Usage

scale(x, center=TRUE, scale=TRUE)

Arguments

x

a numeric matrix(like object).

center

either a logical value or numeric-alike vector of lengthequal to the number of columns ofx, where‘numeric-alike’ means thatas.numeric(.) willbe applied successfully ifis.numeric(.) is not true.

scale

either a logical value or a numeric-alike vector of lengthequal to the number of columns ofx.

Details

The value ofcenter determines how column centering isperformed. Ifcenter is a numeric-alike vector with length equal tothe number of columns ofx, then each column ofx hasthe corresponding value fromcenter subtracted from it. Ifcenter isTRUE then centering is done by subtracting thecolumn means (omittingNAs) ofx from theircorresponding columns, and ifcenter isFALSE, nocentering is done.

The value ofscale determines how column scaling is performed(after centering). Ifscale is a numeric-alike vector with lengthequal to the number of columns ofx, then each column ofx is divided by the corresponding value fromscale.Ifscale isTRUE then scaling is done by dividing the(centered) columns ofx by their standard deviations ifcenter isTRUE, and the root mean square otherwise.Ifscale isFALSE, no scaling is done.

The root-mean-square for a (possibly centered) column is defined as(x2)/(n1)\sqrt{\sum(x^2)/(n-1)}, wherexx isa vector of the non-missing values andnn is the number ofnon-missing values. In the casecenter = TRUE, this is thesame as the standard deviation, but in general it is not. (To scaleby the standard deviations without centering, usescale(x, center = FALSE, scale = apply(x, 2, sd, na.rm = TRUE)).)

Value

Forscale.default, the centered, scaled matrix. The numericcentering and scalings used (if any) are returned as attributes"scaled:center" and"scaled:scale"

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

sweep which allows centering (and scaling) witharbitrary statistics.

For working with the scale of a plot, seepar.

Examples

require(stats)x<- matrix(1:10, ncol=2)(centered.x<- scale(x, scale=FALSE))cov(centered.scaled.x<- scale(x))# all 1

Read Data Values

Description

Read data into a vector or list from the console or file.

Usage

scan(file="", what= double(), nmax=-1, n=-1, sep="",     quote=if(identical(sep,"\n"))""else"'\"", dec=".",     skip=0, nlines=0, na.strings="NA",     flush=FALSE, fill=FALSE, strip.white=FALSE,     quiet=FALSE, blank.lines.skip=TRUE, multi.line=TRUE,     comment.char="", allowEscapes=FALSE,     fileEncoding="", encoding="unknown", text, skipNul=FALSE)

Arguments

file

the name of a file to read data values from. If thespecified file is"", then input is taken from the keyboard(or whateverstdin() reads if input is redirected orR is embedded).(In this case input can be terminated by a blank line or anEOFsignal, ‘⁠Ctrl-D⁠’ on Unix and ‘⁠Ctrl-Z⁠’ on Windows.)

Otherwise, the file name is interpretedrelative to thecurrent working directory (given bygetwd()),unless it specifies anabsolute path.Tilde-expansion is performed where supported.When runningR from a script,file = "stdin" can be used torefer to the process'sstdin file stream.

This can be a compressed file (seefile).

Alternatively,file can be aconnection,which will be opened if necessary, and if so closed at the end ofthe function call. Whatever mode the connection is opened in,any ofLF,CRLF orCR will be accepted as theEOL marker for a line and so will matchsep = "\n".

file can also be a complete URL. (For the supported URLschemes, see the ‘URLs’ section of the help forurl.)

To read a data file not in the current encoding (for example aLatin-1 file in a UTF-8 locale or conversely) use afile connection setting itsencoding argument(orscan'sfileEncoding argument).

what

thetype ofwhat gives the type of data tobe read. (Here ‘type’ is used in the sense oftypeof.) The supported types arelogical,integer,numeric,complex,character,raw andlist. Ifwhat is a list, it isassumed that the lines of the data file are records each containinglength(what) items (‘fields’) and the list componentsshould have elements which are one of the first six (atomic)types listed orNULL, see section ‘Details’ below.

nmax

the maximum number of data values to be read, or ifwhat is a list, the maximum number of records to be read. Ifomitted or not positive or an invalid value for an integer(andnlines is not set to a positive value),scan willread to the end offile.

n

integer: the maximum number of data values to be read, defaulting tono limit. Invalid values will be ignored.

sep

by default, scan expects to read ‘white-space’delimited input fields. Alternatively,sep can be used tospecify a character which delimits fields. A field is alwaysdelimited by an end-of-line marker unless it is quoted.

If specified this should be the empty character string (the default)orNULL or a character string containing just one single-bytecharacter.

quote

the set of quoting characters as a single characterstring orNULL. In a multibyte locale the quoting charactersmust be ASCII (single-byte).

dec

decimal point character. This should be a character stringcontaining just one single-byte character. (NULL and azero-length character vector are also accepted, and taken as thedefault.)

skip

the number of lines of the input file to skip beforebeginning to read data values.

nlines

if positive, the maximum number of lines of data to be read.

na.strings

character vector. Elements of this vector are to beinterpreted as missing (NA) values. Blank fields arealso considered to be missing values in logical, integer, numericand complex fields. Note that the test happensafter white space is stripped from the input (if enabled), sona.strings valuesmay need their own white space stripped in advance.

flush

logical: ifTRUE,scan will flush to theend of the line after reading the last of the fields requested.This allows putting comments after the last field, but precludesputting more than one record on a line.

fill

logical: ifTRUE,scan will implicitly addempty fields to any lines with fewer fields than implied bywhat.

strip.white

vector of logical value(s) corresponding to itemsin thewhat argument. It is used only whensep hasbeen specified, and allows the stripping of leading and trailing‘white space’ fromcharacter fields (other fieldsare always stripped). Note: white space inside quoted strings isnot stripped.

Ifstrip.white is of length 1, it applies to all fields;otherwise, ifstrip.white[i] isTRUEand thei-th field is of mode character (becausewhat[i] is)then the leading and trailing unquoted white space from fieldi isstripped.

quiet

logical: ifFALSE (default), scan() will print aline, saying how many items have been read.

blank.lines.skip

logical: ifTRUE blank lines in theinput are ignored, except when countingskip andnlines.

multi.line

logical. Only used ifwhat is a list. IfFALSE, all of a record must appear on one line (but more thanone record can appear on a single line). Note that usingfill = TRUEimplies that a record will be terminated at the end of a line.

comment.char

character: a character vector of length onecontaining a single character or an empty string. Use"" toturn off the interpretation of comments altogether (the default).

allowEscapes

logical. Should C-style escapes such as‘⁠\n⁠’ be processed (the default) or read verbatim? Note that ifnot within quotes these could be interpreted as a delimiter (but notas a comment character).

The escapes which are interpreted are the control characters‘⁠\a, \b, \f, \n, \r, \t, \v⁠’ and octal andhexadecimal representations like ‘⁠\040⁠’ and ‘⁠\0x2A⁠’. Anyother escaped character is treated as itself, including backslash.Note that Unicode escapes (starting ‘⁠\u⁠’ or ‘⁠\U⁠’: seeQuotes) are never processed.

fileEncoding

character string: if non-empty declares theencoding used on a file (not a connection nor the keyboard) so thecharacter data can be re-encoded. See the ‘Encoding’ sectionof the help forfile, and the ‘R DataImport/Export Manual’.

encoding

encoding to be assumed for input strings. If thevalue is"latin1" or"UTF-8" it is used to markcharacter strings as known to be in Latin-1 or UTF-8: it is not usedto re-encode the input (seefileEncoding).See also ‘Details’.

text

character string: iffile is not supplied and this is,then data are read from the value oftext via a text connection.

skipNul

logical: shouldNULs be skipped when reading characterfields?

Details

The value ofwhat can be a list of types, in which casescan returns a list of vectors with the types given by thetypes of the elements inwhat. This provides a way of readingcolumnar data. If any of the types isNULL, the correspondingfield is skipped (but aNULL component appears in the result).

The type ofwhat or its components can be one of the sixatomic vector types orNULL (seeis.atomic).

‘White space’ is defined for the purposes of this function asone or more contiguous characters from the set space, horizontal tab,carriage return and line feed (aka “newline”,"\n"). Itdoes not include form feed norvertical tab, but in Latin-1 and Windows 8-bit locales (but not UTF-8)'space' includes the non-breaking space ‘⁠"\xa0"⁠’.

Empty numeric fields are always regarded as missing values.Empty character fields are scanned as empty character vectors, unlessna.strings contains"" when they are regarded as missingvalues.

The allowed input for a numeric field is optional whitespace, followed byeitherNA or an optional sign followed by a decimal orhexadecimal constant (seeNumericConstants), orNaN,Inf orinfinity (ignoring case). Out-of-range valuesare recorded asInf,-Inf or0.

For an integer field the allowed input is optional whitespace,followed by eitherNA or an optional sign and one or moredigits (‘⁠0-9⁠’): all out-of-range values are converted toNA_integer_.

Ifsep is the default (""), the character ‘⁠\⁠’in a quoted string escapes the following character, so quotes may beincluded in the string by escaping them.

Ifsep is non-default, the fields may be quoted in the style of‘.csv’ files where separators inside quotes ('' or"") are ignored and quotes may be put inside strings bydoubling them. However, ifsep = "\n" it is assumedby default that one wants to read entire lines verbatim.

Quoting is only interpreted in character fields and inNULLfields (which might be skipping character fields).

Note that sincesep is a separator and not a terminator,reading a file byscan("foo", sep = "\n", blank.lines.skip = FALSE)will give an empty final line if the file ends in a line feed ("\n")and not if it does not. This might not be what you expected; see alsoreadLines.

Ifcomment.char occurs (except inside a quoted characterfield), it signals that the rest of the line should be regarded as acomment and be discarded. Lines beginning with a comment character(possibly after white space with the default separator) are treated asblank lines.

There is a line-length limit of 4095 bytes when reading from theconsole (which may impose a lower limit: see ‘An Introductionto R’).

There is a check for a user interrupt every 1000 lines ifwhatis a list, otherwise every 10000 items.

Iffile is a character string andfileEncoding isnon-default, or if it is a not-already-openconnection with anon-defaultencoding argument, the text is converted to UTF-8and declared as such (and theencoding argument toscanis ignored). See the examples ofreadLines.

EmbeddedNULs in the input stream will terminate the field currentlybeing read, with a warning once per call toscan. SettingskipNul = TRUE causes them to be ignored.

Value

ifwhat is a list, a list of the same length and same names (asany) aswhat.

Otherwise, a vector of the type ofwhat.

Character strings in the result will have a declared encoding ifencoding is"latin1" or"UTF-8".

Note

The default formulti.line differs from S. To read one recordper line, useflush = TRUE andmulti.line = FALSE.(Note that quoted character strings can still include embedded newlines.)

If number of items is not specified, the internalmechanism re-allocates memory in powers of two and so could use upto three times as much memory as needed. (It needs both old and newcopies.) If you can, specify eithern ornmax wheneverinputting a large vector, andnmax ornlines wheninputting a large list.

Usingscan on an open connection to read partial lines can losechars: use an explicit separator to avoid this.

Havingnul bytes in fields (including ‘⁠\0⁠’ ifallowEscapes = TRUE) may lead to interpretation of thefield being terminated at thenul. They not normally presentin text files – seereadBin.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

read.table for more user-friendly reading of datamatrices;readLines to read a file a line at a time.write.

Quotes for the details of C-style escape sequences.

readChar andreadBin to read fixed orvariable length character strings or binary representations of numbersa few at a time from a connection.

Examples

cat("TITLE extra line","2 3 5 7","11 13 17", file="ex.data", sep="\n")pp<- scan("ex.data", skip=1, quiet=TRUE)scan("ex.data", skip=1)scan("ex.data", skip=1, nlines=1)# only 1 line after the skipped onescan("ex.data", what= list("","",""))# flush is F -> read "7"scan("ex.data", what= list("","",""), flush=TRUE)unlink("ex.data")# tidy up## "inline" usagescan(text="1 2 3")

Give Search Path for R Objects

Description

Gives a list ofattachedpackages(seelibrary), andR objects, usuallydata.frames.

Usage

search()searchpaths()

Value

A character vector, starting with".GlobalEnv", andending with"package:base" which isR'sbase packagerequired always.

searchpaths gives a similar character vector, with theentries for packages being the path to the package used to load thecode.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (search.)

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer. (searchpaths.)

See Also

.packages to list just the packages on search path.

loadedNamespaces to list loaded namespaces.

attach anddetach to change thesearch path,objects to findR objects in there.

Examples

search()searchpaths()

Functions to Reposition Connections

Description

Functions to re-position connections.

Usage

seek(con,...)## S3 method for class 'connection'seek(con, where=NA, origin="start", rw="",...)isSeekable(con)truncate(con,...)

Arguments

con

aconnection.

where

numeric. A file position (relative to the originspecified byorigin), orNA.

rw

character string. Empty or"read" or"write",partial matches allowed.

origin

character string. One of"start","current","end": see ‘Details’.

...

further arguments passed to or from other methods.

Details

seek withwhere = NA returns the current byte offsetof a connection (from the beginning), and with a non-missingwhereargument the connection is re-positioned (if possible) to thespecified position.isSeekable returns whether the connectionin principle supportsseek: currently only (possiblygz-compressed) file connections do.

where is stored as a real but should represent an integer:non-integer values are likely to be truncated. Note that the possiblevalues can exceed the largest representable number in anRinteger on 64-bit builds, and on some 32-bit builds.

File connections can be open for both writing/appending, in which caseR keeps separate positions for reading and writing. Whichseekrefers to can be set by itsrw argument: the default is thelast mode (reading or writing) which was used. Most files areonly opened for reading or writing and so default to that state. If afile is open for both reading and writing but has not been used, thedefault is to give the reading position (0).

The initial file position for reading is always at the beginning.The initial position for writing is at the beginning of the filefor modes"r+" and"r+b", otherwise at the end of thefile. Some platforms only allow writing at the end of the file inthe append modes. (The reported write position for a file opened inan append mode will typically be unreliable until the file has beenwritten to.)

gzfile connections supportseek with a number oflimitations, using the file position of the uncompressed file.They do not supportorigin = "end". When writing, seeking isonly possible forwards: when reading seeking backwards is supported byrewinding the file and re-reading from its start.

Ifseek is called with a non-NA value ofwhere,any pushback on a text-mode connection is discarded.

truncate truncates a file opened for writing at its currentposition. It works only forfile connections, and is notimplemented on all platforms: on others (including Windows) it willnot work for large (> 2Gb) files.

None of these should be expected to work on text-mode connections withre-encoding selected.

Value

seek returns the current position (before any move), as a(numeric) byte offset from the origin, if relevant, or0 ifnot. Note that the position can exceed the largest representablenumber in anRinteger on 64-bit builds, and on some 32-bitbuilds.

truncate returnsNULL: it stops with an error ifit fails (or is not implemented).

isSeekable returns a logical value, whether the connectionsupportsseek.

Warning

Use ofseek on Windows is discouraged. We have found so manyerrors in the Windows implementation of file positioning that usersare advised to use it only at their own risk, and asked not to wastetheR developers' time with bug reports on Windows' deficiencies.

See Also

connections


Sequence Generation

Description

Generate regular sequences.seq is a standard generic with adefault method.seq.int is a primitive which can bemuch faster but has a few restrictions.seq_along andseq_len are very fast primitives for two common cases.

Usage

seq(...)## Default S3 method:seq(from=1, to=1, by=((to- from)/(length.out-1)),    length.out=NULL, along.with=NULL,...)seq.int(from, to, by, length.out, along.with,...)seq_along(along.with)seq_len(length.out)

Arguments

...

arguments passed to or from methods.

from,to

the starting and (maximal) end values of thesequence. Of length1 unless justfrom is supplied asan unnamed argument.

by

number: increment of the sequence.

length.out

desired length of the sequence. Anon-negative number, which forseq andseq.int will berounded up if fractional.

along.with

take the length from the length of this argument.

Details

Numerical inputs should all befinite (that is, not infinite,NaN orNA).

The interpretation of the unnamed arguments ofseq andseq.int isnot standard, and it is recommended always toname the arguments when programming.

seq is generic, and only the default method is described here.Note that it dispatches on the class of thefirst argumentirrespective of argument names. This can have unintended consequencesif it is called with just one argument intending this to be taken asalong.with: it is much better to useseq_along in thatcase.

seq.int is aninternal generic which dispatches onmethods for"seq" based on the class of the first suppliedargument (before argument matching).

Typical usages are

seq(from, to)seq(from, to, by= )seq(from, to, length.out= )seq(along.with= )seq(from)seq(length.out= )

The first form generates the sequencefrom, from+/-1, ..., to(identical tofrom:to).

The second form generatesfrom, from+by, ..., up to thesequence value less than or equal toto. Specifyingto - from andby of opposite signs is an error. Note that thecomputed final value can go just beyondto to allow forrounding error, but is truncated toto. (‘Just beyond’is by up to101010^{-10} timesabs(from - to).)

The third generates a sequence oflength.out equally spacedvalues fromfrom toto. (length.out is usuallyabbreviated tolength orlen, andseq_len is muchfaster.)

The fourth form generates the integer sequence1, 2, ..., length(along.with). (along.with is usually abbreviated toalong, andseq_along is much faster.)

The fifth form generates the sequence1, 2, ..., length(from)(as if argumentalong.with had been specified),unlessthe argument is numeric of length 1 when it is interpreted as1:from (even forseq(0) for compatibility with S).Using eitherseq_along orseq_len is much preferred(unless strict S compatibility is essential).

The final form generates the integer sequence1, 2, ..., length.out unlesslength.out = 0, when it generatesinteger(0).

Very small sequences (withfrom - to of the order of101410^{-14}times the larger of the ends) will returnfrom.

Forseq (only), up to two offrom,to andby can be supplied as complex values providedlength.outoralong.with is specified. More generally, the default methodofseq will handle classed objects with methods fortheMath,Ops andSummary group generics.

seq.int,seq_along andseq_len areprimitive.

Value

seq.int and the default method ofseq for numericarguments return a vector of type"integer" or"double":programmers should not rely on which.

seq_along andseq_len return an integer vector, unlessit is along vector when it will be double.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

The methodsseq.Date andseq.POSIXt.

:,rep,sequence,row,col.

Examples

seq(0,1, length.out=11)seq(stats::rnorm(20))# effectively 'along'seq(1,9, by=2)# matches 'end'seq(1,9, by= pi)# stays below 'end'seq(1,6, by=3)seq(1.575,5.125, by=0.05)seq(17)# same as 1:17, or even better seq_len(17)

Generate Regular Sequences of Dates

Description

The method forseq for objects of class"Date" representing calendar dates.

Usage

## S3 method for class 'Date'seq(from, to, by, length.out=NULL, along.with=NULL,...)

Arguments

from

starting date. Required.

to

end date. Optional.

by

increment of the sequence. Optional. See ‘Details’.

length.out

integer, optional. Desired length of the sequence.

along.with

take the length from the length of this argument.

...

arguments passed to or from other methods.

Details

by can be specified in several ways.

  • A number, taken to be in days.

  • A object of classdifftime

  • A character string, containing one of"day","week","month","quarter" or"year".This can optionally be preceded by a (positive or negative) integerand a space, or followed by"s".

    Seeseq.POSIXt for the details of"month".

Value

A vector of class"Date".

See Also

Date

Examples

## first days of yearsseq(as.Date("1910/1/1"), as.Date("1999/1/1"),"years")## by monthseq(as.Date("2000/1/1"), by="month", length.out=12)## quartersseq(as.Date("2000/1/1"), as.Date("2003/1/1"), by="quarter")## find all 7th of the month between two dates, the last being a 7th.st<- as.Date("1998-12-17")en<- as.Date("2000-1-7")ll<- seq(en, st, by="-1 month")rev(ll[ll> st& ll< en])

Generate Regular Sequences of Times

Description

The method forseq for date-time classes.

Usage

## S3 method for class 'POSIXt'seq(from, to, by, length.out=NULL, along.with=NULL,...)

Arguments

from

starting date. Required.

to

end date. Optional.

by

increment of the sequence. Optional. See ‘Details’.

length.out

integer, optional. Desired length of the sequence.

along.with

take the length from the length of this argument.

...

arguments passed to or from other methods.

Details

by can be specified in several ways.

  • A number, taken to be in seconds.

  • A object of classdifftime

  • A character string, containing one of"sec","min","hour","day","DSTday","week","month","quarter" or"year".This can optionally be preceded by a (positive or negative) integerand a space, or followed by"s".

The difference between"day" and"DSTday" is that theformer ignores changes to/from daylight savings time and the latter takesthe same clock time each day."week" ignores DST (it is aperiod of 144 hours), but"7 DSTdays" can be used as analternative."month" and"year" allow for DST.

Thetime zone of the result is taken fromfrom: rememberthat GMT means UTC (and not the time zone of Greenwich, England) and sodoes not have daylight savings time.

Using"month" first advances the month without changing theday: if this results in an invalid day of the month, it is countedforward into the next month: see the examples.

Value

A vector of class"POSIXct".

See Also

DateTimeClasses

Examples

## first days of yearsseq(ISOdate(1910,1,1), ISOdate(1999,1,1),"years")## by monthseq(ISOdate(2000,1,1), by="month", length.out=12)seq(ISOdate(2000,1,31), by="month", length.out=4)## quartersseq(ISOdate(1990,1,1), ISOdate(2000,1,1), by="quarter")# or "3 months"## days vs DSTdays: use c() to lose the time zone.seq(c(ISOdate(2000,3,20)), by="day", length.out=10)seq(c(ISOdate(2000,3,20)), by="DSTday", length.out=10)seq(c(ISOdate(2000,3,20)), by="7 DSTdays", length.out=4)

Create A Vector of Sequences

Description

The default method forsequence generates the sequenceseq(from[i], by = by[i], length.out = nvec[i]) for eachelementi in the parallel (and recycled) vectorsfrom,by andnvec. It then returns the result of concatenatingthose sequences.

Usage

sequence(nvec,...)## Default S3 method:sequence(nvec, from=1L, by=1L,...)

Arguments

nvec

coerced to a non-negative integer vector each element of whichspecifies the length of a sequence.

from

coerced to an integer vector each element of whichspecifies the first element of a sequence.

by

coerced to an integer vector each element of whichspecifies the step size between elements of a sequence.

...

additional arguments passed to methods.

Details

Negative values are supported forfrom andby.sequence(nvec, from, by=0L) is equivalent torep(from, each=nvec).

This function was originally implemented in R with fewer features, butit has since become more flexible, and the default method isimplemented in C for speed.

Author(s)

Of the current version, Michael Lawrence based on code from theS4Vectors Bioconductor package

See Also

gl,seq,rep.

Examples

sequence(c(3,2))# the concatenated sequences 1:3 and 1:2.#> [1] 1 2 3 1 2sequence(c(3,2), from=2L)#> [1] 2 3 4 2 3sequence(c(3,2), from=2L, by=2L)#> [1] 2 4 6 2 4sequence(c(3,2), by=c(-1L,1L))#> [1] 1 0 -1 1 2

Simple Serialization Interface

Description

A simple low-level interface for serializing to connections.

Usage

serialize(object, connection, ascii, xdr=TRUE,          version=NULL, refhook=NULL)unserialize(connection, refhook=NULL)

Arguments

object

R object to serialize.

connection

an openconnection or (forserialize)NULL or (forunserialize) a raw vector(see ‘Details’).

ascii

a logical. IfTRUE orNA, an ASCIIrepresentation is written; otherwise (default) a binary one.See also the comments in the help forsave.

xdr

a logical: if a binary representation is used, should abig-endian one (XDR) be used?

version

the workspace format version to use.NULLspecifies the current default version (3). The only other supportedvalue is 2, the default fromR 1.4.0 toR 3.5.0.

refhook

a hook function for handling reference objects.

Details

The functionserialize serializesobject to the specifiedconnection. Ifconnection isNULL thenobject isserialized to a raw vector, which is returned as the result ofserialize.

Sharing of reference objects is preserved within the object but notacross separate calls toserialize.

unserialize reads an object (as written byserialize)fromconnection or a raw vector.

Therefhook functions can be used to customize handling ofnon-system reference objects (all external pointers and weakreferences, and all environments other than namespace and packageenvironments and.GlobalEnv). The hook function forserialize should return a character vector for references itwants to handle; otherwise it should returnNULL. The hook forunserialize will be called with character vectors supplied toserialize and should return an appropriate object.

For a text-mode connection, the default value ofascii is settoTRUE: only ASCII representations can be written to text-modeconnections and attempting to useascii = FALSE will throw anerror.

The format consists of a single line followed by the data: the firstline contains a single character:X for binary serializationandA for ASCII serialization, followed by a new line. (Theformat used is identical to that used byreadRDS.)

As almost all systems in current use are little-endian,xdr = FALSE can be used to avoid byte-shuffling at both ends whentransferring data from one little-endian machine to another (orbetween processes on the same machine). Depending on the system, thiscan speed up serialization and unserialization by a factor of up to3x.

Value

Forserialize,NULL unlessconnection = NULL, whenthe result is returned in a raw vector.

Forunserialize anR object.

Warning

These functions have provided a stable interface sinceR 2.4.0 (whenthe storage of serialized objects was changed from character to rawvectors). However, the serialization format may change in futureversions ofR, so this interface should not be used for long-termstorage ofR objects.

On 32-bit platforms a raw vector is limited to23112^{31} - 1 bytes, butR objects can exceed this and their serializations willnormally be larger than the objects.

See Also

saveRDS for a more convenient interface to serialize anobject to a file or connection.

save andload to serialize and restore oneor more named objects.

The ‘R Internals’ manual for details of the format used.

Examples

x<- serialize(list(1,2,3),NULL)unserialize(x)## see also the examples for saveRDS

Set Operations

Description

Performsset union, intersection, (asymmetric!) difference,equality and membership on two vectors.

Usage

union(x, y)intersect(x, y)setdiff(x, y)setequal(x, y)is.element(el, set)

Arguments

x,y,el,set

vectors (of the same mode) containing a sequenceof items (conceptually) with no duplicated values.

Details

Each ofunion,intersect,setdiff andsetequal will discard any duplicated values in the arguments,and they applyas.vector to their arguments (and soin particular coerce factors to character vectors).

is.element(x, y) is identical tox %in% y.

Value

Forunion, a vector of a common mode.

Forintersect, a vector of a common mode, orNULL ifx ory isNULL.

Forsetdiff, a vector of the samemode asx.

A logical scalar forsetequal and a logical of the samelength asx foris.element.

See Also

%in%

plotmath’ for the use ofunion andintersect in plot annotation.

Examples

(x<- c(sort(sample(1:20,9)),NA))(y<- c(sort(sample(3:23,7)),NA))union(x, y)intersect(x, y)setdiff(x, y)setdiff(y, x)setequal(x, y)## True for all possible x & y :setequal( union(x, y),          c(setdiff(x, y), intersect(x, y), setdiff(y, x)))is.element(x, y)# length 10is.element(y, x)# length  8

Set CPU and/or Elapsed Time Limits

Description

Functions to set CPU and/or elapsed time limits for top-levelcomputations or the current session.

Usage

setTimeLimit(cpu=Inf, elapsed=Inf, transient=FALSE)setSessionTimeLimit(cpu=Inf, elapsed=Inf)

Arguments

cpu,elapsed

double (of length one). Set a limit onthe total or elapsed CPU time in seconds, respectively.

transient

logical. IfTRUE, the limits apply only tothe rest of the current computation.

Details

setTimeLimit sets limits which apply to each top-levelcomputation, that is a command line (including any continuation lines)entered at the console or from a file. If it is called from within acomputation the limits apply to the rest of the computation and(unlesstransient = TRUE) to subsequent top-level computations.

setSessionTimeLimit sets limits for the rest of thesession. Once a session limit is reached it is reset toInf.

Setting any limit has a small overhead – well under 1% on thesystems measured.

Time limits are checked whenever a user interrupt could occur.This will happen frequently inR code and duringSys.sleep,but only at points in compiled C and Fortran code identified by thecode author.

‘Total CPU time’ includes that used by child processes wherethe latter is reported.


Display Connections

Description

Display aspects ofconnections.

Usage

showConnections(all=FALSE)getConnection(what)closeAllConnections()stdin()stdout()stderr()nullfile()isatty(con)getAllConnections()

Arguments

all

logical: if true all connections, including closed onesand the standard ones are displayed. If false only open user-createdconnections are included.

what

integer: a row number of the table given byshowConnections.

con

a connection.

Details

stdin(),stdout() andstderr() are standardconnections corresponding to input, output and error on the consolerespectively (and not necessarily to file streams). They are text-modeconnections of class"terminal" which cannot be opened orclosed, and are read-only, write-only and write-only respectively.Thestdout() andstderr() connections can bere-directed bysink (and in some circumstances theoutput fromstdout() can be split: see the help page).

The encoding forstdin() when redirected canbe set by the command-line flag--encoding.

nullfile() returns filename of the null device ("/dev/null"on Unix,"nul:" on Windows).

showConnections returns a matrix of information. If aconnection object has been lost or forgotten,getConnectionwill take a row number from the table and return a connection objectfor that connection, which can be used to close the connection,for example. However, if there is noR level object referring to theconnection it will be closed automatically at the next garbagecollection (except forgzcon connections).

closeAllConnections closes (and destroys) all userconnections, restoring allsink diversions as it doesso.

isatty returns true if the connection is one of the class"terminal" connections and it is apparently connected to aterminal, otherwise false. This may not be reliable in embeddedapplications, including GUI consoles.

getAllConnections returns a sequence of integer connectiondescriptors for use withgetConnection, corresponding to therow names of the table returned byshowConnections(all = TRUE).

Value

stdin(),stdout() andstderr() return connectionobjects.

showConnections returns a character matrix of information witha row for each connection, by default only for open non-standard connections.

getConnection returns a connection object, orNULL.

Note

stdin() refers to the ‘console’ and not to the C-level‘stdin’ of the process. The distinction matters in GUI consoles(which may not have an active ‘stdin’, and if they do it may notbe connected to console input), and also in embedded applications.If you want access to the C-level file stream ‘stdin’, usefile("stdin").

WhenR is reading a script from a file, thefile is the‘console’: this is traditional usage to allow in-line data (see‘An Introduction to R’ for an example).

See Also

connections

Examples

showConnections(all=TRUE)## Not run:textConnection(letters)# oops, I forgot to record that oneshowConnections()#  class     description      mode text   isopen   can read can write#3 "letters" "textConnection" "r"  "text" "opened" "yes"    "no"mycon<- getConnection(3)## End(Not run)c(isatty(stdin()), isatty(stdout()), isatty(stderr()))

Quote Strings for Use in OS Shells

Description

Quote a string to be passed to an operating system shell.

Usage

shQuote(string, type= c("sh","csh","cmd","cmd2"))

Arguments

string

a character vector, usually of length one.

type

character: the type of shell quoting. Partial matching issupported."cmd" and"cmd2" refer to the Windows shell."cmd" is the default under Windows.

Details

The default type of quoting supported under Unix-alikes is that forthe Bourne shellsh. If the string does not contain singlequotes, we can just surround it with single quotes. Otherwise, thestring is surrounded in double quotes, which suppresses all specialmeanings of metacharacters except dollar, backquote and backslash, sothese (and of course double quote) are preceded by backslash. Thistype of quoting is also appropriate forbash,ksh andzsh.

The other type of quoting is for the C-shell (csh andtcsh). Once again, if the string does not contain singlequotes, we can just surround it with single quotes. If it doescontain single quotes, we can use double quotes provided it does notcontain dollar or backquote (and we need to escape backslash,exclamation mark and double quote). As a last resort, we need tosplit the string into pieces not containing single quotes (some may beempty) and surround each with single quotes, and the single quoteswith double quotes.

In Windows, command line interpretation is done by the application as wellas the shell. It may depend on the compiler used: Microsoft's rules forthe C run-time are given athttps://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-160. It may depend on the whim of the programmer of the application: check itsdocumentation. Thetype = "cmd" prepares the string for parsing asan argument by the Microsoft's rules and makesshQuote safe for usewith many applications when used withsystem orsystem2. It surrounds the string by double quotes andescapes internal double quotes by a backslash. Any trailing backslashesand backslashes that were originally before double quotes are doubled.

The Windowscmd.exe shell (used by default withshell)usestype = "cmd2" quoting: special characters are prefixedwith"^". In some cases, two types of quoting should beused: first for the application, and thentype = "cmd2" forcmd.exe. See the examples below.

Value

A character vector of the same length asstring.

References

Loukides, M.et al (2002)Unix Power ToolsThird Edition. O'Reilly. Section 27.12.

Discussion inPR#16636.

See Also

Quotes for quotingR code.

sQuote for quoting English text.

Examples

test<-"abc$def`gh`i\\j"cat(shQuote(test),"\n")## Not run: system(paste("echo", shQuote(test)))test<-"don't do it!"cat(shQuote(test),"\n")tryit<- paste("use the", sQuote("-c"),"switch\nlike this")cat(shQuote(tryit),"\n")## Not run: system(paste("echo", shQuote(tryit)))cat(shQuote(tryit, type="csh"),"\n")## Windows-only example, assuming cmd.exe:perlcmd<-'print "Hello World\\n";'## Not run:shell(shQuote(paste("perl -e",                     shQuote(perlcmd, type="cmd")),              type="cmd2"))## End(Not run)

Sign Function

Description

sign returns a vector with the signs of the correspondingelements ofx (the sign of a real number is 1, 0, or1-1if the number is positive, zero, or negative, respectively).

Note thatsign does not operate on complex vectors.

Usage

sign(x)

Arguments

x

a numeric vector

Details

This is aninternal genericprimitive function: methodscan be defined for it directly or via theMath group generic.

See Also

abs

Examples

sign(pi)# == 1sign(-2:3)# -1 -1 0 1 1 1

Interrupting Execution of R

Description

On receivingSIGUSR1R will save the workspace and quit.SIGUSR2 has the same result except that the.Lastfunction andon.exit expressions will not be called.

Usage

kill-USR1 pidkill-USR2 pid

Arguments

pid

The process ID of theR process.

Details

The commands history will also be saved if would be at normaltermination.

This is not available on Windows, and possibly on other OSes which donot support these signals.

Warning

It is possible that one or moreR objects will be undergoingmodification at the time the signal is sent. These objects could besaved in a corrupted form.

See Also

Sys.getpid to report the process ID for future use.


Send R Output to a File

Description

sink divertsR output to a connection (and stops such diversions).

sink.number() reports how many diversions are in use.

sink.number(type = "message") reports the number of theconnection currently being used for error messages.

Usage

sink(file=NULL, append=FALSE, type= c("output","message"),     split=FALSE)sink.number(type= c("output","message"))

Arguments

file

a writableconnection or a character string naming thefile to write to, orNULL to stop sink-ing.

append

logical. IfTRUE, output will be appended tofile; otherwise, it will overwrite the contents offile.

type

character string. Either the output stream or the messagesstream. The name will be partially matched so can be abbreviated.

split

logical: ifTRUE, output will be sent to the newsink and to the current output stream, like the Unix programtee.

Details

sink divertsR output to a connection (and must be used againto finish such a diversion, see below!). Iffile is acharacter string, a file connection with that name will be establishedfor the duration of the diversion.

NormalR output (to connectionstdout) is diverted bythe defaulttype = "output". Only prompts and (most)messages continue to appear on the console. Messages sent tostderr() (including those frommessage,warning andstop) can be diverted bysink(type = "message") (see below).

sink() orsink(file = NULL) ends the last diversion (ofthe specified type). There is a stack of diversions for normaloutput, so output reverts to the previous diversion (if there wasone). The stack is of up to 21 connections (20 diversions).

Iffile is a connection it will be opened if necessary (in"wt" mode) and closed once it is removed from the stack ofdiversions.

split = TRUE only splitsR output (viaRvprintf) andthe default output fromwriteLines: it does not splitall output that might be sent tostdout().

Sink-ing the messages stream should be done only with great care.For that streamfile must be an already open connection, andthere is no stack of connections.

Iffile is a character string, the file will be opened usingthe current encoding. If you want a different encoding (e.g., torepresent strings which have been stored in UTF-8), use afile connection — but some ways to produceR outputwill already have converted such strings to the current encoding.

Value

sink returnsNULL.

Forsink.number() the number (0, 1, 2, ...) of diversions ofoutput in place.

Forsink.number("message") the connection number used formessages, 2 if no diversion has been used.

Warning

Do not use a connection that is open forsink for any otherpurpose. The software will stop you closing one such inadvertently.

Do not sink the messages stream unless you understand the source codeimplementing it and hence the pitfalls.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

See Also

capture.output

Examples

sink("sink-examp.txt")i<-1:10outer(i, i)sink()## capture all the output to a file.zz<- file("all.Rout", open="wt")sink(zz)sink(zz, type="message")try(log("a"))## revert output back to the console -- only then access the file!sink(type="message")sink()file.show("all.Rout", delete.file=TRUE)

Slice Indexes in an Array

Description

Returns a matrix of integers indicating the number of their slice in agiven array.

Usage

slice.index(x, MARGIN)

Arguments

x

an array. Ifx has no dimension attribute, it isconsidered a one-dimensional array.

MARGIN

an integer vector giving the dimension numbers to slice by.

Details

IfMARGIN gives a single dimension, then all elements of slicenumberi with respect to this have valuei. In general,slice numbers are obtained by numbering all combinations of indices inthe dimensions given byMARGIN in column-major order. I.e.,withm1m_1, ...,mkm_k the dimension numbers (elements ofMARGIN) sliced by anddm1d_{m_1}, ...,dmkd_{m_k} thecorresponding extents, andn1=1n_1 = 1,n2=dm1n_2 = d_{m_1}, ...,nk=dm1dmk1n_k = d_{m_1} \cdots d_{m_{k-1}},the number of the slice where dimensionm1m_1 has valuei1i_1,..., dimensionmkm_k has valueiki_k is1+n1(i11)++nk(ik1)1 + n_1 (i_1 - 1) + \cdots + n_k (i_k - 1).

Value

An integer arrayy with dimensions corresponding to those ofx.

See Also

row andcol for determining row and columnindexes; in fact, these are special cases ofslice.indexcorresponding toMARGIN equal to 1 and 2, respectively whenx is a matrix.

Examples

x<- array(1:24, c(2,3,4))slice.index(x,2)slice.index(x, c(1,3))## When slicing by dimensions 1 and 3, slice index 5 is obtained for## dimension 1 has value 1 and dimension 3 has value 3 (see above):which(slice.index(x, c(1,3))==5, arr.ind=TRUE)

Extract or Replace a Slot or Property

Description

Extract or replace the contents of a slot or property of an object.

Usage

object@nameobject@name<- value

Arguments

object

An object from a formally defined (S4) class, or anobject with a class for which '@' or '@<-' S3 methods are defined.

name

The name of the slot or property, supplied as a characterstring or unquoted symbol. Ifobject has an S4 class, thenname must be the name of a slot in the definition of the classofobject.

value

A suitable replacement value for the slot orproperty. For an S4 object this must be from a class compatiblewith the class defined for this slot in the definition of the classofobject.

Details

Ifobject is not an S4 object, then a suitable S3 method for'@' or '@<-' is searched for. If no method is found, then an erroris signaled.

ifobject is an S4 object, then these operators are for slotaccess, and are enabled only when packagemethods is loaded (asper default). The slot must be formally defined. (There is anexception for the name.Data, intended for internal use only.)The replacement operator checks that the slot already exists on theobject (which it should if the object is really from the class itclaims to be). Seeslot for further details, inparticular for the differences betweenslot() and the@operator.

These are internal generic operators: seeInternalMethods.

Value

The current contents of the slot.

See Also

Extract,slot


Wait on Socket Connections

Description

Waits for the first of several socket connections and server socketsto become available.

Usage

socketSelect(socklist, write=FALSE, timeout=NULL)

Arguments

socklist

list of open socket connections and server sockets.

write

logical. IfTRUE wait for corresponding socket tobecome available for writing; otherwise wait for it to becomeavailable for reading or for accepting an incomingconnection (server sockets).

timeout

numeric orNULL. Time in seconds to wait for asocket to become available;NULL means waitindefinitely.

Details

The values inwrite are recycled if necessary to make up alogical vector the same length assocklist. Socket connectionscan appear more than once insocklist; this can be useful ifyou want to determine whether a socket is available for reading orwriting.

Value

Logical the same length assocklist indicatingwhether the corresponding socket connection is available foroutput or input, depending on the corresponding value ofwrite.Server sockets can only become available for input.

Examples

## Not run:## test whether socket connection s is available for writing or readingsocketSelect(list(s, s), c(TRUE,FALSE), timeout=0)## End(Not run)

Solve a System of Equations

Description

This generic function solves the equationa %*% x = b forx,whereb can be either a vector or a matrix.

Usage

solve(a, b,...)## Default S3 method:solve(a, b, tol, LINPACK=FALSE,...)

Arguments

a

a square numeric or complex matrix containing the coefficients ofthe linear system. Logical matrices are coerced to numeric.

b

a numeric or complex vector or matrix giving the right-handside(s) of the linear system. If missing,b is taken to bean identity matrix andsolve will return the inverse ofa.

tol

the tolerance for detecting linear dependencies in thecolumns ofa. The default is.Machine$double.eps.

LINPACK

logical. Defunct and an error.

...

further arguments passed to or from other methods.

Details

a orb can be complex, but this uses double complexarithmetic which might not be available on all platforms.

The row and column names of the result are taken from the column namesofa and ofb respectively. Ifb is missing thecolumn names of the result are the row names ofa. No check ismade that the column names ofa match the row names ofb.

For back-compatibilitya can be a (real) QR decomposition,althoughqr.solve should be called in that case.qr.solve can handle non-square systems.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code: these can only be interpreted bydetailed study of the FORTRAN code.

What happens ifa and/orb contain missing,NaNor infinite values is platform-dependent, including on the version ofLAPACK is in use.

tol is a tolerance for the (estimated 1-norm)‘reciprocal condition number’: the check is skipped iftol <= 0.

For historical reasons, the default method acceptsa as anobject of class"qr" (with a warning) and passes it on tosolve.qr.

Source

The default method is an interface to the LAPACK routinesDGESVandZGESV.

LAPACK is fromhttps://netlib.org/lapack/.

References

Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

solve.qr for theqr method,chol2inv for inverting from the Cholesky factorbacksolve,qr.solve.

Examples

hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}h8<- hilbert(8); h8sh8<- solve(h8)round(sh8%*% h8,3)A<- hilbert(4)A[]<- as.complex(A)## might not be supported on all platformstry(solve(A))

Sorting or Ordering Vectors

Description

Sort (ororder) a vector or factor (partially) intoascending or descending order. For ordering along more than onevariable, e.g., for sorting data frames, seeorder.

Usage

sort(x, decreasing=FALSE,...)## Default S3 method:sort(x, decreasing=FALSE, na.last=NA,...)sort.int(x, partial=NULL, na.last=NA, decreasing=FALSE,         method= c("auto","shell","quick","radix"), index.return=FALSE)

Arguments

x

forsort anR object with a class or a numeric,complex, character or logical vector. Forsort.int, anumeric, complex, character or logical vector, or a factor.

decreasing

logical. Should the sort be increasing or decreasing?Not available for partial sorting.

...

arguments to be passed to or from methods or (for thedefault methods and objects without a class) tosort.int.

na.last

for controlling the treatment ofNAs.IfTRUE, missing values in the data are put last; ifFALSE, they are put first; ifNA, they are removed.

partial

NULL or a vector of indices for partial sorting.

method

character string specifying the algorithm used. Notavailable for partial sorting. Can be abbreviated.

index.return

logical indicating if the ordering index vector shouldbe returned as well. Supported bymethod == "radix" for anyna.last mode and data type, and the other methods whenna.last = NA (the default) and fully sorting non-factors.

Details

sort is a generic function for which methods can be written,andsort.int is the internal method which is compatiblewith S if only the first three arguments are used.

The defaultsort method makes use oforder forclassed objects, which in turn makes use of the generic functionxtfrm (and can be slow unless axtfrm method hasbeen defined oris.numeric(x) is true).

Complex values are sorted first by the real part, then the imaginarypart.

The"auto" method selects"radix" for short (less than2312^{31} elements) numeric vectors, integer vectors, logicalvectors and factors; otherwise,"shell".

Except for method"radix",the sort order for character vectors will depend on the collatingsequence of the locale in use: seeComparison.The sort order for factors is the order of their levels (which isparticularly appropriate for ordered factors).

Ifpartial is notNULL, it is taken to contain indicesof elements of the result which are to be placed in their correctpositions in the sorted array by partial sorting. For each of theresult values in a specified position, any values smaller than thatone are guaranteed to have a smaller index in the sorted array and anyvalues which are greater are guaranteed to have a bigger index in thesorted array. (This is included for efficiency, and many of theoptions are not available for partial sorting. It is onlysubstantially more efficient ifpartial has a handful ofelements, and a full sort is done (a Quicksort if possible) if thereare more than 10.) Names are discarded for partial sorting.

Method"shell" uses Shellsort (anO(n4/3)O(n^{4/3}) variant fromSedgewick (1986)). Ifx has names a stable modification isused, so ties are not reordered. (This only matters if names arepresent.)

Method"quick" uses Singleton (1969)'s implementation ofHoare's Quicksort method and is only available whenx isnumeric (double or integer) andpartial isNULL. (Forother types ofx Shellsort is used, silently.) It is normallysomewhat faster than Shellsort (perhaps 50% faster on vectors oflength a million and twice as fast at a billion) but has poorperformance in the rare worst case. (Peto's modification using apseudo-random midpoint is used to make the worst case rarer.) This isnot a stable sort, and ties may be reordered.

Method"radix" relies on simple hashing to scale time linearlywith the input size, i.e., its asymptotic time complexity is O(n). Thespecific variant and its implementation originated from the data.tablepackage and are due to Matt Dowle and Arun Srinivasan. For smallinputs (< 200), the implementation uses an insertion sort (O(n^2))that operates in-place to avoid the allocation overhead of the radixsort. For integer vectors of range less than 100,000, it switches to asimpler and faster linear time counting sort. In all cases, the sortis stable; the order of ties is preserved. It is the default methodfor integer vectors and factors.

The"radix" method generally outperforms the other methods,especially for small integers. Compared to quick sort, it is slightlyfaster for vectors with large integer or real values (but unlike quicksort, radix is stable and supports allna.last options). Theimplementation is orders of magnitude faster than shell sort forcharacter vectors, but collationdoes not respect thelocale and so gives incorrect answers even in English locales.

However, there are some caveats for the radix sort:

  • Ifx is acharacter vector, all elements must sharethe same encoding. Only UTF-8 (including ASCII) and Latin-1encodings are supported. Collation follows that withLC_COLLATE=C, that is lexicographically byte-by-byte usingnumerical ordering of bytes.

  • Long vectors (with2312^{31} or more elements)andcomplex vectors are not supported.

Value

Forsort, the result depends on the S3 method which isdispatched. Ifx does not have a classsort.int is usedand it description applies. For classed objects which do not have aspecific method the default method will be used and is equivalent tox[order(x, ...)]: this depends on the class having a suitablemethod for[ (and also thatorder will work,which requires axtfrm method).

Forsort.int the value is the sorted vector unlessindex.return is true, when the result is a list with componentsnamedx andix containing the sorted numbers and theordering index vector. In the latter case, ifmethod == "quick" ties may be reversed in the ordering (unlikesort.list) as quicksort is not stable. Formethod == "radix",index.return is supported for allna.lastmodes. The other methods only supportindex.returnwhenna.last isNA. The index vectorrefers to element numbersafter removal ofNAs: seeorder if you want the original element numbers.

All attributes are removed from the return value (see Beckeret al., 1988, p.146) except names, which are sorted. (Ifpartial is specified even the names are removed.) Note thatthis means that the returned value has no class, except for factorsand ordered factors (which are treated specially and whose result istransformed back to the original class).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).The New S Language.Wadsworth & Brooks/Cole.

Knuth, D. E. (1998).The Art of Computer Programming, Volume 3: Sorting andSearching, 2nd ed.Addison-Wesley.

Sedgewick, R. (1986).A new upper bound for Shellsort.Journal of Algorithms,7, 159–173.doi:10.1016/0196-6774(86)90001-5.

Singleton, R. C. (1969).Algorithm 347: an efficient algorithm for sorting with minimal storage.Communications of the ACM,12, 185–186.doi:10.1145/362875.362901.

See Also

Comparison’ for how character strings are collated.

order for sorting on or reordering multiple variables.

is.unsorted.rank.

Examples

require(stats)x<- swiss$Education[1:25]x; sort(x); sort(x, partial= c(10,15))## illustrate 'stable' sorting (of ties):sort(c(10:3,2:12), method="shell", index.return=TRUE)# is stable## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12## $ix: 9  8 10  7 11  6 12  5 13  4 14  3 15  2 16  1 17 18 19sort(c(10:3,2:12), method="quick", index.return=TRUE)# is not## $x : 2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 12## $ix: 9 10  8  7 11  6 12  5 13  4 14  3 15 16  2 17  1 18 19x<- c(1:3,3:5,10)is.unsorted(x)# FALSE: is sortedis.unsorted(x, strictly=TRUE)# TRUE : is not (and cannot be)# sorted strictly## Not run:## Small speed comparison simulation:N<-2000Sim<-20rep<-1000# << adjust to your CPUc1<- c2<- numeric(Sim)for(isin seq_len(Sim)){  x<- rnorm(N)  c1[is]<- system.time(for(iin1:rep) sort(x, method="shell"))[1]  c2[is]<- system.time(for(iin1:rep) sort(x, method="quick"))[1]  stopifnot(sort(x, method="shell")== sort(x, method="quick"))}rbind(ShellSort= c1, QuickSort= c2)cat("Speedup factor of quick sort():\n")summary({qq<- c1/ c2; qq[is.finite(qq)]})## A larger testx<- rnorm(1e7)system.time(x1<- sort(x, method="shell"))system.time(x2<- sort(x, method="quick"))system.time(x3<- sort(x, method="radix"))stopifnot(identical(x1, x2))stopifnot(identical(x1, x3))## End(Not run)

Sorting Vectors or Data Frames by Other Vectors

Description

Generic function to sort an object in the order determined by one ormore other objects, typically vectors. A method is defined for dataframes to sort its rows (typically by one or more columns), and thedefault method handles vector-like objects.

Usage

sort_by(x, y,...)## Default S3 method:sort_by(x, y,...)## S3 method for class 'data.frame'sort_by(x, y,...)

Arguments

x

An object to be sorted, typically a vector or data frame.

y

Variables to sort by.

For the default method, this can be a vector, or more generally anyobject that has axtfrm method.

For thedata.frame method, typically a formula specifying thevariables to sort by. The formula can take the forms ~ g or~ list(g) to sort by the variableg, or more generallythe forms ~ g1 + ... + gk or~ list(g1, ..., gk)to sort by the variablesg1, ...,gk, using thelater ones to resolve ties in the preceding ones. These variablesare evaluated in the data framex using the usualnon-standard evaluation rules. If not a formula,y = g isequivalent toy = ~ g andy = list(g1, ..., gk) isequivalent toy = ~ list(g1, ..., gk). However,non-standard evaluation inx is not done in this case.

...

Additional arguments, typically passed on toorder. These may include additional variables to sortby, as well as named arguments recognized byorder.

Value

A sorted version ofx. Ifx is a data frame, this meansthat the rows ofx have been reordered to sort the variablesspecified iny.

See Also

sort,order.

Examples

mtcars$ammtcars$mpgwith(mtcars, sort_by(mpg, am))# group mpg by am## data.frame methodsort_by(mtcars, runif(nrow(mtcars)))# random row permutationsort_by(mtcars, list(mtcars$am, mtcars$mpg))# formula interfacesort_by(mtcars,~ am+ mpg)|> subset(select= c(am, mpg))sort_by.data.frame(mtcars,~ list(am,-mpg))|> subset(select= c(am, mpg))

Read R Code from a File, a Connection or Expressions

Description

source causesR to accept its input from the named file or URLor connection or expressions directly. Input is read andparsed from that fileuntil the end of the file is reached, then the parsed expressions areevaluated sequentially in the chosen environment.

withAutoprint(exprs) is a wrapper forsource(exprs = exprs, ..) with different defaults. Its main purpose is to evaluateand auto-print expressions as if in a toplevel context, e.g, as in theR console.

Usage

source(file, local=FALSE, echo= verbose, print.eval= echo,       exprs, spaced= use_file,       verbose= getOption("verbose"),       prompt.echo= getOption("prompt"),       max.deparse.length=150, width.cutoff=60L,       deparseCtrl="showAttributes",       chdir=FALSE,       catch.aborts=FALSE,       encoding= getOption("encoding"),       continue.echo= getOption("continue"),       skip.echo=0, keep.source= getOption("keep.source"))withAutoprint(exprs, evaluated=FALSE, local= parent.frame(),              print.=TRUE, echo=TRUE, max.deparse.length=Inf,              width.cutoff= max(20, getOption("width")),              deparseCtrl= c("keepInteger","showAttributes","keepNA"),              skip.echo=0,...)

Arguments

file

aconnection or a character string giving thepathname of the file or URL to read from. Thestdin()connection reads from the console when interactive.

local

TRUE,FALSE or an environment, determiningwhere the parsed expressions are evaluated.FALSE (thedefault) corresponds to the user's workspace (the globalenvironment) andTRUE to the environment from whichsource is called.

echo

logical; ifTRUE, each expression is printedafter parsing, before evaluation.

print.eval,print.

logical; ifTRUE, the result ofeval(i) is printed for each expressioni; defaultsto the value ofecho.

exprs

forsource() andwithAutoprint(*, evaluated=TRUE):instead of specifyingfile, anexpression,call, orlistofcall's, butnot an unevaluated “expression”.

forwithAutoprint() (with defaultevaluated=FALSE):one or more unevaluated “expressions”.

evaluated

logical indicating thatexprs is passed tosource(exprs= *) and hence must be evaluated, i.e., a formalexpression,call orlist of calls.

spaced

logical indicating if newline (hence empty line) shouldbe printed before each expression (whenecho = TRUE).

verbose

ifTRUE, more diagnostics (than justecho = TRUE) are printed during parsing and evaluation ofinput, including extra info foreach expression.

prompt.echo

character; gives the prompt to be used ifecho = TRUE.

max.deparse.length

integer; is used only ifecho isTRUE and gives the maximal number of characters output forthe deparse of a single expression.

width.cutoff

integer, passed todeparse() whichis used (only) when there are no source references.

deparseCtrl

character vector, passed ascontrol todeparse(), see also.deparseOpts. InR version <= 3.3.x, this washardcoded to"showAttributes", which is the defaultcurrently;deparseCtrl = "all" may be preferable, when strictback compatibility is not of importance.

chdir

logical; ifTRUE andfile is a pathname,theR working directory is temporarily changed to the directorycontainingfile for evaluating.

catch.aborts

logical indicating that “abort”ing errorsshould be caught.

encoding

character vector. The encoding(s) to be assumed whenfile is a character string: seefile. Apossible value is"unknown" when the encoding is guessed: seethe ‘Encodings’ section.

continue.echo

character; gives the prompt to use oncontinuation lines ifecho = TRUE.

skip.echo

integer; how many comment lines at the start of thefile to skip ifecho = TRUE.

keep.source

logical: should the source formatting be retainedwhen echoing expressions, if possible?

...

(forwithAutoprint():) further (non-file related)arguments to be passed tosource(.).

Details

Note that running code viasource differs in a few respectsfrom entering it at theR command line. Since expressions are notexecuted at the top level, auto-printing is not done. So you willneed to include explicitprint calls for things you want to beprinted (and remember that this includes plotting bylattice,FAQ Q7.22). Since the complete file is parsed before any of it isrun, syntax errors result in none of the code being run. If an erroroccurs in running a syntactically correct script, anything assignedinto the workspace by code that has been run will be kept (just asfrom the command line), but diagnostic information such astraceback() will contain additional calls towithVisible.

All versions ofR accept input from a connection with end of linemarked byLF (as used on Unix),CRLF (as used onDOS/Windows) orCR (as used on classic Mac OS) and map this tonewline. The final line can be incomplete, that is missing the finalend-of-line marker.

Ifkeep.source is true (the default in interactive use), thesource of functions is kept so they can be listed exactly as input.

Unlike input from a console, lines in the file or on a connection cancontain an unlimited number of characters.

Whenskip.echo > 0, that many comment lines at the start ofthe file will not be echoed. This does not affect the execution ofthe code at all. If there are executable lines within the firstskip.echo lines, echoing will start with the first of them.

Ifecho is true and a deparsed expression exceedsmax.deparse.length, that many characters are output followed by .... [TRUNCATED].

Encodings

By default the input is read and parsed in the current encoding oftheR session. This is usually what is required, but occasionallyre-encoding is needed, e.g. if a file from a UTF-8-using system is tobe read on Windows (orvice versa).

The rest of this paragraph applies iffile is an actualfilename or URL (and not a connection). Ifencoding = "unknown", an attempt is made to guess the encoding:the result oflocaleToCharset() is used as a guide. Ifencoding has two or more elements, they are tried in turn untilthe file/URL can be read without error in the trial encoding. If anactualencoding is specified (rather than the default or"unknown") in a Latin-1 or UTF-8 locale then character stringsin the result will be translated to the current encoding and marked assuch (seeEncoding).

Iffile is a connection,it is not possible to re-encode the input insidesource, and sotheencoding argument is just used to mark character strings in theparsed input in Latin-1 and UTF-8 locales: seeparse.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

demo which usessource;eval,parse andscan;options("keep.source").

sys.source which is a streamlined version to source afile into an environment.

‘The R Language Definition’ for a discussion of sourcedirectives.

Examples

someCond<-7>6## want an if-clause to behave "as top level" wrt auto-printing :## (all should look "as if on top level", e.g. non-assignments should print:)if(someCond) withAutoprint({   x<-1:12   x-1(y<-(x-5)^2)   z<- y   z-10})## If you want to source() a bunch of files, something like## the following may be useful:sourceDir<-function(path, trace=TRUE,...){    op<- options(); on.exit(options(op))# to reset after eachfor(nmin list.files(path, pattern="[.][RrSsQq]$")){if(trace) cat(nm,":")       source(file.path(path, nm),...)if(trace) cat("\n")       options(op)}}suppressWarnings( rm(x,y))# remove 'x' or 'y' from global envwithAutoprint({ x<-1:2; cat("x=",x,"\n"); y<- x^2})## x and y now exist:stopifnot(identical(x,1:2), identical(y, x^2))withAutoprint({ formals(sourceDir); body(sourceDir)},              max.deparse.length=20, verbose=TRUE)## Continuing after (catchable) errors:tc<- textConnection('1:32+"3" cat(" .. in spite of error: happily continuing! ..\n")6*7')r<- source(tc, catch.aborts=TRUE)## Error in 2 + "3" ....## .. in spite of error: happily continuing! ..stopifnot(identical(r, list(value=42, visible=TRUE)))

Special Functions of Mathematics

Description

Special mathematical functions related to the beta and gammafunctions.

Usage

beta(a, b)lbeta(a, b)gamma(x)lgamma(x)psigamma(x, deriv=0)digamma(x)trigamma(x)choose(n, k)lchoose(n, k)factorial(x)lfactorial(x)

Arguments

a,b

non-negative numeric vectors.

x,n

numeric vectors.

k,deriv

integer vectors.

Details

The functionsbeta andlbeta return the beta functionand the natural logarithm of the beta function,

B(a,b)=Γ(a)Γ(b)Γ(a+b).B(a,b) = \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}.

The formal definition is

B(a,b)=01ta1(1t)b1dtB(a, b) = \int_0^1 t^{a-1} (1-t)^{b-1} dt

(Abramowitz and Stegun section 6.2.1, page 258).Note that it is onlydefined inR for non-negativea andb, and is infiniteif either is zero.

The functionsgamma andlgamma return the gamma functionΓ(x)\Gamma(x) and the natural logarithm ofthe absolute value of thegamma function. The gamma function is defined by(Abramowitz and Stegun section 6.1.1, page 255)

Γ(x)=0tx1etdt\Gamma(x) = \int_0^\infty t^{x-1} e^{-t} dt

for all realx except zero and negative integers (whenNaN is returned). There will be a warning on possible loss ofprecision for values which are too close (within about10810^{-8}) to a negative integer less than ‘⁠-10⁠’.

factorial(x) (x!x! for non-negative integerx)is defined to begamma(x+1) andlfactorial to belgamma(x+1).

The functionsdigamma andtrigamma return the first and secondderivatives of the logarithm of the gamma function.psigamma(x, deriv) (deriv >= 0) computes thederiv-th derivative ofψ(x)\psi(x).

digamma(x)=ψ(x)=ddxlnΓ(x)=Γ(x)Γ(x)\code{digamma(x)} = \psi(x) = \frac{d}{dx}\ln\Gamma(x) = \frac{\Gamma'(x)}{\Gamma(x)}

ψ\psi and its derivatives, thepsigamma() functions, areoften called the ‘polygamma’ functions, e.g. inAbramowitz and Stegun (section 6.4.1, page 260); and higherderivatives (deriv = 2:4) have occasionally been called‘tetragamma’, ‘pentagamma’, and ‘hexagamma’.

The functionschoose andlchoose return binomialcoefficients and the logarithms of their absolute values. Note thatchoose(n, k) is defined for all real numbersnn and integerkk. Fork1k \ge 1 it is defined asn(n1)(nk+1)/k!n(n-1)\cdots(n-k+1) / k!,as11 fork=0k = 0 and as00 for negativekk.Non-integer values ofk are rounded to an integer, with a warning.
choose(*, k) uses direct arithmetic (instead of[l]gamma calls) for smallk, for speed and accuracyreasons. Note the functioncombn (packageutils) for enumeration of all possible combinations.

Thegamma,lgamma,digamma andtrigammafunctions areinternal genericprimitive functions: methods can bedefined for them individually or via theMath group generic.

Source

gamma,lgamma,beta andlbeta are based onC translations of Fortran subroutines by W. Fullerton of Los AlamosScientific Laboratory (now available as part of SLATEC).

digamma,trigamma andpsigamma forx >= 0 are based on

Amos, D. E. (1983). A portable Fortran subroutine forderivatives of the psi function, Algorithm 610,ACM Transactions on Mathematical Software9(4), 494–502.

For,x < 0 andderiv <= 5, the reflection formula (6.4.7) ofAbramowitz and Stegun is used.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (Forgamma andlgamma.)

Abramowitz, M. and Stegun, I. A. (1972)Handbook of Mathematical Functions. New York: Dover.https://en.wikipedia.org/wiki/Abramowitz_and_Stegun provideslinks to the full text which is in public domain.
Chapter 6: Gamma and Related Functions.

See Also

Arithmetic for simple,sqrt formiscellaneous mathematical functions andBessel for thereal Bessel functions.

For the incomplete gamma function seepgamma.

Examples

require(graphics)choose(5,2)for(nin0:10) print(choose(n, k=0:n))factorial(100)lfactorial(10000)## gamma has 1st order poles at 0, -1, -2, ...## this will generate loss of precision warnings, so turn offop<- options("warn")options(warn=-1)x<- sort(c(seq(-3,4, length.out=201), outer(0:-3,(-1:1)*1e-6, `+`)))plot(x, gamma(x), ylim= c(-20,20), col="red", type="l", lwd=2,     main= expression(Gamma(x)))abline(h=0, v=-3:0, lty=3, col="midnightblue")options(op)x<- seq(0.1,4, length.out=201); dx<- diff(x)[1]par(mfrow= c(2,3))for(chin c("","l","di","tri","tetra","penta")){  is.deriv<- nchar(ch)>=2  nm<- paste0(ch,"gamma")if(is.deriv){    dy<- diff(y)/ dx# finite difference    der<- which(ch== c("di","tri","tetra","penta"))-1    nm2<- paste0("psigamma(*, deriv = ", der,")")    nm<-if(der>=2) nm2else paste(nm, nm2, sep=" ==\n")    y<- psigamma(x, deriv= der)}else{    y<- get(nm)(x)}  plot(x, y, type="l", main= nm, col="red")  abline(h=0, col="lightgray")if(is.deriv) lines(x[-1], dy, col="blue", lty=2)}par(mfrow= c(1,1))## "Extended" Pascal triangle:fN<-function(n) formatC(n, width=2)for(nin-4:10){    cat(fN(n),":", fN(choose(n, k=-2:max(3, n+2))))    cat("\n")}## R code version of choose()  [simplistic; warning for k < 0]:mychoose<-function(r, k)    ifelse(k<=0,(k==0),           sapply(k,function(k) prod(r:(r-k+1)))/ factorial(k))k<--1:6cbind(k= k, choose(1/2, k), mychoose(1/2, k))## Binomial theorem for n = 1/2 ;## sqrt(1+x) = (1+x)^(1/2) = sum_{k=0}^Inf  choose(1/2, k) * x^k :k<-0:10# 10 is sufficient for ~ 9 digit precision:sqrt(1.25)sum(choose(1/2, k)*.25^k)

Divide into Groups and Reassemble

Description

split divides the data in the vectorx into the groupsdefined byf. The replacement forms replace valuescorresponding to such a division.unsplit reverses the effect ofsplit.

Usage

split(x, f, drop=FALSE,...)## Default S3 method:split(x, f, drop=FALSE, sep=".", lex.order=FALSE,...)split(x, f, drop=FALSE,...)<- valueunsplit(value, f, drop=FALSE)

Arguments

x

vector or data frame containing values to be divided into groups.

f

a ‘factor’ in the sense thatas.factor(f)defines the grouping, or a list of such factors in which case theirinteraction is used for the grouping. Ifx is a data frame,f can also be a formula of the form ~ g to split bythe variableg, or more generally of the form ~ g1 + ... + gk to split by the interaction of the variablesg1, ...,gk, where these variables are evaluated inthe data framex using the usual non-standard evaluationrules.

drop

logical indicating if levels that do not occur should be dropped(iff is afactor or a list).

value

a list of vectors or data frames compatible with asplitting ofx. Recycling applies if the lengths do not match.

sep

character string, passed tointeraction in thecase wheref is alist.

lex.order

logical, passed tointeraction whenf is a list.

...

further potential arguments passed to methods.

Details

split andsplit<- are generic functions with default anddata.frame methods. The data frame method can also be used tosplit a matrix into a list of matrices, and the replacement formlikewise, provided they are invoked explicitly.

unsplit works with lists of vectors or data frames (assumed tohave compatible structure, as if created bysplit). It putselements or rows back in the positions given byf. In the dataframe case, row names are obtained by unsplitting the row namevectors from the elements ofvalue.

f is recycled as necessary and if the length ofx is nota multiple of the length off a warning is printed.

Any missing values inf are dropped together with thecorresponding values ofx.

The default method callsinteraction whenf is alist. If the levels of the factors contain ‘⁠.⁠’the factors may not be split as expected, unlesssep is set tostring not present in the factorlevels.

Value

The value returned fromsplit is a list of vectors containingthe values for the groups. The components of the list are named bythe levels off (after converting to a factor, or if already afactor anddrop = TRUE, dropping unused levels).

The replacement forms return their right hand side.unsplitreturns a vector or data frame for whichsplit(x, f) equalsvalue

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

cut to categorize numeric values.

strsplit to split strings.

Examples

require(stats); require(graphics)n<-10; nn<-100g<- factor(round(n* runif(n* nn)))x<- rnorm(n* nn)+ sqrt(as.numeric(g))xg<- split(x, g)boxplot(xg, col="lavender", notch=TRUE, varwidth=TRUE)sapply(xg, length)sapply(xg, mean)### Calculate 'z-scores' by group (standardize to mean zero, variance one)z<- unsplit(lapply(split(x, g), scale), g)# orzz<- xsplit(zz, g)<- lapply(split(x, g), scale)# and check that the within-group std dev is indeed onetapply(z, g, sd)tapply(zz, g, sd)### data frame variation## Notice that assignment form is not used since a variable is being addedg<- airquality$Monthl<- split(airquality, g)## Alternative using a formulaidentical(l, split(airquality,~ Month))l<- lapply(l, transform, Oz.Z= scale(Ozone))aq2<- unsplit(l, g)head(aq2)with(aq2, tapply(Oz.Z,  Month, sd, na.rm=TRUE))### Split a matrix into a list by columnsma<- cbind(x=1:10, y=(-4:5)^2)split(ma, col(ma))split(1:10,1:2)

Use C-style String Formatting Commands

Description

A wrapper for the C functionsprintf, that returns a charactervector containing a formatted combination of text and variable values.

Usage

sprintf(fmt,...)gettextf(fmt,..., domain=NULL, trim=TRUE)

Arguments

fmt

a character vector of format strings, each of up to 8192 bytes.

...

values to be passed intofmt. Only logical,integer, real and character vectors are supported, but some coercionwill be done: see the ‘Details’ section. Up to 100.

trim,domain

seegettext.

Details

sprintf is a wrapper for the systemsprintf C-libraryfunction. Attempts are made to check that the mode of the valuespassed match the format supplied, andR's special values (NA,Inf,-Inf andNaN) are handled correctly.

gettextf is a convenience function which provides C-stylestring formatting with possible translation of the format string.

The arguments (includingfmt) are recycled if possible a wholenumber of times to the length of the longest, and then the formattingis done in parallel. Zero-length arguments are allowed and will givea zero-length result. All arguments are evaluated even if unused, andhence some types (e.g.,"symbol" or"language", seetypeof) are not allowed. Arguments unused byfmtresult in a warning. (The format%.0s can be used to“skip” an argument.)

The following is abstracted fromKernighan and Ritchie (1988):however the actual implementation will follow the C99standard and fine details (especially the behaviour under user error)may depend on the platform. References to numbered arguments come fromPOSIX.

The stringfmt contains normal characters,which are passed through to the output string, and also conversionspecifications which operate on the arguments provided through.... The allowed conversion specifications start with a% and end with one of the letters in the setaAdifeEgGosxX%. These letters denote the following types:

d,i,o,x,X

Integervalue,o being octal,x andX being hexadecimal (using the same case fora-f as the code). Numeric variables with exactly integervalues will be coerced to integer. Formatsd andican also be used for logical variables, which will be converted to0,1 orNA.

f

Double precision value, in “fixedpoint” decimal notation of the form ‘⁠"[-]mmm.ddd"⁠’. The number ofdecimal places ("d") is specified by the precision: the default is 6;a precision of 0 suppresses the decimal point. Non-finite valuesare converted toNA,NaN or (perhaps a sign followedby)Inf.

e,E

Double precision value, in“exponential” decimal notation of theform[-]m.ddde[+-]xx or[-]m.dddE[+-]xx.

g,G

Double precision value, in%e or%E format if the exponent is less than -4 or greater than orequal to the precision, and%f format otherwise.(The precision (default 6) specifies the number ofsignificant digits here, whereas in%f, %e, it isthe number of digits after the decimal point.)

a,A

Double precision value, in binary notationof the form[-]0xh.hhhp[+-]d. This is a binary fractionexpressed in hex multiplied by a (decimal) power of 2. The numberof hex digits after the decimal point is specified by the precision:the default is enough digits to represent exactly the internalbinary representation. Non-finite values are converted toNA,NaN or (perhaps a sign followed by)Inf. Format%a uses lower-case forx,p and the hexvalues: format%A uses upper-case.

This should be supported on all platforms as it is a feature of C99.The format is not uniquely defined: although it would be possibleto make the leadingh always zero or one, this is notalways done. Most systems will suppress trailing zeros, but a fewdo not. On a well-written platform, for normal numbers there willbe a leading one before the decimal point plus (by default) 13hexadecimal digits, hence 53 bits. The treatment of denormalized(aka ‘subnormal’) numbers is very platform-dependent.

s

Character string. CharacterNAs areconverted to"NA".

%

Literal% (none of the extra formattingcharacters given below are permitted in this case).

Conversion byas.character is used for non-characterarguments withs and byas.double fornon-double arguments withf, e, E, g, G. NB: the length isdetermined before conversion, so do not rely on the internalcoercion if this would change the length. The coercion is done onlyonce, so iflength(fmt) > 1 then all elements must expect thesame types of arguments.

In addition, between the initial% and the terminatingconversion character there may be, in any order:

m.n

Two numbers separated by a period, denoting thefield width (m) and the precision (n).

-

Left adjustment of converted argument in its field.

+

Always print number with sign: by default onlynegative numbers are printed with a sign.

a space

Prefix a space if the first character is not a sign.

0

For numbers, pad to the field width with leadingzeros. For characters, this zero-pads on some platforms and isignored on others.

#

specifies “alternate output” for numbers, itsaction depending on the type:Forx orX,0x or0X will be prefixedto a non-zero result. Fore,e,f,gandG, the output will always have a decimal point; forg andG, trailing zeros will not be removed.

Further, immediately after% may come1$ to99$to refer to a numbered argument: this allows arguments to bereferenced out of order and is mainly intended for translators oferror messages. If this is done it is best if all formats arenumbered: if not the unnumbered ones process the arguments in order.See the examples. This notation allows arguments to be used more thanonce, in which case they must be used as the same type (integer,double or character).

A field width or precision (but not both) may be indicated by anasterisk*: in this case an argument specifies the desirednumber. A negative field width is taken as a '-' flag followed by apositive field width. A negative precision is treated as if theprecision were omitted. The argument should be integer, but a doubleargument will be coerced to integer.

There is a limit of 8192 bytes on elements offmt, and onstrings included from a single%letter conversionspecification.

Field widths and precisions of%s conversions are interpretedas bytes, not characters, as described in the C standard.

The C doubles used forR numerical vectors have signed zeros, whichsprintf may output as-0,-0.000 ....

Value

A character vector of length that of the longest input. If anyelement offmt or any character argument is declared as UTF-8,the element of the result will be in UTF-8 and have the encodingdeclared as UTF-8. Otherwise it will be in the current locale'sencoding.

Warning

The format string is passed down the OS'ssprintf function, andincorrect formats can cause the latter to crash theR process .Rdoes perform sanity checks on the format, but not all possible usererrors on all platforms have been tested, and some might be terminal.

The behaviour on inputs not documented here is ‘undefined’,which means it is allowed to differ by platform.

Author(s)

Original code by Jonathan Rougier.

References

Kernighan, B. W. and Ritchie, D. M. (1988)The C Programming Language. Second edition, Prentice Hall.Describes the format options in table B-1 in the Appendix.

The C Standards, especially ISO/IEC 9899:1999 for ‘C99’. Linkscan be found athttps://developer.r-project.org/Portability.html.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.htmlfor POSIX extensions such as numbered arguments.

man sprintf on a Unix-alike system.

See Also

formatC for a way of formatting vectors of numbers in asimilar fashion.

paste for another way of creating a vector combiningtext and values.

gettext for the mechanisms for the automated translationof text.

Examples

## be careful with the format: most things in R are floats## only integer-valued reals get coerced to integer.sprintf("%s is %f feet tall\n","Sven",7.1)# OKtry(sprintf("%s is %i feet tall\n","Sven",7.1))# not OK    sprintf("%s is %i feet tall\n","Sven",7)# OK## use a literal % :sprintf("%.0f%% said yes (out of a sample of size %.0f)",66.666,3)## various formats of pi :sprintf("%f", pi)sprintf("%.3f", pi)sprintf("%1.0f", pi)sprintf("%5.1f", pi)sprintf("%05.1f", pi)sprintf("%+f", pi)sprintf("% f", pi)sprintf("%-10f", pi)# left justifiedsprintf("%e", pi)sprintf("%E", pi)sprintf("%g", pi)sprintf("%g",1e6* pi)# -> exponentialsprintf("%.9g",1e6* pi)# -> "fixed"sprintf("%G",1e-6* pi)## no truncation:sprintf("%1.f",101)## re-use one argument three times, show difference between %x and %Xxx<- sprintf("%1$d %1$x %1$X",0:15)xx<- matrix(xx, dimnames= list(rep("",16),"%d%x%X"))noquote(format(xx, justify="right"))## More sophisticated:sprintf("min 10-char string '%10s'",        c("a","ABC","and an even longer one"))n<-1:18sprintf(paste0("e with %2d digits = %.", n,"g"), n, exp(1))## Platform-dependent bad example: may pad with spaces or zeroessprintf("%09s", month.name)## Using arguments out of ordersprintf("second %2$1.0f, first %1$5.2f, third %3$1.0f", pi,2,3)## Using asterisk for width or precisionsprintf("precision %.*f, width '%*.3f'",3, pi,8, pi)## Asterisk and argument re-use, 'e' example reiterated:sprintf("e with %1$2d digits = %2$.*1$g", n, exp(1))## re-cycle argumentssprintf("%s %d","test",1:3)## binary output showing rounding/representation errorsx<- seq(0,1.0,0.1); y<- c(0,.1,.2,.3,.4,.5,.6,.7,.8,.9,1)cbind(x, sprintf("%a", x), sprintf("%a", y))

Quote Text

Description

Single or double quote text by combining with appropriate single ordouble left and right quotation marks.

Usage

sQuote(x, q= getOption("useFancyQuotes"))dQuote(x, q= getOption("useFancyQuotes"))

Arguments

x

anR object, to be coerced to a character vector.

q

the kind of quotes to be used, see ‘Details’.

Details

The purpose of the functions is to provide a simple means of markupfor quoting text to be used in the R output, e.g., in warnings orerror messages.

The choice of the appropriate quotation marks depends on both thelocale and the available character sets. Older Unix/X11 fontsdisplayed the grave accent (ASCII code 0x60) and the apostrophe (0x27)in a way that they could also be used as matching open and closesingle quotation marks. Using modern fonts, or non-Unix systems,these characters no longer produce matching glyphs. Unicode providesleft and right single quotation mark characters (U+2018 and U+2019);if Unicode markup cannot be assumed to be available, it seems goodpractice to use the apostrophe as a non-directional single quotationmark.

Similarly, Unicode has left and right double quotation mark characters(U+201C and U+201D); if only ASCII's typewriter characteristics can beemployed, than the ASCII quotation mark (0x22) should be used as boththe left and right double quotation mark.

Some other locales also have the directional quotation marks, notablyon Windows. TeX uses grave and apostrophe for the directional singlequotation marks, and doubled grave and doubled apostrophe for thedirectional double quotation marks.

What rendering is used depends onq which by default depends ontheoptions setting foruseFancyQuotes. If thisisFALSE then the undirectionalASCII quotation style is used. If this isTRUE (the default),Unicode directional quotes are used are used where available(currently, UTF-8 locales on Unix-alikes and all Windows localesexceptC): if set to"UTF-8" UTF-8 markup is used(whatever the current locale). If set to"TeX", TeX-stylemarkup is used. Finally, if this is set to a character vector oflength four, the first two entries are used for beginning and endingsingle quotes and the second two for beginning and ending doublequotes: this can be used to implement non-English quoting conventionssuch as the use of guillemets.

Where fancy quotes are used, you should be aware that they may not berendered correctly as not all fonts include the requisite glyphs: forexample some have directional single quotes but not directional doublequotes.

Value

A character vector of the same length asx (after any coercion)in the current locale's encoding.

References

Markus Kuhn, “ASCII and Unicode quotation marks”.https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

See Also

Quotes for quotingR code.

shQuote for quoting OS commands.

Examples

op<- options("useFancyQuotes")paste("argument", sQuote("x"),"must be non-zero")options(useFancyQuotes=FALSE)cat("\ndistinguish plain", sQuote("single"),"and",    dQuote("double"),"quotes\n")options(useFancyQuotes=TRUE)cat("\ndistinguish fancy", sQuote("single"),"and",    dQuote("double"),"quotes\n")options(useFancyQuotes="TeX")cat("\ndistinguish TeX", sQuote("single"),"and",    dQuote("double"),"quotes\n")if(l10n_info()$`Latin-1`){    options(useFancyQuotes= c("\xab","\xbb","\xbf","?"))    cat("\n", sQuote("guillemet"),"and",        dQuote("Spanish question"),"styles\n")}elseif(l10n_info()$`UTF-8`){    options(useFancyQuotes= c("\xc2\xab","\xc2\xbb","\xc2\xbf","?"))    cat("\n", sQuote("guillemet"),"and",        dQuote("Spanish question"),"styles\n")}options(op)

References to Source Files and Code

Description

These functions are for working with source files and more generallywith “source references” ("srcref"), i.e., references tosource code. The resulting data is used for printing and source leveldebugging, and is typically available in interactiveR sessions,namely whenoptions(keep.source = TRUE).

Usage

srcfile(filename, encoding= getOption("encoding"), Enc="unknown")srcfilecopy(filename, lines, timestamp= Sys.time(), isFile=FALSE)srcfilealias(filename, srcfile)getSrcLines(srcfile, first, last)srcref(srcfile, lloc)## S3 method for class 'srcfile'print(x,...)## S3 method for class 'srcfile'summary(object,...)## S3 method for class 'srcfile'open(con, line,...)## S3 method for class 'srcfile'close(con,...)## S3 method for class 'srcref'print(x, useSource=TRUE,...)## S3 method for class 'srcref'summary(object, useSource=FALSE,...)## S3 method for class 'srcref'as.character(x, useSource=TRUE, to= x,...).isOpen(srcfile)

Arguments

filename

The name of a file.

encoding

The character encoding to assume for the file.

Enc

The encoding with which to make strings: see theencoding argument ofparse.

lines

A character vector of source lines. OtherR objectswill be coerced to character.

timestamp

The timestamp to use on a copy of a file.

isFile

Is thissrcfilecopy known to come from a file system file?

srcfile

Asrcfile object.

first,last,line

Line numbers.

lloc

A vector of four, six or eight values giving a source location; see‘Details’.

x,object,con

An object of the appropriate class.

useSource

Whether to read thesrcfile to obtain thetext of asrcref.

to

An optional secondsrcref object to mark the endof the character range.

...

Additional arguments to the methods; these will be ignored.

Details

These functions and classes handle source code references.

Thesrcfile function produces an object of classsrcfile, which contains the name and directory of a source codefile, along with its timestamp, for use in source level debugging (notyet implemented) and source echoing. The encoding of the file issaved; seefile for a discussion of encodings, andiconvlist for a list of allowable encodings on your platform.

Thesrcfilecopy function produces an object of the descendantclasssrcfilecopy, which saves the source lines in a charactervector. It copies the value of theisFile argument, to helpdebuggers identify whether this text comes from a real file in thefile system.

Thesrcfilealias function produces an object of the descendantclasssrcfilealias, which gives an alternate name to anothersrcfile. This is produced by the parser when a#line directiveis used.

ThegetSrcLines function reads the specified lines fromsrcfile.

Thesrcref function produces an object of classsrcref, which describes a range of characters in asrcfile.Thelloc value gives the following values:

c(first_line, first_byte, last_line, last_byte, first_column,  last_column, first_parsed, last_parsed)

Bytes (elements 2, 4) andcolumns (elements 5, 6) may be different due to multibytecharacters. If only four values are given, the columns and bytesare assumed to match. Lines (elements 1, 3) and parsed lines(elements 7, 8) may differ if a#line directive is used incode: the former will respect the directive, the latter will justcount lines. If only 4 or 6 elements are given, the parsed lineswill be assumed to match the lines.

Methods are defined forprint,summary,open,andclose for classessrcfile andsrcfilecopy.Theopen method opens its internalfile connection ata particular line; if it was already open, it will be repositionedto that line.

Methods are defined forprint,summary andas.character for classsrcref. Theas.charactermethod will read the associated source file to obtain the textcorresponding to the reference. If theto argument is given,it should be a secondsrcref that follows the first, in thesame file; they will be treated as one reference to the wholerange. The exact behaviour depends on theclass of the source file. If the source file inherits fromclasssrcfilecopy, the lines are taken from the saved copyusing the “parsed” line counts. If not, an attemptis made to read the file, and the original line numbers of thesrcref record (i.e., elements 1 and 3) are used. If an erroroccurs (e.g., the file no longer exists), text like‘⁠<srcref: "file" chars 1:1 to 2:10>⁠’ will be returned instead,indicating theline:column ranges of the first and lastcharacter. Thesummary method defaults to this type ofdisplay.

Lists ofsrcref objects may be attached to expressions as the"srcref" attribute. (The list ofsrcref objects should be the samelength as the expression.) By default, expressions are printed byprint.default using the associatedsrcref. Tosee deparsed code instead, callprint with argumentuseSource = FALSE. If asrcref objectis printed withuseSource = FALSE, the ‘⁠<srcref: ....>⁠’record will be printed.

.isOpen is intended for internal use: it checks whether theconnection associated with asrcfile object is open.

Value

srcfile returns asrcfile object.

srcfilecopy returns asrcfilecopy object.

getSrcLines returns a character vector of source code lines.

srcref returns asrcref object.

Author(s)

Duncan Murdoch

See Also

getSrcFilename for extracting information from a sourcereference, orremoveSource to remove it from a(non-primitive) function (aka ‘closure’).

Examples

src<- srcfile(system.file("DESCRIPTION", package="base"))summary(src)getSrcLines(src,1,4)ref<- srcref(src, c(1,1,2,1000))refprint(ref, useSource=FALSE)

Stack Overflow Errors

Description

Errors signaled byR when stacks used in evaluation overflow.

Details

R uses several stacks in evaluating expressions: the C stack, thepointer protection stack, and the node stack used by the byte codeengine. In addition, the number of nestedR expressions currentlyunder evaluation is limited by the value set asoptions("expressions"). Overflowing these stacks orlimits signals an error that inherits from classesstackOverflowError,error, andcondition.

The specific classes signaled are:

  • CStackOverflowError: Signaled when the C stackoverflows. Theusage field of the error object contains thecurrent stack usage.

  • protectStackOverflowError: Signaled when the pointerprotection stack overflows.

  • nodeStackOverflowError: Signaled when the node stackused by the byte code engine overflows.

  • expressionStackOverflowError: Signaled when the theevaluation depth, the number of nestedR expressions currentlyunder evaluation, exceeds the limit set byoptions("expressions")

Stack overflow errors can be caught and handled by exiting handlersestablished withtryCatch() Calling handlers establishedbywithCallingHandlers() may fail since there may not beenough stack space to run the handler. In this case the next availableexiting handler will be run, or error handling will fall back to thedefault handler. Default handlers set bytryCatch("error") may also fail to run in a stackoverflow situation.

See Also

Cstack_info for information on the environment and theevaluation depth limit.

Memory andoptions for information on theprotection stack.


Formal Method System – Dispatching S4 Methods

Description

The functionstandardGeneric initiates dispatch of S4methods: see the references and the documentation of themethods package. Usually, calls to this function aregenerated automatically and not explicitly by the programmer.

Usage

standardGeneric(f, fdef)

Arguments

f

The name of the generic.

fdef

The generic function definition. Never passed whendefining a new generic.

Details

standardGeneric dispatches the method defined for a genericfunction namedf, using the actual arguments in the frame from whichit is called.

The argumentfdef is inserted (automatically) when dispatchingmethods for a primitive function. If present, it must always be the functiondefinition for the corresponding generic. Don't insert this argumentby hand, as there is no validity checking and miss-specifying thefunction definition will cause certain failure.

For more, use themethods package, and see the documentation inGenericFunctions.

Author(s)

John Chambers

References

Chambers, John M. (2008)Software for Data Analysis: Programming with RSpringer. (For the R version.)

Chambers, John M. (1998)Programming with DataSpringer (For the original S4 version.)


Does String Start or End With Another String?

Description

Determines if entries ofx start or end with string (entries of)prefix orsuffix respectively, where strings arerecycled to common lengths.

Usage

startsWith(x, prefix)  endsWith(x, suffix)

Arguments

x

character vector whose “starts” or“ends” are considered.

prefix,suffix

character vector, typically of lengthone, i.e., a string.

Details

startsWith() is equivalent to but much faster than

  substring(x, 1, nchar(prefix)) == prefix

or also

  grepl("^<prefix>", x)

whereprefix is not to contain special regular expressioncharacters (and forgrepl,x does not contain missingvalues, see below).

The code has an optimized branch for the most common usage in whichprefix orsuffix is of length one, and is furtheroptimized in a UTF-8 or 8-byte locale if that is an ASCII string.

Value

Alogical vector, of “common length” ofxandprefix (orsuffix), i.e., of the longer of the twolengths unless one of them is zero when the result isalso of zero length. A shorter input is recycled to the output length.

See Also

grepl,substring; the partial stringmatching functionscharmatch andpmatchsolve a different task.

Examples

startsWith(search(),"package:")# typically at least two FALSE, nowadays often threex1<- c("Foobar","bla bla","something","another","blu","brown","blau blüht der Enzian")# non-ASCIIx2<- cbind(      startsWith(x1,"b"),      startsWith(x1,"bl"),      startsWith(x1,"bla"),        endsWith(x1,"n"),        endsWith(x1,"an"))rownames(x2)<- x1; colnames(x2)<- c("b","b1","bla","n","an")x2## Non-equivalence in case of missing values in 'x', see Details:x<- c("all","but",NA_character_)cbind(startsWith(x,"a"),      substring(x,1L,1L)=="a",      grepl("^a", x))

Initialization at Start of an R Session

Description

InR, the startup mechanism is as follows.

Unless--no-environ was given on the command line,Rsearches for site and user files to process for setting environmentvariables. The name of the site file is the one pointed to by theenvironment variableR_ENVIRON; if this is unset,‘R_HOME/etc/Renviron.site’ is used (if it exists,which it does not in a ‘factory-fresh’ installation). The nameof the user file can be specified by theR_ENVIRON_USERenvironment variable; if this is unset, the files searched for are‘.Renviron’ in the current or in the user's home directory (inthat order). See ‘Details’ for how the files are read.

ThenR searches for the site-wide startup profile file ofR codeunless the command line option--no-site-file was given. Thepath of this file is taken from the value of theR_PROFILEenvironment variable (aftertilde expansion). If this variableis unset, the default is ‘R_HOME/etc/Rprofile.site’,which is used if it exists(which it does not in a ‘factory-fresh’ installation).

This code is sourced into the workspace (global environment). Users needto be careful not to unintentionally create objects in the workspace, andit is normally advisable to uselocal if code needs to beexecuted: see the examples..Library.site may be assigned to andthe assignment will effectively modify the value of the variable in thebase namespace where.libPaths() finds it. One may alsoassign to.First and.Last, but assigning to other variablesin the execution environment is not recommended and does not work insome older versions ofR.

Then, unless--no-init-file was given,R searches for a userprofile, a file ofR code. The path of this file can be specified bytheR_PROFILE_USER environment variable (andtilde expansion will be performed). If this is unset, a filecalled ‘.Rprofile’ is searched for in the current directory or inthe user's home directory (in that order). The user profile file issourced into the workspace.

Note that when the site and user profile files are sourced only thebase package is loaded, so objects in other packages need to bereferred to by e.g.utils::dump.frames or after explicitlyloading the package concerned.

R then loads a saved image of the user workspace from ‘.RData’in the current directory if there is one (unless--no-restore-data or--no-restore was specified onthe command line).

Next, if a function.First is found on the search path,it is executed as.First(). Finally, function.First.sys() in thebase package is run. This callsrequire to attach the default packages specified byoptions("defaultPackages"). If themethodspackage is included, this will have been attached earlier (by function.OptRequireMethods()) so that namespace initializations suchas those from the user workspace will proceed correctly.

A function.First (and.Last) can be defined inappropriate ‘.Rprofile’ or ‘Rprofile.site’ files or havebeen saved in ‘.RData’. If you want a different set of packagesthan the default ones when you start, insert a call tooptions in the ‘.Rprofile’ or ‘Rprofile.site’file. For example,options(defaultPackages = character()) willattach no extra packages on startup (only thebase package) (orsetR_DEFAULT_PACKAGES=NULL as an environment variable beforerunningR). Usingoptions(defaultPackages = "") orR_DEFAULT_PACKAGES="" enforces the Rsystem default.

On front-ends which support it, the commands history is read from thefile specified by the environment variableR_HISTFILE (default‘.Rhistory’ in the current directory) unless--no-restore-history or--no-restore was specified.

The command-line option--vanilla implies--no-site-file,--no-init-file,--no-environ and (except forR CMD)--no-restore

Details

Note that there are two sorts of files used in startup:environment files which contain lists of environment variablesto be set, andprofile files which containR code.

Lines in a site or user environment file should be either commentlines starting with#, or lines of the formname=value. The latter sets the environmentalvariablename tovalue, overriding anexisting value. Ifvalue contains an expression of theform${foo-bar}, the value is that of the environmentalvariablefoo if that is set, otherwisebar. For${foo:-bar}, the value is that offoo if that is set toa non-empty value, otherwisebar. (If it is of the form${foo}, the default is"".) This construction can benested, sobar can be of the same form (as in${foo-${bar-blah}}). Note that the braces are essential: forexample$HOME will not be interpreted.

Leading and trailing white space invalue are stripped.value is then processed in a similar way to a Unix shell:in particular (single or double) quotes not preceded by backslashare removed and backslashes are removed except inside such quotes.

For readability and future compatibility it is recommended to only useconstructs that have the same behavior as in a Unix shell. Hence,expansions of variables should be in double quotes (e.g."${HOME}", in case they may contain a backslash) and literalsincluding a backslash should be in single quotes. If a variable valuemay end in a backslash, such asPATH on Windows, it may benecessary to protect the following quote from it, e.g."${PATH}/".It is recommended to use forward slashes instead of backslashes.It is ok to mix text in single and double quotes, see examples below.

On systems with sub-architectures (mainly Windows), thefiles ‘Renviron.site’ and ‘Rprofile.site’ are looked forfirst in architecture-specific directories,e.g. ‘R_HOME/etc/i386/Renviron.site’.And e.g. ‘.Renviron.i386’ will be used in preferenceto ‘.Renviron’.

There is a 100,000 byte limit on the length of a line (after expansions)in environment files.

Note

It is not intended that there be interaction with the user duringstartup code. Attempting to do so can crash theR process.

On Unix versions ofR there is also a file‘R_HOME/etc/Renviron’ which is read very early inthe start-up processing. It contains environment variables set byRin the configure process. Values in that file can be overridden insite or user environment files: do not change‘R_HOME/etc/Renviron’ itself. Note that this isdistinct from ‘R_HOME/etc/Renviron.site’.

Command-line options may well not apply to alternative front-ends:they do not apply toR.app on macOS.

R CMD check andR CMD build do not always read thestandard startup files, but they do always read specific‘⁠Renviron⁠’ files. The location of these can be controlled by theenvironment variablesR_CHECK_ENVIRON andR_BUILD_ENVIRON.If these are set their value is used as the path for the‘⁠Renviron⁠’ file; otherwise, files ‘~/.R/check.Renviron’ or‘~/.R/build.Renviron’ or sub-architecture-specific versions areemployed.

If you want ‘~/.Renviron’ or ‘~/.Rprofile’ to be ignored bychildR processes (such as those run byR CMD check andR CMD build), set the appropriate environment variableR_ENVIRON_USER orR_PROFILE_USER to (if possible, which itis not on Windows)"" or to the name of a non-existent file.

See Also

For the definition of the ‘home’ directory on Windows see the‘rw-FAQ’ Q2.14. It can be found from a runningR bySys.getenv("R_USER").

.Last for final actions at the close of anR session.commandArgs for accessing the command line arguments.

There are examples of using startup files to set defaults for graphicsdevices in the help for

X11 andquartz.

An Introduction to R for more command-line options: thoseaffecting memory management are covered in the help file forMemory.

readRenviron to read ‘.Renviron’ files.

For profiling code, seeRprof.

Examples

## Not run:## Example ~/.Renviron on UnixR_LIBS=~/R/libraryPAGER=/usr/local/bin/less## Example .Renviron on WindowsR_LIBS=C:/R/libraryMY_TCLTK="c:/Program Files/Tcl/bin"# Variable expansion in double quotes, string literals with backslashes in# single quotes.R_LIBS_USER="${APPDATA}"'\R-library'## Example of setting R_DEFAULT_PACKAGES (from R CMD check)R_DEFAULT_PACKAGES='utils,grDevices,graphics,stats'# this loads the packages in the order given, so they appear on# the search path in reverse order.## Example of .Rprofileoptions(width=65, digits=5)options(show.signif.stars=FALSE)setHook(packageEvent("grDevices","onLoad"),function(...) grDevices::ps.options(horizontal=FALSE))set.seed(1234).First<-function() cat("\n   Welcome to R!\n\n").Last<-function()  cat("\n   Goodbye!\n\n")## Example of Rprofile.sitelocal({# add MASS to the default packages, set a CRAN mirror  old<- getOption("defaultPackages"); r<- getOption("repos")  r["CRAN"]<-"http://my.local.cran"  options(defaultPackages= c(old,"MASS"), repos= r)## (for Unix terminal users) set the width from COLUMNS if set  cols<- Sys.getenv("COLUMNS")if(nzchar(cols)) options(width= as.integer(cols))# interactive sessions get a fortune cookie (needs fortunes package)if(interactive())    fortunes::fortune()})## if .Renviron containsFOOBAR="coo\bar"doh\ex"abc\"def'"## then we get# > cat(Sys.getenv("FOOBAR"), "\n")# coo\bardoh\exabc"def'## End(Not run)

Stop Function Execution

Description

stop stops execution of the current expression and executesan error action.

geterrmessage gives the last error message.

Usage

stop(..., call.=TRUE, domain=NULL)geterrmessage()

Arguments

...

zero or more objects which can be coerced to character(and which are pasted together with no separator) or a singlecondition object.

call.

logical, indicating if the call should become part of theerror message.

domain

seegettext. IfNA, messages willnot be translated.

Details

The error action is controlled by error handlers established withinthe executing code and by the current default error handler set byoptions(error=). The error is first signaled as if usingsignalCondition(). If there are no handlers or if all handlersreturn, then the error message is printed (ifoptions("show.error.messages") is true) and the default errorhandler is used. The default behaviour (theNULLerror-handler) in interactive use is to return to the top levelprompt or the top level browser, and in non-interactive use to(effectively) callq("no", status = 1, runLast = FALSE)unlessgetOption("catch.script.errors") is true.

The default handler stores the error message in a buffer; it can beretrieved bygeterrmessage(). It also stores a trace ofthe call stack that can be retrieved bytraceback().

Errors will be truncated togetOption("warning.length")characters, default 1000.

If a condition object is supplied it should be the only argument, andfurther arguments will be ignored, with a warning.

Value

geterrmessage gives the last error message, as a character stringending in"\n".

Note

Usedomain = NA whenever... contain aresult fromgettextf() as that is translated already.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

warning,try to catch errors and retry,andoptions for setting error handlers.stopifnot for validity testing.tryCatchandwithCallingHandlers can be used to establish custom handlerswhile executing an expression.

gettext for the mechanisms for the automated translationof messages.

Examples

iter<-12try(if(iter>10) stop("too many iterations"))tst1<-function(...) stop("dummy error")try(tst1(1:10, long, calling, expression))tst2<-function(...) stop("dummy error", call.=FALSE)try(tst2(1:10, longcalling, expression, but.not.seen.in.Error))

Ensure the Truth of R Expressions

Description

If any of the expressions (in... orexprs) are notallTRUE,stop is called, producingan error message indicating thefirst expression which was not(all) true.

Usage

stopifnot(..., exprs, exprObject, local=TRUE)

Arguments

...,exprs

any number ofR expressions, which should eachevaluate to (a logical vector of all)TRUE. Useeither...orexprs, the latter typically an unevaluated expression of theform

{   expr1   expr2   ....}

Note that e.g., positive numbers arenotTRUE, even whenthey are coerced toTRUE, e.g., insideif(.) or inarithmetic computations inR.

If names are provided to..., they will be used in lieu ofthe default error message.

exprObject

alternative toexprs or...:an ‘expression-like’ object, typically anexpression, but also acall, aname, or atomic constant such asTRUE.

local

(only whenexprs is used:) indicates theenvironment in which the expressions should beevaluated; by default the one from wherestopifnot() has beencalled.

Details

This function is intended for use in regression tests or also argumentchecking of functions, in particular to make them easier to read.

stopifnot(A, B) or equivalentlystopifnot(exprs= {A ; B}) are conceptually equivalent to

 { if(any(is.na(A)) || !all(A)) stop(...);   if(any(is.na(B)) || !all(B)) stop(...) }

SinceR version 3.6.0,stopifnot() no longer handles potentialerrors or warnings (bytryCatch() etc) for each singleexpressionand may usesys.call(n) to get a meaningful and shorterror message in case an expression did not evaluate to all TRUE. Thisprovides considerably less overhead.

SinceR version 3.5.0, expressionsare evaluated sequentially,and hence evaluation stops as soon as there is a “non-TRUE”, asindicated by the above conceptual equivalence statement.

Also, sinceR version 3.5.0,stopifnot(exprs = { ... }) can be usedalternatively and may be preferable in the case of severalexpressions, as they are more conveniently evaluated interactively(“no extraneous, ”).

SinceR version 3.4.0, when an expression (from...) is nottrueand is a call toall.equal, the errormessage will report the (first part of the) differences reported byall.equal(*); sinceR 4.3.0, this happens for all callswhere"all.equal"pmatch()es the function called,e.g., when that is calledall.equalShow, see the example inall.equal.

Value

(NULL if all statements in... areTRUE.)

Note

Trying to use thestopifnot(exprs = ..) version via a shortcut,say,

 assertWRONG <- function(exprs) stopifnot(exprs = exprs)

is delicate and the above isnot a good idea. Contrary tostopifnot()which takes care to evaluate the parts ofexprs one by one andstop at the first non-TRUE, the above short cut would typically evaluateall parts ofexprs and pass the result, i.e., typically of thelast entry ofexprs tostopifnot().

However, a more careful version,

 assert <- function(exprs) eval.parent(substitute(stopifnot(exprs = exprs)))

may be a nice short cut forstopifnot(exprs = *) calls using themore commonly known verb as function name.

See Also

stop,warning;assertCondition in packagetools complementsstopifnot() for testing warnings and errors.

Examples

## NB: Some of these examples are expected to produce an error. To##     prevent them from terminating a run with example() they are##     piped into a call to try().stopifnot(1==1, all.equal(pi,3.14159265),1<2)# all TRUEm<- matrix(c(1,3,3,1),2,2)stopifnot(m== t(m), diag(m)== rep(1,2))# all(.) |=>  TRUEstopifnot(length(10))|> try()# gives an error: '1' is *not* TRUE## even when   if(1) "ok"   worksstopifnot(all.equal(pi,3.141593),2<2,(1:10<12),"a"<"b")|> try()## More convenient for interactive "line by line" evaluation:stopifnot(exprs={  all.equal(pi,3.1415927)2<21:10<12"a"<"b"})|> try()eObj<- expression(2<3,3<=3:6,1:10<2)stopifnot(exprObject= eObj)|> try()stopifnot(exprObject= quote(3==3))stopifnot(exprObject=TRUE)# long all.equal() error messages are abbreviated:stopifnot(all.equal(rep(list(pi),4), list(3.1,3.14,3.141,3.1415)))|> try()# The default error message can be overridden to be more informative:m[1,2]<-12stopifnot("m must be symmetric"= m== t(m))|> try()#=> Error: m must be symmetric##' warnifnot(): a "only-warning" version of stopifnot()##'   {Yes, learn how to use do.call(substitute, ...) in a powerful manner !!}warnifnot<- stopifnot; N<- length(bdy<- body(warnifnot))bdy<- do.call(substitute, list(bdy,   list(stopifnot= quote(warnifnot))))bdy[[N-1]]<- do.call(substitute, list(bdy[[N-1]], list(stop= quote(warning))))body(warnifnot)<- bdywarnifnot(1==1,1<2,2<2)# => warns " 2 < 2 is not TRUE  "warnifnot(exprs={1==13<3# => warns "3 < 3 is not TRUE"})

Date-time Conversion Functions to and from Character

Description

Functions to convert between character representations and objects ofclasses"POSIXlt" and"POSIXct" representing calendardates and times.

Usage

## S3 method for class 'POSIXct'format(x, format="", tz="", usetz=FALSE,...)## S3 method for class 'POSIXlt'format(x, format="", usetz=FALSE,       digits= getOption("digits.secs"),...)## S3 method for class 'POSIXt'as.character(x, digits=if(inherits(x,"POSIXlt"))14Lelse6L,             OutDec=".",...)strftime(x, format="", tz="", usetz=FALSE,...)strptime(x, format, tz="")

Arguments

x

an object to be converted: a character vector forstrptime, an object which can be converted to"POSIXlt" forstrftime.

tz

a character string specifying the time zone to be used forthe conversion. System-specific (seeas.POSIXlt), but"" is the current time zone, and"GMT" is UTC.Invalid values are most commonly treated as UTC, on some platforms witha warning.

format

a character string. The default for theformatmethods is"%Y-%m-%d %H:%M:%S" if any element has a timecomponent which is not midnight, and"%Y-%m-%d"otherwise. Ifoptions("digits.secs") is set, up tothe specified number of digits will be printed for seconds.

...

further arguments to be passed from or to other methods.

usetz

logical. Should the time zone abbreviation be appendedto the output? This is used in printing times, and more reliablethan using"%Z".

digits

integer determining theformat()ing of seconds whenneeded. Note that the defaults forformat() andas.character() differ on purpose,as.character() givingclose to full accuracy as it does for numbers.

OutDec

a 1-character string specifying the decimal point to beused; the default isnotgetOption("OutDec") onpurpose.

Details

Theformat andas.character methods andstrftimeconvert objects from the classes"POSIXlt" and"POSIXct" tocharacter vectors.

strptime converts character vectors to class"POSIXlt":its inputx is first converted byas.character.Each input string is processed as far as necessary for the formatspecified: any trailing characters are ignored.

strftime is a wrapper forformat.POSIXlt, and it andformat.POSIXct first convert to class"POSIXlt" bycallingas.POSIXlt (so they also work for class"Date"). Note that only that conversion depends on thetime zone. SinceR version 4.2.0,as.POSIXlt() conversion nowtreats the non-finite numeric-Inf,Inf,NA andNaN differently (where previously all were treated asNA). Also theformat() method forPOSIXlt nowtreats these different non-finite times and dates analogously to typedouble.

The usual vector re-cycling rules are applied tox andformat so the answer will be of length of the longer of thesevectors.

Locale-specific conversions to and from character strings are usedwhere appropriate and available. This affects the names of the daysand months, the AM/PM indicator (if used) and the separators in outputformats such as%x and%X,via the setting oftheLC_TIME locale category. The ‘currentlocale’ of the descriptions might mean the locale in use at the startof theR session or when these functions are first used. (For input,the locale-specific conversions can be changed by callingSys.setlocale with categoryLC_TIME (orLC_ALL). For output, what happens depends on the OS butusually works.)

The details of the formats are platform-specific, but the following arelikely to be widely available: most are defined by the POSIX standard.Aconversion specification is introduced by%, usuallyfollowed by a single letter orO orE and then a singleletter. Any character in the format string not part of a conversionspecification is interpreted literally (and%% gives%). Widely implemented conversion specifications include

%a

Abbreviated weekday name in the currentlocale on this platform. (Also matches full name on input:in some locales there are no abbreviations of names.)

%A

Full weekday name in the current locale. (Alsomatches abbreviated name on input.)

%b

Abbreviated month name in the current locale onthis platform. (Also matches full name on input: insome locales there are no abbreviations of names.)

%B

Full month name in the current locale. (Alsomatches abbreviated name on input.)

%c

Date and time. Locale-specific on output,"%a %b %e %H:%M:%S %Y" on input.

%C

Century (00–99): the integer part of the yeardivided by 100.

%d

Day of the month as decimal number (01–31).

%D

Date format such as%m/%d/%y: the C99standard says it should be that exact format (but not all OSescomply).

%e

Day of the month as decimal number (1–31), witha leading space for a single-digit number.

%F

Equivalent to %Y-%m-%d (the ISO 8601 dateformat).

%g

The last two digits of the week-based year(see%V). (Accepted but ignored on input.)

%G

The week-based year (see%V) as a decimalnumber. (Accepted but ignored on input.)

%h

Equivalent to%b.

%H

Hours as decimal number (00–23). As a specialexception strings such as ‘⁠24:00:00⁠’ are accepted for input,since ISO 8601 allows these.

%I

Hours as decimal number (01–12).

%j

Day of year as decimal number (001–366): Forinput, 366 is only valid in a leap year.

%m

Month as decimal number (01–12).

%M

Minute as decimal number (00–59).

%n

Newline on output, arbitrary whitespace on input.

%p

AM/PM indicator in the locale. Used inconjunction with%I andnot with%H. Anempty string in some locales (for example on some OSes,non-English European locales including Russia). The behaviour isundefined if used for input in such a locale.

Some platforms accept%P for output, which uses a lower-caseversion (%p may also use lower case): others will outputP.

%r

For output, the 12-hour clock time (using thelocale's AM or PM): only defined in some locales, and on some OSesmisleading in locales which do not define an AM/PM indicator.For input, equivalent to%I:%M:%S %p.

%R

Equivalent to%H:%M.

%S

Second as integer (00–61), allowing forup to two leap-seconds (but POSIX-compliant implementationswill ignore leap seconds).

%t

Tab on output, arbitrary whitespace on input.

%T

Equivalent to%H:%M:%S.

%u

Weekday as a decimal number (1–7, Monday is 1).

%U

Week of the year as decimal number (00–53) usingSunday as the first day 1 of the week (and typically with thefirst Sunday of the year as day 1 of week 1). The US convention.

%V

Week of the year as decimal number (01–53) asdefined in ISO 8601.If the week (starting on Monday) containing 1 January has four ormore days in the new year, then it is considered week 1. Otherwise, itis the last week of the previous year, and the next week is week1. See%G (%g) for the year corresponding to theweek given by%V. (Accepted but ignored on input.)

%w

Weekday as decimal number (0–6, Sunday is 0).

%W

Week of the year as decimal number (00–53) usingMonday as the first day of week (and typically with thefirst Monday of the year as day 1 of week 1). The UK convention.

%x

Date. Locale-specific on output,"%y/%m/%d" on input.

%X

Time. Locale-specific on output,"%H:%M:%S" on input.

%y

Year without century (00–99). On input, values00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is thebehaviour specified by the 2018 POSIX standard, but it doesalso say ‘it is expected that in a future version thedefault century inferred from a 2-digit year will change’.

%Y

Year with century. Note that whereas there was nozero in the original Gregorian calendar, ISO 8601:2004 defines itto be valid (interpreted as 1BC): seehttps://en.wikipedia.org/wiki/0_(year). However, the standardsalso say that years before 1582 in its calendar should only be usedwith agreement of the parties involved.

For input, only years0:9999 are accepted.

%z

Signed offset in hours and minutes from UTC, so-0800 is 8 hours behind UTC. (Standard only for output. ForinputR currently supports it on all platforms – values from-1400 to+1400 are accepted.)

%Z

(Output only.) Time zone abbreviation as acharacter string (empty if not available). This may not be reliablewhen a time zone has changed abbreviations over the years.

Where leading zeros are shown they will be used on output but areoptional on input. Names are matched case-insensitively on input:whether they are capitalized on output depends on the platform and thelocale. Note that abbreviated names are platform-specific (althoughthe standards specify that in the ‘⁠C⁠’ locale they must be thefirst three letters of the capitalized English name: this conventionis widely used in English-language locales but for example the Frenchmonth abbreviations are not the same on any two of Linux, macOS, Solarisand Windows). Knowing what the abbreviations are is essentialif you wish to use%a,%b or%h as part of aninput format: see the examples for how to check.

When%z or%Z is used for output with anobject with an assigned time zone an attempt is made to use the valuesfor that time zone — but it is not guaranteed to succeed.

The definition of ‘whitespace’ for%n and%tis platform-dependent: for most it does not include non-breaking spaces.

Not in the standards and less widely implemented are

%k

The 24-hour clock time with single digits precededby a blank.

%l

The 12-hour clock time with single digits precededby a blank.

%s

(Output only.) The number of seconds since theepoch.

%+

(Output only.) Similar to%c, often"%a %b %e %H:%M:%S %Z %Y". May depend on the locale.

For output there are also%O[dHImMUVwWy] which may emitnumbers in an alternative locale-dependent format (e.g., romannumerals), and%E[cCyYxX] which can use an alternative‘era’ (e.g., a different religious calendar). Which of theseare supported is OS-dependent. These are accepted for input, but withthe standard interpretation.

Specific toR is%OSn, which for output gives the secondstruncated to0 <= n <= 6 decimal places (and if%OS isnot followed by a digit, it uses the setting ofgetOption("digits.secs"), or if that is unset,n = 0). Further, forstrptime%OS will input secondsincluding fractional seconds. Note that%S does not readfractional parts on output.

The behaviour of other conversion specifications (and even if othercharacter sequences commencing with%are conversionspecifications) is system-specific. Some systems document that theuse of multi-byte characters informat is unsupported: UTF-8locales are unlikely to cause a problem.

Value

Theformat methods andstrftime return character vectorsrepresenting the time.NA times are returned asNA_character_.

strptime turns character representations into an object ofclass"POSIXlt". The time zone is used to set theisdst component and to set the"tzone" attribute iftz != "". If the specified time is invalid (for example‘⁠"2010-02-30 08:00"⁠’) all the components of the result areNA. (NB: this does means exactly what it says – if it is aninvalid time, not just a time that does not exist in some time zone.)

Printing years

Everyone agrees that years from 1000 to 9999 should be printed with 4digits, but the standards do not define what is to be done outsidethat range. For years 0 to 999 most OSes pad with zeros or spaces to4 characters, but Linux/glibc outputs just the number.

OS facilities will probably not print years before 1 CE (aka 1 AD)‘correctly’ (they tend to assume the existence of a year 0: seehttps://en.wikipedia.org/wiki/0_(year), and some OSes get themcompletely wrong). Common formats are-45 and-045.

Years after 9999 and before -999 are normally printed with five ormore characters.

Some platforms support modifiers from POSIX 2008 (and others). OnLinux/glibc the format"%04Y" assures a minimum of fourcharacters and zero-padding (the default is no padding). The internalcode (as used on Windows and by default on macOS) uses zero-padding bydefault (this can be controlled by environment variableR_PAD_YEARS_BY_ZERO). On those platforms, formats%04Y,%_4Y and%_Y can be used for zero, space and nopadding respectively. (On macOS, the native code (not the default)supports none of these and uses zero-padding to 4 digits.)

Time zone offsets

Offsets from GMT (also known as UTC) are part of the conversionbetween timezones and to/from class"POSIXct", but causedifficulties as they are often computed incorrectly.

They conventionally have the opposite sign from time-zonespecifications (seeSys.timezone): positive values areEast of the meridian. Although there have been time zones withoffsets like +00:09:21 (Paris in 1900), and -00:44:30 (Liberia until1972), offsets are usually treated as whole numbers of minutes, andare most often seen in RFC 5322 email headers in forms like-0800 (e.g., used on the Pacific coast of the USA in winter).

Format%z can be used for input or output: it is a characterstring, conventionally plus or minus followed by two digits for hoursand two for minutes: the standards say that an empty string should beoutput if the offset is undetermined, but some systems use+0000 or the offsets for the time zone in use for the currentyear. (On some platforms this works better after conversion to"POSIXct". Some platforms only recognize hour or half-houroffsets for output.)

Using%z for input makes most sense withtz = "UTC".

Sources

Input uses the POSIX functionstrptime and output the C99functionstrftime.

However, not all OSes (notably Windows) providedstrptime andmany issues were found for those which did, so since 2000R has useda fork of code from ‘⁠glibc⁠’. The forked code uses thesystem'sstrftime to find the locale-specific day and monthnames and any AM/PM indicator.

On some platforms (including Windows and by default on macOS) thesystem'sstrftime is replaced (along with most of the rest ofthe C-level datetime code) by code modified fromIANA's ‘⁠tzcode⁠’distribution (https://www.iana.org/time-zones).

Note that asstrftime is used for output (and notwcsftime), argumentformat is translated if necessary tothe session encoding.

Note

The default formats follow the rules of the ISO 8601 internationalstandard which expresses a day as"2001-02-28" and a time as"14:01:02" using leading zeroes as here. (The ISO form uses nospace, possibly ‘⁠T⁠’, to separate dates and times:R uses a spaceby default.)

Forstrptime the input string need not specify the datecompletely: it is assumed that unspecified seconds, minutes or hoursare zero, and an unspecified year, month or day is the current one.(However, if a month is specified, the day of that month has to bespecified by%d or%e since the current day of themonth need not be valid for the specified month.) Some components maybe returned asNA (but an unknowntzone component isrepresented by an empty string).

If the time zone specified is invalid on your system, what happens issystem-specific but it will probably be ignored.

Remember that in most time zones some times do not occur and someoccur twice because of transitions to/from ‘daylight saving’(also known as ‘summer’) time.strptime does notvalidate such times (it does not assume a specific time zone), butconversion byas.POSIXct will do so. Conversion bystrftime and formatting/printing uses OS facilities and mayreturn nonsensical results for non-existent times at DST transitions.

In a C locale%c is required to be"%a %b %e %H:%M:%S %Y". As Windows does not comply (anduses a date format not understood outside N. America), that format isused byR on Windows in all locales.

There is a limit of 2048 bytes on each string produced bystrftime and theformat methods. As fromR 4.3.0attempting to exceed this is an error (previous versions silentlytruncated at 255 bytes).

References

International Organization for Standardization (2004, 2000, ...)‘ISO 8601. Data elements and interchange formats –Information interchange – Representation of dates and times.’,slightly updated to International Organization for Standardization (2019)‘ISO 8601-1:2019. Date and time – Representations forinformation interchange – Part 1: Basic rules’, and further amendedin 2022.For links to versions available on-line see (at the time of writing)https://dotat.at/tmp/ISO_8601-2004_E.pdf andhttps://www.qsl.net/g1smd/isopdf.htm; for information on thecurrent official version, seehttps://www.iso.org/iso/iso8601 andhttps://en.wikipedia.org/wiki/ISO_8601.

The POSIX 1003.1 standard, which is in some respects stricter than ISO 8601.

See Also

DateTimeClasses for details of the date-time classes;locales to query or set a locale.

Your system's help page onstrftime to see how to specify theirformats. (On some systems, including Windows,strftime isreplaced by more comprehensive internal code.)

Examples

## locale-specific version of date()format(Sys.time(),"%a %b %d %X %Y %Z")## time to sub-second accuracy (if supported by the OS)format(Sys.time(),"%H:%M:%OS3")## read in date info in format 'ddmmmyyyy'## This will give NA(s) in some non-English locales; setting the C locale## as in the commented lines will overcome this on most systems.## lct <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")x<- c("1jan1960","2jan1960","31mar1960","30jul1960")z<- strptime(x,"%d%b%Y")## Sys.setlocale("LC_TIME", lct)z(chz<- as.character(z))# same w/o TZ## *here* (but not in general), the same as format():stopifnot(exprs={     identical(chz, format(z))     grepl("^1960-0[137]-[03][012]$", chz[!is.na(z)])})## read in date/time info in format 'm/d/y h:m:s'dates<- c("02/27/92","02/27/92","01/14/92","02/28/92","02/01/92")times<- c("23:03:20","22:29:56","01:03:30","18:21:03","16:56:26")x<- paste(dates, times)z2<- strptime(x,"%m/%d/%y %H:%M:%S")z2## *here* (but not in general), the same as format():stopifnot(identical(format(z2), as.character(z2)))## time with fractional secondsz3<- strptime("20/2/06 11:16:16.683","%d/%m/%y %H:%M:%OS") z3# prints without fractional seconds by default, digits.sec = NULL ("= 0")op<- options(digits.secs=3)z3# shows the 3 extra digitsas.character(z3)# dittooptions(op)## time zone names are not portable, but 'EST5EDT' comes pretty close.## (but its interpretation may not be universal: see ?timezones)z4<- strptime(c("2006-01-08 10:07:52","2006-08-07 19:33:02"),"%Y-%m-%d %H:%M:%S", tz="EST5EDT")z4 attr(z4,"tzone")as.character(z4)z4$sec[2]<- pi# "very" fractional secondsas.character(z4)# shows full precisionformat(z4)# no fractional secformat(z4, digits=8)# shows only 6  (hard-wired maximum)format(z4, digits=4)## An RFC 5322 header (Eastern Canada, during DST)## In a non-English locale the commented lines may be needed.## prev <- Sys.getlocale("LC_TIME"); Sys.setlocale("LC_TIME", "C")strptime("Tue, 23 Mar 2010 14:36:38 -0400","%a, %d %b %Y %H:%M:%S %z")## Sys.setlocale("LC_TIME", prev)## Make sure you know what the abbreviated names are for you if you wish## to use them for input (they are matched case-insensitively):format(s1<- seq.Date(as.Date('1978-01-01'), by='day',   len=7),"%a")format(s2<- seq.Date(as.Date('2000-01-01'), by='month', len=12),"%b")## Non-finite date-times :format(as.POSIXct(Inf))# "Inf"  (was  NA  in R <= 4.1.x)format(as.POSIXlt(c(-Inf,Inf,NaN,NA)))# were all NA

Repeat the Elements of a Character Vector

Description

Repeat the character strings in a character vector a given number oftimes (i.e., concatenate the respective numbers of copies of thestrings).

Usage

strrep(x, times)

Arguments

x

a character vector, or an object which can be coerced to acharacter vector usingas.character.

times

an integer vector giving the (non-negative) numbers oftimes to repeat the respective elements ofx.

Details

The elements ofx andtimes will be recycled asnecessary (if one has no elements, and empty character vector isreturned). Missing elements inx ortimes result inmissing elements of the return value.

Value

A character vector with the elements of the given character vectorrepeated the given numbers of times.

Examples

strrep("ABC",2)strrep(c("A","B","C"),1:3)## Create vectors with the given numbers of spaces:strrep(" ",1:5)

Split the Elements of a Character Vector

Description

Split the elements of a character vectorx into substringsaccording to the matches to substringsplit within them.

Usage

strsplit(x, split, fixed=FALSE, perl=FALSE, useBytes=FALSE)

Arguments

x

character vector, each element of which is to be split. Otherinputs, including a factor, will give an error.

split

character vector (or object which can be coerced to such)containingregular expression(s) (unlessfixed = TRUE)to use for splitting. If empty matches occur, in particular ifsplit has length 0,x is split into single characters.Ifsplit has length greater than 1, it is re-cycled alongx.

fixed

logical. IfTRUE matchsplit exactly, otherwiseuse regular expressions. Has priority overperl.

perl

logical. Should Perl-compatible regexps be used?

useBytes

logical. IfTRUE the matching is donebyte-by-byte rather than character-by-character, and inputs withmarked encodings are not converted. This is forced (with a warning)if any input is found which is marked as"bytes"(seeEncoding).

Details

Argumentsplit will be coerced to character, soyou will see uses withsplit = NULL to meansplit = character(0), including in the examples below.

Note that splitting into single characters can be doneviasplit = character(0) orsplit = ""; the two areequivalent. The definition of ‘character’ here depends on thelocale: in a single-byte locale it is a byte, and in a multi-bytelocale it is the unit represented by a ‘wide character’ (almostalways a Unicode code point).

A missing value ofsplit does not split the correspondingelement(s) ofx at all.

The algorithm applied to each input string is

    repeat {        if the string is empty            break.        if there is a match            add the string to the left of the match to the output.            remove the match and all to the left of it.        else            add the string to the output.            break.    }

Note that this means that if there is a match at the beginning of a(non-empty) string, the first element of the output is"", butif there is a match at the end of the string, the output is the sameas with the match removed.

Note also that if there is an empty match at the beginning of a non-emptystring, the first character is returned and the algorithm continues withthe rest of the string. This needs to be kept in mind when designing theregular expressions. For example, when looking for a word boundaryfollowed by a letter ("[[:<:]]" withperl = TRUE), one candisallow a match at the beginning of a string (via"(?!^)[[:<:]]").

Invalid inputs in the current locale are warned about up to 5 times.

Value

A list of the same length asx, thei-th element of whichcontains the vector of splits ofx[i].

If any element ofx orsplit is declared to be in UTF-8(seeEncoding), all non-ASCII character strings in theresult will be in UTF-8 and have their encoding declared as UTF-8.(This also holds if any element is declared to be Latin-1 except in aLatin-1 locale.)Forperl = TRUE, useBytes = FALSE all non-ASCII strings in amultibyte locale are translated to UTF-8.

If any element ofx orsplit is marked as"bytes"(seeEncoding), all non-ASCII character strings created bythe splitting in the result will be marked as"bytes", but encodingof the resulting character strings not split is unspecified (may be"bytes" or the original). If no element ofx orsplit is marked as"bytes", butuseBytes = TRUE, eventhe encoding of the resulting character strings created by splitting isunspecified (may be"bytes" or"unknown", possibly invalidin the current encoding). Mixed use of"bytes" and other markedencodings is discouraged, but if still desired one may useiconv to re-encode the result e.g. to UTF-8 with suitablysubstituted invalid bytes.

See Also

paste for the reverse,grep andsub for string search andmanipulation; alsonchar,substr.

regular expression’ for the details of the patternspecification.

OptionPCRE_use_JIT controls the details whenperl = TRUE.

Examples

noquote(strsplit("A text I want to display with spaces",NULL)[[1]])x<- c(as="asfef", qu="qwerty","yuiop[","b","stuff.blah.yech")# split x on the letter estrsplit(x,"e")unlist(strsplit("a.b.c","."))## [1] "" "" "" "" ""## Note that 'split' is a regexp!## If you really want to split on '.', useunlist(strsplit("a.b.c","[.]"))## [1] "a" "b" "c"## orunlist(strsplit("a.b.c",".", fixed=TRUE))## a useful function: rev() for stringsstrReverse<-function(x)        sapply(lapply(strsplit(x,NULL), rev), paste, collapse="")strReverse(c("abc","Statistics"))## get the first names of the members of R-corea<- readLines(file.path(R.home("doc"),"AUTHORS"))[-(1:8)]a<- a[(0:2)-length(a)](a<- sub(" .*","", a))# and reverse themstrReverse(a)## Note that final empty strings are not produced:strsplit(paste(c("","a",""), collapse="#"), split="#")[[1]]# [1] ""  "a"## and also an empty string is only produced before a definite match:strsplit(""," ")[[1]]# character(0)strsplit(" "," ")[[1]]# [1] ""

Convert Strings to Integers

Description

Convert strings to integers according to the given base using the Cfunctionstrtol, or choose a suitable base following the C rules.

Usage

strtoi(x, base=0L)

Arguments

x

a character vector, or something coercible to this byas.character.

base

an integer which is between 2 and 36 inclusive, or zero(default).

Details

Conversion is based on the C library functionstrtol.

For the defaultbase = 0L, the base chosen from the stringrepresentation of that element ofx, so different elements canhave different bases (see the first example). The standard C rulesfor choosing the base are that octal constants (prefix0 notfollowed byx orX) and hexadecimal constants (prefix0x or0X) are interpreted as base8 and16; all other strings are interpreted as base10.

For a base greater than10, lettersa toz (orA toZ) are used to represent10 to35.

Value

An integer vector of the same length asx. Values which cannotbe interpreted as integers or would overflow are returned asNA_integer_.

See Also

For decimal stringsas.integer is equally useful.

Examples

strtoi(c("0xff","077","123"))strtoi(c("ffff","FFFF"),16L)strtoi(c("177","377"),8L)

Trim Character Strings to Specified Display Widths

Description

Trim character strings to specified display widths.

Usage

strtrim(x, width)

Arguments

x

a character vector, or an object which can be coerced to acharacter vector byas.character.

width

positive integer values: recycled to the length ofx.

Details

‘Width’ is interpreted as the display width in a monospacedfont. What happens with non-printable characters (such as backspace, tab)is implementation-dependent and may depend on the locale (e.g., theymay be included in the count or they may be omitted).

Using this function rather thansubstr is important whenthere might be double-width (e.g., Chinese/Japanese/Korean) charactersin the character vector.

Value

A character vector of the same length and with the same attributesasx (after possible coercion).

Elements of the result will have the encoding declared as that ofthe current locale (seeEncoding) if the correspondinginput had a declared encoding and the current locale is either Latin-1or UTF-8.

Examples

strtrim(c("abcdef","abcdef","abcdef"), c(1,5,10))

Attribute Specification

Description

structure returns the given object with furtherattributes set.

Usage

structure(.Data,...)

Arguments

.Data

an object which will havevarious attributes attached to it.

...

attributes, specified intag = valueform, which will be attached to data.

Details

Adding a class"factor" will ensure that numeric codes aregiven integer storage mode.

For historical reasons (these names are used when deparsing),attributes".Dim",".Dimnames",".Names",".Tsp" and".Label" are renamed to"dim","dimnames","names","tsp" and"levels".

It is possible to give the same tag more than once, in which case thelast value assigned wins. As with other ways of assigning attributes,usingtag = NULL removes attributetag from.Data ifit is present.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

attributes,attr.

Examples

structure(1:6, dim=2:3)

Wrap Character Strings to Format Paragraphs

Description

Each character string in the input is first split into paragraphs (orlines containing whitespace only). The paragraphs are then formattedby breaking lines at word boundaries. The target columns for wrappinglines and the indentation of the first and all subsequent lines of aparagraph can be controlled independently.

Usage

strwrap(x, width=0.9* getOption("width"), indent=0,        exdent=0, prefix="", simplify=TRUE, initial= prefix)

Arguments

x

a character vector, or an object which can be converted to acharacter vector byas.character.

width

a positive integer giving the target column for wrappinglines in the output.

indent

a non-negative integer giving the indentation of thefirst line in a paragraph.

exdent

a non-negative integer specifying the indentation ofsubsequent lines in paragraphs.

prefix,initial

a character string to be used as prefix foreach line except the first, for whichinitial is used.

simplify

a logical. IfTRUE, the result is a singlecharacter vector of line text; otherwise, it is a list of the samelength asx the elements of which are character vectors ofline text obtained from the corresponding element ofx.(Hence, the result in the former case is obtained by unlisting thatof the latter.)

Details

Whitespace (space, tab or newline characters) in the input isdestroyed. Double spaces after periods, question and explanationmarks (thought as representing sentence ends) are preserved.Currently, possible sentence ends at line breaks are not consideredspecially.

Indentation is relative to the number of characters in the prefixstring.

Value

A character vector (ifsimplify isTRUE), or a list ofsuch character vectors, with declared input encodings preserved.

Examples

## Read in file 'THANKS'.x<- paste(readLines(file.path(R.home("doc"),"THANKS")), collapse="\n")## Split into paragraphs and remove the first three onesx<- unlist(strsplit(x,"\n[ \t\n]*\n"))[-(1:3)]## Join the restx<- paste(x, collapse="\n\n")## Now for some fun:writeLines(strwrap(x, width=60))writeLines(strwrap(x, width=60, indent=5))writeLines(strwrap(x, width=60, exdent=5))writeLines(strwrap(x, prefix="THANKS> "))## Note that messages are wrapped AT the target column indicated by## 'width' (and not beyond it).## From an R-devel posting by J. Hosking <[email protected]>.x<- paste(sapply(sample(10,100, replace=TRUE),function(x) substring("aaaaaaaaaa",1, x)), collapse=" ")sapply(10:40,function(m)       c(target= m, actual= max(nchar(strwrap(x, m)))))

Subsetting Vectors, Matrices and Data Frames

Description

Return subsets of vectors, matrices or data frames which meet conditions.

Usage

subset(x,...)## Default S3 method:subset(x, subset,...)## S3 method for class 'matrix'subset(x, subset, select, drop=FALSE,...)## S3 method for class 'data.frame'subset(x, subset, select, drop=FALSE,...)

Arguments

x

object to be subsetted.

subset

logical expression indicating elements or rows to keep:missing values are taken as false.

select

expression, indicating columns to select from adata frame.

drop

passed on to[ indexing operator.

...

further arguments to be passed to or from other methods.

Details

This is a generic function, with methods supplied for matrices, dataframes and vectors (including lists). Packages and users can addfurther methods.

For ordinary vectors, the result is simplyx[subset & !is.na(subset)].

For data frames, thesubset argument works on the rows. Notethatsubset will be evaluated in the data frame, so columns canbe referred to (by name) as variables in the expression (see the examples).

Theselect argument exists only for the methods for data framesand matrices. It works by first replacing column names in theselection expression with the corresponding column numbers in the dataframe and then using the resulting integer vector to index thecolumns. This allows the use of the standard indexing conventions sothat for example ranges of columns can be specified easily, or singlecolumns can be dropped (see the examples).

Thedrop argument is passed on to the indexing method formatrices and data frames: note that the default for matrices isdifferent from that for indexing.

Factors may have empty levels after subsetting; unused levels arenot automatically removed. Seedroplevels for a way todrop all unused levels from a data frame.

Value

An object similar tox contain just the selected elements (fora vector), rows and columns (for a matrix or data frame), and so on.

Warning

This is a convenience function intended for use interactively. Forprogramming it is better to use the standard subsetting functions like[, and in particular the non-standard evaluation ofargumentsubset can have unanticipated consequences.

Author(s)

Peter Dalgaard and Brian Ripley

See Also

[,transformdroplevels

Examples

subset(airquality, Temp>80, select= c(Ozone, Temp))subset(airquality, Day==1, select=-Temp)subset(airquality, select= Ozone:Wind)with(airquality, subset(Ozone, Temp>80))## sometimes requiring a logical 'subset' argument is a nuisancenm<- rownames(state.x77)start_with_M<- nm%in% grep("^M", nm, value=TRUE)subset(state.x77, start_with_M, Illiteracy:Murder)# but in recent versions of R this can simply besubset(state.x77, grepl("^M", nm), Illiteracy:Murder)

Substituting and Quoting Expressions

Description

substitute returns the parse tree for the (unevaluated)expressionexpr, substituting any variables bound inenv.

quote simply returns its argument. The argument is not evaluatedand can be any R expression.

enquote is a simple one-line utility which transforms a call ofthe formFoo(....) into the callquote(Foo(....)). Thisis typically used to protect acall from early evaluation.

Usage

substitute(expr, env)quote(expr)enquote(cl)

Arguments

expr

any syntactically validR expression.

cl

acall, i.e., anR object ofclass (andmode)"call".

env

an environment or a list object. Defaults to thecurrent evaluation environment.

Details

The typical use ofsubstitute is to create informative labelsfor data sets and plots.Themyplot example below shows a simple use of this facility.It uses the functionsdeparse andsubstituteto create labels for a plot which are character string versionsof the actual arguments to the functionmyplot.

Substitution takes place by examining each component of the parse treeas follows: If it is not a bound symbol inenv, it isunchanged. If it is a promise object, i.e., a formal argument to afunction or explicitly created usingdelayedAssign(),the expression slot of the promise replaces the symbol. If it is anordinary variable, its value is substituted, unlessenv is.GlobalEnv in which case the symbol is left unchanged.

Bothquote andsubstitute are ‘special’primitive functions which do not evaluate their arguments.

Value

Themode of the result is generally"call" butmay in principle be any type. In particular, single-variableexpressions have mode"name" and constants have theappropriate base mode.

Note

substitute works on a purely lexical basis. There is noguarantee that the resulting expression makes any sense.

Substituting and quoting often cause confusion when the argument isexpression(...). The result is a call to theexpression constructor function and needs to be evaluatedwitheval to give the actual expression object.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

missing for argument ‘missingness’,bquote for partial substitution,sQuote anddQuote for adding quotationmarks to strings.Quotes about forward, back, and double quotes ‘⁠'⁠’,‘⁠`⁠’, and ‘⁠"⁠’.

all.names to retrieve the symbol names from an expressionor call.

Examples

require(graphics)(s.e<- substitute(expression(a+ b), list(a=1)))#> expression(1 + b)(s.s<- substitute( a+ b,            list(a=1)))#> 1 + bc(mode(s.e), typeof(s.e))#  "call", "language"c(mode(s.s), typeof(s.s))#   (the same)# but:(e.s.e<- eval(s.e))#>  expression(1 + b)c(mode(e.s.e), typeof(e.s.e))#  "expression", "expression"substitute(x<- x+1, list(x=1))# nonsensemyplot<-function(x, y)    plot(x, y, xlab= deparse1(substitute(x)),               ylab= deparse1(substitute(y)))## Simple examples about lazy evaluation, etc:f1<-function(x, y= x){ x<- x+1; y}s1<-function(x, y= substitute(x)){ x<- x+1; y}s2<-function(x, y){if(missing(y)) y<- substitute(x); x<- x+1; y}a<-10f1(a)# 11s1(a)# 11s2(a)# atypeof(s2(a))# "symbol"

Substrings of a Character Vector

Description

Extract or replace substrings in a character vector.

Usage

substr(x, start, stop)substring(text, first, last=1000000L)substr(x, start, stop)<- valuesubstring(text, first, last=1000000L)<- value

Arguments

x,text

a character vector.

start,first

integer. The first element to be extracted or replaced.

stop,last

integer. The last element to be extracted or replaced.

value

a character vector, recycled if necessary.

Details

substring is compatible with S, withfirst andlast instead ofstart andstop.For vector arguments, it expands the arguments cyclically to thelength of the longestprovided none are of zero length.

When extracting, ifstart is larger than the string length then"" is returned.

For the extraction functions,x ortext will beconverted to a character vector byas.character if it is notalready one.

For the replacement functions, ifstart is larger than thestring length then no replacement is done. If the portion to bereplaced is longer than the replacement string, then only theportion the length of the string is replaced.

If any argument is anNA element, the corresponding element ofthe answer isNA.

Elements of the result will be have the encoding declared as that ofthe current locale (seeEncoding) if the correspondinginput had a declared Latin-1 or UTF-8 encoding and the current localeis either Latin-1 or UTF-8.

If an input element has declared"bytes" encoding (seeEncoding), the subsetting is done in units of bytes notcharacters.

Value

Forsubstr, a character vector of the same length and with thesame attributes asx (after possible coercion).

Forsubstring, a character vector of length the longest of thearguments. This will have names taken fromx (if it has anyafter coercion, repeated as needed), and other attributes copied fromx if it is the longest of the arguments).

For the replacement functions, a character vector of the same length asx ortext, withattributes such asnames preserved.

Elements ofx ortext with a declared encoding (seeEncoding) will be returned with the same encoding.

Note

The S version ofsubstring<- ignoreslast; this versiondoes not.

These functions are often used withnchar to truncate adisplay. That does not really work (you want to limit the width, notthe number of characters, so it would be better to usestrtrim), but at least make sure you use the defaultnchar(type = "chars").

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (substring.)

See Also

strsplit,paste,nchar.

Examples

substr("abcdef",2,4)substring("abcdef",1:6,1:6)## strsplit() is more efficient ...substr(rep("abcdef",4),1:4,4:5)x<- c("asfef","qwerty","yuiop[","b","stuff.blah.yech")substr(x,2,5)substring(x,2,4:6)X<- xnames(X)<- LETTERS[seq_along(x)]comment(X)<- noquote("is a named vector")str(aX<- attributes(X))substring(x,2)<- c("..","+++")substring(X,2)<- c("..","+++")Xstopifnot(x== X, identical(aX, attributes(X)), nzchar(comment(X)))

Sum of Vector Elements

Description

sum returns the sum of all the valuespresent in its arguments.

Usage

sum(..., na.rm=FALSE)

Arguments

...

numeric or complex or logical vectors.

na.rm

logical. Should missing values (includingNaN) beremoved?

Details

This is a generic function: methods can be defined for itdirectly or via theSummary group generic.For this to work properly, the arguments... should beunnamed, and dispatch is on the first argument.

Ifna.rm isFALSE anNA orNaN value inany of the arguments will cause a value ofNA orNaN tobe returned, otherwiseNA andNaN values are ignored.

Logical true values are regarded as one, false values as zero.For historical reasons,NULL is accepted and treated as if itwereinteger(0).

Loss of accuracy can occur when summing values of different signs:this can even occur for sufficiently long integer inputs if thepartial sums would cause integer overflow. Where possibleextended-precision accumulators are used, typically well supportedwith C99 and newer, but possibly platform-dependent.

Value

The sum. If all of the... arguments are of typeinteger or logical, then the sum isinteger whenpossible and isdouble otherwise. Integer overflow should nolonger happen sinceR version 3.5.0.For other argument types it is a length-one numeric(double) or complex vector.

NB: the sum of an empty set is zero, by definition.

S4 methods

This is part of the S4Summarygroup generic. Methods for it must use the signaturex, ..., na.rm.

plotmath’ for the use ofsum in plot annotation.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

colSums for row and column sums.

Examples

## Pass a vector to sum, and it will add the elements together.sum(1:5)## Pass several numbers to sum, and it also adds the elements.sum(1,2,3,4,5)## In fact, you can pass vectors into several arguments, and everything gets added.sum(1:2,3:5)## If there are missing values, the sum is unknown, i.e., also missing, ....sum(1:5,NA)## ... unless  we exclude missing values explicitly:sum(1:5,NA, na.rm=TRUE)

Object Summaries

Description

summary is a generic function used to produce result summariesof the results of various model fitting functions. The functioninvokes particularmethods which depend on theclass of the first argument.

Usage

summary(object,...)## Default S3 method:summary(object,..., digits, quantile.type=7)## S3 method for class 'data.frame'summary(object, maxsum=7,       digits= max(3, getOption("digits")-3),...)## S3 method for class 'factor'summary(object, maxsum=100,...)## S3 method for class 'matrix'summary(object,...)## S3 method for class 'summaryDefault'format(x, digits= max(3L, getOption("digits")-3L),...)## S3 method for class 'summaryDefault'print(x, digits= max(3L, getOption("digits")-3L),...)

Arguments

object

an object for which a summary is desired.

x

a result of thedefault method ofsummary().

maxsum

integer, indicating how many levels should be shown forfactors.

digits

integer, used for number formatting withsignif() (forsummary.default) orformat() (forsummary.data.frame). Insummary.default, if not specified (i.e.,missing(.)),signif() willnot be calledanymore (sinceR >= 3.4.0, where the default has been changed toonly round in theprint andformat methods).

quantile.type

integer code used inquantile(*, type=quantile.type)for the default method.

...

additional arguments affecting the summary produced.

Details

Forfactors, the frequency of the firstmaxsum - 1most frequent levels is shown, and the less frequent levels aresummarized in"(Others)" (resulting in at mostmaxsumfrequencies).

The functionssummary.lm andsummary.glm are examplesof particular methods which summarize the results produced bylm andglm.

Value

The form of the value returned bysummary depends on theclass of its argument. See the documentation of the particularmethods for details of what is produced by that method.

The default method returns an object of classc("summaryDefault", "table") which has specializedformat andprint methods. Thefactor method returns an integer vector.

The matrix and data frame methods return a matrix of class"table", obtained by applyingsummary to eachcolumn and collating the results.

References

Chambers, J. M. and Hastie, T. J. (1992)Statistical Models in S.Wadsworth & Brooks/Cole.

See Also

anova,summary.glm,summary.lm.

Examples

summary(attenu, digits=4)#-> summary.data.frame(...), default precisionsummary(attenu$ station, maxsum=20)#-> summary.factor(...)lst<- unclass(attenu$station)>20# logical with NAs## summary.default() for logicals -- different from *.factor:summary(lst)summary(as.factor(lst))

Singular Value Decomposition of a Matrix

Description

Compute the singular-value decomposition of a rectangular matrix.

Usage

svd(x, nu= min(n, p), nv= min(n, p), LINPACK=FALSE)La.svd(x, nu= min(n, p), nv= min(n, p))

Arguments

x

a numeric or complex matrix whose SVD decompositionis to be computed. Logical matrices are coerced to numeric.

nu

the number of left singular vectors to be computed.This must between0 andn = nrow(x).

nv

the number of right singular vectors to be computed.This must be between0 andp = ncol(x).

LINPACK

logical. Defunct and an error.

Details

The singular value decomposition plays an important role in manystatistical techniques.svd andLa.svd provide twointerfaces which differ in their return values.

Computing the singular vectors is the slow part for large matrices.The computation will be more efficient if bothnu <= min(n, p)andnv <= min(n, p), and even more so if both are zero.

Unsuccessful results from the underlying LAPACK code will result in anerror giving a positive error code (most often1): these canonly be interpreted by detailed study of the FORTRAN code but meanthat the algorithm failed to converge.

Missing,NaN or infinite values inx will givenan error.

Value

The SVD decomposition of the matrix as computed by LAPACK,

X=UDV,\bold{X = U D V'},

whereU\bold{U} andV\bold{V} areorthogonal,V\bold{V'} meansV transposed (and conjugatedfor complex input), andD\bold{D} is a diagonal matrix with the(non-negative) singular valuesDiiD_{ii} in decreasingorder. Equivalently,D=UXV\bold{D = U' X V}, which is verified inthe examples.

The returned value is a list with components

d

a vector containing the singular values ofx, oflengthmin(n, p), sorted decreasingly.

u

a matrix whose columns contain the left singular vectors ofx, present ifnu > 0. Dimensionc(n, nu).

v

a matrix whose columns contain the right singular vectors ofx, present ifnv > 0. Dimensionc(p, nv).

Recall that the singular vectors are only defined up to sign (aconstant of modulus one in the complex case). If a left singularvector has its sign changed, changing the sign of the correspondingright vector gives an equivalent decomposition.

ForLa.svd the return value replacesv byvt, the(conjugated if complex) transpose ofv.

Source

The main functions used are the LAPACK routinesDGESDD andZGESDD.

LAPACK is fromhttps://netlib.org/lapack/ and its guide islisted in the references.

References

Anderson. E. and ten others (1999)LAPACK Users' Guide. Third Edition. SIAM.
Available on-line athttps://netlib.org/lapack/lug/lapack_lug.html.

The‘Singular-value decomposition’ Wikipedia article.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

eigen,qr.

Examples

hilbert<-function(n){ i<-1:n;1/ outer(i-1, i, `+`)}X<- hilbert(9)[,1:6](s<- svd(X))D<- diag(s$d)s$u%*% D%*% t(s$v)#  X = U D V't(s$u)%*% X%*% s$v#  D = U' X V

Sweep out Array Summaries

Description

Return an array obtained from an input array by sweeping out a summarystatistic.

Usage

sweep(x, MARGIN, STATS, FUN="-", check.margin=TRUE,...)

Arguments

x

an array, including a matrix.

MARGIN

a vector of indices giving the extent(s) ofxwhich correspond toSTATS.Wherex has named dimnames, it can be a charactervector selecting dimension names.

STATS

the summary statistic which is to be swept out.

FUN

the function to be used to carry out the sweep.

check.margin

logical. IfTRUE (the default), warn if thelength or dimensions ofSTATS do not match the specifieddimensions ofx. Set toFALSE for a small speed gainwhen youknow that dimensions match.

...

optional arguments toFUN.

Details

FUN is found by a call tomatch.fun. As in thedefault, binary operators can be supplied if quoted or backquoted.

FUN should be a function of two arguments: it will be calledwith argumentsx and an array of the same dimensions generatedfromSTATS byaperm.

The consistency check amongSTATS,MARGIN andxis stricter ifSTATS is an array than if it is a vector.In the vector case, some kinds of recycling are allowed without awarning. Usesweep(x, MARGIN, as.array(STATS)) ifSTATSis a vector and you want to be warned if any recycling occurs.

Value

An array with the same shape asx, but with the summarystatistics swept out.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

apply on whichsweep used to be based;scale for centering and scaling.

Examples

require(stats)# for medianmed.att<- apply(attitude,2, median)sweep(data.matrix(attitude),2, med.att)# subtract the column medians## More sweeping:A<- array(1:24, dim=4:2)## no warnings in normal usesweep(A,1,5)(A.min<- apply(A,1, min))# == 1:4sweep(A,1, A.min)sweep(A,1:2, apply(A,1:2, median))## warnings when mismatchsweep(A,1,1:3)# STATS does not recyclesweep(A,1,6:1)# STATS is longer## exact recycling:sweep(A,1,1:2)# no warningsweep(A,1, as.array(1:2))# warning## Using named dimnamesdimnames(A)<- list(fee=1:4, fie=1:3, fum=1:2)mn_fum_fie<- apply(A, c("fum","fie"), mean)mn_fum_fiesweep(A, c("fum","fie"), mn_fum_fie)

Select One of a List of Alternatives

Description

switch evaluatesEXPR and accordingly chooses one of thefurther arguments (in...).

Usage

switch(EXPR,...)

Arguments

EXPR

an expression evaluating to a number or a characterstring.

...

the list of alternatives. If it is intended thatEXPR has a character-string value these will benamed, perhaps except for one alternative to be used as a‘default’ value.

Details

switch works in two distinct ways depending whether the firstargument evaluates to a character string or a number.

If the value ofEXPR is not a character string it is coerced tointeger. Note that this also happens forfactors, witha warning, as typically the character level is meant. If the integeris between 1 andnargs()-1 then the corresponding element of... is evaluated and the result returned: thus if the firstargument is3 then the fourth argument is evaluated andreturned.

IfEXPR evaluates to a character string then that string ismatched (exactly) to the names of the elements in.... Ifthere is a match then that element is evaluated unless it is missing,in which case the next non-missing element is evaluated, so forexampleswitch("cc", a = 1, cc =, cd =, d = 2) evaluates to2. If there is more than one match, the first matching elementis used. In the case of no match, if there is an unnamed element of... its value is returned. (If there is more than one suchargument an error is signaled.)

The first argument is always taken to beEXPR: if it is namedits name must (partially) match.

A warning is signaled if no alternatives are provided, as this isusually a coding error.

This is implemented as aprimitive function that only evaluatesits first argument and one other if one is selected.

Value

The value of one of the elements of..., orNULL,invisibly (whenever no element is selected).

The result has the visibility (seeinvisible) of theelement evaluated.

Warning

It is possible to write calls toswitch that can be confusingand may not work in the same way in earlier versions ofR. Forcompatibility (and clarity), always haveEXPR as the firstargument, naming it if partial matching is a possibility. For thecharacter-string form, have a single unnamed argument as the defaultafter the named values.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Examples

require(stats)centre<-function(x, type){  switch(type,         mean= mean(x),         median= median(x),         trimmed= mean(x, trim=.1))}x<- rcauchy(10)centre(x,"mean")centre(x,"median")centre(x,"trimmed")ccc<- c("b","QQ","a","A","bb")# note: cat() produces no output for NULLfor(chin ccc)    cat(ch,":", switch(EXPR= ch, a=1, b=2:3),"\n")for(chin ccc)    cat(ch,":", switch(EXPR= ch, a=, A=1, b=2:3,"Otherwise: last"),"\n")## switch(f, *) with a factor fff<- gl(3,1, labels=LETTERS[3:1])ff[1]# C## so one might expect " is C" here, butswitch(ff[1], A="I am A", B="Bb..", C=" is C")# -> "I am A"## so we give a warning## Numeric EXPR does not allow a default value to be specified## -- it is always NULLfor(iin c(-1:3,9))  print(switch(i,1,2,3,4))## visibilityswitch(1, invisible(pi), pi)switch(2, invisible(pi), pi)

Operator Syntax and Precedence

Description

OutlinesR syntax and gives the precedence of operators.

Details

The following unary and binary operators are defined. They are listedin precedence groups, from highest to lowest.

:: ::: access variables in a namespace
$ @ component / slot extraction
[ [[ indexing
^ exponentiation (right to left)
- + unary minus and plus
: sequence operator
%any% |> special operators (including%% and%/%)
* / multiply, divide
+ - (binary) add, subtract
< > <= >= == != ordering and comparison
! negation
& && and
| || or
~ as in formulae
-> ->> rightwards assignment
<- <<- assignment (right to left)
= assignment (right to left)
? help (unary and binary)

Within an expression operators of equal precedence are evaluatedfrom left to right except where indicated. (Note that= is notnecessarily an operator.)

The binary operators::,:::,$ and@ requirenames or string constants on the right hand side, and the first twoalso require them on the left.

The links in theSee Also section cover most other aspects ofthe basic syntax.

Note

There are substantial precedence differences betweenR and S. Inparticular, in S? has the same precedence as (binary)+ -and& && | || have equal precedence.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

Arithmetic,Comparison,Control,Extract,Logic,NumericConstants,Paren,Quotes,Reserved.

The ‘R Language Definition’ manual.

Examples

## Logical AND ("&&") has higher precedence than OR ("||"):TRUE||TRUE&&FALSE# is the same asTRUE||(TRUE&&FALSE)# and different from(TRUE||TRUE)&&FALSE## Special operators have higher precedence than "!" (logical NOT).## You can use this for %in% :!1:10%in% c(2,3,5,7)# same as !(1:10 %in% c(2, 3, 5, 7))## but we strongly advise to use the "!( ... )" form in this case!## '=' has lower precedence than '<-' ... so you should not mix them##     (and '<-' is considered better style anyway):## Not run: ## Consequently, this gives a ("non-catchable") error x<- y=5#->  Error in (x <- y) = 5 : ....## End(Not run)

Get Environment Variables

Description

Sys.getenv obtains the values of the environment variables.

Usage

Sys.getenv(x=NULL, unset="", names=NA)

Arguments

x

a character vector, orNULL.

unset

a character string.

names

logical: should the result be named? IfNA (thedefault) single-element results are not named whereas multi-elementresults are.

Details

Both arguments will be coerced to character if necessary.

Settingunset = NA will enable unset variables and those set tothe value"" to be distinguished,if the OS does. POSIXrequires the OS to distinguish, and all known currentR platforms do.

Value

A vector of the same length asx, with (ifnames == TRUE) the variable names as itsnames attribute. Each elementholds the value of the environment variable named by the correspondingcomponent ofx (or the value ofunset if no environmentvariable with that name was found).

On most platformsSys.getenv() will return a named vectorgiving the values of all the environment variables, sorted in thecurrent locale. It may be confused by names containing= whichsome platforms allow but POSIX does not. (Windows is such a platform:there names including= are truncated just before the first=.)

Whenx is missing andnames is not false, the result isof class"Dlist" in order to get a niceprint method.

See Also

Sys.setenv,Sys.getlocale for the locale in use,getwd for the working directory.

The help for ‘environment variables’ lists many of theenvironment variables used byR.

Examples

## whether HOST is set will be shell-dependent e.g. Solaris' csh did not.Sys.getenv(c("R_HOME","R_PAPERSIZE","R_PRINTCMD","HOST"))s<- Sys.getenv()# *all* environment variablesop<- options(width=111)# (nice printing)names(s)# all settings (the values could be very long)head(s,12)# using the Dlist print() method## Language and Locale settings -- but rather use Sys.getlocale()s[grep("^L(C|ANG)", names(s))]## typically R-related:s[grep("^_?R_", names(s))]options(op)# reset

Get the Process ID of the R Session

Description

Get the process ID of theR Session. It is guaranteed by theoperating system that twoR sessions running simultaneously willhave different IDs, but it is possible thatR sessions running atdifferent times will have the same ID.

Usage

Sys.getpid()

Value

An integer, often between 1 and 32767 under Unix-alikes (but forexample FreeBSD and macOS use IDs up to 99999) and apositive integer (up to 32767) under Windows.

Examples

Sys.getpid()## Show files opened from this R processif(.Platform$OS.type=="unix")## on Unix-alikes such Linux, macOS, FreeBSD:   system(paste("lsof -p", Sys.getpid()))

Wildcard Expansion on File Paths

Description

Function to do wildcard expansion (also known as ‘globbing’) onfile paths.

Usage

Sys.glob(paths, dirmark=FALSE)

Arguments

paths

character vector of patterns for relative or absolutefilepaths. Missing values will be ignored.

dirmark

logical: should matches to directories from patternsthat do not already end in/

have a slash appended? May not be supported on all platforms.

Details

This expands tilde (seetilde expansion) and wildcards in file paths. For precise details of wildcards expansion, see yoursystem's documentation on theglob system call. There is aPOSIX 1003.2 standard (seehttps://pubs.opengroup.org/onlinepubs/9699919799/functions/glob.html)but some OSes will go beyond this.

All systems should interpret* (match zero or more characters),? (match a single character) and (probably)[ (begin acharacter class or range). The handling of pathsending with a separator is system-dependent. On a POSIX-2008compliant OS they will match directories (only), but as they are notvalid filepaths on Windows, they match nothing there. (Earlier POSIXstandards allowed them to match files.)

The rest of these details are indicative (and based on the POSIXstandard).

If a filename starts with. this may need to be matchedexplicitly: for exampleSys.glob("*.RData") may or may notmatch ‘.RData’ but will not usually match ‘.aa.RData’. Notethat this is platform-dependent: e.g. on SolarisSys.glob("*.*") matches ‘.’ and ‘..’.

[ begins a character class. If the first character in[...] is not!, this is a character class which matchesa single character against any of the characters specified. The classcannot be empty, so] can be included provided it is first. Ifthe first character is!, the character class matches a singlecharacter which isnone of the specified characters. Whether. in a character class matches a leading. in thefilename is OS-dependent.

Character classes can include ranges such as[A-Z]: include- as a character by having it first or last in a class. (Theinterpretation of ranges should be locale-specific, so the example isnot a good idea in an Estonian locale.)

One can remove the special meaning of?,* and[ by preceding them by a backslash (except within acharacter class).

Value

A character vector of matched file paths. The order issystem-specific (but in the order of the elements ofpaths): itis normally collated in either the current locale or in byte (ASCII)order; however, on Windows collation is in the order of Unicodepoints.

Directory errors are normally ignored, so the matches are toaccessible file paths (but not necessarily accessible files).

See Also

path.expand.

Quotes for handling backslashes in character strings.

Examples

Sys.glob(file.path(R.home(),"library","*","R","*.rdx"))

Extract System and User Information

Description

Reports system and user information.

Usage

Sys.info()

Details

This uses POSIX or Windows system calls. Note that OS names (sysname) might notbe what you expect: for example macOS identifies itself as‘⁠Darwin⁠’ and Solaris as ‘⁠SunOS⁠’.

Sys.info() returns details of the platformR is running on,whereasR.version gives details of the platformR wasbuilt on: therelease andversion may well be different.

Value

A character vector with fields

sysname

The operating system name.

release

The OS release.

version

The OS version.

nodename

A name by which the machine is known on the network (ifany).

machine

A concise description of the hardware, often the CPU type.

login

The user's login name, or"unknown" if it cannot beascertained.

user

The name of the real user ID, or"unknown" if itcannot be ascertained.

effective_user

The name of the effective user ID, or"unknown" if itcannot be ascertained. This may differ from the real user in‘set-user-ID’ processes.

On Unix-alike platforms:

The first five fields come from theuname(2) system call. Thelogin name comes fromgetlogin(2), and the user names fromgetpwuid(getuid()) andgetpwuid(geteuid()).

On Windows:

The last three fields give the same value.

Note

The meaning ofrelease andversion is system-dependent:on a Unix-alike they normally refer to the kernel. There, usuallyrelease contains a numeric version andversion givesadditional information. Examples forrelease:

    "4.17.11-200.fc28.x86_64" # Linux (Fedora)    "3.16.0-5-amd64"          # Linux (Debian)    "17.7.0"                  # macOS 10.13.6    "5.11"                    # Solaris

There is no guarantee that the node or login or user names will bewhat you might reasonably expect. (In particular on some Linuxdistributions the login name is unknown from sessions with re-directedinputs.)

The use of alternatives such assystem("whoami") is notportable: the POSIX commandsystem("id") is much more portableon Unix-alikes, provided only the POSIX options-[Ggu][nr] areused (and not the many BSD and GNU extensions).whoami isequivalent toid -un (on Solaris,/usr/xpg4/bin/id -un).

Windows may report unexpected versions: there, see the help for

See Also

.Platform, andR.version.sessionInfo() gives a synopsis of both your system andtheR session (and gives the OS version in a human-readable form).

Examples

Sys.info()## An alternative (and probably better) way to get the login name on UnixSys.getenv("LOGNAME")

Find Details of the Numerical and Monetary Representationsin the Current Locale

Description

Get details of the numerical and monetary representations in thecurrent locale.

Usage

Sys.localeconv()

Details

NormallyR is run without looking at the value ofLC_NUMERIC,so the decimal point remains '.'. So the first three of thesecomponents will only be useful if you have set the locale categoryLC_NUMERIC usingSys.setlocale in the currentR session(whenR may not work correctly).

The monetary components will only be set to non-default values (seethe ‘Examples’ section) if theLC_MONETARY category isset. It often is not set: set the examples for how to trigger setting it.

Value

A character vector with 18 named components. See your ISO Cdocumentation for details of the meaning.

It is possible to compileR without support for locales, in whichcase the value will beNULL.

See Also

Sys.setlocale for ways to set locales.

Examples

Sys.localeconv()## The results in the C locale are##    decimal_point     thousands_sep          grouping   int_curr_symbol##              "."                ""                ""                ""##  currency_symbol mon_decimal_point mon_thousands_sep      mon_grouping##               ""                ""                ""                ""##    positive_sign     negative_sign   int_frac_digits       frac_digits##               ""                ""             "127"             "127"##    p_cs_precedes    p_sep_by_space     n_cs_precedes    n_sep_by_space##            "127"             "127"             "127"             "127"##      p_sign_posn       n_sign_posn##            "127"             "127"## Now try your default locale (which might be "C").old<- Sys.getlocale()## The category may not be set:## the following may do so, but it might not be supported.Sys.setlocale("LC_MONETARY", locale="")Sys.localeconv()## or set an appropriate value yourself, e.g.Sys.setlocale("LC_MONETARY","de_AT")Sys.localeconv()Sys.setlocale(locale= old)## Not run: read.table("foo", dec=Sys.localeconv()["decimal_point"])

Functions to Access the Function Call Stack

Description

These functions provide access toenvironments(‘frames’ in S terminology) associated with functions furtherup the calling stack.

Usage

sys.call(which=0)sys.frame(which=0)sys.nframe()sys.function(which=0)sys.parent(n=1)sys.calls()sys.frames()sys.parents()sys.on.exit()sys.status()parent.frame(n=1)

Arguments

which

the frame number if non-negative, the number of framesto go back if negative.

n

the number of generations to go back. (See the‘Details’ section.)

Details

.GlobalEnv is given number 0 in the list of frames.Each subsequent function evaluation increases the frame stack by 1.The call, function definition and the environment for evaluationof that function are returned bysys.call,sys.functionandsys.frame with the appropriate index.

sys.call,sys.function andsys.frame acceptinteger values for the argumentwhich. Non-negative values ofwhich are frame numbers starting from.GlobalEnvwhereas negative values are counted back from the frame number of thecurrent evaluation.

The parent frame of a function evaluation is the environment in whichthe function was called. It is not necessarily numbered one less thanthe frame number of the current evaluation, nor is it the environmentwithin which the function was defined.sys.parent returns thenumber of the parent frame ifn is 1 (the default), thegrandparent ifn is 2, and so on. See also the ‘Note’.

sys.nframe returns an integer, the number of the current frameas described in the first paragraph.

sys.calls andsys.frames give a pairlist of all theactive calls and frames, respectively, andsys.parents returnsan integer vector of indices of the parent frames of each of thoseframes.

Notice that even though thesys.xxx functions (exceptsys.status) are interpreted, their contexts are not counted norare they reported. There is no access to them.

sys.status() returns a list with componentssys.calls,sys.parents andsys.frames, the results of calls tothose three functions (which will include the call tosys.status: see the first example).

sys.on.exit() returns the expression stored for use byon.exit in the function currently being evaluated.(Note that this differs from S, which returns a list of expressionsfor the current frame and its parents.)

parent.frame(n) is a convenient shorthand forsys.frame(sys.parent(n)) (implemented slightly more efficiently).

Value

sys.call returns a call,sys.function a functiondefinition, andsys.frame andparent.frame return anenvironment.

For the other functions, see the ‘Details’ section.

Note

Strictly,sys.parent andparent.frame refer to thecontext of the parent interpreted function. So internalfunctions (which may or may not set contexts and so may or may notappear on the call stack) may not be counted, and S3 methods can also dosurprising things.

As an effect of lazy evaluation, these functions look at the call stack atthe time they are evaluated, not at the time they are called. Passingcalls to them as function arguments is unlikely to be a good idea, butthese functions still look at the call stack and count frames from theframe of the function evaluation from which they were called.

Hence, when these functions are called to provide default values forfunction arguments, they are evaluated in the evaluation of the calledfunction and they count frames accordingly (see e.g. theenvirargument ofeval).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole. (Notparent.frame.)

See Also

eval for a usage ofsys.frame andparent.frame.

Examples

require(utils)## Note: the first two examples will give different results## if run by example().ff<-function(x) gg(x)gg<-function(y) sys.status()str(ff(1))gg<-function(y){    ggg<-function(){        cat("current frame is", sys.nframe(),"\n")        cat("parents are", sys.parents(),"\n")        print(sys.function(0))# ggg        print(sys.function(2))# gg}if(y>0) gg(y-1)else ggg()}gg(3)t1<-function(){  aa<-"here"  t2<-function(){## in frame 2 here    cat("current frame is", sys.nframe(),"\n")    str(sys.calls())## list with two components t1() and t2()    cat("parents are frame numbers", sys.parents(),"\n")## 0 1    print(ls(envir= sys.frame(-1)))## [1] "aa" "t2"    invisible()}  t2()}t1()test.sys.on.exit<-function(){  on.exit(print(1))  ex<- sys.on.exit()  str(ex)  cat("exiting...\n")}test.sys.on.exit()## gives 'language print(1)', prints 1 on exit## An example where the parent is not the next frame up the stack## since method dispatch uses a frame.as.double.foo<-function(x){    str(sys.calls())    print(sys.frames())    print(sys.parents())    print(sys.frame(-1)); print(parent.frame())    x}t2<-function(x) as.double(x)a<- structure(pi, class="foo")t2(a)

Read File Symbolic Links

Description

Find out if a file path is a symbolic link, and if so what it islinked to,via the system callreadlink.

Symbolic links are a POSIX concept, not implemented on Windows but formost filesystems on Unix-alikes.

Usage

Sys.readlink(paths)

Arguments

paths

character vector of file paths. Tilde expansion is done:seepath.expand.

Value

A character vector of the same length aspaths. Theentries are the path of the file linked to,"" if the path isnot a symbolic link, andNA if there is an error (e.g., thepath does not exist or cannot be converted to the native encoding).

On platforms without thereadlink system call, all elements are"".

See Also

file.symlink for the creation of symbolic links (andtheir Windows analogues),file.info

Examples

##' To check if files (incl. directories) are symbolic links:is.symlink<-function(paths) isTRUE(nzchar(Sys.readlink(paths), keepNA=TRUE))## will return all FALSE when the platform has no `readlink` system call.is.symlink("/foo/bar")

Set or Unset Environment Variables

Description

Sys.setenv sets environment variables (for other processescalled from withinR or future calls toSys.getenv fromthisR process).

Sys.unsetenv removes environment variables.

Usage

Sys.setenv(...)Sys.unsetenv(x)

Arguments

...

named arguments with values coercible to a character string.

x

a character vector, or an object coercible to character.

Details

Non-standardR names must be quoted inSys.setenv: see theexamples. Most platforms (and POSIX) do not allow names containing"=". Windows does, but the facilities provided byR may nothandle these correctly so they should be avoided. Most platformsallow setting an environment variable to"", but Windows doesnot and thereSys.setenv(FOO = "") unsetsFOO.

There may be system-specific limits on the maximum length of thevalues of individual environment variables or of names+values of allenvironment variables.

Recent versions of Windows have a maximum length of 32,767 characters for aenvironment variable; howevercmd.exe has a limit of 8192characters for a command line, henceset can only set 8188.

Value

A logical vector, with elements being true if (un)setting thecorresponding variable succeeded. (ForSys.unsetenv thisincludes attempting to remove a non-existent variable.)

Note

On Unix-alikes, ifSys.unsetenv is not supported, it will atleast try to set the value of the environment variable to"",with a warning.

See Also

Sys.getenv,Startup for ways to set environmentvariables for theR session.

setwd for the working directory.

Sys.setlocale to set (and get) language locale variables,and notablySys.setLanguage to set theLANGUAGEenvironment variable which is used forconditionMessagetranslations.

The help for ‘environment variables’ lists many of theenvironment variables used byR.

Examples

print(Sys.setenv(R_TEST="testit","A+C"=123))# `A+C` could also be usedSys.getenv("R_TEST")Sys.unsetenv("R_TEST")# on Unix-alike may warn and not succeedSys.getenv("R_TEST", unset=NA)

Set File Time

Description

Uses system calls to set the times on a file or directory.

Usage

Sys.setFileTime(path, time)

Arguments

path

A character vector containing file or directory paths.

time

A date-time of class"POSIXct" or an object which can becoerced to one. Fractions of a second may be ignored. Recycled alongpaths.

Details

This attempts sets the file time to the value specified.

On a Unix-alike it uses the system callutimensat if that isavailable, otherwiseutimes orutime. On a POSIX filesystem it sets both the last-access and modification times.Fractional seconds will set as fromR 3.4.0 on OSes with therequisite system calls and suitable filesystems.

On Windows it uses the system callSetFileTime to set the‘last write time’. Some Windows file systems only record thetime at a resolution of two seconds.

Sys.setFileTime has been vectorized inR 3.6.0. Earlier versionsofR requiredpath andtime to be vectors of length one.

Value

A logical vector indicating if the operation succeeded for each of thefiles and directories attempted, returned invisibly.


Suspend Execution for a Time Interval

Description

Suspend execution ofR expressions for a specified time interval.

Usage

Sys.sleep(time)

Arguments

time

The time interval to suspend execution for, in seconds.

Details

Using this function allowsR to temporarily be given very lowpriority and hence not to interfere with more important foregroundtasks. A typical use is to allow a process launched fromR to setitself up and read its input files beforeR execution is resumed.

The intention is that this function suspends execution ofRexpressions but wakes the process up often enough to respond to GUIevents, typically every half second. It can be interrupted(e.g. by ‘⁠Ctrl-C⁠’ or ‘⁠Esc⁠’ at theR console).

There is no guarantee that the process will sleep for the whole of thespecified interval (sleep might be interrupted), and it may well takeslightly longer in real time to resume execution.

time must be non-negative (and notNA norNaN):Inf is allowed (and might be appropriate if the intention is towait indefinitely for an interrupt). The resolution of the timeinterval is system-dependent, but will normally be 20ms or better.(On modern Unix-alikes it will be better than 1ms.)

Value

InvisibleNULL.

Note

Despite its name, this is not currently implemented using thesleep system call (although on Windows it does make use ofSleep).

Examples

testit<-function(x){    p1<- proc.time()    Sys.sleep(x)    proc.time()- p1# The cpu usage should be negligible}testit(3.7)

Parse and Evaluate Expressions from a File

Description

Parses expressions in the given file, and then successively evaluatesthem in the specified environment.

Usage

sys.source(file, envir= baseenv(), chdir=FALSE,           keep.source= getOption("keep.source.pkgs"),           keep.parse.data= getOption("keep.parse.data.pkgs"),           toplevel.env= as.environment(envir))

Arguments

file

a character string naming the file to be read from.

envir

anR object specifying the environment in which theexpressions are to be evaluated. May also be a list or an integer.The defaultbaseenv() corresponds to evaluation in the baseenvironment. This is probably not what you want; you shouldtypically supply an explicitenvir argument, see the‘Note’.

chdir

logical; ifTRUE, theR working directory ischanged to the directory containingfile for evaluating.

keep.source

logical. IfTRUE, functions keeptheir source including comments, seeoptions(keep.source = *) for more details.

keep.parse.data

logical. IfTRUE andkeep.source isalsoTRUE, functions keep parse data with their source, seeoptions(keep.parse.data = *) for more details.

toplevel.env

anR environment to be used as top level whileevaluating the expressions. This argument is useful for frameworksrunning package tests; the default should be used in other cases.

Details

For large files,keep.source = FALSE may save quite a bit ofmemory. Disabling only parse data viakeep.parse.data = FALSEcan already save a lot.

Note onenvir

In order for the code being evaluated to use the correct environment(for example, in global assignments), source code in packages shouldcalltopenv(), which will return the namespace, if any,the environment set up bysys.source, or the global environmentif a saved image is being used.

See Also

source, andloadNamespace whichis called fromlibrary(.) and usessys.source(.).

Examples

## a simple way to put some objects in an environment## high on the search pathtmp<- tempfile()writeLines("aaa <- pi", tmp)env<- attach(NULL, name="myenv")sys.source(tmp, env)unlink(tmp)search()aaadetach("myenv")

Get Current Date and Time

Description

Sys.time andSys.Date returns the system's idea of thecurrent date with and without time.

Usage

Sys.time()Sys.Date()

Details

Sys.time returns an absolute date-time value which can beconverted to various time zones and may return different days.

Sys.Date returns the current day in the currenttime zone.

Value

Sys.time returns an object of class"POSIXct" (seeDateTimeClasses). On almost all systems it will havesub-second accuracy, possibly microseconds or better. On Windows itincrements in clock ticks (usually 1/60 of a second) reported tomillisecond accuracy.

Sys.Date returns an object of class"Date" (seeDate).

Note

Sys.time may return fractional seconds, but they are ignored bythe default conversions (e.g., printing) for class"POSIXct".See the examples andformat.POSIXct for ways to reveal them.

See Also

date for the system time in a fixed-format characterstring.

Sys.timezone.

system.time for measuring elapsed/CPU time of expressions.

Examples

Sys.time()## print with possibly greater accuracy:op<- options(digits.secs=6)Sys.time()options(op)## locale-specific version of date()format(Sys.time(),"%a %b %d %X %Y")Sys.Date()

Find Full Paths to Executables

Description

This is an interface to the system commandwhich, or to anemulation on Windows.

Usage

Sys.which(names)

Arguments

names

Character vector of names or paths of possible executables.

Details

The system commandwhich reports on the full path names ofan executable (including an executable script) as would be executed bya shell, accepting either absolute paths or looking on the path.

On Windows an ‘executable’ is a file with extension‘.exe’, ‘.com’, ‘.cmd’ or ‘.bat’. Such files neednot actually be executable, but they are whatsystemtries.

On a Unix-alike the full path towhich (usually‘/usr/bin/which’) is found whenR is installed.

Value

A character vector of the same length asnames, named bynames. The elements are either the full path to theexecutable or some indication that no executable of that name wasfound. Typically the indication is"", but this does depend onthe OS (and the known exceptions are changed to""). Missingvalues innames have missing return values.

On Windows the paths will be short paths (8+3 components, no spaces)with\ as the path delimiter.

Note

Except on Windows this calls the system commandwhich: sincethat is not part of e.g. the POSIX standards, exactly what it does isOS-dependent. It will usually do tilde-expansion and it may make useofcsh aliases.

Examples

## the first two are likely to exist everywhere## texi2dvi exists on most Unix-alikes and under MiKTeXSys.which(c("ftp","ping","texi2dvi","this-does-not-exist"))

Invoke a System Command

Description

system invokes the OS command specified bycommand.

Usage

system(command, intern=FALSE,       ignore.stdout=FALSE, ignore.stderr=FALSE,       wait=TRUE, input=NULL, show.output.on.console=TRUE,       minimized=FALSE, invisible=TRUE, timeout=0,       receive.console.signals= wait)

Arguments

command

the system command to be invoked, as a character string.

intern

a logical (notNA) which indicates whether tocapture the output of the command as anR character vector.

ignore.stdout,ignore.stderr

a logical (notNA)indicating whether messages written to ‘stdout’ or‘stderr’ should be ignored.

wait

a logical (notNA) indicating whether theRinterpreter should wait for the command to finish, or run itasynchronously. This will be ignored (and the interpreter willalways wait) ifintern = TRUE. When running the commandasynchronously, no output will be displayed on theRguiconsole in Windows (it will be dropped, instead).

input

if a character vector is supplied, this is copied onestring per line to a temporary file, and the standard input ofcommand is redirected to the file.

timeout

timeout in seconds, ignored if 0. This is a limit for theelapsed time runningcommand in a separate process. Fractionsof seconds are ignored.

receive.console.signals

a logical (notNA) indicatingwhether the command should receive events from the terminal/console thatR runs from, particularly whether it should be interrupted byCtrl-C. This will be ignored and events will always be received whenintern = TRUE orwait = TRUE.

show.output.on.console,minimized,invisible

argumentsthat are accepted on Windows but ignored on this platform, with awarning.

Details

This interface has become rather complicated over the years: seesystem2 for a more portable and flexible interfacewhich is recommended for new code.

command is parsed as a command plus arguments separated byspaces. So if the path to the command (or a single argument such as afile path) contains spaces, it must be quoted e.g. byshQuote.

Unix-alikes pass the command line to a shell (normally ‘/bin/sh’,and POSIX requires that shell), socommand can be anything theshell regards as executable, including shell scripts, and it cancontain multiple commands separated by;.

On Windows,system does not use a shell and there is a separatefunctionshell which passes command lines to a shell.

Ifintern isTRUE thenpopen is used to invoke thecommand and the output collected, line by line, into anRcharacter vector. Ifintern isFALSE thenthe C functionsystem is used to invoke the command.

wait is implemented by appending& to the command: thisis in principle shell-dependent, but required by POSIX and so widelysupported.

Whentimeout is non-zero, the command is terminated after the givennumber of seconds. The termination works for typical commands, but is notguaranteed: it is possible to write a program that would keep runningafter the time is out. Timeouts can only be set withwait = TRUE.

Timeouts cannot be used with interactive commands: the command is run withstandard input redirected from ‘/dev/null’ and it must not modifyterminal settings. As long as ttytostop option is disabled, whichit usually is by default, the executed command may write to standardoutput and standard error. One cannot rely on that the execution time ofthe child processes will be included intouser.child andsys.child element ofproc_time returned byproc.time. For the time to be included, all child processes have to be waited for bytheir parents, which has to be implemented in the parent applications.

The ordering of arguments after the first two has changed from time totime: it is recommended to name all arguments after the first.

There are many pitfalls in usingsystem to ascertain if acommand can be run —Sys.which is more suitable.

receive.console.signals = TRUE is useful when running asynchronousprocesses (usingwait = FALSE) to implement a synchronous operation.In all other cases it is recommended to use the default.

Value

Ifintern = TRUE, a character vector giving the output of thecommand, one line per character string. (Output lines of more than8095 bytes will be split on some systems.)If the command could not be run anR error is generated.

Ifcommand runs but gives a non-zero exit status this will bereported with a warning and in the attribute"status" of theresult: an attribute"errmsg" may also be available.

Ifintern = FALSE, the return value is an error code (0for success), given the invisible attribute (so needs to be printedexplicitly). If the command could not be run for any reason, thevalue is127 and a warning is issued (as fromR 3.5.0).Otherwise ifwait = TRUE the value is the exit status returnedby the command, and ifwait = FALSE it is0 (theconventional success value).

If the command times out, a warning is reported and the exit status is124.

Stdout and stderr

For command-lineR, error messages written to ‘stderr’ will besent to the terminal unlessignore.stderr = TRUE. They can becaptured (in the most likely shells) by

    system("some command 2>&1", intern = TRUE)

For GUIs, what happens to output sent to ‘stdout’ or‘stderr’ ifintern = FALSE is interface-specific, and itis unsafe to assume that such messages will appear on a GUI console(they do on the macOS GUI's console, but not on some others).

Differences between Unix and Windows

How processes are launched differs fundamentally between Windows andUnix-alike operating systems, as do the higher-level OS functions onwhich thisR function is built. So it should not be surprising thatthere are many differences between OSes in howsystem behaves.For the benefit of programmers, the more important ones are summarizedin this section.

  • The most important difference is that on a Unix-alikesystem launches a shell which then runscommand. OnWindows the command is run directly – useshell for aninterface which runscommandvia a shell (by defaultthe Windows shellcmd.exe, which has many differences froma POSIX shell).

    This means that it cannot be assumed that redirection or piping willwork insystem (redirection sometimes does, but we have seencases where it stopped working after a Windows security patch), andsystem2 (orshell) must be used on Windows.

  • What happens tostdout andstderr when notcaptured depends on howR is running: Windows batch commands behavelike a Unix-alike, but from the Windows GUI they aregenerally lost.system(intern = TRUE) captures ‘stderr’when run from the Windows GUI console unlessignore.stderr = TRUE.

  • The behaviour on error is different in subtle ways (and hasdiffered betweenR versions).

  • The quoting conventions forcommand differ, butshQuote is a portable interface.

  • Argumentsshow.output.on.console,minimized,invisible only do something on Windows (and are most relevanttoRgui there).

See Also

man system andman sh for how this is implementedon the OS in use.

.Platform for platform-specific variables.

pipe to set up a pipe connection.

Examples

# list all files in the current directory using the -F flag## Not run: system("ls -F")# t1 is a character vector, each element giving a line of output from who# (if the platform has who)t1<- try(system("who", intern=TRUE))try(system("ls fizzlipuzzli", intern=TRUE, ignore.stderr=TRUE))# zero-length result since file does not exist, and will give warning.

Find Names of R System Files

Description

Finds the full file names of files in packages etc.

Usage

system.file(..., package="base", lib.loc=NULL,            mustWork=FALSE)

Arguments

...

character vectors, specifying subdirectory and file(s)within some package. The default, none, returns theroot of the package. Wildcards are not supported.

package

a character string with the name of a single package.An error occurs if more than one package name is given.

lib.loc

a character vector with path names ofR libraries.See ‘Details’ for the meaning of the default value ofNULL.

mustWork

logical. IfTRUE, an error is given if thereare no matching files.

Details

This checks the existence of the specified files withfile.exists. So file paths are only returned if thereare sufficient permissions to establish their existence.

The unnamed arguments in... are usually character strings, butif character vectors they are recycled to the same length.

This usesfind.package to find the package, and hencewith the defaultlib.loc = NULL looks first for attachedpackages then in each library listed in.libPaths().Note that if a namespace is loaded but the package is not attached,this will look only on.libPaths().

Value

A character vector of positive length, containing the file pathsthat matched..., or the empty string,"", if nonematched (unlessmustWork = TRUE).

If matching the root of a package, there is no trailing separator.

system.file() with no arguments gives the root of thebase package.

See Also

R.home for the root directory of theRinstallation,list.files.

Sys.glob to find paths via wildcards.

Examples

system.file()# The root of the 'base' packagesystem.file(package="stats")# The root of package 'stats'system.file("INDEX")system.file("help","AnIndex", package="splines")

CPU Time Used

Description

Return CPU (and other) times thatexpr used.

Usage

system.time(expr, gcFirst=TRUE)

Arguments

expr

ValidR expression to be timed.

gcFirst

Logical - should a garbage collection be performedimmediately before the timing? Default isTRUE.

Details

system.time calls the functionproc.time,evaluatesexpr, and then callsproc.time once more,returning the difference between the twoproc.time calls.

unix.time has been an alias ofsystem.time, forcompatibility with S, has been deprecated in 2016 and finally becamedefunct in 2022.

Timings of evaluations of the same expression can vary considerablydepending on whether the evaluation triggers a garbage collection. WhengcFirst isTRUE a garbage collection (gc)will be performed immediately before the evaluation ofexpr.This will usually produce more consistent timings.

Value

A object of class"proc_time": seeproc.time for details.

See Also

proc.time,time which is for time series.

setTimeLimit to limit the (CPU/elapsed) timeR is allowedto use.

Sys.time to get the current date & time.

Examples

require(stats)system.time(for(iin1:100) mad(runif(1000)))## Not run:exT<-function(n=10000){# Purpose: Test if system.time works ok;   n: loop size  system.time(for(iin1:n) x<- mean(rt(1000, df=4)))}#-- Try to interrupt one of the following (using Ctrl-C / Escape):exT()#- about 4 secs on a 2.5GHz Xeonsystem.time(exT())#~ +/- same## End(Not run)

Invoke a System Command

Description

system2 invokes the OS command specified bycommand.

Usage

system2(command, args= character(),        stdout="", stderr="", stdin="", input=NULL,        env= character(), wait=TRUE,        minimized=FALSE, invisible=TRUE, timeout=0,        receive.console.signals= wait)

Arguments

command

the system command to be invoked, as a character string.

args

a character vector of arguments tocommand.The arguments have to be quoted e.g. byshQuotein case they contain space or other special characters(a double quote or backslash on Windows, shell-specific specialcharacters on Unix).

stdout,stderr

where output to ‘stdout’ or‘stderr’ should be sent. Possible values are"", to theRconsole (the default),NULL orFALSE (discard output),TRUE (capture the output in a character vector) or acharacter string naming a file.

stdin

should input be diverted?"" means the default,alternatively a character string naming a file. Ignoredifinput is supplied.

input

if a character vector is supplied, this is copied onestring per line to a temporary file, and the standard input ofcommand is redirected to the file.

env

character vector of name=value strings to set environmentvariables.

wait

a logical (notNA) indicating whether theRinterpreter should wait for the command to finish, or run itasynchronously. This will be ignored (and the interpreter willalways wait) ifstdout = TRUE orstderr = TRUE. Whenrunning the command asynchronously, no output will be displayed ontheRgui console in Windows (it will be dropped, instead).

timeout

timeout in seconds, ignored if 0. This is a limit for theelapsed time runningcommand in a separate process. Fractionsof seconds are ignored.

receive.console.signals

a logical (notNA) indicating whetherthe command should receive events from the terminal/console thatR runsfrom, particularly whether it should be interrupted byCtrl-C. Thiswill be ignored and events will always be received whenintern = TRUE orwait = TRUE.

minimized,invisible

arguments that are accepted on Windows butignored on this platform, with a warning.

Details

Unlikesystem,command is always quoted byshQuote, so it must be a single command without arguments.

For details of howcommand is found seesystem.

On Windows,env is only supported for commands such asR andmake which accept environment variables ontheir command line.

Some Unix commands (such as some implementations ofls) changetheir output if they consider it to be piped or redirected:stdout = TRUE uses a pipe whereasstdout = "some_file_name" uses redirection.

Because of the way it is implemented, on a Unix-alikestderr = TRUE impliesstdout = TRUE: a warning is given if this isnot what was specified.

Whentimeout is non-zero, the command is terminated after the givennumber of seconds. The termination works for typical commands, but is notguaranteed: it is possible to write a program that would keep runningafter the time is out. Timeouts can only be set withwait = TRUE.

Timeouts cannot be used with interactive commands: the command is run withstandard input redirected from/dev/null and it must not modifyterminal settings. As long as ttytostop option is disabled, whichit usually is by default, the executed command may write to standardoutput and standard error.

receive.console.signals = TRUE is useful when running asynchronousprocesses (usingwait = FALSE) to implement a synchronous operation.In all other cases it is recommended to use the default.

Value

Ifstdout = TRUE orstderr = TRUE, a character vectorgiving the output of the command, one line per character string.(Output lines of more than 8095 bytes will be split.) If the commandcould not be run anR error is generated. Ifcommand runs butgives a non-zero exit status this will be reported with a warning andin the attribute"status" of the result: an attribute"errmsg" may also be available.

In other cases, the return value is an error code (0 forsuccess), given theinvisible attribute (so needs to be printedexplicitly). If the command could not be run for any reason, thevalue is127 and a warning is issued (as fromR 3.5.0).Otherwise ifwait = TRUE the value is the exit status returnedby the command, and ifwait = FALSE it is0 (theconventional success value).

If the command times out, a warning is issued and the exit status is124.

Note

system2 is a more portable and flexible interface thansystem. It allows redirection of output without needingto invoke a shell on Windows, a portable way to set environmentvariables for the execution ofcommand, and finer control overthe redirection ofstdout andstderr. Conversely,system (andshell on Windows) allows the invocation ofarbitrary command lines.

There is no guarantee that ifstdout andstderr are bothTRUE or the same file that the two streams will be interleavedin order. This depends on both the buffering used by the command andthe OS.

See Also

system.


Matrix Transpose

Description

Given a matrix ordata.framex,t returns the transpose ofx.

Usage

t(x)

Arguments

x

a matrix or data frame, typically.

Details

This is a generic function for which methods can be written. Thedescription here applies to the default and"data.frame" methods.

A data frame is first coerced to a matrix: seeas.matrix.Whenx is a vector, it is treated as a column, i.e., theresult is a 1-row matrix.

Value

A matrix, withdim anddimnames constructedappropriately from those ofx, and other attributes exceptnames copied across.

Note

Theconjugate transpose of a complex matrixAA, denotedAHA^H orAA^*, is computed asConj(t(A)).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

aperm for permuting the dimensions of arrays.

Examples

a<- matrix(1:30,5,6)ta<- t(a)##-- i.e.,  a[i, j] == ta[j, i] for all i,j :for(jin seq(ncol(a)))if(! all(a[, j]== ta[j,])) stop("wrong transpose")

Cross Tabulation and Table Creation

Description

table uses cross-classifying factors to build a contingencytable of the counts at each combination of factor levels.

Usage

table(...,      exclude=if(useNA=="no") c(NA,NaN),      useNA= c("no","ifany","always"),      dnn= list.names(...), deparse.level=1)as.table(x,...)is.table(x)## S3 method for class 'table'as.data.frame(x, row.names=NULL,...,              responseName="Freq", stringsAsFactors=TRUE,              sep="", base= list(LETTERS))

Arguments

...

one or more objects which can be interpreted as factors(including numbers or character strings), or alist (suchas a data frame) whosecomponents can be so interpreted. (Foras.table, argumentspassed to specific methods; foras.data.frame, unused.)

exclude

levels to remove for all factors in.... Ifit does not containNA anduseNA is notspecified, it impliesuseNA = "ifany". See‘Details’ for its interpretation for non-factor arguments.

useNA

whether to includeNA values in the table.See ‘Details’. Can be abbreviated.

dnn

the names to be given to the dimensions in the result (thedimnames names).

deparse.level

controls how the defaultdnn isconstructed. See ‘Details’.

x

an arbitraryR object, or an object inheriting from class"table" for theas.data.frame method. Note thatas.data.frame.table(x, *) may be called explicitly fornon-tablex for “reshaping”arrays.

row.names

a character vector giving the row names for the dataframe.

responseName

the name to be used for the column of tableentries, usually counts.

stringsAsFactors

logical: should the classifying factors bereturned as factors (the default) or character vectors?

sep,base

passed toprovideDimnames.

Details

If the argumentdnn is not supplied, the internal functionlist.names is called to compute the ‘dimname names’ asfollows:If... is onelist with its ownnames(),thesenames are used. Otherwise, if thearguments in... are named, those names are used. For theremaining arguments,deparse.level = 0 gives an empty name,deparse.level = 1 uses the supplied argument if it is a symbol,anddeparse.level = 2 will deparse the argument.

Only whenexclude is specified (i.e., not by default) andnon-empty, willtable potentially drop levels of factorarguments.

useNA controls if the table includes counts ofNAvalues: the allowed values correspond to never ("no"), only if the count ispositive ("ifany") and even for zero counts ("always").Note the somewhat “pathological” case of two different kinds ofNAs which are treated differently, depending on bothuseNA andexclude, seed.patho in the‘Examples:’ below.

Bothexclude anduseNA operate on an “all or none”basis. If you want to control the dimensions of a multiway tableseparately, modify each argument usingfactor oraddNA.

Non-factor argumentsa are coerced viafactor(a, exclude=exclude). SinceR 3.4.0, care is takennot tocount the excluded values (where they were included in theNAcount, previously).

Thesummary method for class"table" (used for objectscreated bytable orxtabs) which gives basicinformation and performs a chi-squared test for independence offactors (note that the functionchisq.test currentlyonly handles 2-d tables).

Value

table() returns acontingency table, an object ofclass"table", an array of integer values.Note that unlike S the result is always anarray, a 1Darray if one factor is given.

as.table andis.table coerce to and test for contingencytable, respectively.

Theas.data.frame method for objects inheriting from class"table" can be used to convert the array-based representationof a contingency table to a data frame containing the classifyingfactors and the corresponding entries (the latter as componentnamed byresponseName). This is the inverse ofxtabs.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

tabulate is the underlying function and allows finercontrol.

Useftable for printing (and more) ofmultidimensional tables.margin.table,prop.table,addmargins.

addNA for constructing factors withNA asa level.

xtabs for cross tabulation of data frames with aformula interface.

Examples

require(stats)# for rpois and xtabs## Simple frequency distributiontable(rpois(100,5))## Check the design:with(warpbreaks, table(wool, tension))table(state.division, state.region)# simple two-way contingency tablewith(airquality, table(cut(Temp, quantile(Temp)), Month))a<- letters[1:3]table(a, sample(a))# dnn is c("a", "")table(a, sample(a), dnn=NULL)# dimnames() have no namestable(a, sample(a), deparse.level=0)# dnn is c("", "")table(a, sample(a), deparse.level=2)# dnn is c("a", "sample(a)")## xtabs() <-> as.data.frame.table() :UCBAdmissions## already a contingency tableDF<- as.data.frame(UCBAdmissions)class(tab<- xtabs(Freq~ ., DF))# xtabs & table## tab *is* "the same" as the original table:all(tab== UCBAdmissions)all.equal(dimnames(tab), dimnames(UCBAdmissions))a<- rep(c(NA,1/0:3),10)table(a)# does not report NA'stable(a, exclude=NULL)# reports NA'sb<- factor(rep(c("A","B","C"),10))table(b)table(b, exclude="B")d<- factor(rep(c("A","B","C"),10), levels= c("A","B","C","D","E"))table(d, exclude="B")print(table(b, d), zero.print=".")## NA counting:is.na(d)<-3:4d.<- addNA(d)d.[1:7]table(d.)# ", exclude = NULL" is not needed## i.e., if you want to count the NA's of 'd', usetable(d, useNA="ifany")## "pathological" case:d.patho<- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho)<-3:4d.patho## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here :as.integer(d.patho)# 1 4 NA NA 1 2#### In R >= 3.4.0, table() allows to differentiate:table(d.patho)# counts the "unusual" NAtable(d.patho, useNA="ifany")# counts all threetable(d.patho, exclude=NULL)#  (ditto)table(d.patho, exclude=NA)# counts none## Two-way tables with NA counts. The 3rd variant is absurd, but shows## something that cannot be done using exclude or useNA.with(airquality,   table(OzHi= Ozone>80, Month, useNA="ifany"))with(airquality,   table(OzHi= Ozone>80, Month, useNA="always"))with(airquality,   table(OzHi= Ozone>80, addNA(Month)))

Tabulation for Vectors

Description

tabulate takes the integer-valued vectorbin and countsthe number of times each integer occurs in it.

Usage

tabulate(bin, nbins= max(1, bin, na.rm=TRUE))

Arguments

bin

a numeric vector (of positive integers), or a factor.Long vectors are supported.

nbins

the number of bins to be used.

Details

tabulate is the workhorse for thetable function.

Ifbin is a factor, its internal integer representationis tabulated.

If the elements ofbin are numeric but not integers,they are truncated byas.integer.

Value

An integer valuedinteger ordouble vector(without names). There is a bin for each of the values1, ..., nbins; values outside that range andNAs are (silently)ignored.

On 64-bit platformsbin can have2312^{31} or moreelements (i.e.,length(bin) > .Machine$integer.max), and hencea count could exceed the maximum integer. For this reason, the returnvalue is of type double for such longbin vectors.

See Also

table,factor.

Examples

tabulate(c(2,3,5))tabulate(c(2,3,3,5), nbins=10)tabulate(c(-2,0,2,3,3,5))# -2 and 0 are ignoredtabulate(c(-2,0,2,3,3,5), nbins=3)tabulate(factor(letters[1:10]))

Tailcall andExec

Description

Tailcall andExec allow writing morestack-space-efficient recursive functions inR.

Usage

Tailcall(FUN,...)Exec(expr, envir)

Arguments

FUN

a function or a non-empty character string naming thefunction to be called.

...

all the arguments to be passed.

expr

a call expression.

envir

environment for evaluatingexpr; default is theenvironment from whichExec is called.

Details

Tailcall evaluates a call toFUN with arguments ... inthe current environment, andExec evaluates the callexpr in environmentenvir. If aTailcall orExec expression appears in tail position in anR function, andif there are noon.exit expressions set, then the evaluationcontext of the new calls replaces the currently executing call contextwith a new one. If the requirements for context re-use are not met,then evaluation proceeds in the standard way adding another context tothe stack.

UsingTailcall it is possible to define tail-recursivefunctions that do not grow the evaluation stack.Exec can beused to simplify the call stack for functions that create and thenevaluate an expression.

Because of lazy evaluation of arguments inR it may be necessary toforce evaluation of some arguments to avoid accumulating deferredevaluations.

Thistail call optimization has the advantage of not growingthe call stack and permitting arbitrarily deep tail recursions. Itdoes also mean that stack traces produced bytracebackorsys.calls will only show the call specified byTailcall orExec, not the previous call whose stackentry has been replaced.

Note

Tailcall andExec are experimental and may bechanged or dropped in future released versions ofR.

See Also

Recall andforce.

Examples

## tail-recursive log10-factoriallfact<-function(n){    lfact_iter<-function(val, n){if(n<=0)            valelse{            val<- val+ log10(n)# forces val            Tailcall(lfact_iter, val, n-1)}}    lfact_iter(0, n)}10^ lfact(3)lfact(100000)## simplified variant of do.call using Exec:docall<-function(what, args, quote=FALSE){if(!is.list(args))         stop("second argument must be a list")if(quote)         args<- lapply(args, enquote)    Exec(as.call(c(list(substitute(what)), args)), parent.frame())}## the call stack does not contain the call to docall:docall(function() sys.calls(), list())|>     Find(function(x) identical(x[[1]], quote(docall)), x= _)## contrast to do.call:do.call(function(x) sys.calls(), list())|>     Find(function(x) identical(x[[1]], quote(do.call)), x= _)

Apply a Function Over a Ragged Array

Description

Apply a function to each cell of a ragged array, that is to each(non-empty) group of values or data rows given by a uniquecombination of the levels of certain factors.

Usage

tapply(X, INDEX, FUN=NULL,..., default=NA, simplify=TRUE)

Arguments

X

anR object for which asplit methodexists. Typically vector-like, allowing subsetting with[, or a data frame.

INDEX

alist of one or morefactors,each of same length asX. The elements are coerced tofactors byas.factor. Can also be a formula, which isuseful ifX is a data frame; see thef argument insplit for interpretation.

FUN

a function (or name of a function) to be applied, orNULL.In the case of functions like+,%*%, etc.,the function name must be backquoted or quoted. IfFUN isNULL,tapply returns a vector which can be used to subscriptthe multi-way arraytapply normally produces.

...

optional arguments toFUN: the Note section.

default

(only in the case of simplification to an array) thevalue with which the array is initialized asarray(default, dim = ..). BeforeR 3.4.0, thiswas hard coded toarray()'s defaultNA. If itisNA (the default), the missing value of the answer type,e.g.NA_real_, is chosen (as.raw(0) for"raw"). In a numerical case, it may be set, e.g., toFUN(integer(0)), e.g., in the case ofFUN = sum to0 or0L.

simplify

logical; ifFALSE,tapply always returnsan array of mode"list"; in other words, alistwith adim attribute. IfTRUE (the default), then ifFUN always returns a scalar,tapply returns an arraywith the mode of the scalar.

Details

IfFUN is notNULL, it is passed tomatch.fun, and hence it can be a function or a symbol orcharacter string naming a function.

Value

WhenFUN is present,tapply callsFUN for eachcell that has any data in it. IfFUN returns a single atomicvalue for each such cell (e.g., functionsmean orvar)and whensimplify isTRUE,tapply returns amulti-wayarray containing the values, andNA for theempty cells. The array has the same number of dimensions asINDEX has components; the number of levels in a dimension isthe number of levels (nlevels()) in the corresponding componentofINDEX. Note that if the return value has a class (e.g., anobject of class"Date") the class is discarded.

simplify = TRUE always returns an array, possibly 1-dimensional.

IfFUN does not return a single atomic value,tapplyreturns an array of modelist whose components are thevalues of the individual calls toFUN, i.e., the result is alist with adim attribute.

When there is an array answer, itsdimnames are named bythe names ofINDEX and are based on the levels of the groupingfactors (possibly after coercion).

For a list result, the elements corresponding to empty cells areNULL.

Thearray2DF function can be used to convert the arrayreturned bytapply into a data frame, which may be moreconvenient for further analysis.

Note

Optional arguments toFUN supplied by the... argumentare not divided into cells. It is therefore inappropriate forFUN to expect additional arguments with the same length asX.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

the convenience functionsby andaggregate (usingtapply);apply,lapply with its versionssapply andmapply.

array2DF to convert the result into a data frame.

Examples

require(stats)groups<- as.factor(rbinom(32, n=5, prob=0.4))tapply(groups, groups, length)#- is almost the same astable(groups)## contingency table from data.frame : array with named dimnamestapply(warpbreaks$breaks, warpbreaks[,-1], sum)tapply(warpbreaks$breaks, warpbreaks[,3, drop=FALSE], sum)n<-17; fac<- factor(rep_len(1:3, n), levels=1:5)table(fac)tapply(1:n, fac, sum)tapply(1:n, fac, sum, default=0)# maybe more desirabletapply(1:n, fac, sum, simplify=FALSE)tapply(1:n, fac, range)tapply(1:n, fac, quantile)tapply(1:n, fac, length)## NA'stapply(1:n, fac, length, default=0)# == table(fac)## example of ... argument: find quarterly meanstapply(presidents, cycle(presidents), mean, na.rm=TRUE)ind<- list(c(1,2,2), c("A","A","B"))table(ind)tapply(1:3, ind)#-> the split vectortapply(1:3, ind, sum)## Some assertions (not held by all patch propsals):nq<- names(quantile(1:5))stopifnot(  identical(tapply(1:3, ind), c(1L,2L,4L)),  identical(tapply(1:3, ind, sum),            matrix(c(1L,2L,NA,3L),2, dimnames= list(c("1","2"), c("A","B")))),  identical(tapply(1:n, fac, quantile)[-1],            array(list(`2`= structure(c(2,5.75,9.5,13.25,17), names= nq),                 `3`= structure(c(3,6,9,12,15), names= nq),                 `4`=NULL, `5`=NULL), dim=4, dimnames=list(as.character(2:5)))))

Add or Remove a Top-Level Task Callback

Description

addTaskCallback registers an R functionthat is to be called each time a top-level taskis completed.

removeTaskCallback un-registers a functionthat was registered earlier viaaddTaskCallback.

These provide low-level access to the internal/nativemechanism for managing task-completion actions.One can usetaskCallbackManagerat theR-language level to manageR functionsthat are called at the completion of each task.This is easier and more direct.

Usage

addTaskCallback(f, data=NULL, name= character())removeTaskCallback(id)

Arguments

f

the function that is to be invoked each time a top-level taskis successfully completed. This is called with 5 or 4 argumentsdepending on whetherdata is specified or not, respectively.The return value should be a logical value indicating whether tokeep the callback in the list of active callbacks or discard it.

data

if specified, this is the 5-th argument in the call to thecallback functionf.

id

a string or an integer identifying the element in theinternal callback list to be removed.Integer indices are 1-based, i.e the first element is 1.The names of currently registered handlers is availableusinggetTaskCallbackNamesand is also returned in a call toaddTaskCallback.

name

character: names to be used.

Details

Top-level tasks are individual expressionsrather than entire lines of input. Thus an inputline of the formexpression1 ; expression2will give rise to 2 top-level tasks.

A top-level task callback is called with the expression for thetop-level task, the result of the top-level task, a logical valueindicating whether it was successfully completed or not (always TRUEat present), and a logical value indicating whether the result wasprinted or not. If thedata argument was specified in the calltoaddTaskCallback, that value is given as the fifth argument.

The callback function should return a logical value.If the value is FALSE, the callback is removed from the tasklist and will not be called again by this mechanism.If the function returns TRUE, it is kept in the list andwill be called on the completion of the next top-level task.

Value

addTaskCallback returnsan integer value giving the position in the listof task callbacks that this new callback occupies.This is only the current position of the callback.It can be used to remove the entry as long asno other values are removed from earlier positionsin the list first.

removeTaskCallback returns a logical valueindicating whether the specified element was removed.This can fail (i.e., returnFALSE)if an incorrect name or index is given that does notcorrespond to the name or position of an element in the list.

Note

There is also C-level access to top-level task callbacksto allow C routines rather than R functions be used.

See Also

getTaskCallbackNamestaskCallbackManagerhttps://developer.r-project.org/TaskHandlers.pdf

Examples

times<-function(total=3, str="Task a"){  ctr<-0function(expr, value, ok, visible){    ctr<<- ctr+1    cat(str, ctr,"\n")    keep.me<-(ctr< total)if(!keep.me)      cat("handler removing itself\n")# return    keep.me}}# add the callback that will work for# 4 top-level tasks and then remove itself.n<- addTaskCallback(times(4))# now remove it, assuming it is still first in the list.removeTaskCallback(n)## See how the handler is called every time till "self destruction":addTaskCallback(times(4))# counts as once alreadysum(1:10); mean(1:3)# two moresinpi(1)# 4th - and "done"cospi(1)tanpi(1)

Create an R-level Task Callback Manager

Description

This provides an entirelyR-language mechanismfor managing callbacks or actions that are invoked atthe conclusion of each top-level task. Essentially,we register a singleR function from this managerwith the underlying, nativetask-callback mechanism and this function handles invoking the otherR callbacks under the control of the manager.The manager consists of a collection of functions that access sharedvariables to manage the list of user-level callbacks.

Usage

taskCallbackManager(handlers= list(), registered=FALSE,                    verbose=FALSE)

Arguments

handlers

this can be a list of callbacks in which each elementis a list with an element named"f"which is a callback function, and an optionalelement named"data" which is the 5-th argument to besupplied to the callback when it is invoked.Typically this argument is not specified, and one usesadd toregister callbacks after the manager is created.

registered

a logical value indicating whethertheevaluate function has already been registeredwith the internal task callback mechanism.This is usuallyFALSE andthe first time a callback is addedvia theadd function, theevaluate function is automatically registered.One can control when the function is registeredby specifyingTRUE for this argumentand callingaddTaskCallback manually.

verbose

a logical value, which ifTRUE,causes information to be printed to the consoleabout certain activities this dispatch manager performs.This is useful for debugging callbacks and the handleritself.

Value

Alist containing 6 functions:

add()

register a callback with this manager, giving thefunction, an optional 5-th argument, an optional nameby which the callback is stored in the list,and aregister argument which controls whethertheevaluate function is registered with the internalC-level dispatch mechanism if necessary.

remove()

remove an element from the manager's collectionof callbacks, either by name or position/index.

evaluate()

the ‘real’ callback function that is registeredwith the C-level dispatch mechanism and which invokes each of theR-level callbacks within this manager's control.

suspend()

a function to set the suspend stateof the manager. If it is suspended, none of the callbacks will beinvoked when a task is completed. One sets the state by specifyinga logical value for thestatus argument.

register()

a function to register theevaluatefunction with the internal C-level dispatch mechanism. This isdone automatically by theadd function, but can be calledmanually.

callbacks()

returns the list of callbacks being maintainedby this manager.

References

Duncan Temple Lang (2001)Top-level Task Callbacks in R,https://developer.r-project.org/TaskHandlers.pdf

See Also

addTaskCallback,removeTaskCallback,getTaskCallbackNames and the reference.

Examples

# create the managerh<- taskCallbackManager()# add a callbackh$add(function(expr, value, ok, visible){                       cat("In handler\n")                       return(TRUE)}, name="simpleHandler")# look at the internal callbacks.getTaskCallbackNames()# look at the R-level callbacksnames(h$callbacks())removeTaskCallback("R-taskCallbackManager")

Query the Names of the Current Internal Top-Level Task Callbacks

Description

This provides a way to get the names (or identifiers)for the currently registered task callbacksthat are invoked at the conclusion of each top-level task.These identifiers can be used to remove a callback.

Usage

getTaskCallbackNames()

Value

A character vector giving the name for each of theregistered callbacks which are invoked whena top-level task is completed successfully.Each name is the one used when registeringthe callbacks and returned as the in thecall toaddTaskCallback.

Note

One can usetaskCallbackManagerto manage user-level task callbacks,i.e., S-language functions, entirely withinthe S language and access the namesmore directly.

See Also

addTaskCallback,removeTaskCallback,taskCallbackManager\https://developer.r-project.org/TaskHandlers.pdf

Examples

n<- addTaskCallback(function(expr, value, ok, visible){                        cat("In handler\n")                        return(TRUE)}, name="simpleHandler") getTaskCallbackNames()# now remove it by name removeTaskCallback("simpleHandler") h<- taskCallbackManager() h$add(function(expr, value, ok, visible){                        cat("In handler\n")                        return(TRUE)}, name="simpleHandler") getTaskCallbackNames() removeTaskCallback("R-taskCallbackManager")

Create Names for Temporary Files

Description

tempfile returns a vector of character strings which can be used asnames for temporary files.

Usage

tempfile(pattern="file", tmpdir= tempdir(), fileext="")tempdir(check=FALSE)

Arguments

pattern

a non-empty character vector giving the initial partof the name.

tmpdir

a non-empty character vector giving the directory name.

fileext

a non-empty character vector giving the file extension.

check

logical indicating iftmpdir()should be checked and recreated if no longer valid.

Details

The length of the result is the maximum of the lengths of the threearguments; values of shorter arguments are recycled.

The names are very likely to be unique among calls totempfilein anR session and across simultaneousR sessions (unlesstmpdir is specified). The filenames are guaranteed not to becurrently in use.

The file name is made by concatenating the path given bytmpdir, thepattern string, a random string in hex anda suffix offileext.

By default,tmpdir will be the directory given bytempdir(). This will be a subdirectory of the per-sessiontemporary directory found by the following rule when theR session isstarted. The environment variablesTMPDIR,TMP andTEMP are checked in turn and the first found which points to awritable directory is used:if none succeeds ‘/tmp’ is used. The path must not contain spaces.

Note that setting any of these environment variables in theR sessionhas no effect ontempdir(): the per-session temporary directoryis created before the interpreter is started.

Value

Fortempfile a character vector giving the names of possible(temporary) files. Note that no files are generated bytempfile.

Fortempdir, the path of the per-session temporary directory.

On Windows, both will use a backslash as the path separator.

On a Unix-alike, the value will be an absolute path (unlesstmpdir is set to a relative path), but it need not be canonical(seenormalizePath) and on macOS it often is not.

Note on parallel use

R processes forked by functions such asmclapply andmakeForkCluster in packageparallel share aper-session temporary directory. Further, the ‘guaranteed notto be currently in use’ applies only at the time of asking, and twochildren could ask simultaneously. This is circumvented by ensuringthattempfile calls in different children try different names.

Source

The final component oftempdir() is created by the POSIX systemcallmkdtemp, or if this is not available (e.g. onWindows) a version derived from the source code of GNUglibc.

It will be of the form ‘RtmpXXXXXX’ where the last 6 charactersare replaced in a platform-specific way. POSIX only requires that thereplacements be ASCII, which allows. (so the value may appearto have a file extension) andregexp metacharacters such as+. Most commonly the replacements are from theregexppattern[A-Za-z0-9], but.has been seen.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

unlink for deleting files.

Examples

tempfile(c("ab","a b c"))# give file name with spaces in!tempfile("plot", fileext= c(".ps",".pdf"))tempdir()# works on all platforms with a platform-dependent result## Show how 'check' is working on some platforms:if(exists("I'm brave")&& `I'm brave`&&   identical(.Platform$OS.type,"unix")&& grepl("^/tmp/", tempdir())){  cat("Current tempdir(): ", tempdir(),"\n")  cat("Removing it :", file.remove(tempdir()),"; dir.exists(tempdir()):", dir.exists(tempdir()),"\n")  cat("and now  tempdir(check = TRUE) :", tempdir(check=TRUE),"\n")}

Text Connections

Description

Input and output text connections.

Usage

textConnection(object, open="r", local=FALSE,               name= deparse1(substitute(object)),               encoding= c("","bytes","UTF-8"))textConnectionValue(con)

Arguments

object

character. A description of theconnection.For an input this is anR character vector object, and for an outputconnection the name for theR character vector to receive theoutput, orNULL (for none).

open

character string. Either"r" (or equivalently"")for an input connection or"w" or"a" for an outputconnection.

local

logical. Used only for output connections. IfTRUE,output is assigned to a variable in the calling environment. Otherwisethe global environment is used.

name

acharacter string specifying the connection name.

encoding

character string, partially matched. Used only for input connections. Howmarked strings inobject should be handled: converted to thecurrent locale, used byte-by-byte or translated to UTF-8.

con

an output text connection.

Details

An input text connection is opened and the character vector is copiedat time the connection object is created, andclose destroysthe copy.object should be the name of a character vector:however, short expressions will be accepted provided theydeparse toless than 60 bytes.

An output text connection is opened and creates anR character vectorof the given name in the user's workspace or in the calling environment,depending on the value of thelocal argument. This object will at alltimes hold the completed lines of output to the connection, andisIncomplete will indicate if there is an incompletefinal line. Closing the connection will output the final line,complete or not. (A line is complete once it has been terminated byend-of-line, represented by"\n" inR.) The output charactervector has locked bindings (seelockBinding) untilclose is called on the connection. The character vector canalso be retrievedviatextConnectionValue, which is theonly way to do so ifobject = NULL. If the current locale isdetected as Latin-1 or UTF-8, non-ASCII elements of the character vectorwill be marked accordingly (seeEncoding).

Opening a text connection withmode = "a" will attempt toappend to an existing character vector with the given name in theuser's workspace or the calling environment. If none is found (evenif an object exists of the right name but the wrong type) a newcharacter vector will be created, with a warning.

You cannotseek on a text connection, andseek willalways return zero as the position.

Text connections have slightly unusual semantics: they are alwaysopen, and throwing away an input text connection without closing it(so it get garbage-collected) does not give a warning.

Value

FortextConnection, a connection object of class"textConnection" which inherits from class"connection".

FortextConnectionValue, a character vector.

Note

As output text connections keep the character vector up to dateline-by-line, they are relatively expensive to use, and it is oftenbetter to use an anonymousfile() connection to collectoutput.

On (rare) platforms wherevsnprintf does not return the neededlength of output there is a 100,000 character limit on the length ofline for output connections: longer lines will be truncated with awarning.

References

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language. Springer.
[S has input text connections only.]

See Also

connections,showConnections,pushBack,capture.output.

Examples

zz<- textConnection(LETTERS)readLines(zz,2)scan(zz,"",4)pushBack(c("aa","bb"), zz)scan(zz,"",4)close(zz)zz<- textConnection("foo","w")writeLines(c("testit1","testit2"), zz)cat("testit3 ", file= zz)isIncomplete(zz)cat("testit4\n", file= zz)isIncomplete(zz)close(zz)foo# capture R output: use part of example from help(lm)zz<- textConnection("foo","w")ctl<- c(4.17,5.58,5.18,6.11,4.5,4.61,5.17,4.53,5.33,5.14)trt<- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)group<- gl(2,10,20, labels= c("Ctl","Trt"))weight<- c(ctl, trt)sink(zz)anova(lm.D9<- lm(weight~ group))cat("\nSummary of Residuals:\n\n")summary(resid(lm.D9))sink()close(zz)cat(foo, sep="\n")

Tilde Operator

Description

Tilde is used to separate the left- and right-hand sides in a model formula.

Usage

y~ model

Arguments

y,model

symbolic expressions.

Details

The left-hand side is optional, and one-sided formulae are used insome contexts.

A formula hasmodecall. It can be subsetted by[[: the components are~, the left-hand side (ifpresent) and the right-hand sidein that order. (Thusone-sided formulae have two components.)

References

Chambers, J. M. and Hastie, T. J. (1992)Statistical models.Chapter 2 ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

formula


Time Zones

Description

Information about time zones inR.Sys.timezone returnsthe name of the current time zone.

Usage

Sys.timezone(location=TRUE)OlsonNames(tzdir=NULL)

Arguments

location

logical. Defunct, with a warning ifFALSE.

tzdir

the time-zone database to be used: the default is to tryknown locations until one is found.

Details

Time zones are a system-specific topic, but these days almost allRplatforms use similar underlying code, used by Linux, macOS, Solaris,AIX and FreeBSD, and installed withR on Windows. (Unfortunatelythere are many system-specific errors in the implementations.) It ispossible to use theR sources' version of the code on Unix-alikes aswell as on Windows: this is the default on macOS.

It should be possible to set the current time zone via the environmentvariableTZ: see the section on ‘Time zone names’ forsuitable values.Sys.timezone() will return the value ofTZ if set initially (and on some OSes it is always set),otherwise it will try to retrieve from the OS a value which if set forTZ would give the initial time zone. (‘Initially’ meansbefore any time-zone functions are used: ifTZ is being set tooverride the OS setting or if the ‘try’ does not get thisright, it should be set before theR process is started or (probablyearly enough) in file.Rprofile).

IfTZ is set but invalid, most platforms default to ‘⁠UTC⁠’,the time zone colloquially known as ‘⁠GMT⁠’ (seehttps://en.wikipedia.org/wiki/Coordinated_Universal_Time).(Some but not all platforms will give a warning for invalid values.)If it is unset or empty thesystem time zone is used (the onereturned bySys.timezone).

Time zones did not come into use until the middle of the nineteenthcentury and were not widely adopted until the twentieth, anddaylight saving time (DST, also known assummer time)was first introduced in the early twentieth century, most widely in1916. Over the last 100 years places have changed their affiliationbetween major time zones, have opted out of (or in to) DST in variousyears or adopted DST rule changes late or not at all. (For example,the UK experimented with DST throughout 1971, only.) In a fewcountries (one is the Irish Republic) it is the summer time which isthe ‘standard’ time and a different name is used in winter.And there can be multiple changes during a year, for example forRamadan.

A quite common system implementation ofPOSIXct was as signed32-bit integers and so only went back to the end of 1901: on suchsystemsR assumes that dates prior to that are in the same time zoneas they were in 1902. Most of the world had not adopted time zones by1902 (so used local ‘mean time’ based on longitude) but for afew places there had been time-zone changes before then. 64-bitrepresentations are becoming by far the most common; unfortunately onsome 64-bit OSes the database information is 32-bit and so onlyavailable for the range 1901–2038, and incompletely for the endyears.

When a time zone location is first found in a session its value iscached in object.sys.timezone in the base environment.

Value

Sys.timezone returns an OS-specific character string, possiblyNA or an empty string (which on some OSes means ‘⁠UTC⁠’).This will be a location such as"Europe/London" if one can beascertained.

A time zone region may be known by several names: for example‘⁠"Europe/London"⁠’ may also be known as ‘⁠GB⁠’, ‘⁠GB-Eire⁠’,‘⁠Europe/Belfast⁠’, ‘⁠Europe/Guernsey⁠’,‘⁠Europe/Isle_of_Man⁠’ and ‘⁠Europe/Jersey⁠’. A few regions arealso known by a summary of their time zone,e.g. ‘⁠PST8PDT⁠’ is (on most but not all systems) an aliasfor ‘⁠America/Los_Angeles⁠’.

OlsonNames returns a character vector, see the examples fortypical cases. It may have an attribute"Version", somethinglike ‘⁠"2023a"⁠’. (It does on systems using--with-internal-tzcode and those like Fedora distributingfile ‘tzdata.zi’.)

Time zone names

Names"UTC" and its synonym"GMT" are accepted on allplatforms.

Where OSes describe their valid time zones can be obscure. The helpfor the C functiontzset can be helpful, but it can also beinaccurate. There is a cumbersome POSIX specification (listed underenvironment variableTZ athttps://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08),which is often at least partially supported, but there are other moreuser-friendly ways to specify time zones.

Almost allR platforms make use of a time-zone database originallycompiled by Arthur David Olson and now managed byIANA, in which thepreferred way to refer to a time zone is by a location (typically of acity), e.g.,Europe/London,America/Los_Angeles,Pacific/Easter within a ‘time zone region’. Sometraditional designations are also allowed such asEST5EDT orGB. (Beware that some of these designations may not be whatyou expect: in particularEST is a time zone used in Canadawithout daylight saving time, and notEST5EDT nor(Australian) Eastern Standard Time.) The designation can also be anoptional colon prepended to the path to a file giving complied zoneinformation (and the examples above are all files in a system-specificlocation). Seehttps://data.iana.org/time-zones/tz-link.htmlfor more details and references. By convention, regions with a uniquetime-zone history since 1970 have specific names in the database, butthose with different earlier histories may not. Each time zone hasone or two (the second for ‘summer’)abbreviations used whenformatting times.

Increasingly OSes are (optionally or always) not including‘legacy’ names such asUS/Eastern: only names of theformsContinent/City andEtc/... are fully portable.

The abbreviations used have changed over the years: for example Franceused ‘⁠PMT⁠’ (‘Paris Mean Time’) from 1891 to 1911 then‘⁠WET/WEST⁠’ up to 1940 and ‘⁠CET/CEST⁠’ from 1946. (In almostall time zones the abbreviations have been stable since 1970.) ThePOSIX standard allows only one or two abbreviations per time zone, soyou may see the current abbreviation(s) used for older times.

For some time zones abbreviations are like ‘⁠-03⁠’ and‘⁠+0845⁠’: this is done when there is no official abbreviation.(Negative values are behind (West of) UTC, as for the"%z"format forstrftime.)

The functionOlsonNames returns the time-zone names known tothe currently selected Olson/IANA database. The system-specificlocation in the file system varies,e.g. ‘/usr/share/zoneinfo’ (Linux, macOS, FreeBSD),‘/usr/share/lib/zoneinfo’ (Solaris, AIX), .... It is likelythat there is a file named something like ‘zone1970.tab’ or(older) ‘zone.tab’ under that directory listing the locationsknown as time-zone names (but not for exampleEST5EDT). Seealsohttps://en.wikipedia.org/wiki/Zone.tab.

WhereR was configured with option--with-internal-tzcode(the default on Windows), the database atfile.path(R.home("share"), "zoneinfo") is used by default: file‘VERSION’ in that directory states the version. That option isalso the default on macOS but there whichever is more recent of thesystem database at ‘/var/db/timezone/zoneinfo’ and thatdistributed withR is used by default. Environment variableTZDIR can be used to give the full path to a different‘zoneinfo’ database: value"internal" indicates thedatabase from theR sources and"macOS" indicates the systemdatabase. (Setting either of those values would not be recognized byother software usingTZDIR.)

SettingTZDIR is also supported by the native services on someOSes, e.g. Linux usingglibc except in secure modes.

Time zones given by name (via environment variableTZ, intz arguments to functions such asas.POSIXlt andperhaps the system time zone) are loaded from the currently selected‘zoneinfo’ database.

On Windows only:An attempt is made (once only per session) to map Windows' idea of thecurrent time zone to a location, following a version ofhttp://unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xmlwith additional values deduced from the Windows Registry and documentation.It can be overridden by setting theTZ environment variablebefore any date-times are used in the session.

Most platforms support time zones of the form ‘⁠Etc/GMT+n⁠’ and‘⁠Etc/GMT-n⁠’ (possibly also without prefix ‘⁠Etc/⁠’), whichassume a fixed offset from UTC (hence no DST). Contrary to someexpectations (but consistent with names such as ‘⁠PST8PDT⁠’),negative offsets are times ahead of (East of) UTC, positive offsetsare times behind (West of) UTC.

Immediately prior to the advent of legislated time zones, most peopleused time based on their longitude (or that of a nearby town), knownas ‘Local Mean Time’ and abbreviated as ‘⁠LMT⁠’ in thedatabases: in many countries that was codified with a specific namebefore the switch to a standard time. For example, Paris codified itsLMT as ‘Paris Mean Time’ in 1891 (to be used throughoutmainland France) and switched to ‘⁠GMT+0⁠’ in 1911.

Some systems (notably Linux) have atzselect command whichallows the interactive selection of a supported time zone name. Onsystems usingsystemd (notably Linux), the OS commandtimedatectl list-timezones will list all available time zonenames.

Warnings

There is a system-specific upper limit on the number of bytes in(abbreviated) time-zone names which can be as low as 6 (as required byPOSIX). Some OSes allow the setting of time zones with names whichexceed their limit, and that can crash theR session.

Information about future times is speculative (‘proleptic’):the database provides the best-known information based on currentrules set by civil authorities. For the period 1900–1970 those rules(and which of any authority's rules were enacted) are often obscure,and the databases do get corrected frequently.

OlsonNames tries to find an Olson database in known locations.It might not succeed (when it returns an empty vector with a warning)and even if it does it might not locate the database used by thedate-time code linked intoR. Fortunately names are added rarelyand most databases are pretty complete. On the other hand, many nameswhich duplicate other named timezones have been moved to the‘backward’ list – these are regarded as optional and omitted onminimal installations. Similarly, there are timezones named in file‘backzone’ which differ only from those in the main lists priorto 1970 – these are usually included but may not be in minimalistsystems.

For many years, the legacy namesEST5EDT andPST8PDTwere portable, butmusl (the C runtime used by Alpine Linux)does not use DST with those names.

How the system time zone is found – on Unix-alikes

This section is of background interest for users of a Unix-alike, butmay help if anNA value is returned unexpectedly.

Commercial Unixen such as Solaris and AIX setTZ, so the valuewhenR is started is used.

All other common platforms (Linux, macOS, *BSD) use similar schemes,either derived fromtzcode (currently distributed fromhttps://www.iana.org/time-zones) or independently coded(glibc,musl-libc). Such systems read the time-zoneinformation from a file ‘localtime’, usually under ‘/etc’(but possibly under ‘/usr/local/etc’ or‘/usr/local/etc/zoneinfo’). As the usual Linux manual page forlocaltime says

‘Because the time zone identifier is extracted from the symlinktarget name of ‘/etc/localtime’, this file may not be a normalfile or hardlink.’

Nevertheless, some Linux distributions (including the one from whichthat quote was taken) or sysadmins have chosen to copy a time-zone fileto ‘localtime’. For a non-symlink, the ultimate fallback is tocompare that file to all files in the time-zone database.

Some Linux platforms provide two other mechanisms which are tried inturn before looking at ‘/etc/localtime’.

  • ‘Modern’ Linux systems usesystemd whichprovides mechanisms to set and retrieve the time zone (amongst otherthings). There is a commandtimedatectl to give details.(Unfortunately RHEL/Centos 6.x were not ‘modern’.)

  • Debian-derived systems sinceca 2007 have supplied afile ‘/etc/timezone’. Its format is undocumented butempirically it contains a single line of text naming the time zone.

In each case a sanity check is performed that the time-zone name is thename of a file in the time-zone database. (The systems probably usethe time-zone file (symlinked to) ‘/etc/localtime’, but theSys.timezone code does not check that is the same as the namedfile in the database. This is deliberate as they may be fromdifferent dates.)

Note

Since 2007 there has been considerable disruption over changes to thetimings of the DST transitions; these often have short notice andtime-zone databases may not be up to date. (Morocco in 2013 announceda change to the end of DST ata day's notice. In2023 there was chaos in Lebanon as the authorities changed their mindsrepeatedly and some changes were not widely implemented.)

There have also been changes to the ‘standard’ time with littlenotice (Kazakhstan switched to a single time zone in Mar 2024 with sixweeks' notice), and to whether ‘summer’ or ‘winter’time is regarded as ‘standard’ (and hence to abbreviations).

On platforms with case-insensitive file systems, time zone names will becase-insensitive. They may or may not be on other platforms and so,for example,"gmt" is valid on some platforms and not on others.

Note that except where replaced, the operation of time zones is an OSservice, and even where replaced a third-party database is used andcan be updated (see the section on ‘Time zone names’).Incorrect results will never be anR issue, so please ensure that youhave the courtesy not to blameR for them.

See Also

Sys.time,as.POSIXlt.

https://en.wikipedia.org/wiki/Time_zone andhttps://data.iana.org/time-zones/tz-link.htmlfor extensive sets of links.

https://data.iana.org/time-zones/theory.html for the‘rules’ of the Olson/IANA database.

Examples

Sys.timezone()str(OlsonNames())## typically around six hundred names,## typically some acronyms/aliases such as "UTC", "NZ", "MET", "Eire", ..., but## mostly pairs (and triplets) such as "Pacific/Auckland"table(sl<- grepl("/", OlsonNames()))OlsonNames()[!sl]# the simple oneshead(Osl<- strsplit(OlsonNames()[sl],"/"))(tOS1<- table(vapply(Osl, `[[`,"",1)))# Continents, countries, ...table(lengths(Osl))# most are pairs, some tripletsstr(Osl[lengths(Osl)>=3])# "America" South and North ...

Convert an R Object to a Character String

Description

This is a helper function forformat to produce a singlecharacter string describing anR object.

Usage

toString(x,...)## Default S3 method:toString(x, width=NULL,...)

Arguments

x

The object to be converted.

width

Suggestion for the maximum field width. Values ofNULL or0 indicate no maximum.The minimum value accepted is 6 and smaller values are taken as 6.

...

Optional arguments passed to or from methods.

Details

This is a generic function for which methods can be written: only thedefault method is described here. Most methods should honor thewidth argument to specify the maximum display width (as measuredbynchar(type = "width")) of the result.

The default method first convertsx to character and thenconcatenates the elements separated by", ".Ifwidth is supplied and is notNULL, the default methodreturns the firstwidth - 4 characters of the result with.... appended, if the full result would use more thanwidth characters.

Value

A character vector of length 1 is returned.

Author(s)

Robert Gentleman

See Also

format

Examples

x<- c("a","b","aaaaaaaaaaa")toString(x)toString(x, width=8)

Interactive Tracing and Debugging of Calls to a Function or Method

Description

A call totrace allows you to insert debugging code (e.g., acall tobrowser orrecover) at chosenplaces in any function. A call tountrace cancels the tracing.Specified methods can be traced the same way, without tracing allcalls to the generic function. Trace code (tracer) can be anyR expression. Tracing can be temporarily turned on or off globallyby callingtracingState.

Usage

trace(what, tracer, exit, at, print, signature,      where= topenv(parent.frame()), edit=FALSE)untrace(what, signature=NULL, where= topenv(parent.frame()))tracingState(on=NULL).doTrace(expr, msg)returnValue(default=NULL)

Arguments

what

the name, possiblyquote()d, of a functionto be traced or untraced. Foruntrace or fortracewith more than one argument, more than one name can be given in thequoted form, and the same action will be applied to each one. For“hidden” functions such as S3 methods in a namespace,where = * typically needs to be specified as well.

tracer

either afunction or an unevaluated expression. Thefunction will be called or the expression will be evaluated eitherat the beginning of the call, or before those steps in the callspecified by the argumentat.See the details section.

exit

either afunction or an unevaluated expression. Thefunction will be called or the expression will be evaluated onexiting the function.See the details section.

at

optional numeric vector or list. If supplied,tracerwill be called just before the corresponding step in the body of thefunction.See the details section.

print

ifTRUE (as per default), a descriptive line isprinted before any trace expression is evaluated.

signature

an optionalsignature for a method for functionwhat. If supplied, themethod, andnot the function itself, is traced.

edit

For complicated tracing, such as tracing within a loopinside the function, you will need to insert the desired calls byediting the body of the function. If so, supply theeditargument either asTRUE, or as the name of the editor youwant to use. Thentrace() will calledit anduse the version of the function after you edit it. See the detailssection for additional information.

where

where to look for the function to betraced; by default, the top-level environment of the call totrace.

An important use of this argument is to trace functions from apackage which are “hidden” or called from another package.The namespace mechanism imports the functions to be called (with theexception of functions in the base package). The functions beingcalled arenot the same objects seen from the top-level (ingeneral, the imported packages may not even be attached).Therefore, you must ensure that the correct versions are beingtraced. The way to do this is to set argumentwhere to afunction in the namespace (or that namespace). The tracingcomputations will then start looking in the environment of thatfunction (which will be the namespace of the corresponding package).(Yes, it's subtle, but the semantics here are central to hownamespaces work inR.)

on

logical; a call to the support functiontracingState returnsTRUEif tracing is globally turned on,FALSE otherwise. Anargument of one or the other of those values sets the state. If thetracing state isFALSE, none of the trace actions willactually occur (used, for example, by debugging functions to shutoff tracing during debugging).

expr,msg

arguments to the support function.doTrace, calls towhich are inserted into the modified function or method:expr is the tracing action (such as a call tobrowser()), andmsg is a string identifying theplace where the trace action occurs.

default

ifreturnValue finds no return value (e.g., whena function exited because of an error, restart or as a resultof evaluating a return from a caller function), it will returndefault instead.

Details

Thetrace function operates by constructing a revised versionof the function (or of the method, ifsignature is supplied),and assigning the new object back where the original was found.If only thewhat argument is given, a line of trace printing isproduced for each call to the function (back compatible with theearlier version oftrace).

The object constructed bytrace is from a class that extends"function" and which contains the original, untraced version.A call tountrace re-assigns this version.

If the argumenttracer orexit is the name of afunction, the tracing expression will be a call to that function, withno arguments. This is the easiest and most common case, with thefunctionsbrowser andrecover thelikeliest candidates; the former browses in the frame of the functionbeing traced, and the latter allows browsing in any of the currentlyactive calls. The argumentstracer andexit are evaluated tosee whether they are functions, but only their names are used in thetracing expressions. The lookup is done again when the traced functionexecutes, so it may not betracer orexit that will be calledwhile tracing.

Thetracer orexit argument can also be an unevaluatedexpression (such as returned by a call toquote orsubstitute). This expression itself is inserted in thetraced function, so it will typically involve arguments or localobjects in the traced function. An expression of this form is usefulif you only want to interact when certain conditions apply (and inthis case you probably want to supplyprint = FALSE in the calltotrace also).

When theat argument is supplied, it can be a vector ofintegers referring to the substeps of the body of the function (thisonly works if the body of the function is enclosed in{ ...}). Inthis casetracer isnot called on entry, but insteadjust before evaluating each of the steps listed inat. (Hint:you don't want to try to count the steps in the printed version of afunction; instead, look atas.list(body(f)) to get the numbersassociated with the steps in functionf.)

Theat argument can also be a list of integer vectors. Inthis case, each vector refers to a step nested within another step ofthe function. For example,at = list(c(3,4))will call the tracer just before the fourth step of the third stepof the function. See the example below.

UsingsetBreakpoint (from packageutils) may be analternative, callingtrace(...., at, ...).

Theexit argument is called duringon.exitprocessing. In anon.exit expression, the experimentalreturnValue()function may be called to obtain the value about to be returned bythe function. Calling this function in other circumstances will giveundefined results.

An intrinsic limitation in theexit argument is that it won'twork if the function itself useson.exit withadd= FALSE (the default), since the existing calls will override the onesupplied bytrace.

Tracing does not nest. Any call totrace replaces previouslytraced versions of that function or method (except for editedversions as discussed below), anduntrace alwaysrestores an untraced version. (Allowing nested tracing has too manypotentials for confusion and for accidentally leaving traced versionsbehind.)

When theedit argument is used repeatedly with no call tountrace on the same function or method in between, thepreviously edited version is retained. If you want to throw awayall the previous tracing and then edit, calluntrace before the nextcall totrace. Editing may be combined with automatictracing; just supply the other arguments such astracer, andtheedit argument as well. Theedit = TRUE argumentuses the default editor (seeedit).

Tracing primitive functions (builtins and specials) from the basepackage works, but only by a special mechanism and not veryinformatively. Tracing a primitive causes the primitive to bereplaced by a function with argument ... (only). You can get a bitof information out, but not much. A warning message is issued whentrace is used on a primitive.

The practice of saving the traced version of the function back wherethe function came from means that tracing carries over from onesession to another,if the traced function is saved in thesession image. (In the next session,untrace will remove thetracing.) On the other hand, functions that were in a package, not inthe global environment, are not saved in the image, so tracing expireswith the session for such functions.

Tracing an S4 method is basically just like tracing a function, with theexception that the traced version is stored by a call tosetMethod rather than by direct assignment, and so isthe untraced version after a call tountrace.

The version oftrace described here is largely compatible withthe version in S-Plus, although the two work by entirely differentmechanisms. The S-Plustrace uses the session frame, with theresult that tracing never carries over from one session to another (Rdoes not have a session frame). Another relevant distinction hasnothing directly to do withtrace: The browser in S-Plusallows changes to be made to the frame being browsed, and the changeswill persist after exiting the browser. TheR browser allows changes,but they disappear when the browser exits. This may be relevant inthat the S-Plus version allows you to experiment with code changesinteractively, but theR version does not. (A future revision mayinclude a ‘destructive’ browser forR.)

Value

In the simple version (just the first argument),trace returnsan invisibleNULL.Otherwise, the traced function(s) name(s). The relevant consequence is theassignment that takes place.

untrace returns the function name invisibly.

tracingState returns the current global tracing state, and possiblychanges it.

When called duringon.exit processing,returnValue returnsthe value about to be returned by the exiting function. Behaviour inother circumstances is undefined.

Note

Usingtrace() is conceptually a generalization ofdebug, implemented differently. Namely by callingbrowser via itstracer orexit argument.

The version of function tracing that includes any of the argumentsexcept for the function name requires themethods package(because it uses special classes of objects to store and restoreversions of the traced functions).

If methods dispatch is not currently on,trace will load themethods namespace, but will not put the methods package on thesearch list.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

browser andrecover, the likeliesttracing functions;also,quote andsubstitute forconstructing general expressions.

Examples

require(stats)##  Very simple usetrace(sum)hist(rnorm(100))# shows about 3-4 calls to sum()untrace(sum)## Show how pt() is called from inside power.t.test():if(FALSE)  trace(pt)## would show ~20 calls, but we want to see more:trace(pt, tracer= quote(cat(sprintf("tracing pt(*, ncp = %.15g)\n", ncp))),      print=FALSE)# <- not showing typical extrapower.t.test(20,1, power=0.8, sd=NULL)##--> showing the ncp root finding:untrace(pt)f<-function(x, y){    y<- pmax(y,0.001)if(x>0) x^ yelse stop("x must be positive")}## arrange to call the browser on entering and exiting## function ftrace("f", quote(browser(skipCalls=4)),      exit= quote(browser(skipCalls=4)))## instead, conditionally assign some data, and then browse## on exit, but only then.  Don't bother me otherwisetrace("f", quote(if(any(y<0)) yOrig<- y),      exit= quote(if(exists("yOrig")) browser(skipCalls=4)),      print=FALSE)## Enter the browser just before stop() is called.  First, find## the step numbersuntrace(f)# (as it has changed f's body !)as.list(body(f))as.list(body(f)[[3]])# -> stop(..) is [[4]]## Now call the browser theretrace("f", quote(browser(skipCalls=4)), at= list(c(3,4)))## Not run:f(-1,2)# --> enters browser just before stop(..)## End(Not run)## trace a utility function, with recover so we## can browse in the calling functions as well.trace("as.matrix", recover)## turn off the tracing (that happened above)untrace(c("f","as.matrix"))## Not run:## Useful to find how system2() is called in a higher-up function:trace(base::system2, quote(print(ls.str())))## End(Not run)##-------- Tracing hidden functions : need 'where = *'#### 'where' can be a function whose environment is meant:trace(quote(ar.yw.default), where= ar)a<- ar(rnorm(100))# "Tracing ..."untrace(quote(ar.yw.default), where= ar)## trace() more than one function simultaneously:##         expression(E1, E2, ...)  here is equivalent to##          c(quote(E1), quote(E2), quote(.*), ..)trace(expression(ar.yw, ar.yw.default), where= ar)a<- ar(rnorm(100))# --> 2 x "Tracing ..."# and turn it off:untrace(expression(ar.yw, ar.yw.default), where= ar)## Not run:## trace calls to the function lm() that come from## the nlme package.trace("lm", where= asNamespace("nlme"))      lm(len~ log(dose)* supp, ToothGrowth)-> fit1# NOT tracednlme::lmList(len~ log(dose)| supp, ToothGrowth)-> fit2# traceduntrace("lm", where= asNamespace("nlme"))## End(Not run)

Get and Print Call Stacks

Description

By defaulttraceback() prints the call stack of the lastuncaught error, i.e., the sequence of calls that lead to the error.This is useful when an error occurs with an unidentifiable errormessage. It can also be used to print the current stack orarbitrary lists of calls.

.traceback() nowreturns the above call stack (andtraceback(x, *) can be regarded as convenience function forprinting the result of.traceback(x)).

Usage

traceback(x=NULL, max.lines= getOption("traceback.max.lines",                                           getOption("deparse.max.lines",-1L))).traceback(x=NULL, max.lines= getOption("traceback.max.lines",                                           getOption("deparse.max.lines",-1L)))

Arguments

x

NULL (default, meaning.Traceback), or aninteger count of calls to skip in the current stack, or a list orpairlist of calls. See the details.

max.lines

a number, the maximum number of lines to be printedper call. The default is unlimited. Applies only whenxisNULL, alist or apairlist ofcalls, see the details.

Details

The default display is of the stack of the last uncaught error asstored as a list ofcalls in.Traceback, whichtraceback prints in a user-friendly format. The stack ofcalls always contains all function calls and all foreignfunction calls (such as.Call): if profiling is inprogress it will include calls to some primitive functions. (Callsto builtins are included, but not to specials.)

Errors which are caughtviatry ortryCatch do not generate a traceback, so what is printedis the call sequence for the last uncaught error, and not necessarilyfor the last error.

Ifx is numeric, then the current stack is printed, skippingx entries at the top of the stack. For example,options(error = function() traceback(3)) will print the stackat the time of the error, skipping the call totraceback() and.traceback()and the error function that called it.

Otherwise,x is assumed to be a list or pairlist of calls ordeparsed calls and will be displayed in the same way.

.traceback() and by extensiontraceback() may triggerdeparsing ofcalls. This is an expensive operationfor large calls so it may be advisable to setmax.linesto a reasonable value when such calls are on the call stack.

Value

.traceback() returns the deparsed call stack deepest callfirst as a list or pairlist. The number of lines deparsed fromthe call can be limited viamax.lines. Calls for whichmax.lines results in truncated output will gain a"truncated" attribute.

traceback() formats, prints, and returns the call stackproduced by.traceback() invisibly.

Warning

It is undocumented where.Traceback is stored nor that it isvisible, and this is subject to change. Currently.Traceback contains thecalls as languageobjects.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Examples

foo<-function(x){ print(1); bar(2)}bar<-function(x){ x+ a.variable.which.does.not.exist}## Not run:foo(2)# gives a strange errortraceback()## End(Not run)## 2: bar(2)## 1: foo(2)bar## Ah, this is the culprit ...## This will print the stack trace at the time of the error.options(error=function() traceback(3))

Trace Copying of Objects

Description

This function marks an object so that a message is printed whenever theinternal code copies the object. It is amajor cause of hard-to-predict memory use in R.

Usage

tracemem(x)untracemem(x)retracemem(x, previous=NULL)

Arguments

x

An R object, not a function or environment orNULL.

previous

A value as returned bytracemem orretracemem.

Details

This functionality is optional, determined at compilation, because itmakes R run a little more slowly even when no objects are beingtraced.tracemem anduntracemem give errors when R is notcompiled with memory profiling;retracemem does not (so it can beleft in code during development).

It is enabled in the CRAN macOS and Windows builds ofR.

When an object is traced any copying of the object by the C functionduplicate produces a message to standard output, as does typecoercion and copying when passing arguments to.C or.Fortran.

The message consists of the stringtracemem, the identifyingstrings for the object being copied and the new object being created,and a stack trace showing where the duplication occurred.retracemem() is used to indicate that a variable should beconsidered a copy of a previous variable (e.g., after subscripting).

The messages can be turned off withtracingState.

It is not possible to trace functions, as this would conflict withtrace and it is not useful to traceNULL,environments, promises, weak references, or external pointer objects, asthese are not duplicated.

These functions areprimitive.

Value

A character string for identifying the object in the trace output (anaddress in hex enclosed in angle brackets), orNULL (invisibly).

See Also

capabilities("profmem") to see if this was enabled forthis build ofR.

trace,Rprofmem

https://developer.r-project.org/memory-profiling.html

Examples

## Not run:a<-1:10tracemem(a)## b and a share memoryb<- ab[1]<-1untracemem(a)## copying in lm: less than R <= 2.15.0d<- stats::rnorm(10)tracemem(d)lm(d~ a+log(b))## f is not a copy and is not tracedf<- d[-1]f+1## indicate that f should be traced as a copy of dretracemem(f, retracemem(d))f+1## End(Not run)

Transform an Object, for Example a Data Frame

Description

transform is a generic function, which—at leastcurrently—only does anything useful withdata frames.transform.default converts its first argument toa data frame if possible and callstransform.data.frame.

Usage

transform(`_data`,...)

Arguments

_data

The object to be transformed

...

Further arguments of the formtag=value

Details

The... arguments totransform.data.frame are taggedvector expressions, which are evaluated in the data frame_data. The tags are matched againstnames(_data), and forthose that match, the value replace the corresponding variable in_data, and the others are appended to_data.

Value

The modified value of_data.

Warning

This is a convenience function intended for use interactively. Forprogramming it is better to use the standard subsetting arithmetic functions,and in particular the non-standard evaluation ofargumenttransform can have unanticipated consequences.

Note

If some of the values are not vectors of the appropriate length,you deserve whatever you get!

Author(s)

Peter Dalgaard

See Also

within for a more flexible approach,subset,list,data.frame

Examples

transform(airquality, Ozone=-Ozone)transform(airquality, new=-Ozone, Temp=(Temp-32)/1.8)attach(airquality)transform(Ozone, logOzone= log(Ozone))# marginally interesting ...detach(airquality)

Trigonometric Functions

Description

These functions give the obvious trigonometric functions. Theyrespectively compute the cosine, sine, tangent, arc-cosine, arc-sine,arc-tangent, and the two-argument arc-tangent.

cospi(x),sinpi(x), andtanpi(x), computecos(pi*x),sin(pi*x), andtan(pi*x).

Usage

cos(x)sin(x)tan(x)acos(x)asin(x)atan(x)atan2(y, x)cospi(x)sinpi(x)tanpi(x)

Arguments

x,y

numeric or complex vectors.

Details

The arc-tangent of two argumentsatan2(y, x) returns the anglebetween the x-axis and the vector from the origin to(x,y)(x, y),i.e., for positive argumentsatan2(y, x) == atan(y/x).

Angles are in radians, not degrees, for the standard versions (i.e., aright angle isπ/2\pi/2), and in ‘half-rotations’ forcospi etc.

cospi(x),sinpi(x), andtanpi(x) are accurateforx values which are multiples of a half.

All exceptatan2 areinternal genericprimitivefunctions: methods can be defined for them individually or via theMath group generic.

These are all wrappers to system calls of the same name (with prefixc for complex arguments) where available. (cospi,sinpi, andtanpi are part of a C11 extensionand provided by e.g. macOS and Solaris: where not yetavailable call tocosetc are used, with special casesfor multiples of a half.)

Value

tanpi(0.5) isNaN. Similarly for other inputswith fractional part0.5.

Complex values

For the inverse trigonometric functions, branch cuts are defined as inAbramowitz and Stegun, figure 4.4, page 79.

Forasin andacos, there are two cuts, both alongthe real axis:(,1]\left(-\infty, -1\right] and[1,)\left[1, \infty\right).

Foratan there are two cuts, both along the pure imaginaryaxis:(i,1i]\left(-\infty i, -1i\right] and[1i,i)\left[1i, \infty i\right).

The behaviour actually on the cuts follows the C99 standard whichrequires continuity coming round the endpoint in a counter-clockwisedirection.

Complex arguments forcospi,sinpi, andtanpiare not yet implemented, and they are a ‘future direction’ ofISO/IEC TS 18661-4.

S4 methods

All exceptatan2 are S4 generic functions: methods can be definedfor them individually or via theMath group generic.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Abramowitz, M. and Stegun, I. A. (1972).Handbook ofMathematical Functions. New York: Dover.
Chapter 4. Elementary Transcendental Functions: Logarithmic,Exponential, Circular and Hyperbolic Functions

Forcospi,sinpi, andtanpi the C11 extensionISO/IEC TS 18661-4:2015 (draft athttps://www.open-std.org/jtc1/sc22/wg14/www/docs/n1950.pdf).

Examples

x<- seq(-3,7, by=1/8)tx<- cbind(x, cos(pi*x), cospi(x), sin(pi*x), sinpi(x),               tan(pi*x), tanpi(x), deparse.level=2)op<- options(digits=4, width=90)# for nice formattinghead(tx)tx[(x%%1)%in% c(0,0.5),]options(op)

Remove Leading/Trailing Whitespace

Description

Remove leading and/or trailing whitespace from character strings.

Usage

trimws(x, which= c("both","left","right"), whitespace="[ \t\r\n]")

Arguments

x

a character vector.

which

a character string specifying whether to remove bothleading and trailing whitespace (default), or only leading("left") or trailing ("right"). Can be abbreviated.

whitespace

a string specifying a regular expression to match(one character of) “white space”, see Details foralternatives to the default.

Details

Internally,sub(re, "", *, perl = TRUE), i.e., PCRElibrary regular expressions are used.For portability, the default ‘whitespace’ is the character class[ \t\r\n] (space, horizontal tab, carriage return,newline). Alternatively,[\h\v] is a good (PCRE)generalization to match all Unicode horizontal and vertical whitespace characters, see alsohttps://www.pcre.org.

Examples

x<-"  Some text. "xtrimws(x)trimws(x,"l")trimws(x,"r")## Unicode --> need "stronger" 'whitespace' to match all :tt<-"text with unicode 'non breakable space'."xu<- paste(" \t\v", tt,"\u00a0 \n\r")(tu<- trimws(xu, whitespace="[\\h\\v]"))stopifnot(identical(tu, tt))

Try an Expression Allowing Error Recovery

Description

try is a wrapper to run an expression that might fail and allowthe user's code to handle error-recovery.

Usage

try(expr, silent=FALSE,    outFile= getOption("try.outFile", default= stderr()))

Arguments

expr

anR expression to try.

silent

logical: should the report of error messages besuppressed?

outFile

aconnection, or a character string naming thefile to print to (viacat(*, file = outFile));used only ifsilent is false, as by default.

Details

try evaluates an expression and traps any errors that occurduring the evaluation. If an error occurs then the errormessage is printed to thestderr connection unlessoptions("show.error.messages") is false orthe call includessilent = TRUE. The error message is alsostored in a buffer where it can be retrieved bygeterrmessage. (This should not be needed as the value returnedin case of an error contains the error message.)

try is implemented usingtryCatch; forprogramming, instead oftry(expr, silent = TRUE), something liketryCatch(expr, error = function(e) e) (or other simpleerror handler functions) may be more efficient and flexible.

It may be useful to set the default foroutFile tostdout(), i.e.,

  options(try.outFile = stdout())

instead of the defaultstderr(),notably whentry() is used inside aSweave codechunk and the error message should appear in the resulting document.

Value

The value of the expression ifexpr is evaluated without error:otherwise an invisible object inheriting from class"try-error"containing the error message with the error condition as the"condition" attribute.

Warning

Do not test

    if (class(res) == "try-error"))

as if there is no error, the result might (now or in future) have aclass of length > 1. Useif(inherits(res, "try-error"))instead.

See Also

options for setting error handlers and suppressing theprinting of error messages;geterrmessage for retrieving the last error message.The underlyingtryCatch provides more flexible means ofcatching and handling errors.

assertCondition in packagetools is related anduseful for testing.

Examples

## this example will not work correctly in example(try), but## it does work correctly if pasted inoptions(show.error.messages=FALSE)try(log("a"))print(.Last.value)options(show.error.messages=TRUE)## alternatively,print(try(log("a"),TRUE))## run a simulation, keep only the results that worked.set.seed(123)x<- stats::rnorm(50)doit<-function(x){    x<- sample(x, replace=TRUE)if(length(unique(x))>30) mean(x)else stop("too few unique points")}## alternative 1res<- lapply(1:100,function(i) try(doit(x),TRUE))## alternative 2## Not run: res <- vector("list", 100)for(iin1:100) res[[i]]<- try(doit(x),TRUE)## End(Not run)unlist(res[sapply(res,function(x)!inherits(x,"try-error"))])

The Type of an Object

Description

typeof determines the (R internal)type or storage mode of any object

Usage

typeof(x)

Arguments

x

anyR object.

Value

A character string. The possible values are listed in the structureTypeTable in ‘src/main/util.c’. Current values arethe vector types"logical","integer","double","complex","character","raw" and"list","NULL","closure" (function),"special" and"builtin"(basic functions and operators),"environment","S4"(some S4 objects) and others that are unlikely to be seen at userlevel ("symbol","pairlist","promise","object","language","char","...","any","expression","externalptr","bytecode" and"weakref").

See Also

mode,storage.mode.

isS4 to determine if an object has an S4 class.

Examples

typeof(2)mode(2)## for a table of examples, see  ?mode  /  examples(mode)

Extract Unique Elements

Description

unique returns a vector, data frame or array likexbut with duplicate elements/rows removed.

Usage

unique(x, incomparables=FALSE,...)## Default S3 method:unique(x, incomparables=FALSE, fromLast=FALSE,        nmax=NA,...)## S3 method for class 'matrix'unique(x, incomparables=FALSE, MARGIN=1,       fromLast=FALSE,...)## S3 method for class 'array'unique(x, incomparables=FALSE, MARGIN=1,       fromLast=FALSE,...)

Arguments

x

a vector or a data frame or an array orNULL.

incomparables

a vector of values that cannot be compared.FALSE is a special value, meaning that all values can becompared, and may be the only value accepted for methods other thanthe default. It will be coerced internally to the same type asx.

fromLast

logical indicating if duplication should be consideredfrom the last, i.e., the last (or rightmost) of identical elements willbe kept. This only matters fornames ordimnames.

nmax

the maximum number of unique items expected (greater than one).Seeduplicated.

...

arguments for particular methods.

MARGIN

the array margin to be held fixed: a single integer.

Details

This is a generic function with methods for vectors, data frames andarrays (including matrices).

The array method calculates for each element of the dimensionspecified byMARGIN if the remaining dimensions are identicalto those for an earlier element (in row-major order). This would mostcommonly be used for matrices to find unique rows (the default) or columns(withMARGIN = 2).

Note that unlike the Unix commanduniq this omitsduplicated and not justrepeated elements/rows. Thatis, an element is omitted if it is equal to any previous element andnot just if it is equal the immediately previous one. (For thelatter, seerle).

Missing values ("NA") are regarded as equal, numeric andcomplex ones differing fromNaN; character strings will be compared in a“common encoding”; for details, seematch (andduplicated) which use the same concept.

Values inincomparables will never be marked as duplicated.This is intended to be used for a fairly small set of values and willnot be efficient for a very large set.

When used on a data frame with more than one column, or an array ormatrix when comparing dimensions of length greater than one, thistests for identity of character representations. This willcatch people who unwisely rely on exact equality of floating-pointnumbers!

Value

For a vector, an object of the same type ofx, but with onlyone copy of each duplicated element. No attributes are copied (sothe result has no names).

For a data frame, a data frame is returned with the same columns butpossibly fewer rows (and with row names from the first occurrences ofthe unique rows).

A matrix or array is subsetted by[, drop = FALSE], sodimensions and dimnames are copied appropriately, and the resultalways has the same number of dimensions asx.

Warning

Using this for lists is potentially slow, especially if the elementsare not atomic vectors (seevector) or differ onlyin their attributes. In the worst case it isO(n2)O(n^2).

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

duplicated which gives the indices of duplicatedelements.

rle which is the equivalent of the Unixuniq -ccommand.

Examples

x<- c(3:5,11:8,8+0:5)(ux<- unique(x))(u2<- unique(x, fromLast=TRUE))# different orderstopifnot(identical(sort(ux), sort(u2)))length(unique(sample(100,100, replace=TRUE)))## approximately 100(1 - 1/e) = 63.21unique(iris)

Units

Description

Get or set units.

Usage

units(x)units(x)<- value

Arguments

x

anR object

value

anR object

Details

These are generic functions, with methods for"difftime"objects.


Delete Files and Directories

Description

unlink deletes the file(s) or directories specified byx.

Usage

unlink(x, recursive=FALSE, force=FALSE, expand=TRUE)

Arguments

x

a character vector with the names of the file(s) ordirectories to be deleted.

recursive

logical. Should directories be deleted recursively?

force

logical. Should permissions be changed (if possible) toallow the file or directory to be removed?

expand

logical. Should wildcards (see ‘Details’ below) andtilde (seepath.expand) be expanded?

Details

Ifrecursive = FALSE directories are not deleted,not even empty ones.

On most platforms ‘file’ includes symbolic links, fifos andsockets.unlink(x, recursive = TRUE)deletes just the symbolic link if the target of such a link is a directory.

Wildcard expansion (normally ‘*’ and ‘?’ are allowed) is done bythe internal code ofSys.glob. Wildcards never match aleading ‘.’ in the filename, and files ‘.’, ‘..’ and‘~’ will never be considered for deletion.Wildcards will only be expanded if the system supports it. Mostsystems will support not only ‘*’ and ‘?’ but also characterclasses such as ‘[a-z]’ (see theman pages for the systemcallglob on your OS). The metacharacters* ? [ canoccur in Unix filenames, and this makes it difficult to useunlink to delete such files (seefile.remove),although escaping the metacharacters by backslashes usually works. Ifa metacharacter matches nothing it is considered as a literalcharacter.

recursive = TRUE might not be supported on all platforms, when itwill be ignored, with a warning: however there are no known currentexamples.

Value

0 for success,1 for failure, invisibly.Not deleting a non-existent file is not a failure, nor is being unableto delete a directory ifrecursive = FALSE. However, missingvalues inx are regarded as failures.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

file.remove.


Flatten Lists

Description

Given a list structurex,unlist simplifies it toproduce a vector which contains all the atomic componentswhich occur inx.

Usage

unlist(x, recursive=TRUE, use.names=TRUE)

Arguments

x

anR object, typically a list or vector.

recursive

logical. Should unlisting be applied to listcomponents ofx?

use.names

logical. Should names be preserved?

Details

unlist is generic: you can write methods to handlespecific classes of objects, seeInternalMethods,and note, e.g.,relist with theunlist methodforrelistable objects.

Ifrecursive = FALSE, the function will not recurse beyond thefirst level items inx.

Factors are treated specially. If all non-list elements ofxarefactor (or ordered factor) objects then the resultwill be a factor withlevels the union of the level sets of the elements, in the order thelevels occur in the level sets of the elements (which means that ifall the elements have the same level set, that is the level set of theresult).

x can be an atomic vector, but thenunlist does nothing useful,not even drop names.

By default,unlist tries to retain the naminginformation present inx. Ifuse.names = FALSE allnaming information is dropped.

Where possible the list elements are coerced to a common mode duringthe unlisting, and so the result often ends up as a charactervector. Vectors will be coerced to the highest type of the componentsin the hierarchy NULL < raw < logical < integer < double < complex < character< list < expression: pairlists are treated as lists.

A list is a (generic) vector, and the simplified vector might still bea list (and might be unchanged). Non-vector elements of the list(for example language elements such as names, formulas and calls)are not coerced, and so a list containing one or more of these remains alist. (The effect of unlisting anlm fit is a list whichhas individual residuals as components.)Note thatunlist(x) now returnsx unchanged also fornon-vectorx, instead of signalling an error in that case.

Value

NULL or an expression or a vector of an appropriate mode tohold the list components.

The output type is determined from the highest typeof the components in the hierarchy NULL < raw < logical < integer < double <complex < character < list < expression, after coercion of pairliststo lists.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

c,as.list,relist.

Examples

unlist(options())unlist(options(), use.names=FALSE)l.ex<- list(a= list(1:5, LETTERS[1:5]), b="Z", c=NA)unlist(l.ex, recursive=FALSE)unlist(l.ex, recursive=TRUE)l1<- list(a="a", b=2, c= pi+2i)unlist(l1)# a character vectorl2<- list(a="a", b= as.name("b"), c= pi+2i)unlist(l2)# remains a listll<- list(as.name("sinc"), quote( a+ b),1:10, letters, expression(1+x))utils::str(ll)for(xin ll)  stopifnot(identical(x, unlist(x)))

Removenames ordimnames

Description

Remove thenames ordimnames attribute ofanR object.

Usage

unname(obj, force=FALSE)

Arguments

obj

anR object.

force

logical; if true, thedimnames (names and rownames) are removed even fromdata.frames.

Value

Object asobj but withoutnames ordimnames.

Examples

require(graphics); require(stats)## Answering a question on R-help (14 Oct 1999):col3<-750+100*rt(1500, df=3)breaks<- factor(cut(col3, breaks=360+5*(0:155)))z<- table(breaks)z[1:5]# The names are larger than the data ...barplot(unname(z), axes=FALSE)

Use Packages

Description

Use packages in R scripts by loading their namespace and attaching apackage environment including (a subset of) their exports to thesearch path.

Usage

use(package, include.only)

Arguments

package

a character string given the name of a package.

include.only

character vector of names of objects toinclude in the attached environment frame. If missing, all exportsare included.

Details

This is a simple wrapper aroundlibrary which alwaysusesattach.required = FALSE, so that packages listed in theDepends clause of theDESCRIPTION file of the package tobe used never get attached automatically to the search path.

This therefore allows to write R scripts with full control over whatgets found on the search path. In addition, such scripts can easilybe integrated as package code, replacing the calls touse bythe correspondingImportFrom directives in ‘NAMESPACE’files.

Value

(invisibly) a logical indicating whether the package to be used isavailable.

Note

This functionality is still experimental: interfaces may change infuture versions.


Class Methods

Description

R possesses a simple generic function mechanism which can be used foran object-oriented style of programming. Method dispatch takes placebased on the class(es) of the first argument to the generic function or ofthe object supplied as an argument toUseMethod orNextMethod.

Usage

UseMethod(generic, object)NextMethod(generic=NULL, object=NULL,...)

Arguments

generic

a character string naming a function (and not abuilt-in operator). Required forUseMethod.

object

forUseMethod: an object whose class willdetermine the method to be dispatched. Defaults to the firstargument of the enclosing function.

...

further arguments to be passed to the next method.

Details

AnR object is a data object which has aclassattribute (and this can be tested byis.object).A class attribute is a character vector giving the names ofthe classes from which the objectinherits.

If the object does not have a class attribute, it has animplicit class. Matrices and arrays have class"matrix"or"array" followed by the class of the underlying vector.Most vectors have class the result ofmode(x), exceptthat integer vectors have classc("integer", "numeric") andreal vectors have classc("double", "numeric").Function.class2(x) (sinceR 4.0.x) returns the fullimplicit (or explicit) class vector ofx.

When a function callingUseMethod("fun") is applied to anobject with class vectorc("first", "second"), the systemsearches for a function calledfun.first and, if it finds it,applies it to the object. If no such function is found a functioncalledfun.second is tried. If no class name produces asuitable function, the functionfun.default is used, if itexists, or an error results.

Functionmethods can be used to find out about themethods for a particular generic function or class.

UseMethod is a primitive function but uses standard argumentmatching. It is not the only means of dispatch of methods, for thereareinternal generic andgroup generic functions.UseMethod currently dispatches on the implicit class even forarguments that are not objects, but the other means of dispatch donot.

NextMethod invokes the next method (determined by theclass vector, either of the object supplied to the generic, or ofthe first argument to the function containingNextMethod if amethod was invoked directly). NormallyNextMethod is used withonly one argument,generic, but if further arguments aresupplied these modify the call to the next method.

NextMethod should not be called except in methods called byUseMethod or from internal generics (seeInternalGenerics). In particular it will not work insideanonymous calling functions (e.g.,get("print.ts")(AirPassengers)).

Namespaces can register methods for generic functions. To supportthis,UseMethod andNextMethod search for methods intwo places: in the environment in which the generic functionis called, and in the registration data base for theenvironment in which the generic is defined (typically a namespace).So methods for a generic function need to be available in theenvironment of the call to the generic, or they must be registered.(It does not matter whether they are visible in the environment inwhich the generic is defined.) As fromR 3.5.0, the registrationdata base is searched after the top level environment (seetopenv) of the calling environment (but before theparents of the top level environment).

Technical Details

Now for some obscure details that need to appear somewhere. Thesecomments will be slightly different than those in Chambers(1992).(See also the draft ‘R Language Definition’.)UseMethod creates a new function call witharguments matched as they came in to the generic. [Previously localvariables defined before the call toUseMethod were retained;as ofR 4.4.0 this is no longer the case.] Anystatements after the call toUseMethod will not be evaluated asUseMethod does not return.UseMethod can be called withmore than two arguments: a warning will be given and additionalarguments ignored. (They are not completely ignored in S.) If it iscalled with just one argument, the class of the first argument of theenclosing function is used asobject: unlike S this is the firstactual argument passed and not the current value of the object of thatname.

NextMethod works by creating a special call frame for the nextmethod. If no new arguments are supplied, the arguments will be thesame in number, order and name as those to the current method buttheir values will be promises to evaluate their name in the currentmethod and environment. Any named arguments matched to...are handled specially: they either replace existing arguments of thesame name or are appended to the argument list. They are passed on asthe promise that was supplied as an argument to the currentenvironment. (S does this differently!) If they have been evaluatedin the current (or a previous environment) they remain evaluated.(This is a complex area, and subject to change: see the draft‘R Language Definition’.)

The search for methods forNextMethod is slightly differentfrom that forUseMethod. Finding nofun.default is notnecessarily an error, as the search continues to the genericitself. This is to pick up aninternal generic like[which has no separate default method, and succeeds only if the genericis aprimitive function or a wrapper for a.Internal function of the same name. (When a primitiveis called as the default method, argument matching may not work asdescribed above due to the different semantics of primitives.)

You will see objects such as.Generic,.Method, and.Class used in methods. These are set in the environmentwithin which the method is evaluated by the dispatch mechanism, whichis as follows:

  1. Find the context for the calling function (the generic): thisgives us the unevaluated arguments for the original call.

  2. Evaluate the object (usually an argument) to be used fordispatch, and find a method (possibly the default method) or throwan error.

  3. Create an environment for evaluating the method and insertspecial variables (see below) into that environment. Also copy anyvariables in the environment of the generic that are not formal (oractual) arguments.

  4. Fix up the argument list to be the arguments of the callmatched to the formals of the method.

.Generic is a length-one character vector naming the generic function.

.Method is a character vector (normally of length one) namingthe method function. (For functions in the group genericOps it is of length two.)

.Class is a character vector of classes used to find the nextmethod.NextMethod adds an attribute"previous" to.Class giving the.Class last used for dispatch, andshifts.Class along to that used for dispatch.

.GenericCallEnv and.GenericDefEnv are the environmentsof the call to be generic and defining the generic respectively. (Thelatter is used to find methods registered for the generic.)

Note that.Class is set when the generic is called, and isunchanged if the class of the dispatching argument is changed in amethod. It is possible to change the method thatNextMethodwould dispatch by manipulating.Class, but ‘this is notrecommended unless you understand the inheritance mechanismthoroughly’ (Chambers & Hastie, 1992, p. 469).

Note

This scheme is calledS3 (S version 3). For new projects,it is recommended to use the more flexible and robustS4 schemeprovided in themethods package.

References

Chambers, J. M. (1992)Classes and methods: object-oriented programming in S.Appendix A ofStatistical Models in Seds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

See Also

The draft ‘R Language Definition’.

methods,class incl.class2();getS3method,is.object.


Functions to Get and Set Hooks for Load, Attach, Detach and Unload

Description

These functions allow users to set actions to be taken before packagesare attached/detached and namespaces are (un)loaded.

Usage

getHook(hookName)setHook(hookName, value,        action= c("append","prepend","replace"))packageEvent(pkgname,             event= c("onLoad","attach","detach","onUnload"))

Arguments

hookName

character string: the hook name.

pkgname

character string: the package/namespace name.

event

character string: an event for the package. Can be abbreviated.

value

a function or a list of functions, or foraction = "replace",NULL.

action

the action to be taken. Can be abbreviated.

Details

setHook provides a general mechanism for users to registerhooks, a list of functions to be called from system (or user)functions. The initial set of hooks was associated with events onpackages/namespaces: these hooks are named via calls topackageEvent.

To remove a hook completely, callsetHook(hookName, NULL, "replace").

When anR package is attached bylibrary or loaded byother means, it can call initialization code. See.onLoad for a description of the package hook functionscalled during initialization. Users can add their own initializationcode via the hooks provided bysetHook(), functions which willbe called asfunname(pkgname, pkgpath) inside atry call.

The sequence of events depends on which hooks are defined, and whethera package is attached or just loaded. In the case where all hooksare defined and a package is attached, the order of initializationevents is as follows:

  1. The package namespace is loaded.

  2. The package's.onLoad function is run.

  3. If S4 methods dispatch is on, any actions set bysetLoadAction are run.

  4. The namespace is sealed.

  5. The user's"onLoad" hook is run.

  6. The package is added to the search path.

  7. The package's.onAttach function is run.

  8. The package environment is sealed.

  9. The user's"attach" hook is run.

A similar sequence (but in reverse) is run when a package is detachedand its namespace unloaded:

  1. The user's"detach" hook is run.

  2. The package's.Last.lib function is run.

  3. The package is removed from the search path.

  4. The user's"onUnload" hook is run.

  5. The package's.onUnload function is run.

  6. The package namespace is unloaded.

Note that when anR session is finished, packages are not detached andnamespaces are not unloaded, so the corresponding hooks will not berun.

Also note that some of the user hooks are run without the packagebeing on the search path, so in those hooks objects in the packageneed to be referred to using the double (or triple) colon operator,as in the example.

If multiple hooks are added, they are normally run in the order shownbygetHook, but the"detach" and"onUnload" hooksare run in reverse order so the default for package events is to addhooks ‘inside’ existing ones.

The hooks are stored in the environment.userHooksEnv in thebase package, with ‘mangled’ names.

Value

ForgetHook function, a list of functions (possibly empty).ForsetHook function, no return value.ForpackageEvent, the derived hook name (a character string).

Note

Hooks need to be set before the event they modify: for standardpackages this can be problematic asmethods is loaded andattached early in the startup sequence. The usual place to set hookssuch as the example below is in the ‘.Rprofile’ file, but thatwill not work formethods.

See Also

library,detach,loadNamespace.

See::for a discussion of the double and triple colon operators.

Other hooks may be added later: functionsplot.new andpersp already have them.

Examples

setHook(packageEvent("grDevices","onLoad"),function(...) grDevices::ps.options(horizontal=FALSE))

Convert Integer Vectors to or from UTF-8-encoded Character Vectors

Description

Conversion of UTF-8 encoded character vectors to and from integervectors representing a UTF-32 encoding.

Usage

utf8ToInt(x)intToUtf8(x, multiple=FALSE, allow_surrogate_pairs=FALSE)

Arguments

x

object to be converted.

multiple

logical: should the conversion be to a singlecharacter string or multiple individual characters?

allow_surrogate_pairs

logical: should interpretation ofsurrogate pairs be attempted? (See ‘Details’.)Only supported formultiple = FALSE.

Details

These will work in any locale, including on platforms that do nototherwise support multi-byte character sets.

Unicode defines a name and a number of all of the glyphs itencompasses: the numbers are calledcode points: since RFC3629they run from0 to0x10FFFF (with about 5% beingassigned by version 13.0 of the Unicode standard and 7% reserved for‘private use’).

intToUtf8 does not by default handle surrogate pairs: inputs inthe surrogate ranges are mapped toNA. They might occur if aUTF-16 byte stream has been read as 2-byte integers (in the correctbyte order), in which caseallow_surrogate_pairs = TRUE willtry to interpret them (with unmatched surrogate values still treatedasNA).

Value

utf8ToInt converts a length-one character string encoded inUTF-8 to an integer vector of Unicode code points.

intToUtf8 converts a numeric vector of Unicode code pointseither (default) to a single character string or a character vector ofsingle characters. Non-integral numeric values are truncated tointegers. For output to a single character string0 issilently omitted: otherwise0 is mapped to"". TheEncoding of a non-NA return value is declared as"UTF-8".

Invalid andNA inputs are mapped toNA output.

Validity

Which code points are regarded as valid has changed over the lifetimeof UTF-8. Originally all 32-bit unsigned integers were potentiallyvalid and could be converted to up to 6 bytes in UTF-8. Since 2003 ithas been stated that there will never be valid code points larger than0x10FFFF, and so valid UTF-8 encodings are never more than 4bytes.

The code points in the surrogate-pair range0xD800 to0xDFFF are prohibited in UTF-8 and so are regarded as invalidbyutf8ToInt and by default byintToUtf8.

The position of ‘noncharacters’ (notably0xFFFE and0xFFFF) was clarified by ‘Corrigendum 9’ in 2013. Theseare valid but will never be given an official interpretation. (In someearlier versions ofRutf8ToInt treated them as invalid.)

References

https://www.rfc-editor.org/rfc/rfc3629, the current standard for UTF-8.

https://www.unicode.org/versions/corrigendum9.html for non-characters.

Examples

## will only display in some locales and fontsintToUtf8(0x03B2L)# Greek betautf8ToInt("bi\u00dfchen")utf8ToInt("\xfa\xb4\xbf\xbf\x9f")## A valid UTF-16 surrogate pair (for U+10437)x<- c(0xD801,0xDC37)intToUtf8(x)intToUtf8(x,TRUE)(xx<- intToUtf8(x,,TRUE))# will only display in some locales and fontscharToRaw(xx)## An example of how surrogate pairs might occurx<-"\U10437"charToRaw(x)foo<- tempfile()writeLines(x, file(foo, encoding="UTF-16LE"))## next two are OS-specific, but are mandated by POSIXsystem(paste("od -x", foo))# 2-byte units, correct on little-endian platformssystem(paste("od -t x1", foo))# single bytes as hexy<- readBin(foo,"integer",2,2,FALSE, endian="little")sprintf("%X", y)intToUtf8(y,,TRUE)

File Paths not in the Native Encoding

Description

Most modern file systems store file-path components (names ofdirectories and files) in a character encoding of wide scope: usuallyUTF-8 on a Unix-alike and UCS-2/UTF-16 on Windows. However, this wasnot true whenR was first developed and there are still exceptionsamongst file systems, e.g. FAT32.

This was not something anticipated by the C and POSIX standards whichonly provide means to access filesvia file paths encoded inthe current locale, for example those specified in Latin-1 in aLatin-1 locale.

Everything here apart from the specific section on Windows is aboutUnix-alikes.

Details

It is possible to mark character strings (elements of charactervectors) as being in UTF-8 or Latin-1 (seeEncoding).This allows file paths not in the native encoding to beexpressed inR character vectors but there is almost no way to usethem unless they can be translated to the native encoding. That is ofcourse not a problem if that is UTF-8, so these details are really onlyrelevant to the use of a non-UTF-8 locale (including a C locale) on aUnix-alike.

Functions to open a file such asfile,fifo,pipe,gzfile,bzfile,xzfile andunz givean error for non-native filepaths. Where functions look at existencesuch asfile.exists,dir.exists,unlink,file.info andlist.files, non-native filepaths are treated asnon-existent.

Many other functions usefile orgzfile to open theirfiles.

file.path allows non-native file paths to be combined,marking them as UTF-8 if needed.

path.expand only handles paths in the native encoding.

Windows

Windows provides proprietary entry points to access its file systems,and these gained ‘wide’ versions in Windows NT that allowedfile paths in UCS-2/UTF-16 to be accessed from any locale.

SomeR functions use these entry points when file paths are markedas Latin-1 or UTF-8 to allow access to paths not in the currentencoding. These includefile,file.access,file.append,file.copy,file.create,file.exists,file.info,file.link,file.remove,file.rename,file.symlinkanddir.create,dir.exists,normalizePath,path.expand,pipe,Sys.glob,Sys.junction,

unlinkbut notgzfilebzfile,xzfile norunz.

For functions usinggzfile (includingload,readRDS,read.dcf andtar), it is often possible to use agzconconnection wrapping afile connection.

Other notable exceptions arelist.files,list.dirs,system and file-path inputs forgraphics devices.

Historical comment

BeforeR 4.0.0, file paths marked as being in Latin-1 or UTF-8 weresilently translated to the native encoding using escapes such as‘⁠<e7>⁠’ or ‘⁠<U+00e7>⁠’. This created valid file names butmaybe not those intended.

Note

This document is still a work-in-progress.


Check if a Character Vector is Validly Encoded

Description

Check if each element of a character vector is valid in its impliedencoding.

Usage

validUTF8(x)validEnc(x)

Arguments

x

a character vector.

Details

These use similar checks to those used by functions such asgrep.

validUTF8 ignores any marked encoding (seeEncoding) and so looks directly if the bytes in eachstring are valid UTF-8. (For the validity of ‘noncharacters’see the help forintToUtf8.)

validEnc regards character strings as validly encoded unlesstheir encodings are marked as UTF-8 or they are unmarked and theRsession is in a UTF-8 or other multi-byte locale. (The checks inother multi-byte locales depend on the OS and as withiconv not all invalid inputs may be detected.)

Value

A logical vector of the same length asx.NA elementsare regarded as validly encoded.

Note

It would be possible to check for the validity of character strings ina Latin-1 encoding, but extensions such as CP1252 are widely acceptedas ‘Latin-1’ and 8-bit encodings rarely need to be checked forvalidity.

Examples

x<-## from example(text)c("Jetz","no","chli","z\xc3\xbcrit\xc3\xbc\xc3\xbctsch:","(noch","ein","bi\xc3\x9fchen","Z\xc3\xbc","deutsch)",## from a CRAN check log"\xfa\xb4\xbf\xbf\x9f")validUTF8(x)validEnc(x)# depends on the localeEncoding(x)<-"UTF-8"validEnc(x)# typically the last, x[10], is invalid## Maybe advantageous to declare it "unknown":G<- x; Encoding(G[!validEnc(G)])<-"unknown"try( substr(x,1,1))# gives 'invalid multibyte string' error in a UTF-8 localetry( substr(G,1,1))# works in a UTF-8 localenchar(G)# fine, too## but it is not "more valid" typically:all.equal(validEnc(x),          validEnc(G))# typically TRUE

Vectors - Creation, Coercion, etc

Description

Avector inR is either an atomic vector i.e., one of the atomictypes, see ‘Details’, or of type (typeof) or modelist orexpression.

vector produces a ‘simple’ vector of the given length andmode, where a ‘simple’ vector has no attribute, i.e., fulfillsis.null(attributes(.)).

as.vector, a generic, attempts to coerce its argument into avector of modemode (the default is to coerce to whichevervector mode is most convenient): if the result is atomic(is.atomic), all attributes are removed.Formode="any", see ‘Details’.

is.vector(x) returnsTRUE ifx is a vector of thespecified mode having no attributesother than names.Formode="any", see ‘Details’.

Usage

vector(mode="logical", length=0)as.vector(x, mode="any")is.vector(x, mode="any")

Arguments

mode

character string naming an atomic mode or"list" or"expression" or (except forvector)"any". Currently,is.vector() allows any type (seetypeof) formode, and when mode is not"any",is.vector(x, mode) is almost the same astypeof(x) == mode.

length

a non-negative integer specifying the desired length. Foralong vector, i.e.,length > .Machine$integer.max, ithas to be of type"double". Supplying an argument of lengthother than one is an error.

x

anR object.

Details

The atomic modes are"logical","integer","numeric" (synonym"double"),"complex","character" and"raw".

Ifmode = "any",is.vector may returnTRUE forthe atomic modes,list andexpression.For anymode, it will returnFALSE ifx has anyattributes except names. (This is incompatible with S.) On the otherhand,as.vector removesall attributes including namesfor results of atomic mode.

Formode = "any", and atomic vectorsx,as.vector(x)strips allattributes (includingnames),returning a simple atomic vector.
However, whenx is of type"list" or"expression",as.vector(x) currently returns theargumentx unchanged, unless there is anas.vector methodforclass(x).

Note that factors arenot vectors;is.vector returnsFALSE andas.vector converts a factor to a charactervector formode = "any".

Value

Forvector, a vector of the given length and mode. Logicalvector elements are initialized toFALSE, numeric vectorelements to0, character vector elements to"", rawvector elements tonul bytes and list/expression elements toNULL.

Foras.vector, a vector (atomic or of type list or expression).All attributes are removed from the result if it is of an atomic mode,but not in general for a list or expression result. The default method handles 24input types and 12 values oftype: the details of mostcoercions are undocumented and subject to change.

Foris.vector,TRUE orFALSE.is.vector(x, mode = "numeric") can be true for vectors of types"integer" or"double" whereasis.vector(x, mode = "double") can only be true for those of type"double".

Methods foras.vector()

Writers of methods foras.vector need to take care tofollow the conventions of the default method. In particular

  • Argumentmode can be"any", any of the atomicmodes,"list","expression","symbol","pairlist" or one of the aliases"double" and"name".

  • The return value should be of the appropriate mode. Formode = "any" this means an atomic vector or list or expression.

  • Attributes should be treated appropriately: in particular whenthe result is an atomic vector there should be no attributes, noteven names.

  • is.vector(as.vector(x, m), m) should be true for anymodem, including the default"any".

    Currently this is not fulfilled inR whenm == "any" andx is of typelist orexpression withattributes in addition tonames — typically the case for(S3 or S4) objects (seeis.object) which are listsinternally.

Note

as.vector andis.vector are quite distinct from themeaning of the formal class"vector" in themethodspackage, and henceas(x, "vector") andis(x, "vector").

Note thatas.vector(x) is not necessarily a null operation ifis.vector(x) is true: any names will be removed from an atomicvector.

Non-vectormodes"symbol" (synonym"name") and"pairlist" are accepted but have long been undocumented: theyare used to implementas.name andas.pairlist, and those functions should preferably beused directly. None of the description here applies to thosemodes: see the help for the preferred forms.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

c,is.numeric,is.list, etc.

Examples

df<- data.frame(x=1:3, y=5:7)## Error:try(as.vector(data.frame(x=1:3, y=5:7), mode="numeric"))x<- c(a=1, b=2)is.vector(x)as.vector(x)all.equal(x, as.vector(x))## FALSE###-- All the following are TRUE:is.list(df)! is.vector(df)! is.vector(df, mode="list")is.vector(list(), mode="list")

Vectorize a Scalar Function

Description

Vectorize creates a function wrapper that vectorizes theaction of its argumentFUN.

Usage

Vectorize(FUN, vectorize.args= arg.names, SIMPLIFY=TRUE,          USE.NAMES=TRUE)

Arguments

FUN

function to apply, found viamatch.fun.

vectorize.args

a character vector of arguments which should bevectorized. Defaults to all arguments ofFUN.

SIMPLIFY

logical or character string; attempt to reduce theresult to a vector, matrix or higher dimensional array; seethesimplify argument ofsapply.

USE.NAMES

logical; use names if the first ... argument hasnames, or if it is a character vector, use that character vector asthe names.

Details

The arguments named in thevectorize.args argument toVectorize are the arguments passed in the... list tomapply. Only those that are actually passed will bevectorized; default values will not. See the examples.

Vectorize cannot be used with primitive functions as they donot have a value forformals.

It also cannot be used with functions that have arguments namedFUN,vectorize.args,SIMPLIFY orUSE.NAMES, as they will interfere with theVectorizearguments. See thecombn example below for a workaround.

Value

A function with the same arguments asFUN, wrapping a call tomapply.

Examples

# We use rep.int as rep is primitivevrep<- Vectorize(rep.int)vrep(1:4,4:1)vrep(times=1:4, x=4:1)vrep<- Vectorize(rep.int,"times")vrep(times=1:4, x=42)f<-function(x=1:3, y) c(x, y)vf<- Vectorize(f, SIMPLIFY=FALSE)f(1:3,1:3)vf(1:3,1:3)vf(y=1:3)# Only vectorizes y, not x# Nonlinear regression contour plot, based on nls() examplerequire(graphics)SS<-function(Vm, K, resp, conc){    pred<-(Vm* conc)/(K+ conc)    sum((resp- pred)^2/ pred)}vSS<- Vectorize(SS, c("Vm","K"))Treated<- subset(Puromycin, state=="treated")Vm<- seq(140,310, length.out=50)K<- seq(0,0.15, length.out=40)SSvals<- outer(Vm, K, vSS, Treated$rate, Treated$conc)contour(Vm, K, SSvals, levels=(1:10)^2, xlab="Vm", ylab="K")# combn() has an argument named FUNcombnV<- Vectorize(function(x, m, FUNV=NULL) combn(x, m, FUN= FUNV),                    vectorize.args= c("x","m"))combnV(4,1:4)combnV(4,1:4, sum)

Warning Messages

Description

Generates a warning message that corresponds to its argument(s) and(optionally) the expression or function from which it was called.

Usage

warning(..., call.=TRUE, immediate.=FALSE, noBreaks.=FALSE,        domain=NULL)suppressWarnings(expr, classes="warning")

Arguments

...

either zero or more objects which can be coercedto character (and which are pasted together with no separator)or a singlecondition object.

call.

logical, indicating if the call should become part of thewarning message.

immediate.

logical, indicating if the warning should be outputimmediately, even ifgetOption("warn") <= 0.NB: this is not respected for condition objects.

noBreaks.

logical, indicating as far as possible the message shouldbe output as a single line whenoptions(warn = 1).

expr

expression to evaluate.

domain

seegettext. IfNA, messages willnot be translated, see also the note instop.

classes

character, indicating which classes of warnings shouldbe suppressed.

Details

The resultdepends on the value ofoptions("warn") and on handlers established in theexecuting code.

If acondition object is supplied it should be the onlyargument, and further arguments will be ignored, with a message.options(warn = 1) can be used to request an immediatereport.

warning signals a warning condition by (effectively) callingsignalCondition. If there are no handlers or if all handlersreturn, then the value ofwarn =getOption("warn") isused to determine the appropriate action. Ifwarn is negativewarnings are ignored; if it is zero they are stored and printed afterthe top–level function has completed; if it is one they are printedas they occur and if it is 2 (or larger) warnings are turned intoerrors. Callingwarning(immediate. = TRUE) turnswarn <= 0 intowarn = 1 for this call only.

Ifwarn is zero (the default), a read-only variablelast.warning is created. It contains the warnings which can beprinted via a call towarnings.

Warnings will be truncated togetOption("warning.length")characters, default 1000, indicated by[... truncated].

While the warning is being processed, amuffleWarning restartis available. If this restart is invoked withinvokeRestart,thenwarning returns immediately.

An attempt is made to coerce other types of inputs towarningto character vectors.

suppressWarnings evaluates its expression in a context thatignores all warnings.

Value

The warning message ascharacter string, invisibly.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

stop for fatal errors,message for diagnostic messages,warnings,andoptions with argumentwarn=.

gettext for the mechanisms for the automated translationof messages.

Examples

testit<-function() warning("testit")testit()## shows calltestit<-function() warning("problem in testit", call.=FALSE)testit()## no callsuppressWarnings(warning("testit"))

Print Warning Messages

Description

warnings and itsprint method print thevariablelast.warning in a pleasing form.

Usage

warnings(...)## S3 method for class 'warnings'summary(object,...)## S3 method for class 'warnings'print(x, tags,      header= ngettext(n,"Warning message:\n","Warning messages:\n"),...)## S3 method for class 'summary.warnings'print(x,...)

Arguments

...

arguments to be passed tocat (forwarnings()).

object

a"warnings" object as returned bywarnings().

x

a"warnings" or"summary.warnings" object.

tags

if notmissing, acharactervector of the samelength asx, to “label”the messages. Defaults topaste0(seq_len(n), ": ") forn2n \ge 2 wheren <- length(x).

header

a character stringcat()ed before themessages are printed.

Details

See the description ofoptions("warn") for thecircumstances under which there is alast.warning object andwarnings() is used. In essence this is ifoptions(warn = 0) andwarning has been called at least once.

Note that thelength(last.warning) is maximallygetOption("nwarnings") (at the time the warnings aregenerated) which is50 by default. To increase, use somethinglike

  options(nwarnings = 10000)

It is possible thatlast.warning refers to the last recordedwarning and not to the last warning, for example ifoptions(warn) hasbeen changed or if a catastrophic error occurred.

Value

warnings() returns an object of S3 class"warnings", basically a namedlist.InR versions before 4.4.0, it returnedNULL when therewere no warnings, contrary to the above documentation.

summary(<warnings>) returns a"summary.warnings"object which is basically thelist of unique warnings(unique(object)) with a"counts" attribute, somewhatexperimentally.

Warning

It is undocumented wherelast.warning is stored nor that it isvisible, and this is subject to change.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

warning.

Examples

## NB this example is intended to be pasted in,##    rather than run by example()ow<- options("warn")for(win-1:1){   options(warn= w); cat("\n warn =", w,"\n")for(iin1:3){ cat(i,"..\n"); m<- matrix(1:7,3,4)}   cat("--=--=--\n")}## at the end prints all three warnings, from the 'option(warn = 0)' aboveoptions(ow)# reset to previous, typically 'warn = 0'tail(warnings(),2)# see the last two warnings only (via '[' method)## Often the most useful way to look at many warnings:summary(warnings())op<- options(nwarnings=10000)## <- get "full statistics"x<-1:36;for(nin1:13)for(min1:12) A<- matrix(x, n,m)# There were 105 warnings ...summary(warnings())options(op)# revert to previous (keeping 50 messages by default)

Extract Parts of a POSIXt or Date Object

Description

Extract the weekday, month or quarter, or the Julian time(days since some origin). These are generic functions: the methodsfor the internal date-time classes are documented here.

Usage

weekdays(x, abbreviate)## S3 method for class 'POSIXt'weekdays(x, abbreviate=FALSE)## S3 method for class 'Date'weekdays(x, abbreviate=FALSE)months(x, abbreviate)## S3 method for class 'POSIXt'months(x, abbreviate=FALSE)## S3 method for class 'Date'months(x, abbreviate=FALSE)quarters(x, abbreviate)## S3 method for class 'POSIXt'quarters(x,...)## S3 method for class 'Date'quarters(x,...)julian(x,...)## S3 method for class 'POSIXt'julian(x, origin= as.POSIXct("1970-01-01", tz="GMT"),...)## S3 method for class 'Date'julian(x, origin= as.Date("1970-01-01"),...)

Arguments

x

an object inheriting from class"POSIXt" or"Date".

abbreviate

logical vector (possibly recycled). Should the names beabbreviated?

origin

an length-one object inheriting from class"POSIXt" or"Date".

...

arguments for other methods.

Value

weekdays andmonths return a charactervector of names in the locale in use, i.e.,Sys.getlocale("LC_TIME").

quarters returns a character vector of"Q1" to"Q4".

julian returns the number of days (possibly fractional)since the origin, with the origin as a"origin" attribute.All time calculations inR are done ignoring leap-seconds.

Note

Other components such as the day of the month or the year arevery easy to compute: just useas.POSIXlt and extractthe relevant component. Alternatively (especially if the componentsare desired as character strings), usestrftime.

See Also

DateTimeClasses,Date;Sys.getlocale("LC_TIME") crucially formonths() andweekdays().

Examples

## first two are locale dependent:weekdays(.leap.seconds)months(.leap.seconds)quarters(.leap.seconds)## Show how easily you get month, day, year, day (of {month, week, yr}), ... :## (remember to count from 0 (!): mon = 0..11, wday = 0..6,  etc !!)##' Transform (Time-)Date vector  to  convenient data frame :dt2df<-function(dt, dName= deparse(substitute(dt))){    DF<- as.data.frame(unclass(as.POSIXlt( dt)))    `names<-`(cbind(dt, DF, deparse.level=0L), c(dName, names(DF)))}## e.g.,dt2df(.leap.seconds)# date+timedt2df(Sys.Date()+0:9)# date##' Even simpler:  Date -> Matrix - dropping time info {sec,min,hour, isdst}d2mat<-function(x) simplify2array(unclass(as.POSIXlt(x))[4:7])## e.g.,d2mat(seq(as.Date("2000-02-02"), by=1, length.out=30))# has R 1.0.0's release date## Julian Day Number (JDN, https://en.wikipedia.org/wiki/Julian_day)## is the number of days since noon UTC on the first day of 4317 BCE.## in the proleptic Julian calendar.  To more recently, in## 'Terrestrial Time' which differs from UTC by a few seconds## See https://en.wikipedia.org/wiki/Terrestrial_Timejulian(Sys.Date(),-2440588)# from a dayfloor(as.numeric(julian(Sys.time()))+2440587.5)# from a date-time

Which indices are TRUE?

Description

Give theTRUE indices of a logical object, allowing for arrayindices.

Usage

which(x, arr.ind=FALSE, useNames=TRUE)arrayInd(ind, .dim, .dimnames=NULL, useNames=FALSE)

Arguments

x

alogical vector or array.NAsare allowed and omitted (treated as ifFALSE).

arr.ind

logical; shouldarrayindices be returnedwhenx is an array? Anything other than a single true valueis treated as false.

ind

integer-valued index vector, as resulting fromwhich(x).

.dim

dim(.) integer vector.

.dimnames

optional list of characterdimnames(.).IfuseNames is true, to be used for constructing dimnames forarrayInd() (and hence,which(*, arr.ind=TRUE)).Ifnames(.dimnames) is not empty, these are used ascolumn names..dimnames[[1]] is used as row names.

useNames

logical indicating if the value ofarrayInd()should have (non-null) dimnames at all.

Value

Ifarr.ind == FALSE (the default), an integer vector,or a double vector ifx is along vector, withlength equal tosum(x), i.e., to the number ofTRUEs inx.

Basically, the result is(1:length(x))[x] in typical cases;more generally, including whenx hasNA's,which(x) isseq_along(x)[!is.na(x) & x] plusnames whenx has.

Ifarr.ind == TRUE andx is anarray (hasadim attribute), the result isarrayInd(which(x), dim(x), dimnames(x)), namely a matrixwhose rows each are the indices of one element ofx; seeExamples below.

Note

Unlike most other baseR functions this does not coercexto logical: only arguments withtypeof logical areaccepted and others give an error.

Author(s)

Werner Stahel and Peter Holzer (ETH Zurich) proposed thearr.ind option.

See Also

Logic,which.min for the index ofthe minimum or maximum, andmatch for the first index ofan element in a vector, i.e., for a scalara,match(a, x)is equivalent tomin(which(x == a)) but much more efficient.

Examples

which(LETTERS=="R")which(ll<- c(TRUE,FALSE,TRUE,NA,FALSE,FALSE,TRUE))#> 1 3 7names(ll)<- letters[seq(ll)]which(ll)which((1:12)%%2==0)# which are even?which(1:10>3, arr.ind=TRUE)( m<- matrix(1:12,3,4))div.3<- m%%3==0which(div.3)which(div.3, arr.ind=TRUE)rownames(m)<- paste("Case",1:3, sep="_")which(m%%5==0, arr.ind=TRUE)dim(m)<- c(2,2,3); mwhich(div.3, arr.ind=FALSE)which(div.3, arr.ind=TRUE)vm<- c(m)dim(vm)<- length(vm)#-- funny thing with  length(dim(...)) == 1which(div.3, arr.ind=TRUE)

Where is the Min() or Max() or first TRUE or FALSE ?

Description

Determines the location, i.e., index of the (first) minimum or maximumof a numeric (or logical) vector.

Usage

which.min(x)which.max(x)

Arguments

x

numeric (logical, integer or double) vector or anR objectfor which the internal coercion todouble works whosemin ormax is searched for.

Value

Missing andNaN values are discarded.

aninteger or on 64-bit platforms, iflength(x) =: n231\ge 2^{31} an integervalueddouble of length 1 or 0 (iffx has nonon-NAs), giving the index of thefirst minimum ormaximum respectively ofx.

If this extremum is unique (or empty), the results are the same as(but more efficient than)which(x == min(x, na.rm = TRUE)) orwhich(x == max(x, na.rm = TRUE)) respectively.

Logicalx – FirstTRUE orFALSE

For alogical vectorx with bothFALSE andTRUE values,which.min(x) andwhich.max(x) returnthe index of the firstFALSE orTRUE, respectively, asFALSE < TRUE. However,match(FALSE, x) ormatch(TRUE, x) are typicallypreferred, as they doindicate mismatches.

Author(s)

Martin Maechler

See Also

which,max.col,max, etc.

UsearrayInd(), if you need array/matrix indices insteadof 1D vector ones.

which.is.max in packagennet differs inbreaking ties at random (and having a ‘fuzz’ in the definitionof ties).

Examples

x<- c(1:4,0:5,11)which.min(x)which.max(x)## it *does* work with NA's present, by discarding them:presidents[1:30]range(presidents, na.rm=TRUE)which.min(presidents)# 28which.max(presidents)#  2## Find the first occurrence, i.e. the first TRUE, if there is at least one:x<- rpois(10000, lambda=10); x[sample.int(50,20)]<-NA## where is the first value >= 20 ?which.max(x>=20)## Also works for lists (which can be coerced to numeric vectors):which.min(list(A=7, pi= pi))##  ->  c(pi = 2L)

Evaluate an Expression in a Data Environment

Description

Evaluate anR expression in an environment constructed from data,possibly modifying (a copy of) the original data.

Usage

with(data, expr,...)within(data, expr,...)## S3 method for class 'list'within(data, expr, keepAttrs=TRUE,...)

Arguments

data

data to use for constructing an environment. For thedefaultwith method this may be an environment, a list, adata frame, or an integer as insys.call. Forwithin,it can be a list or a data frame.

expr

expression to evaluate; particularly forwithin()often a “compound” expression, i.e., of the form

   {     a <- somefun()     b <- otherfun()     .....     rm(unused1, temp)   }
keepAttrs

for thelist method ofwithin(),alogical specifying if the resulting list should keeptheattributes fromdata and have itsnames in the same order. Often this is unneeded asthe result is anamed list anyway, and thenkeepAttrs = FALSE is more efficient.

...

arguments to be passed to (future) methods.

Details

with is a generic function that evaluatesexpr in alocal environment constructed fromdata. The environment hasthe caller's environment as its parent. This is useful forsimplifying calls to modeling functions. (Note: ifdata isalready an environment then this is used with its existing parent.)

Note that assignments withinexpr take place in the constructedenvironment and not in the user's workspace.

within is similar, except that it examines the environmentafter the evaluation ofexpr and makes the correspondingmodifications to a copy ofdata (this may fail in the dataframe case if objects are created which cannot be stored in a dataframe), and returns it.within can be used as an alternativetotransform.

Value

Forwith, the value of the evaluatedexpr. Forwithin, the modified object.

Note

Forinteractive use this is very effective and nice to read. Forprogramming however, i.e., in one's functions, more care isneeded, and typically one should refrain from usingwith(), as,e.g., variables indata may accidentally override localvariables, see the reference.

Further, when using modeling or graphics functions with an explicitdata argument (and typically usingformulas),it is typically preferred to use thedata argument of thatfunction rather than to usewith(data, ...).

References

Thomas Lumley (2003)Standard nonstandard evaluation rules.https://developer.r-project.org/nonstandard-eval.pdf

See Also

evalq,attach,assign,transform.

Examples

with(mtcars, mpg[cyl==8&  disp>350])# is the same as, but nicer thanmtcars$mpg[mtcars$cyl==8&  mtcars$disp>350]require(stats); require(graphics)# examples from glm:with(data.frame(u= c(5,10,15,20,30,40,60,80,100),                lot1= c(118,58,42,35,27,25,21,19,18),                lot2= c(69,35,26,21,18,16,13,12,12)),    list(summary(glm(lot1~ log(u), family= Gamma)),         summary(glm(lot2~ log(u), family= Gamma))))aq<- within(airquality,{# Notice that multiple vars can be changed    lOzone<- log(Ozone)    Month<- factor(month.abb[Month])    cTemp<- round((Temp-32)*5/9,1)# From Fahrenheit to Celsius    S.cT<- Solar.R/ cTemp# using the newly created variable    rm(Day, Temp)})head(aq)# example from boxplot:with(ToothGrowth,{    boxplot(len~ dose, boxwex=0.25, at=1:3-0.2,            subset=(supp=="VC"), col="yellow",            main="Guinea Pigs' Tooth Growth",            xlab="Vitamin C dose mg",            ylab="tooth length", ylim= c(0,35))    boxplot(len~ dose, add=TRUE, boxwex=0.25, at=1:3+0.2,            subset= supp=="OJ", col="orange")    legend(2,9, c("Ascorbic acid","Orange juice"),           fill= c("yellow","orange"))})# alternate form that avoids subset argument:with(subset(ToothGrowth, supp=="VC"),     boxplot(len~ dose, boxwex=0.25, at=1:3-0.2,             col="yellow", main="Guinea Pigs' Tooth Growth",             xlab="Vitamin C dose mg",             ylab="tooth length", ylim= c(0,35)))with(subset(ToothGrowth,  supp=="OJ"),     boxplot(len~ dose, add=TRUE, boxwex=0.25, at=1:3+0.2,             col="orange"))legend(2,9, c("Ascorbic acid","Orange juice"),       fill= c("yellow","orange"))

Return both a Value and its Visibility

Description

This function evaluates an expression, returning it in a two element listcontaining its value and a flag showing whether it would automatically print.

Usage

withVisible(x)

Arguments

x

an expression to be evaluated.

Details

The argument,not anexpression object, ratheran (unevaluated function)call, is evaluated in thecaller's context.

This is aprimitive function.

Value

value

The value ofx after evaluation.

visible

logical; whether the value would auto-print.

See Also

invisible,eval;withAutoprint() callssource() whichitself useswithVisible() in order to correctly“auto print”.

Examples

x<-1withVisible(x<-1)# *$visible is FALSExwithVisible(x)# *$visible is TRUE# Wrap the call in evalq() for special handlingdf<- data.frame(a=1:5, b=1:5)evalq(withVisible(a+ b), envir= df)

Write Data to a File

Description

Write datax to a file or otherconnection.
As it simply callscat(), less formatting happens thanwithprint()ing.Ifx is a matrix you need to transpose it (and typically setncolumns) to get the columns infile the same as those inthe internal representation.

Whereas atomic vectors (numeric,character,etc, including matrices) are written plainly, i.e., without any names,less simple vector-like objects such as"factor","Date", or"POSIXt" may beformatted to character before writing.

Usage

write(x, file="data",      ncolumns=if(is.character(x))1else5,      append=FALSE, sep=" ")

Arguments

x

the data to be written out.

file

aconnection, or a character string namingthe file to write to. If"", print to the standard outputconnection.

When.Platform$OS.type != "windows", and itis"|cmd", the output is piped to the command givenby ‘cmd’.

ncolumns

the number of columns to write the data in.

append

ifTRUE the datax are appended to theconnection.

sep

a string used to separate columns. Usingsep = "\t"gives tab delimited output; default is" ".

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

See Also

write is a wrapper forcat, which gives furtherdetails on the format used.

write.table for matrix and data frame objects,writeLines for lines of text,andscan for reading data.

saveRDS andsave are often preferable (forwriting anyR objects).

Examples

# Demonstrate default ncolumns, writing to the consolewrite(month.abb,"")# 1 element  per line for "character"write(stack.loss,"")# 5 elements per line for "numeric"# Build a file with sequential callsfil<- tempfile("data")write("# Model settings", fil)write(month.abb, fil, ncolumns=6, append=TRUE)write("\n# Initial parameter values", fil, append=TRUE)write(sqrt(stack.loss), fil, append=TRUE)if(interactive()) file.show(fil)unlink(fil)# tidy up

Write Lines to a Connection

Description

Write text lines to a connection.

Usage

writeLines(text, con= stdout(), sep="\n", useBytes=FALSE)

Arguments

text

a character vector.

con

aconnection object or a character string.

sep

character string. A string to be written to the connectionafter each line of text.

useBytes

logical. See ‘Details’.

Details

If thecon is a character string, the function callsfile to obtain a file connection which is opened forthe duration of the function call.(tilde expansion of the file path is done byfile.)

If the connection is open it is written from its current position.If it is not open, it is opened for the duration of the call in"wt" mode and then closed again.

NormallywriteLines is used with a text-mode connection, and thedefault separator is converted to the normal separator for thatplatform (LF on Unix/Linux,CRLF on Windows). For morecontrol, open a binary connection and specify the precise value you want written tothe file insep. For even more control, usewriteChar on a binary connection.

useBytes is for expert use. Normally (when false) characterstrings with marked encodings are converted to the current encodingbefore being passed to the connection (which might do furtherre-encoding).useBytes = TRUE suppresses the re-encoding ofmarked strings so they are passed byte-by-byte to the connection:this can be useful when strings have already been re-encoded bye.g.iconv. (It is invoked automatically for stringswith marked encoding"bytes".)

See Also

connections,writeChar,writeBin,readLines,cat


Auxiliary Function for Sorting and Ranking

Description

A generic auxiliary function that produces a numeric vector whichwill sort in the same order asx.

Usage

xtfrm(x)

Arguments

x

anR object.

Details

This is a special case of ranking, but as a less general function thanrank is more suitable to be made generic. The defaultmethod is similar torank(x, ties.method = "min", na.last = "keep"), soNA values are given rankNA and alltied values are given equal integer rank.

Thefactor method extracts the codes.

The default method will unclass the object ifis.numeric(x) is true but otherwise make use of== and> methods for the class ofx[i] (forintegersi), and theis.na method for the class ofx, but might be rather slow when doing so.

This is aninternal genericprimitive, so S3 or S4methods can be written for it.Differently to other internal generics, the default method is calledexplicitly when no other dispatch has happened.

Value

A numeric (usually integer) vector of the same length asx.

See Also

rank,sort,order.


Rounding of Numbers: Zapping Small Ones to Zero

Description

zapsmall determines adigits argumentdr forcallinground(x, digits = dr) such that values close tozero (compared with the maximal absolute value in the vector) are‘zapped’, i.e., replaced by0.

Usage

zapsmall(x, digits= getOption("digits"),         mFUN=function(x, ina) max(abs(x[!ina])),         min.d=0L)

Arguments

x

a numeric or complex vector or anyR number-like objectwhich has around method and basic arithmetic methodsincludinglog10().

digits

integer indicating the precision to be used.

mFUN

afunction(x, ina) of the numeric (or complex)x and thelogicalina := is.na(x)returning a positive number in the order of magnitude of the maximalabs(x) value. The default is back compatible but not robust,and e.g., not very useful whenx has infinite entries.

min.d

an integer specifying the minimal number of digits to use inthe resultinground(x, digits=*) call whenmFUN(*) > 0.

References

Chambers, J. M. (1998)Programming with Data. A Guide to the S Language.Springer.

Examples

x2<- pi*100^(-2:2)/10   print(  x2, digits=4)zapsmall(  x2)# automatical digitszapsmall(  x2, digits=4)zapsmall(c(x2,Inf))# round()s to integer ..zapsmall(c(x2,Inf), min.d=-Inf)# everything  is small wrt  Inf(z<- exp(1i*0:4*pi/2))zapsmall(z)zapShow<-function(x,...) rbind(orig= x, zapped= zapsmall(x,...))zapShow(x2)## using a *robust* mFUNmF_rob<-function(x, ina) boxplot.stats(x, do.conf=FALSE)$stats[5]## with robust mFUN(), 'Inf' is no longer distorting the picture:zapShow(c(x2,Inf), mFUN= mF_rob)zapShow(c(x2,Inf), mFUN= mF_rob, min.d=-5)# the samezapShow(c(x2,999), mFUN= mF_rob)# same *rounding* as w/ InfzapShow(c(x2,999), mFUN= mF_rob, min.d=3)# the samezapShow(c(x2,999), mFUN= mF_rob, min.d=8)# small diff

Listing of Packages

Description

.packages returns information about package availability.

Usage

.packages(all.available=FALSE, lib.loc=NULL)

Arguments

all.available

logical; ifTRUE return a character vectorof all available packages inlib.loc.

lib.loc

a character vector describing the location ofRlibrary trees to search through, orNULL. The default valueofNULL corresponds to.libPaths().

Details

.packages() returns the names of the currentlyattached packagesinvisibly whereas.packages(all.available = TRUE) gives (visibly)allpackages available in the library location pathlib.loc.

For a package to be regarded as being ‘available’ it must have validmetadata (and hence be an installed package). However, this willreport a package as available if the metadata does not match thedirectory name: usefind.package to confirm that themetadata match orinstalled.packages for a much slowerbut more comprehensive check of ‘available’ packages.

Value

A character vector of package base names, invisible unlessall.available = TRUE.

Note

.packages(all.available = TRUE) is not a way to find out if asmall number of packages are available for use: not only is itexpensive when thousands of packages are installed, it is anincomplete test. See the help forfind.package for whyrequire should be used.

Author(s)

R core;Guido Masarotto for theall.available = TRUE part of.packages.

See Also

library,.libPaths,installed.packages.

Examples

(.packages())# maybe just "base".packages(all.available=TRUE)# return all available as character vectorrequire(splines)(.packages())# "splines", toodetach("package:splines")

Miscellaneous Internal/Programming Utilities

Description

Miscellaneous internal/programming utilities.

Usage

.standard_regexps()

Details

.standard_regexps returns a list of ‘standard’ regexps,including elements namedvalid_package_name andvalid_package_version with the obvious meanings. The regexpsare not anchored.



[8]ページ先頭

©2009-2025 Movatter.jp