| Type: | Package |
| Title: | Tools for Data Manipulation |
| Version: | 0.5.0 |
| BugReports: | https://github.com/wahani/dat/issues |
| Description: | An implementation of common higher order functions with syntactic sugar for anonymous function. Provides also a link to 'dplyr' and 'data.table' for common transformations on data frames to work around non standard evaluation by default. |
| License: | MIT + file LICENSE |
| Depends: | methods |
| Imports: | data.table, Formula, magrittr, progress, aoos |
| Suggests: | dplyr, lintr, knitr, rbenchmark, nycflights13, rmarkdown,testthat, tibble |
| VignetteBuilder: | knitr |
| Encoding: | UTF-8 |
| ByteCompile: | TRUE |
| RoxygenNote: | 7.1.0 |
| Collate: | 'NAMESPACE.R' 'FormulaList.R' 'helper.R' 'DataFrame.R''as.function.R' 'bindRows.R' 'dataTableBackend.R' 'deparse.R''extract.R' 'map.R' 'mutar.R' 'pipeExport.R' 'replace.R''useDplyr.R' 'verboseApply.R' |
| NeedsCompilation: | no |
| Packaged: | 2020-05-15 18:57:26 UTC; lswarnholz |
| Author: | Sebastian Warnholz [aut, cre] |
| Maintainer: | Sebastian Warnholz <wahani@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2020-05-15 19:40:03 UTC |
DataFrame and methods
Description
This is a 'data.table' like implementation of a data.frame. Either dplyr ordata.table is used as backend. The only purpose is to haveR CMD checkfriendly syntax.
Usage
DataFrame(...)as.DataFrame(x, ...)## Default S3 method:as.DataFrame(x, ...)## S3 method for class 'data.frame'as.DataFrame(x, ...)## S3 method for class 'DataFrame'x[i, j, ..., by, sby, drop]Arguments
... | arbitrary number of args |
x | (DataFrame | data.frame) |
i | (logical | numeric | integer | OneSidedFormula | TwoSidedFormula |FormulaList) see the examples. |
j | (logical | character | TwoSidedFormula | FormulaList | function)character beginning with '^' are interpreted as regular expression |
by,sby | (character) variables to group by. by will be used to dotransformations within groups. sby will collapse each group to one row. |
drop | (ignored) never drops the class. |
Details
OneSidedFormula is always used for subsetting rows.
TwoSidedFormula is used instead of name-value expressions insummarise andmutate.
See Also
Examples
data("airquality")dat <- as.DataFrame(airquality)dat[~ Month > 4, ][meanWind ~ mean(Wind), sby = "Month"]["meanWind"]dat[FL(.n ~ mean(.n), .n = c("Wind", "Temp")), sby = "Month"]Dynamically generate formulas
Description
Function to dynamically generate formulas - (F)ormula (L)ist - to be used inmutar.
Usage
FL(..., .n = NULL, pattern = "\\.n")makeFormulas(..., .n, pattern = "\\.n")## S3 method for class 'FormulaList'update(object, data, ...)Arguments
... | (formulas) |
.n | names to be used in formulas. Can be any object whichcan be used byextract to select columns. NULL isinterpreted to use the formulas without change. |
pattern | (character) pattern to be replaced in formulas |
object | (FormulaList) |
data | (data.frame) |
See Also
Examples
FL(.n ~ mean(.n), .n = "variable")as(makeFormulas(.n ~ mean(.n), .n = "variable"), "FormulaList")Coerce a formula into a function
Description
Convert a formula into a function. Seemap andextract forexamples.
Usage
## S3 method for class 'formula'as.function(x, ...)Arguments
x | (formula) see examples |
... | not used |
Value
An object inheriting from class function.
Examples
as.function(~ .)(1)as.function(x ~ x)(1)as.function(f(x, y) ~ c(x, y))(1, 2)as.function(numeric : x ~ x)(1) # check for classas.function(numeric(1) : x ~ x)(1) # check for class + lengthBind rows
Description
This is a wrapper aroundrbindlist to preserve the inputclass.
Usage
bindRows(x, id = NULL, useNames = TRUE, fill = TRUE)Arguments
x | (list) a list of data frames |
id,useNames,fill | passed torbindlist |
Value
If the first element ofx inherits fromdata.frame the typethat first element.
x else.
Extract elements from a vector
Description
Extract elements from an object as S4 generic function. See the examples.
Usage
extract(x, ind, ...)## S4 method for signature 'list,'function''extract(x, ind, ...)## S4 method for signature 'atomic,'function''extract(x, ind, ...)## S4 method for signature 'ANY,formula'extract(x, ind, ...)## S4 method for signature 'atomicORlist,numericORintegerORlogical'extract(x, ind, ...)## S4 method for signature 'ANY,character'extract(x, ind, ...)## S4 method for signature 'data.frame,character'extract(x, ind, ...)extract2(x, ind, ...)## S4 method for signature 'atomicORlist,numericORinteger'extract2(x, ind, ...)## S4 method for signature 'ANY,formula'extract2(x, ind, ...)## S4 method for signature 'atomicORlist,'function''extract2(x, ind, ...)## S4 method for signature 'ANY,character'extract2(x, ind, ...)Arguments
x | (atomic | list) a vector. |
ind | (function | formula | character | numeric | integer | logical) aformula is coerced into a function. For lists the function is applied toeach element (and has to return a logical of length 1). For atomics avectorized function is expected. If you supply an atomic it is used forsubsetting. A character of length 1 beginning with "^" is interpreted asregular expression. |
... | arguments passed to ind. |
Examples
extract(1:15, ~ 15 %% . == 0)extract(list(xy = 1, zy = 2), "^z")extract(list(x = 1, z = 2), 1)extract(list(x = 1, y = ""), is.character)# Example: even numbers:is.even <- function(x) (x %% 2) == 0sum((1:10)[is.even(1:10)])extract(1:10, ~ . %% 2 == 0) %>% sumextract(1:10, is.even) %>% sum# Example: factors of 15extract(1:15, ~ 15 %% . == 0)# Example: relative prime numbersgcd <- function(a, b) { .gcd <- function(a, b) if (b == 0) a else Recall(b, a %% b) flatmap(a ~ b, .gcd)}extract(1:10, x ~ gcd(x, 10) == 1)# Example: real prime numbersisPrime <- function(n) { .isPrime <- function(n) { iter <- function(i) { if (i * i > n) TRUE else if (n %% i == 0 || n %% (i + 2) == 0) FALSE else Recall(i + 6) } if (n <= 1) FALSE else if (n <= 3) TRUE else if (n %% 2 == 0 || n %% 3 == 0) FALSE else iter(5) } flatmap(n, x ~ .isPrime(x))}extract(1:10, isPrime)An implementation of map
Description
An implementation of map and flatmap. They support the use of formulas assyntactic sugar for anonymous functions.
Usage
map(x, f, ...)## S4 method for signature 'ANY,formula'map(x, f, ...)## S4 method for signature 'atomic,'function''map(x, f, ...)## S4 method for signature 'list,'function''map(x, f, p = function(x) TRUE, ...)## S4 method for signature 'list,numericORcharacteORlogical'map(x, f, ...)## S4 method for signature 'MList,'function''map(x, f, ..., simplify = FALSE)## S4 method for signature 'formula,'function''map(x, f, ...)flatmap(x, f, ..., flatten = unlist)## S4 method for signature 'ANY,formula'flatmap(x, f, ..., flatten = unlist)sac(x, f, by, ..., combine = bindRows)## S4 method for signature 'data.frame,'function''sac(x, f, by, ..., combine = bindRows)## S4 method for signature 'ANY,formula'sac(x, f, by, ..., combine = bindRows)vmap(x, f, ..., .mc = min(length(x), detectCores()), .bar = "bar")Arguments
x | (vector |data.frame | formula) if x inherits fromdata.frame, a data.frame is returned. Useas.list if this is notwhat you want. When x is a formula it is interpreted to trigger amultivariate map. |
f | (function |formula | character | logical | numeric)something which can be interpreted as a function. formula objects arecoerced to a function. atomics are used for subsetting in each element ofx. See the examples. |
... | further arguments passed to the apply function. |
p | (function | formula) a predicate function indicating which columnsin a data.frame to use in map. This is a filter for the map operation, thefull data.frame is returned. |
simplify | see SIMPLIFY inmapply |
flatten | (function | formula) a function used to flatten the results. |
by | (e.g. character) argument is passed toextract to selectcolumns. |
combine | (function | formula) a function which knows how to combine thelist of results.bindRows is the default. |
.mc | (integer) the number of cores. Passed down tomclapply ormcmapply. |
.bar | (character) seeverboseApply. |
Details
map will dispatch tolapply. Whenx is aformula this is interpreted as a multivariate map; this is implementedusingmapply. Whenx is a data.framemap will iterateover columns, however the return value is adata.frame.p canbe used to map over a subset ofx.
flatmap will dispatch tomap. The result is then wrapped byflatten which isunlist by default.
sac is a naive implementation of split-apply-combine and implementedusingflatmap.
vmap is a 'verbose' version ofmap and provides a progress barand a link to parallel map (mclapply).
map,flatmap, andsac can be extended; they are S4generic functions. You don't and should not implement a new method forformulas. This method will coerce a formula into a function and pass it downto your map(newtype, function) method.
Examples
# Sugar for anonymous functionsmap(data.frame(y = 1:10, z = 2), x ~ x + 1)map(data.frame(y = 1:10, z = 2), x ~ x + 1, is.numeric)map(data.frame(y = 1:10, z = 2), x ~ x + 1, x ~ all(x == 2))sac(data.frame(y = 1:10, z = 1:2), df ~ data.frame(my = mean(df$y)), "z")# Trigger a multivariate map with a formulamap(1:2 ~ 3:4, f(x, y) ~ x + y)map(1:2 ~ 3:4, f(x, y) ~ x + y, simplify = TRUE)map(1:2 ~ 3:4, f(x, y, z) ~ x + y + z, z = 1)# Extracting values from listsmap(list(1:2, 3:4), 2)map(list(1:3, 2:5), 2:3)map(list(1:3, 2:5), c(TRUE, FALSE, TRUE))# Some type checking along the waymap(as.numeric(1:2), numeric : x ~ x)map(1:2, integer(1) : x ~ x)map(1:2, numeric(1) : x ~ x + 0.5)Tools for Data Frames
Description
mutar is literally the same function as[.DataFrame and can beused as interface to dplyr or data.table. Other functions here listed are aconvenience to mimic dplyr's syntax in aR CMD check friendly way.These functions can also be used with S4 data.frame(s) / data_frame(s) /data.table(s). They will always try to preserve the input class.
Usage
mutar(x, i, j, ..., by, sby, drop)filtar(x, i)sumar(x, ..., by)withReference(expr)Arguments
x | (DataFrame | data.frame) |
i | (logical | numeric | integer | OneSidedFormula | TwoSidedFormula |FormulaList) see the examples. |
j | (logical | character | TwoSidedFormula | FormulaList | function)character beginning with '^' are interpreted as regular expression |
... | arbitrary number of args |
by | (character) variables to group by. by will be used to dotransformations within groups. sby will collapse each group to one row. |
sby | (character) variables to group by. by will be used to dotransformations within groups. sby will collapse each group to one row. |
drop | (ignored) never drops the class. |
expr | (expression) any R expression that should be evaluated using datatables reference semantics on data transformations. |
Details
The real workhorse of this interface ismutar. All other functionsexist to ease the transition from dplyr.
OneSidedFormula is always used for subsetting rows.
TwoSidedFormula is used instead of name-value expressions. Instead ofwritingx = 1 you simply writex ~ 1.
FormulaList can be used to repeat the same operation on differentcolumns. See more details inFL.
See Also
Examples
data("airquality")airquality %>% filtar(~Month > 4) %>% mutar(meanWind ~ mean(Wind), by = "Month") %>% sumar(meanWind ~ mean(Wind), by = "Month") %>% extract("meanWind")airquality %>% sumar( .n ~ mean(.n) | c("Wind", "Temp"), by = "Month" )# Enable data.tables reference semantics with:withReference({ x <- data.table::data.table(x = 1) mutar(x, y ~ 2)})## Not run: # Use dplyr as back-end:options(dat.use.dplyr = TRUE)x <- data.frame(x = 1)mutar(x, y ~ dplyr::n())## End(Not run)Objects exported from other packages
Description
These objects are imported from other packages. Follow the linksbelow to see their documentation.
- magrittr
Replace elements in a vector
Description
This function replaces elements in a vector. It is a link toreplace as a generic function.
Usage
replace(x, ind, values, ...)## S4 method for signature 'ANY,'function''replace(x, ind, values, ...)## S4 method for signature 'ANY,formula'replace(x, ind, values, ...)## S4 method for signature 'ANY,character'replace(x, ind, values, ...)Arguments
x | (atomic | list) a vector. |
ind | used as index for elements to be replaced. See details. |
values | the values used for replacement. |
... | arguments passed to |
Details
The idea is to provide a more flexible interface for thespecification of the index. It can be a character, numeric, integer orlogical which is then simply used inbase::replace. It can be aregular expression in which casex should be named – a character oflength 1 and a leading "^" is interpreted as regex. Whenind is afunction (or formula) andx is a list then it should be a predicatefunction – see the examples. When x is an atomic the function is appliedon x and the result is used for subsetting.
Examples
replace(c(1, 2, NA), is.na, 0)replace(c(1, 2, NA), rep(TRUE, 3), 0)replace(c(1, 2, NA), 3, 0)replace(list(x = 1, y = 2), "x", 0)replace(list(x = 1, y = 2), "^x$", 0)replace(list(x = 1, y = "a"), is.character, NULL)Verbose apply function
Description
This apply function has a progress bar and enables computations inparallel. By default it is not verbose. As an interactive version with proper'verbose' output by default please usevmap.
Usage
verboseApply(x, f, ..., .mc = 1, .mapper = mclapply, .bar = "none")Arguments
x | (vector) |
f | (function) |
... | arguments passed to |
.mc | (integer) the number of processes to start |
.mapper | (function) the actual apply function used. Should have anargument |
.bar | (character) one in 'none', '.' or 'bar' |
Examples
## Not run: verboseApply( 1:4, function(...) Sys.sleep(1), .bar = "bar", .mc = 2 )## End(Not run)