Movatterモバイル変換

[0]ホーム

plyr 1.8.9

Fixes forR CMD check

plyr 1.8.8

Update soR CMD check passes cleanly in futureR-devel.

plyr 1.8.7

Update soR CMD check passes cleanly in futureR-devel.

plyr 1.8.6

Update soR CMD check passes cleanly on R andR-devel.

plyr 1.8.5

Update soR CMD check passes cleanly on R andR-devel.

Version 1.8.4

Update soR CMD check passes cleanly on R andR-devel.

Version 1.8.3

Revert to C version ofloop_apply() as Rcpp version wasappears to be having PROTECTion problems. (Also fixes #256)

Version 1.8.2

Update for changes in R namespace best-practices.
New parameter.id toadply() thatspecifies the name(s) of the index column(s). (Thanks to Kirill Müller,#191)
Fix bug insplit_indices() whenn isn’tsupplied.
Fix bug in.id parameter toldply() andrdply() allowing for.id = NULL to work asdescribed in the help. (Thanks to Doug Mitarotonda, #207, and Marek,#224 and #225)
Deprecate exotic functionsliply() andisplit2(), remove unused and unexported functionsdots() andparallel_fe() (Thanks to KirillMüller, #242, #248)
Warn on duplicate names that cause certain array functions tofail. (Thanks to Kirill Müller, #211)
Parameter.inform is now honored for?_ply() calls. (Thanks to Kirill Müller, #209)

Version 1.8.1

New parameter.id toldply() andrdply() that specifies the name of the index column.(Thanks to Kirill Müller, #107, #140, #142)
The .id column inldply() is generated as a factorto preserve the sort order, but only if the new.idparameter is set. (Thanks to Kirill Müller, #137)
rbind.fill now silently drops NULL inputs(#138)
rbind.fill avoids array copying which had producedquadratic time complexity.*dply of large numbers of groupsshould be faster. (Contributed by Peter Meilstrup)
rbind.fill handles non-numeric matrix columns(i.e. factor arrays, character arrays, list arrays); also arrays withmore than 2 dimensions can be used. Dimnames of array columns are nowpreserved. (Contributed by Peter Meilstrup)
rbind.fill(x,y) converts factor columns of Y tocharacter when columns of X are character.join(x,y) andmatch_df(x,y) now work when the key column in X ischaracter and Y is factor. (Contributed by Peter Meilstrup)
Fix faulty array allocation which caused problems when usingsplit_indices with large (> 2^24) vectors. (Fixes#131)
list_to_array() incorrectly determined dimensions ifcolumn of labels contained any missing values (#169).
r*ply expression is evaluated exactly.n times, evaluation results are consistent with sideeffects. (#158, thanks to Kirill Müller)

Version 1.8

New features and functions

**ply gain a.inform argument(previously only available inllply) - this gives moreuseful debugging information at the cost of some speed. (Thanks to BrianDiggs, #57)
if.dims = TRUEalply’s output gainsdimensions and dimnames, similar toapply. Sequentialindexing of a list produced byalply should be unaffected.(Peter Meilstrup)
colwise,numcolwise andcatcolwise now all accept additional arguments in ….(Thanks to Stavros Macrakis, #62)
here makes it possible to use**ply + afunction that uses non-standard evaluation (e.g. summarise,mutate,subset,arrange) inside afunction. (Thanks to Peter Meilstrup, #3)
join_all recursively joins a list of data frames.(Fixes #29)
name_rows provides a convenient way of saving andthen restoring row names so that you can preserve them if you need to.(#61)
progress_time (used with.progress = "time") estimates the amount of time remainingbefore the job is completed. (Thanks to Mike Lawrence, #78)
summarise now works iteratively so that latercolumns can refer to earlier. (Thanks to Jim Hester, #44)
take makes it easy to subset along an arbitrarydimension.
Improved documentation thanks to patches from Tim Bates.

Parallel plyr

**ply gains a.paropts argument, a listof options that is passed ontoforeach for controllingparallel computation.
*_ply now accepts.parallel argument toenable parallel processing. (Fixes #60)
Progress bars are disabled when using parallel plyr (Fixes#32)

Performance improvements

a*ply: 25x speedup when indexing array objects, 3xspeedup when indexing data frames. This should substantially reduce theoverhead of usinga*ply
d*ply subsetting has been considerably optimised:this will have a small impact unless you have a very large number ofgroups, in which case it will be considerably faster.
idata.frame: Subsetting immutable data frames with[.idf is now faster (Peter Meilstrup)
quickdf is around 20% faster
split_indices, which powers much internal splittingcode (likevaggregate,join andd*ply) is about 2x faster. It was already incredibly fast~0.2s for 1,000,000 obs, so this won’t have much impact on overallperformance

Bug fixes

*aply functions now bind list mode results into alist-array (Peter Meilstrup)
*aply now accepts 0-dimension arrays as inputs.(#88)
count now works correctly for factor and Dateinputs. (Fixes #130)
*dply now deals better with matrix results,converting them to data frames, rather than vectors. (Fixes#12)
d*ply will now preserve factor levels input ifdrop = FALSE (#81)
join works correctly when there are no common rows(Fixes #74), or when one input has no rows (Fixes #48). It alsoconsistently orders the columns: common columns, then x cols, then ycols (Fixes #40).
quickdf correctly handles NA variable names. (Fixes#66. Thanks to Scott Kostyshak)
rbind.fill andrbind.fill.matrix workconsistently with matrices and data frames with zero rows. Fixes #79.(Peter Meilstrup)
rbind.fill now stops if inputs are not data frames.(Fixes #51)
rbind.fill now works consistently with 0 column dataframes
round_any now works withPOSIXctobjects, thanks to Jean-Olivier Irisson (#76)

Version 1.7.1

Fix bug in id, using numeric instead of integer

Version 1.7

rbind.fill: if a column contains both factors andcharacters (in different inputs), the resulting column will be coercedto character
When there are more than 2^31 distinct combinationsid, switches to a slower fallback strategy using strings(inspired bymerge) that guarantees correct results. Thisfixes problems withjoin when joining across many columns.(Fixes #63)
split_indices checks input more aggressively toprevent segfaults. Fixes #43.
fix small bug inloop_apply which lead to segfaultsin certain circumstances. (Thanks to Pål Westermark for patch)
itertools anditerators moved tosuggests from imports so that plyr now only depends on base R.

Version 1.6

documentation improved using new features ofroxygen2
fixed namespacing issue which lead to lost labels when subsettingthe results of*lply
colwise automatically strips off splitvariables.
rlply now correctly deals withrlply(4, NULL) (thanks to bug report from EricGoldlust)
rbind.fill tries harder to keep attributes,retaining the attributes from the first occurrence of each column itfinds. It also now works with variables of classPOSIXltand preserves the ordered status of factors.
arrange now works with one column dataframes

Version 1.5.2

d*ply returns correct number of rows when functionreturns vector
fix NAMESPACE bug which was causing problems withggplot2

Version 1.5.1

rbind.fill now treats 1d arrays in the same way asrbind (i.e. it turns them into ordinary vectors)
fix bug in rename when renaming multiple columns

Version 1.5 (2011-03-02)

New features

newstrip_splits function removes splittingvariables from the data frames returned byddply.
rename moved in from reshape, andrewritten.
newmatch_df function makes it easy to subset a dataframe to only contain values matching another data frame. Inspired byhttp://stackoverflow.com/questions/4693849.

Bug fixes

**ply now works when passed a list offunctions
*dply now correctly names output even when someoutput combinations are missing (NULL) (Thanks to bug report from KarlOve Hufthammer)
*dply preserves the class of many more objecttypes.
a*ply now correctly works with zero length margins,operating on the entire object (Thanks to bug report from StavrosMacrakis)
join now implements joins in a more SQL like way,returning all possible matches, not just the first one. It is still a(little) faster than merge. The previous behaviour is accessible withmatch = "first".
join is now more symmetric so thatjoin(x, y, "left") is closer tojoin(y, x, "right"), modulo column ordering
named.quoted failed when quoted expressions werelonger than 50 characters. (Thanks to bug report from EricGoldlust)
rbind.fill now correctly maintains POSIXct tzoneattributes and preserves missing factor levels
split_labels correctly preserves empty factorlevels, which means thatdrop = FALSE should work in moreplaces. Usebase::droplevels to remove levels that don’toccur in the data, anddrop = T to remove combinations oflevels that don’t occur.
vaggregate now passes... to theaggregation function when working out the output type (thanks to bugreport by Pavan Racherla)

Version 1.4.1 (2011-04-05)

Add citation to JSS article

Version 1.4 (2011-01-03)

count now takes an additional parameterwt_var which allows you to compute weighted sums. This isas fast, or faster than,tapply orxtabs.
Really fix bug innames.quoted
. now captures the environment in which it wasevaluated. This should fix an esoteric class of bugs which no-oneprobably ever encountered, but will form the basis for an improvedversion ofggplot2::aes.

Version 1.3.1 (2010-12-30)

Fix bug innames.quoted that interfered withggplot2

Version 1.3 (2010-12-28)

New features

new functionmutate that works like transform to addnew columns or overwrite existing columns, but computes new columnsiteratively so later transformations can use columns created by earliertransformations. (It’s also about 10x faster) (Fixes #21)

Bug fixes

split column names are no longer coerced to valid Rnames.
quickdf now adds names if missing
summarise preserves variable names if explicit namesnot provided (Fixes #17)
arrays with names should be sorted correctly onceagain (also fixed a bug in the test case that prevented me from catchingthis automatically)
m_ply no longer possesses .parallel argument(mistakenly added)
ldply (and henceadply andddply) now correctly passes on .parallel argument (Fixes#16)
id uses a better strategy for converting tointegers, making it possible to use for cases with larger potentialnumbers of combinations

Version 1.2.1 (2010-09-10)

Fix bug in llply fast path that causes problems with ggplot2.

Version 1.2 (2010-09-09)

New features

l*ply,d*ply,a*ply andm*ply all gain a .parallel argument that whenTRUE, applies functions in parallel using a parallelbackend registered with the foreach package:

x<-seq_len(20)wait<-function(i)Sys.sleep(0.1)system.time(llply(x, wait))#  user  system elapsed# 0.007   0.005   2.005doParallel::registerDoParallel(2)system.time(llply(x, wait,.parallel =TRUE))#  user  system elapsed# 0.020   0.011   1.038

This work has been generously supported by BD (BectonDickinson).

Minor changes

aply and mply gain an .expand argument that controlswhether data frames produce a single output dimension (one element foreach row), or an output dimension for each variable.
new vaggregate (vector aggregate) function, which is equivalentto tapply, but much faster (~ 10x), since it avoids copying thedata.
llply: for simple lists and vectors, with no progress bar, noextra info, and no parallelisation, llply calls lapply directly to avoidall the overhead associated with those unused extra features.
llply: in serial case, for loop replaced with custom C functionthat takes about 40% less time (or about 20% less time than lapply).Note that as a whole, llply still has much more overhead thanlapply.
round_any now lives in plyr instead of reshape

Bug fixes

list_to_array works correct even when there are missingvalues in the array. This is particularly important for daply.

Version 1.1 (2010-07-19)

*dply deals more gracefully with the case when allresults are NULL (fixes #10)
*aply correctly orders output regardless ofdimension names (fixes #11)
join gains type = “full” which preserves all x and yrows

Version 1.0 (2010-07-02)

New functions

arrange, a new helper method for reordering a data frame.
count, a version of table that returns data frames immediately andthat is much much faster for high-dimensional data.
desc makes it easy to sort any vector in descending order
join, works like merge but can be much faster and has a somewhatsimpler syntax drawing from SQL terminology
rbind.fill.matrix is like rbind.fill but works for matrices, codecontributed by C. Beleites

Speed improvements

experimental immutable data frame (idata.frame) that vastlyspeeds up subsetting - for large datasets with large numbers of groups,this can yield 10-fold speed ups. See examples in ?idata.frame to seehow to use it.
rbind.fill rewritten again to increase speed and work with moredata types
d*ply now much faster with nested groups
This work has been generously supported by BD (BectonDickinson).

Version 0.2

New features:

d*ply now accepts NULL for splitting variables, indicating that thedata should not be split
plyr no longer exports internal functions, many of which werecausing clashes with other packages
rbind.fill now works with data frame columns that are lists ormatrices
test suite ensures that plyr behaviour is correct and will remaincorrect as I make future improvements.

Bug fixes:

**ply: if zero splits, empty list(), data.frame() or logical()returned, as appropriate for the output type
**ply: leaving .fun as NULL now always returns list (thanks toStavros Macrakis for the bug report)
a*ply: labels now respect options(stringAsFactors)
each: scoping bug fixed, thanks to Yasuhisa Yoshida for the bugreport
list_to_dataframe is more consistent when processing a single dataframe
NAs preserved in more places
progress bars: guaranteed to terminate even if **ply prematurelyterminates
progress bars: misspelling gives informative warning, instead ofuninformative error
splitter_d: fixed ordering bug when .drop = FALSE

Version 0.1.9 (2009-06-23)

fix bug in rbind.fill when NULLs present in list
improve each to recognise when all elements are numeric
fix labelling bug ind*ply when .drop = FALSE
additional methods for quoted objects
add summarise helper - this function is like transform, but createsa new data frame rather than reusing the old (thanks to Brendan O’Connorfor the neat idea)

Version 0.1.8 (2009-04-20)

made rbind a little faster (~20%) using an idea from RichardRaubertas
daply now works correctly when splitting variables that containempty factor levels

Version 0.1.7 (2009-04-15)

Version of rbind.fill that copies attributes.

Version 0.1.6 (2009-04-15)

Improvements:

all ply functions deal more elegantly when given function names: cansupply a vector of function names, and name is used as label inoutput
failwith and each now work with function names as well as functions(i.e. “nrow” instead of nrow)
each now accepts a list of functions or a vector of functionnames
l*ply will use list names where present
if .inform is TRUE, error messages will give you information aboutwhere errors within your data - hopefully this will make problems easierto track down
d*ply no longer converts splitting variables to factors when drop =T (thanks to bug report from Charlotte Wickham)

Speed-ups

massive speed ups for splitting large arrays
fixed typo that was causing a 50% speed penalty for d*ply
rewritten rbind.fill is considerably (> 4x) faster for many dataframes
colwise about twice as fast

Bug fixes:

daply: now works when the data frame is split by multiplevariables
aaply: now works with vectors
ddply: first variable now varies slowest as you’d expect

Version 0.1.5 (2009-02-23)

colwise now accepts a quoted list as its second argument. Thisallows you to specify the names of columns to work on: colwise(mean,.(lat, long))
d_ply and a_ply now correctly pass … to the function

Version 0.1.4 (2008-12-12)

Greatly improved speed (> 10x faster) and memory usage (50%) forsplitting data frames with many combinations
Splitting variables containing missing values now handledconsistently

Version 0.1.3 (2008-11-19)

Fixed problem where when splitting by a variable that containedmissing values, missing combinations would be drop, and labels wouldn’tmatch up

Version 0.1.2 (2008-11-18)

a*ply now works correctly with array-lists
drop. -> .drop
r*ply now works with …
use inherits instead of is so method package doesn’t need to beloaded
fix bug with using formulas

Version 0.1.1 (2008-10-08)

argument names now start with . (instead of ending with it) - thisshould prevent name clashes with arguments of the called function
return informative error if .fun is not a function
use full names in all internal calls to avoid argument nameclashes

[8]ページ先頭