- Notifications
You must be signed in to change notification settings - Fork1k
R's data.table package extends data.frame:
License
Rdatatable/data.table
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
data.table provides a high-performance version ofbase R'sdata.frame with syntax and feature enhancements for ease of use, convenience and programming speed.
Thedata.table project uses acustom governance agreementand is fiscally sponsored byNumFOCUS. Consider makingatax-deductible donation to help the projectpay for developer time, professional services, travel, workshops, and a variety of other needs.
- concise syntax: fast to type, fast to read
- fast speed
- memory efficient
- careful API lifecycle management
- community
- feature rich
- fast and friendly delimitedfile reader:
?fread, see alsoconvenience features forsmall data - fast and feature rich delimitedfile writer:
?fwrite - low-levelparallelism: many common operations are internally parallelized to use multiple CPU threads
- fast and scalable aggregations; e.g. 100GB in RAM (seebenchmarks on up totwo billion rows)
- fast and feature rich joins:ordered joins (e.g. rolling forwards, backwards, nearest and limited staleness),overlapping range joins (similar to
IRanges::findOverlaps),non-equi joins (i.e. joins using operators>, >=, <, <=),aggregate on join (by=.EACHI),update on join - fast add/update/delete columnsby reference by group using no copies at all
- fast and feature richreshaping data:
?dcast(pivot/wider/spread) and?melt(unpivot/longer/gather) - any R function from any R package can be used in queries not just the subset of functions made available by a database backend, also columns of type
listare supported - hasno dependencies at all other than base R itself, for simpler production/maintenance
- the R dependency isas old as possible for as long as possible, dated April 2014, and we continuously test against that version; e.g. v1.11.0 released on 5 May 2018 bumped the dependency up from 5 year old R 3.0.0 to 4 year old R 3.1.0
install.packages("data.table")# latest development version (only if newer available)data.table::update_dev_pkg()# latest development version (force install)install.packages("data.table",repos="https://rdatatable.gitlab.io/data.table")
Seethe Installation wiki for more details.
Usedata.table subset[ operator the same way you would usedata.frame one, but...
- no need to prefix each column with
DT$(likesubset()andwith()but built-in) - any R expression using any package is allowed in
jargument, not just list of columns - extra argument
byto computejexpression by group
library(data.table)DT= as.data.table(iris)# FROM[WHERE, SELECT, GROUP BY]# DT [i, j, by]DT[Petal.Width>1.0, mean(Petal.Length),by=Species]# Species V1#1: versicolor 4.362791#2: virginica 5.552000
- Introduction to data.table vignette
- Getting started wiki page
- Examples produced by
example(data.table)
data.table is widely used by the R community. It is being directly used by hundreds of CRAN and Bioconductor packages, and indirectly by thousands. It is one of thetop most starred R packages on GitHub, and was highly rated by theDepsy project. If you need help, thedata.table community is active onStackOverflow.
A list of packages that significantly support, extend, or make use ofdata.table can be found in theSeal of Approval document.
- click theWatch button at the top and right of GitHub project page
- readNEWS file
- follow#rdatatable and ther_data_table account on X/Twitter
- follow#rdatatable and ther_data_table account on fosstodon
- follow thedata.table community page on LinkedIn
- watch recentPresentations
- read recentArticles
- read posts onThe Raft
Guidelines for filing issues / pull requests:Contribution Guidelines.
About
R's data.table package extends data.frame:
Resources
License
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.


