Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Simplified R caching for reproducible big data projects

License

NotificationsYou must be signed in to change notification settings

databio/simpleCache

Repository files navigation

Travis CI status

simpleCache is an R package providing functions for caching R objects. Itspurpose is to encourage writing reusable, restartable, and reproducible analysispipelines for projects with massive data and computational requirements.

Like its name indicates,simpleCache is intended to be simple. You choose alocation to store your caches, and then provide the function with nothing morethan a cache name and instructions (R code) for how to produce the R object.While simple,simpleCache also provides some advanced options like environmentassignments, recreating caches, reloading caches, and even cluster computebindings (using thebatchtools package) making it flexible enough for use inlarge-scale data analysis projects.


Installing simpleCache

simpleCache is onCRAN and canbe installed as usual:

install.packages("simpleCache")

Running simpleCache

simpleCache comes with a single primary function (simpleCache()) that will do almosteverything you need. In short, you run it with a few lines like this:

library(simpleCache) setCacheDir(tempdir())simpleCache("normSample", { rnorm(1e7, 0,1) }, recreate=TRUE)simpleCache("normSample", { rnorm(1e7, 0,1) })

simpleCache also interfaces with thebatchtools package to let you buildcaches on any cluster resource manager.


Highlights of exported functions

  • simpleCache(): Creates and caches or reloads cached results of provided R instruction code
  • listCaches(): Lists all of the caches available in thecacheDir
  • deleteCaches(): Deletes cache(s) from thecacheDir
  • setCacheDir(): Sets a global option for a cache directory so you don't have to specify one in eachsimpleCache call
  • simpleCacheOptions(): Views all of thesimpleCache global options that have been set

simpleCache Philosophy

The use case I had in mind forsimpleCache is that you find yourselfconstantly recalculating the same R object in several different scripts, orrepeatedly in the same script, every time you open it and want to continue thatproject. SimpleCache is well-suited for interactive analysis, allowing you topick up right where you left off in a new R session, without having torecalculate everything. It is equally useful in automatic pipelines, whereseparate scripts may benefit from loading, instead of recalculating, the same Robjects produced by other scripts.

R provides some base functions (save,serialize, andload) to let you saveand reload such objects, but these low-level functions are a bit cumbersome.simpleCache simply provides a convenient, user-friendly interface to thesefunctions, streamlining the process. For example, a singlesimpleCache callwill check for a cache and load it if it exists, or create it if it does not.With the base Rsave andload functions, you can't just write a singlefunction call and then run the same thing every time you start the script --even this simple use case requires additional logic to check for an existingcache.simpleCache just does all this for you.

The thing to keep in mind withsimpleCache is thatthe cache name isparamount.simpleCache assumes that your name for an object is a perfectidentifier for that object; in other words, don't cache things that you plan tochange.

Contributing

simpleCache is licensed under the2-Clause BSD License. Questions, feature requests and bug reports are welcome via theissue queue. The maintainer will review pull requests and incorporate contributions at his discretion.

For more information refer to the contributing document and pull request / issue templates in the.github folder of this repository.

About

Simplified R caching for reproducible big data projects

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors4

  •  
  •  
  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp