MarcellGranat/currrPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star20

The currr package is a wrapper for the purrr::map() family but extends the iteration process with a certain number of checkpoints ( currr = checkpoints + purr), where the evaluated results are saved, and we can always restart from there.

License

View license

20 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
R		R
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md

Repository files navigation

currr

Overview

A long journey is best broken into small steps, and the importance oftaking a rest must never be underestimated.

Thecurrr package is a wrapper for thepurrr::map() family butextends the iteration process with a certain number ofcheckpoints(currr =checkpoints +purrr), where the evaluated results aresaved, and we can always restart from there.

Implementations of the family of map() functions with a frequent savingof the intermediate results. The contained functions let youstartthe evaluation of the iterationswhere you stopped (reading thealready evaluated ones from the cache), andwork with the currentlyevaluated iterations while the remaining ones are running in abackground job.Parallel computing is also easier with theworkersparameter.

Installation

install.packages("currr")

Usage

The following example usescurrr to present an everyday issue: run atime-demanding iteration, but you want to rerun it again.

library(tidyverse)library(currr)options(currr.folder=".currr",currr.wait=Inf)# folder in your wd, where to save cache dataavg_n<-function(.data,.col,x) {# meaningless function that takes about 1 sec  Sys.sleep(1).data|>dplyr::pull({{.col }})|>    (\(m) mean(m)*x) ()}

Checkpoints

tictoc::tic(msg="First evaluation")cp_map(.x=1:50,.f=avg_n,.data=iris,.col=Sepal.Length,name="iris_mean")|>   head(3)#> [[1]]#> [1] 5.843333#>#> [[2]]#> [1] 11.68667#>#> [[3]]#> [1] 17.53tictoc::toc()# ~ 1:50 => 50 x 1 sec#> First evaluation: 50.351 sec elapsedtictoc::tic(msg="Second evaluation")cp_map(.x=1:50,.f=avg_n,.data=iris,.col=Sepal.Length,name="iris_mean")|>   head(3)#> ✓ Everything is unchanged. Reading cache.#> [[1]]#> [1] 5.843333#>#> [[2]]#> [1] 11.68667#>#> [[3]]#> [1] 17.53tictoc::toc()# ~ 0 sec#> Second evaluation: 0.034 sec elapsed

If the.x input and.f are the same, then the 2nd time you call thefunction, it reads the outcome from the specified folder (.currr).Also if.x changes, but some of its part remain the same, then thatpart is taken from the previously saved results, and only the newelements of.x are called for evaluation. (If.f changes, then theprocess will start from zero.)

tictoc::tic(msg="Partly modification")cp_map(.x=20:60,.f=avg_n,.data=iris,.col=Sepal.Length,name="iris_mean")|>   head(3)#> ⚠ .x has changed. Looking for mathcing result to save them as cache#> ◌ Cache updated based on the new .x values#> [[1]]#> [1] 116.8667#>#> [[2]]#> [1] 122.71#>#> [[3]]#> [1] 128.5533tictoc::toc()# ~ 50:60 => 10 x 1 sec#> Partly modification: 10.378 sec elapsed

You can remove the cache files, if you want to reset the process (orremove the already unnecessary files from your folder).

# only cache files for iris_meanremove_currr_cache("iris_mean")# all cache filesremove_currr_cache()

Parallel process

You can also use multicore process (built on theparallel package).After evaluation, the computation will automatically reset tosequential.

options(currr.workers=5)# <tictoc::tic(msg="Parallel computation")cp_map(.x=1:50,.f=avg_n,.data=iris,.col=Sepal.Length,name="iris_mean")|>   head(3)#> [[1]]#> [1] 5.843333#>#> [[2]]#> [1] 11.68667#>#> [[3]]#> [1] 17.53tictoc::toc()# ~ 50 / 5 => 10 sec#> Parallel computation: 21.159 sec elapsed

Background process

This is another functionality that makescurrr to be cool. Working inRStudio you can set thewait parameter to 0-1/1+, define how manyiterations you want to wait, and then let R work on the remainingiterations in the background, while you can work with the evaluatedones. If wait < 1, then it is interpreted as what proportion of theiterations you want to wait. Whenever you recall the function, it willreturn the already evaluated ones (use thefill parameter to specifywhether you want to getNULLs to the pending ones.)

options(currr.wait=20,currr.fill=FALSE)

In the example above, you get your results, when 20 iterations areevaluated, but the job in the background keeps running.

About

marcellgranat.com/currr

Languages

R100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

currr

Overview

Installation

Usage

Checkpoints

Parallel process

Background process

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages

Contributors2

Uh oh!

Languages

Movatterモバイル変換

License

MarcellGranat/currr

Folders and files

Latest commit

History

Repository files navigation

currr

Overview

Installation

Usage

Checkpoints

Parallel process

Background process

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages0

Contributors2

Uh oh!

Languages

Packages