- Notifications
You must be signed in to change notification settings - Fork2
Alternative caching backends for `{memoise}` & `{shiny}`.
License
Unknown, MIT licenses found
Licenses found
ThinkR-open/bank
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The goal of{bank}
is to provide alternative backends for caching with{memoise}
&{shiny}
.
# install.packages("remotes")remotes::install_github("thinkr-open/bank")
You’re reading the doc about version : 0.0.0.9002
This README has been compiled on the
Sys.time()#> [1] "2023-03-27 14:58:39 CEST"
Here are the test & coverage results :
devtools::check(quiet=TRUE)#> ℹ Loading bank#> ── R CMD check results ──────────────────────────────────── bank 0.0.0.9002 ────#> Duration: 40.3s#>#> 0 errors ✔ | 0 warnings ✔ | 0 notes ✔
covr::package_coverage()#> bank Coverage: 68.00%#> R/postgres.R: 60.87%#> R/redis.R: 66.67%#> R/mongo.R: 81.01%
When using{bank}
backends with{shiny}
, caching will done at theapp-level, in other words the cache is stored across sessions. Be awareof this behavior if you have sensitive data inside your app, as thismight imply data leakage.
See?shiny::bindCache
With an app-level cache scope, one user can benefit from the work donefor another user’s session. In most cases, this is the best way to getperformance improvements from caching. However, in some cases, thiscould leak information between sessions. For example, if the cache keydoes not fully encompass the inputs used by the value, then data couldleak between the sessions. Or if a user sees that a cached reactivereturns its value very quickly, they may be able to infer that someoneelse has already used it with the same values.
As with any{cachem}
compatible objects, the cache can be manuallyflushed using the$reset()
method – this will calldrop()
onMongoDb,FLUSHALL
in Redis, &DBI::dbRemoveTable()
+DBI::dbCreateTable()
with Postgres.
library(bank)mongo_cache<-cache_mongo$new(db="bank",url="mongodb://localhost:27066",prefix="sn")mongo_cache$reset()
As{bank}
relies on external backends, it’s probably better to let theDBMS handle the flushing of cache. For example, inredis.conf
, you canset :
maxmemory 2mbmaxmemory-policy allkeys-lru
LRU (least recently used) will allow redis to flush the key based onwhen they were used. Seehttps://redis.io/topics/lru-cache.
MongoDB doesn’t come with a LRU mechanism, but you can set data to beephemeral withTTLindex inside yourcollection.
{bank}
also tries to help with that by updating alastAccessed
datemetadata field in Mongo whenever you$get()
the key, meaning that youcan implement your own caching strategy to evict least recently usedcached objects.
Postgrebytea
column can only store up to 1GB elements, so you can’twrite a cache that’s > 1GB.
Note that if you want to use{bank}
in a{shiny}
app:
renderCachedPlot()
require{shiny}
version 1.5.0 or higherbindCache()
require{shiny}
version 1.6.0 or higher
For now, the following backends are supported:
Launching a container with mongo.
docker run --rm --name mongobank -d -p 27066:27017 -e MONGO_INITDB_ROOT_USERNAME=bebop -e MONGO_INITDB_ROOT_PASSWORD=aloula mongo:4
First, thecache_mongo
can be used
library(memoise)library(bank)# Create a mongo cache.# The arguments will be passed to mongo::gridfsmongo_cache<-cache_mongo$new(db="bank",url="mongodb://bebop:aloula@localhost:27066",prefix="sn")#> Loading required namespace: mongolitef<-function(x) { sample(1:1000,x)}mf<- memoise(f,cache=mongo_cache)mf(5)#> [1] 31 609 671 766 219mf(5)#> [1] 31 609 671 766 219
Here is a first simple application that shows you the basics :
library(shiny)ui<- fluidPage(# Creating a slider input that will be used as a cache key sliderInput("nrow","NROW",1,32,32),# Plotting a piece of mtcars plotOutput("plot"))server<-function(input,output,session) {output$plot<- renderCachedPlot( {# Pretending this takes a long time Sys.sleep(2) plot(mtcars[1:input$nrow, ]) },cacheKeyExpr=list(# Defining the cache keyinput$nrow ),# Using our mongo cachecache=mongo_cache )}shinyApp(ui,server)
As you can see, the first time you set the slider to a given value, ittakes a little bit to compute. Then it’s almost instantaneous.
Let’s try a more complex application:
# We'll put everything in a function so that it can later be reused with other backendslibrary(magrittr)generate_app<-function(cache_object) {ui<- fluidPage( h1( sprintf("Caching in an external DB using %s", deparse( substitute(cache_object) ) ) ), sidebarLayout(sidebarPanel= sidebarPanel(# This sliderInput will be the cache key# i.e we don't want to recompute the plot everytime sliderInput("nrow","Nrow",1,32,32),# Allow to clear the cache actionButton("clear","Clear Cache") ),mainPanel= mainPanel(# Outputing the reactive and a plot verbatimTextOutput("txt"), plotOutput("plot"),# If you care about listing the cache keys uiOutput("keys") ) ) )server<-function(input,output,session) {# Our plot, cached using the cache object and# watching the nrowoutput$plot<- renderCachedPlot( { showNotification( h2("I'm computing the plot"),type="message" )# Fake long computation Sys.sleep(2)# Plot plot(mtcars[1:input$nrow, ]) },# We cache on the input$nrowcacheKeyExpr=list(input$nrow ),# The cache object is used herecache=cache_object )rc<- reactive({ showNotification( h2("I'm computing the reactive()"),type="message" )# Fake long computation Sys.sleep(2)input$nrow*100 }) %>%# Using bindCache() require shiny > 1.6.0 bindCache(input$nrow,cache=cache_object )output$txt<- renderText({ rc() })keys<- reactive({# Listing the keys invalidateLater(500)cache_object$keys() })output$keys<- renderUI({tags$ul( lapply(keys(),tags$li) ) }) observeEvent(input$clear, {# Sometime you might want to remove everything from the cachecache_object$reset() showNotification( h2("Cache reset"),type="message" ) }) } shinyApp(ui,server)}generate_app(mongo_cache)
All keys registered to MongoDB comes with ametadata.lastAccessed
parameter. Using this parameter, you’ll be able to flush old cache ifneeded.
mongo<-mongolite::gridfs(db="bank",url="mongodb://bebop:aloula@localhost:27066",prefix="sn")get_metadata<-function(mongo) {purrr::map(mongo$find()$metadata,jsonlite::fromJSON)}Sys.sleep(10)mf(5)#> [1] 31 609 671 766 219get_metadata(mongo)#> [[1]]#> [[1]]$key#> [1] "116acf5d3c7188709a0374305ba3a33747ef5ce323f3e170862551ed523d7a425658d2fb50d11a557250935326669e0a37efb90205d2ba114795359bb55234ca"#>#> [[1]]$lastAccessed#> [1] "2023-03-27 15:00:26"Sys.sleep(10)mf(5)#> [1] 31 609 671 766 219get_metadata(mongo)#> [[1]]#> [[1]]$key#> [1] "116acf5d3c7188709a0374305ba3a33747ef5ce323f3e170862551ed523d7a425658d2fb50d11a557250935326669e0a37efb90205d2ba114795359bb55234ca"#>#> [[1]]$lastAccessed#> [1] "2023-03-27 15:00:36"
Launching a container with redis.
docker run --rm --name redisbank -d -p 6379:6379 redis:5.0.5 --requirepass bebopalula
# Create a redis cache.# The arguments will be passed to redux::hiredisredis_cache<-cache_redis$new(password="bebopalula")#> Loading required namespace: reduxf<-function(x) { sample(1:1000,x)}mf<- memoise(f,cache=redis_cache)mf(5)#> [1] 680 901 873 651 605mf(5)#> [1] 680 901 873 651 605
Here is a first simple application that shows you the basics :
ui<- fluidPage(# Creating a slider input that will be used as a cache key sliderInput("nrow","NROW",1,32,32),# Plotting a piece of mtcars plotOutput("plot"))server<-function(input,output,session) {output$plot<- renderCachedPlot( {# Pretending this takes a long time Sys.sleep(2) plot(mtcars[1:input$nrow, ]) },cacheKeyExpr=list(# Defining the cache keyinput$nrow ),# Using our redis cachecache=redis_cache )}shinyApp(ui,server)
For the larger app:
generate_app(redis_cache)
Launching a container with postgres.
docker run --rm --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d -p 5433:5432 postgres
# Create a postgres cache.# The arguments will be passed to DBI::dbConnect(RPostgres::Postgres(), ...)postgres_cache<-cache_postgres$new(dbname="postgres",host="localhost",port=5433,user="postgres",password="mysecretpassword")#> Loading required namespace: RPostgresf<-function(x) { sample(1:1000,x)}mf<- memoise(f,cache=postgres_cache)mf(5)#> [1] 758 764 404 689 557mf(5)#> [1] 758 764 404 689 557
Here is a first simple application that shows you the basics :
ui<- fluidPage(# Creating a slider input that will be used as a cache key sliderInput("nrow","NROW",1,32,32),# Plotting a piece of mtcars plotOutput("plot"))server<-function(input,output,session) {output$plot<- renderCachedPlot( {# Pretending this takes a long time Sys.sleep(2) plot(mtcars[1:input$nrow, ]) },cacheKeyExpr=list(# Defining the cache keyinput$nrow ),# Using our postgres cachecache=postgres_cache )}shinyApp(ui,server)
For the larger app:
generate_app(postgres_cache)
As we are deporting the caching to an external DBMS, the query will ofcourse be slower than using memory cache of disk cache. But thisdifference in speed comes with a simpler scalability of the caching, asseveral instances of the app can rely on the same caching backendwithout the need to be on the same machine.
library(magrittr)library(bank)big_iris<-purrr::rerun(100,iris) %>%data.table::rbindlist()#> Warning: `rerun()` was deprecated in purrr 1.0.0.#> ℹ Please use `map()` instead.#> # Previously#> rerun(100, iris)#>#> # Now#> map(1:100, ~iris)#> This warning is displayed once every 8 hours.#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was#> generated.nrow(big_iris)#> [1] 15000pryr::object_size(big_iris)#> 542.11 kBlibrary(cachem)mem_cache<- cache_mem()disk_cache<- cache_disk()mongo_cache<-cache_mongo$new(db="bank",url="mongodb://bebop:aloula@localhost:27066",prefix="sn")redis_cache<-cache_redis$new(password="bebopalula")postgres_cache<-cache_postgres$new(dbname="postgres",host="localhost",port=5433,user="postgres",password="mysecretpassword")bench::mark(mem_cache=mem_cache$set("iris",big_iris),disk_cache=disk_cache$set("iris",big_iris),mongo_cache=mongo_cache$set("iris",big_iris),redis_cache=redis_cache$set("iris",big_iris),postgres_cache=postgres_cache$set("iris",big_iris),check=FALSE,iterations=100)#> # A tibble: 5 × 6#> expression min median `itr/sec` mem_alloc `gc/sec`#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>#> 1 mem_cache 8.04µs 9.18µs 94621. 2.23KB 956.#> 2 disk_cache 3.3ms 3.4ms 283. 16.19KB 0#> 3 mongo_cache 9.6ms 13.02ms 71.1 2.52MB 4.54#> 4 redis_cache 4.6ms 5.69ms 167. 536.58KB 1.68#> 5 postgres_cache 1.42ms 1.93ms 419. 626.09KB 0bench::mark(mem_cache=mem_cache$get("iris"),disk_cache=disk_cache$get("iris"),mongo_cache=mongo_cache$get("iris"),redis_cache=redis_cache$get("iris"),postgres_cache=postgres_cache$get("iris"),iterations=100)#> # A tibble: 5 × 6#> expression min median `itr/sec` mem_alloc `gc/sec`#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>#> 1 mem_cache 6.11µs 6.52µs 144938. 0B 0#> 2 disk_cache 556.12µs 605.16µs 1503. 528.62KB 15.2#> 3 mongo_cache 7.85ms 10.01ms 88.3 2.06MB 6.64#> 4 redis_cache 3.68ms 5.17ms 182. 1.03MB 3.71#> 5 postgres_cache 3.05ms 3.85ms 194. 566.62KB 3.97
docker stop mongobank redisbank postgresbank
If you have any other backend in mind, feel free to open an issue hereand we’ll discuss the possibility of implementing it in{bank}
.
Please note that the bank project is released with aContributor CodeofConduct.By contributing to this project, you agree to abide by its terms.
About
Alternative caching backends for `{memoise}` & `{shiny}`.