- Notifications
You must be signed in to change notification settings - Fork6
Prepare objects for serialization with a consistent interface
License
Unknown, MIT licenses found
Licenses found
rstudio/bundle
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Typically, models in R exist in memory and can be saved as.rds files.However, some models store information in locations that cannot be savedusingsave() orsaveRDS() directly. The goal of bundle is to providea common interface to capture this information, situate it within aportable object, and restore it for use in new settings.
You can install the released version of bundle fromCRAN with:
install.packages("bundle")And the development version fromGitHub with:
# install.packages("pak")pak::pak("rstudio/bundle")
We often imagine a trained model as a somewhat “standalone” Robject—given some new data, the R object can generate predictions on itsown. In reality, some types of model objects also make use ofreferences to generate predictions. A reference is a piece ofinformation that a model object refers to that isn’t part of the objectitself; this could be anything from a connection with a server to aninternal function in the package used to train the model. When we callpredict(), model objects know where to look to retrieve that data, butsaving model objects can sometimes disrupt those references. Thus, if wewant to train a model, save it, re-load it into memory in a productionsetting, and generate predictions with it, we may run into issuesbecause those references do not exist in the new computationalenvironment.
We need some way to preserve access to those references. The bundlepackage provides a consistent interface forbundling model objectswith their references so that they can be safely saved and re-loaded inproduction:
For more on this diagram, see themain bundlevignette.
When you’re ready to save your model,bundle() it first. Once you’veloaded it in a new setting,unbundle() it!
The bundle package prepares model objects so that they can beeffectively saved and re-loaded for use in new R sessions. Todemonstrate using bundle, we will train a boosted tree model usingXGBoost, bundle it, and then pass thebundle into another R session to generate predictions on new data.
First, load needed packages:
library(bundle)library(parsnip)library(callr)library(waldo)
Fit the boosted tree model:
# fit an boosted tree with xgboost via parsnipmod<- boost_tree(trees=5,mtry=3)|> set_mode("regression")|> set_engine("xgboost")|> fit(mpg~.,data=mtcars[1:25, ])mod#> parsnip model object#>#> ##### xgb.Booster#> call:#> xgboost::xgb.train(params = list(eta = 0.3, max_depth = 6, gamma = 0,#> colsample_bytree = 1, colsample_bynode = 0.3, min_child_weight = 1,#> subsample = 1, nthread = 1, objective = "reg:squarederror"),#> data = x$data, nrounds = 5, evals = x$watchlist, verbose = 0)#> # of features: 10#> # of rounds: 5#> callbacks:#> evaluation_log#> evaluation_log:#> iter training_rmse#> <int> <num>#> 1 4.655552#> 2 3.648086#> 3 2.877980#> 4 2.302021#> 5 1.850090
Note that simply saving and loading the model results in changes to thefitted model:
temp_file<- tempfile()saveRDS(mod,temp_file)mod2<- readRDS(temp_file)compare(mod,mod2,ignore_formula_env=TRUE)#> `old$fit$ptr` is <pointer: 0x11a0cf990>#> `new$fit$ptr` is <pointer: 0x10bed38a0>
Saving and reloadingmod2 didn’t preserve XGBoost’s reference to itspointer, which may result in failures later in the modeling process.
We thus need to prepare the fitted model to be saved before passing itto another R session. We can do so by bundling it:
# bundle the modelbundled_mod<- bundle(mod)bundled_mod#> bundled model_fit object.
Passing the model to another R session and generating predictions on newdata:
# load the model in a fresh R session and predict on new datar(func=function(bundled_mod) { library(bundle) library(parsnip)unbundled_mod<- unbundle(bundled_mod) predict(unbundled_mod,new_data=mtcars[26:32, ]) },args=list(bundled_mod=bundled_mod ))#> # A tibble: 7 × 1#> .pred#> <dbl>#> 1 28.6#> 2 25.8#> 3 25.8#> 4 16.7#> 5 20.2#> 6 15.3#> 7 21.2
For a more in-depth demonstration of the package, see themainvignette withvignette("bundle").
This project is released with aContributor Code ofConduct.By contributing to this project, you agree to abide by its terms.
For questions and discussions about our packages, modeling, andmachine learning, pleasepost on RStudioCommunity.
If you think you have encountered a bug, pleasesubmit anissue.
Either way, learn how to create and share areprex(a minimal, reproducible example), to clearly communicate about yourcode.
About
Prepare objects for serialization with a consistent interface
Resources
License
Unknown, MIT licenses found
Licenses found
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors7
Uh oh!
There was an error while loading.Please reload this page.
