Typically, models in R exist in memory and can be saved as.rds files. However, some models store information inlocations that cannot be saved usingsave() orsaveRDS() directly. The goal of bundle is to provide acommon interface to capture this information, situate it within aportable object, and restore it for use in new settings.
You can install the released version of bundle fromCRAN with:
install.packages("bundle")And the development version fromGitHub with:
# install.packages("pak")pak::pak("rstudio/bundle")We often imagine a trained model as a somewhat “standalone” Robject—given some new data, the R object can generate predictions on itsown. In reality, some types of model objects also make use ofreferences to generate predictions. A reference is a piece ofinformation that a model object refers to that isn’t part of the objectitself; this could be anything from a connection with a server to aninternal function in the package used to train the model. When we callpredict(), model objects know where to look to retrievethat data, but saving model objects can sometimes disrupt thosereferences. Thus, if we want to train a model, save it, re-load it intomemory in a production setting, and generate predictions with it, we mayrun into issues because those references do not exist in the newcomputational environment.
We need some way to preserve access to those references. The bundlepackage provides a consistent interface forbundling modelobjects with their references so that they can be safely saved andre-loaded in production:

For more on this diagram, see themain bundlevignette.
When you’re ready to save your model,bundle() it first.Once you’ve loaded it in a new setting,unbundle() it!
The bundle package prepares model objects so that they can beeffectively saved and re-loaded for use in new R sessions. Todemonstrate using bundle, we will train a boosted tree model usingXGBoost, bundle it, and thenpass the bundle into another R session to generate predictions on newdata.
First, load needed packages:
library(bundle)library(parsnip)library(callr)library(waldo)Fit the boosted tree model:
# fit an boosted tree with xgboost via parsnipmod<-boost_tree(trees =5,mtry =3)|>set_mode("regression")|>set_engine("xgboost")|>fit(mpg~ .,data = mtcars[1:25, ])mod#> parsnip model object#>#> ##### xgb.Booster#> call:#> xgboost::xgb.train(params = list(eta = 0.3, max_depth = 6, gamma = 0,#> colsample_bytree = 1, colsample_bynode = 0.3, min_child_weight = 1,#> subsample = 1, nthread = 1, objective = "reg:squarederror"),#> data = x$data, nrounds = 5, evals = x$watchlist, verbose = 0)#> # of features: 10#> # of rounds: 5#> callbacks:#> evaluation_log#> evaluation_log:#> iter training_rmse#> <int> <num>#> 1 4.655552#> 2 3.648086#> 3 2.877980#> 4 2.302021#> 5 1.850090Note that simply saving and loading the model results in changes tothe fitted model:
temp_file<-tempfile()saveRDS(mod, temp_file)mod2<-readRDS(temp_file)compare(mod, mod2,ignore_formula_env =TRUE)#> `old$fit$ptr` is <pointer: 0x11a0cf990>#> `new$fit$ptr` is <pointer: 0x10bed38a0>Saving and reloadingmod2 didn’t preserve XGBoost’sreference to itspointer, which may result in failureslater in the modeling process.
We thus need to prepare the fitted model to be saved before passingit to another R session. We can do so by bundling it:
# bundle the modelbundled_mod<-bundle(mod)bundled_mod#> bundled model_fit object.Passing the model to another R session and generating predictions onnew data:
# load the model in a fresh R session and predict on new datar(func =function(bundled_mod) {library(bundle)library(parsnip) unbundled_mod<-unbundle(bundled_mod)predict(unbundled_mod,new_data = mtcars[26:32, ]) },args =list(bundled_mod = bundled_mod ))#> # A tibble: 7 × 1#> .pred#> <dbl>#> 1 28.6#> 2 25.8#> 3 25.8#> 4 16.7#> 5 20.2#> 6 15.3#> 7 21.2For a more in-depth demonstration of the package, see themainvignette withvignette("bundle").
This project is released with aContributorCode of Conduct. By contributing to this project, you agree to abideby its terms.
For questions and discussions about our packages, modeling, andmachine learning, pleaseposton RStudio Community.
If you think you have encountered a bug, pleasesubmit anissue.
Either way, learn how to create and share areprex(a minimal, reproducible example), to clearly communicate about yourcode.