The main motivation for developing thelibr packageis to create and use data libraries and data dictionaries. Theseconcepts are useful when dealing with sets of related data files. Thelibname() function allows you to define a library for anentire directory of data files. The library can then be manipulated as awhole using thelib_* functions in thelibr package.
There are four mainlibr functions for creating andusing a data library:
libname()lib_load()lib_unload()lib_write()Thelibname() function creates a data library. Thefunction has parameters for the library name and a directory toassociate it with. If the directory has existing data files, those datafiles will be automatically loaded into the library. Once in thelibrary, the data can be accessed using list syntax.
You may create a data library for several different types of files:‘rds’, ‘Rdata’, ‘rda’, ‘csv’, ‘xlsx’, ‘xls’, ‘sas7bdat’, ‘xpt’, and‘dbf’. The type of library is defined using theengineparameter on thelibname() function. The default dataengine is ‘rds’. The data engines will attempt to identify the correctdata type for each column of data. You may also control the data type ofthe columns using theimport_specs parameter and thespecs() andimport_spec() functions.
If you prefer to access the data via the workspace, call thelib_load() function on the library. This function will loadthe library data into the parent frame, where it can be accessed using atwo-level (<library>.<dataset>) name.
When you are done with the data, call thelib_unload()function to remove the data from the parent frame and put it back in thelibrary list. To write any added or modified data to disk, call thelib_write() function. Thelib_write() functionwill only write data that has changed since the last write.
The following example will illustrate some basic functionality of thelibr package regarding the creation of libnames and useof dictionaries. The example first places some sample data in a tempdirectory for illustration purposes. Then the example creates a libnamefrom the temp directory, loads it into memory, adds data to it, and thenunloads and writes everything to disk:
library(libr)# Create temp directorytmp<-tempdir()# Save some data to temp directory# for illustration purposessaveRDS(trees,file.path(tmp,"trees.rds"))saveRDS(rock,file.path(tmp,"rocks.rds"))# Create librarylibname(dat, tmp)# library 'dat': 2 items# - attributes: not loaded# - path: C:\Users\User\AppData\Local\Temp\RtmpCSJ6Gc# - items:# Name Extension Rows Cols Size LastModified# 1 rocks rds 48 4 3.1 Kb 2020-11-05 23:25:34# 2 trees rds 31 3 2.4 Kb 2020-11-05 23:25:34# Examine data dictionary for librarydictionary(dat)# A tibble: 7 x 9# Name Column Class Label Description Format Width Rows NAs# <chr> <chr> <chr> <lgl> <lgl> <lgl> <lgl> <int> <int># 1 rocks area integer NA NA NA NA 48 0# 2 rocks peri numeric NA NA NA NA 48 0# 3 rocks shape numeric NA NA NA NA 48 0# 4 rocks perm numeric NA NA NA NA 48 0# 5 trees Girth numeric NA NA NA NA 31 0# 6 trees Height numeric NA NA NA NA 31 0# 7 trees Volume numeric NA NA NA NA 31 0# Load librarylib_load(dat)# Examine workspacels()# [1] "dat" "dat.rocks" "dat.trees" "tmp"# Use data from the librarysummary(dat.rocks)# area peri shape perm# Min. : 1016 Min. : 308.6 Min. :0.09033 Min. : 6.30# 1st Qu.: 5305 1st Qu.:1414.9 1st Qu.:0.16226 1st Qu.: 76.45# Median : 7487 Median :2536.2 Median :0.19886 Median : 130.50# Mean : 7188 Mean :2682.2 Mean :0.21811 Mean : 415.45# 3rd Qu.: 8870 3rd Qu.:3989.5 3rd Qu.:0.26267 3rd Qu.: 777.50# Max. :12212 Max. :4864.2 Max. :0.46413 Max. :1300.00# Add data to the librarydat.trees_subset<-subset(dat.trees, Girth>11)# Add more data to the librarydat.cars<- mtcars# Unload the library from memorylib_unload(dat)# Examine workspace againls()# [1] "dat" "tmp"# Write the library to disklib_write(dat)# library 'dat': 4 items# - attributes: not loaded# - path: C:\Users\User\AppData\Local\Temp\RtmpCSJ6Gc# - items:# Name Extension Rows Cols Size LastModified# 1 rocks rds 48 4 3.1 Kb 2020-11-05 23:37:45# 2 trees rds 31 3 2.4 Kb 2020-11-05 23:37:45# 3 cars rds 32 11 7.3 Kb 2020-11-05 23:37:45# 4 trees_subset rds 23 3 1.8 Kb 2020-11-05 23:37:45# Clean uplib_delete(dat)# Examine workspace againls()# [1] "tmp"Next:Library Management