The iglu package is designed to assist in the analyses of data fromContinuous Glucose Monitors (CGMs). CGMs are small wearable devices thatmeasure glucose levels continuously throughout the day, with somemonitors taking measurements as often as every 5 minutes. Data fromthese monitors provide a detailed quantification of the variation inblood glucose levels during the course of the day, and thus CGMs play anincreasing role in clinical practice. For more on CGMs, seeRodbard (2016) “ContinuousGlucose Monitoring: A Review of Successes, Challenges, andOpportunities.”.
Multiple CGM-derived metrics have been developed to assess thequality of glycemic control and glycemic variability, many of which aresummarized inRodbard(2009) “Interpretation of continuous glucose monitoring data: glycemicvariability and quality of glycemic control.”. The iglu packagestreamlines the calculation of these metrics by providing clearly namedfunctions that output metrics values with one line of code.
The iglu package is designed to work with Continuous Glucose Monitor(CGM) data in the form of a data frame with the following three columnspresent:
Glucose level measurement [in mg/dL] ("gl")
Timestamp for glucose measurement ("time")
Subject identification ("id")
The iglu package comes with example data from 5 subjects with Type IIdiabetes whose glucose levels were measured using a Dexcom G4 CGM. Thesedata are part of a larger study analyzed inGaynanova etal. (2020).
The iglu package comes with a shiny app containing all of the metriccalculations as well as all plot types of the package itself.
The full app can be accessed by runningiglu::iglu_shiny() (iglu must be installed to use theiglu_shiny function).
The app itself has a demo (reduced functionality) available athttps://stevebroll.shinyapps.io/shinyigludemo/with data pre-loaded.
The iglu package includes two datasets: example_data_1_subject, andexample_data_5_subject. The one subject data is simply the first subjectof the five subject data.
Example data with 1 subject can be loaded with:
This dataset contains 2915 observations of 3 columns corresponding tothe three components listed in the introduction:
"id" - Factor (character string) column for subjectidentification"time" - POSIXct column for datetime values"gl" - Numeric column for glucose measurementData used with iglu functions may have additional columns, but thecolumns for id, time and glucose values must be named as above.
dim(example_data_1_subject)#> [1] 2915 3str(example_data_1_subject)#> 'data.frame': 2915 obs. of 3 variables:#> $ id : Factor w/ 1 level "Subject 1": 1 1 1 1 1 1 1 1 1 1 ...#> $ time: POSIXct, format: "2015-06-06 16:50:27" "2015-06-06 17:05:27" ...#> $ gl : int 153 137 128 121 120 138 155 159 154 152 ...head(example_data_1_subject)#> id time gl#> 1 Subject 1 2015-06-06 16:50:27 153#> 2 Subject 1 2015-06-06 17:05:27 137#> 3 Subject 1 2015-06-06 17:10:27 128#> 4 Subject 1 2015-06-06 17:15:28 121#> 5 Subject 1 2015-06-06 17:25:27 120#> 6 Subject 1 2015-06-06 17:45:27 138Example data with multiple subjects can be loaded with:
This dataset contains the same 3 columns as the dataset in the singlesubject case, but now with 13866 observations from 5 subjects. The firstsubject in this multiple subject dataset is the same as the singlesubject from the previous examples.
dim(example_data_5_subject)#> [1] 13866 3str(example_data_5_subject)#> 'data.frame': 13866 obs. of 3 variables:#> $ id : Factor w/ 5 levels "Subject 1","Subject 2",..: 1 1 1 1 1 1 1 1 1 1 ...#> $ time: POSIXct, format: "2015-06-06 16:50:27" "2015-06-06 17:05:27" ...#> $ gl : int 153 137 128 121 120 138 155 159 154 152 ...Iglu comes with its own set of functions for the importing of rawdata from several common CGM formats, as well as for the generalreformatting of any data to work with the package.
To import raw data from a Dexcom, FreeStyle Libre, Libre Pro, ASC, oriPro monitor, the read_raw_data function can be used. The firstparameter should the name of the .csv file you are attempting to read.Note that the function currently only accepts files read.csv can parse.The next parameter is sensor = [sensor name]. Currently the supportedvalues for sensor are “dexcom”, “libre”, “librepro”, “asc” and “ipro”.These correspond to the sensor whose format you are attempting to readfrom. The next parameter is id. This is the value that will be used asthe subject’s id. This parameter has special values, setting it to“filename” will cause the function to use the filename as the id.Similarly, if this value is set to “read”, the function will attempt toread the subject id from the data. A value of “read” is not supportedfor the “asc” sensor. If no id parameter is passed, the filename will beused. Also, when reading from FreeStyle Libre format, if the phrase“mmol/l” is found in the column names, the glucose values will bemultiplied by 18. The read_raw_data function will return a dataframewith three columns, “id”, “time”, and “gl”, corresponding to subject id,time, and glucose readings respectively. Sensor formats change withongoing development, so these functions may become depreciated. If anyissues are encountered, contact the package maintainer. This iscurrently Irina Gaynanova, who can be reached atirinag@stat.tamu.edu
Ex.:
The process_data function is designed to take a dataframe or tibblewith an arbitrary number of columns with arbitrary column names andreturn a dataframe with only the columns “id” “time” and “gl”, which isthe format used by iglu. It currently takes five parameters, the data tobe processed, id, a string indicating the column name of the subjectids, timestamp, a string matching the column name where the timestampscan be found, glu, a string matching the column name where the glucosereadings can be found, and time_parser, a function which is used toparse the time strings into time objects, this currently defaults toas.POSIXct. Currently data, timestamp, and glu are required parameters.If no id parameter is passed an id of 1 will be assigned to all valuesin the data. If “mmol/l” is found in the column name for glucosereadings, the readings will be multipled by 18 in the returneddataframe.
Ex.
If your times are in a format not parsable by as.POSIXct, you canparse a custom format by passing function(time_string){strptime(time_string, format = [format string])} as the time_parserparameter.
For example, the following call parses datetimes in mm/dd/yyyy hh:mmformat.
All the metrics implemented in the package can be divided into twocategories: time-independent and time-dependent.
Time-independent metrics do not use any linear interpolation becausethe time component of the data is not used in their calculations.Because the time component is not necessary, when working with a singlesubject only a glucose vector is required. If a glucose vector formultiple subjects is supplied, or if a data frame that doesn’t have allthree columns is supplied, these functions will treat all glucose valuesas though they are from the same subject.
All metric functions in iglu will produce the output in a tibbleform. See documentation on tibbles with vignette(‘tibble’) or?tbl_df-class.
Some metric functions, likeabove_percent(), will returnmultiple values for a single subject.
above_percent(example_data_1_subject)#> # A tibble: 1 × 4#> id above_140 above_180 above_250#> <fct> <dbl> <dbl> <dbl>#> 1 Subject 1 26.1 8.20 0.377When a data frame is passed, subject id will always be printed in theid column, and metrics will be printed in the following columns.
As discussed above, just the glucose vector can be supplied for thesingle subject case.
above_percent(example_data_1_subject$gl)#> # A tibble: 1 × 3#> above_140 above_180 above_250#> <dbl> <dbl> <dbl>#> 1 26.1 8.20 0.377However, it is not recommended to pass just glucose values wheneverthe time and subject are also available, because this output will notcontain the subject ID.
The list of target values for the above_percent metric is a parameterthat can be changed:
above_percent(example_data_1_subject,targets =c(100,200,300))#> # A tibble: 1 × 4#> id above_100 above_200 above_300#> <fct> <dbl> <dbl> <dbl>#> 1 Subject 1 72.7 3.40 0Many metrics have parameters that can be changed. To see availableparameters for a given metric, see the documentation i.e. ?above_percentor help(above_percent).
Not all metric functions return multiple values. Many, like theHyperglycemia index metric (function call:hyper_index())will return just a single value for each subject, producing a column forvalue and a column for subject id (if a dataframe is passed), with onerow for each subject.
hyper_index(example_data_5_subject)#> # A tibble: 5 × 2#> id hyper_index#> <fct> <dbl>#> 1 Subject 1 0.391#> 2 Subject 2 4.17#> 3 Subject 3 1.18#> 4 Subject 4 0.358#> 5 Subject 5 2.21In this example, Subject 2 has the largest Hyperglycemia index,indicating the worst hyperglycemia. This is reflected in percent oftimes Subject 2 spends above fixed glucose target (see results ofabove_percent).
Unlike time-independent metrics, time-dependent metrics require theinput to be a dataframe with subject id’s, time, and glucose values.Time-dependent metric functions cannot be passed a vector of glucosevalues. With timestamped data, a potential challenge arises whentimestamps are not on perfect intervals due to missing measurements. Toaddress this challenge, we developed theCGMS2DayByDayfunction.
Observe that the timestamps in the first rows are not even. TheCGMS2DayByDay function addresses this issue by linearlyinterpolating glucose measures for each subject on an equally spacedtime grid from day to day. To prevent extrapolation, missing values areinserted between any two measurements that are more thanintergap minutes apart (default value is 45 minutes, can bechanged by the user). This function is automatically called by allmetrics that require such interpolation, however it is also available tothe user directly. The function is designed to work with one subjectdata at a time, the structure of function output is shown below.
str(CGMS2DayByDay(example_data_1_subject))#> List of 3#> $ gd2d : num [1:14, 1:288] NA 112.2 92 90.1 143.1 ...#> $ actual_dates: Date[1:14], format: "2015-06-06" "2015-06-07" ...#> $ dt0 : num 5The first part of the output,gd2d, is the interpolatedgrid of values. Each row corresponds to one day of measurements, and thecolumns correspond to an equi-distant time grid covering a 24 hour timespan. The grid is chosen to match the frequency of the sensor (5 minutesin this example leading to\((24 * 60)/ 5 =288\) columns), which is returned asdt0. Thereturnedactual_dates allows one to map the rows ingd2d back to the original dates. The achieved alignment ofglucose measurement times across the days enables both the calculationof corresponding metrics, and the creation time-dependent visuals suchas lasagna plots. The default frequency can be adjusted as follows.
str(CGMS2DayByDay(example_data_1_subject,dt0 =10))#> List of 3#> $ gd2d : num [1:14, 1:144] NA 111.1 92.9 89.1 138.2 ...#> $ actual_dates: Date[1:14], format: "2015-06-06" "2015-06-07" ...#> $ dt0 : num 10Note that the final part of the output reflects our input, and thereare now only 144 columns instead of 288.
The CGMS2DayByDay function also allows specification of the maximumallowable gap to interpolate values across (default is 45 minutes) and astring corresponding to time zone (default is the timezone of the user’ssystem).
Functions for metrics requiring linear interpolation will accept thefollowing three parameters that are passed on toCGMS2DayByDay():
dt0” - Time frequency (numeric) for interpolation.Default will automatically match the frequency of the datainter_gap” - Maximum allowable gap in minutes(numeric) for interpolationtz” - String corresponding to timezone where thedata’s measurements were recordedIn the example_data_5_subject dataset, it is important to specifytz = ‘EST’, because a Daylight Savings Time shift can causemiscalculations if the wrong timezone is used. A proper call for thisdataset, being recorded in EST, would be:
Examples of proper metric function calls will be shown in the nextsection.
Some metric functions, likeconga() (ContinuousOverlapping Net Glycemic Action), will return just a single value foreach subject, resulting in a 2 column tibble (1 column for id and 1 forthe single value).
conga(example_data_1_subject,tz ='EST')#> # A tibble: 1 × 2#> id CONGA#> <fct> <dbl>#> 1 Subject 1 37.0Other metrics can return multiple values for a single subject. Forexample,sd_measures(), which requires linearinterpolation, computes 6 unique standard deviation subtypes persubject.
sd_measures(example_data_5_subject)#> # A tibble: 5 × 7#> id SDw SDhhmm SDwsh SDdm SDb SDbdm#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>#> 1 Subject 1 26.4 19.6 6.54 16.7 27.9 24.0#> 2 Subject 2 36.7 22.8 7.62 52.0 48.0 35.9#> 3 Subject 3 42.9 14.4 9.51 12.4 42.8 42.5#> 4 Subject 4 24.5 12.9 6.72 16.9 25.5 22.0#> 5 Subject 5 50.0 29.6 12.8 23.3 50.3 45.9Notice the high fluctuations in Subject 5, with all but one subtypesof standard deviation being the largest for Subject 5. This providesadditional level of CGM data interpretation, since frequent or largeglucose fluctuations may contribute to diabetes-related complicationsindependently from chronic hyperglycemia.
Iglu provides a function for calculating metrics over a giventime-range. The calculate_sleep_wake function allows the user to applyan arbitrary function to given data after it has been filtered by timeof day. It supports calculating on inside the time range, outside thetime range, or both separately.
To use this function, pass it the data you want and the function orname of the function you want to apply. For example, to calculate thestandard deviation of glucose readings from 12-6am, the function’sdefault time period, the following call will work.
calculate_sleep_wake(example_data_1_subject, sd_glu,calculate ="sleep")#> # A tibble: 1 × 2#> id SD#> <fct> <dbl>#> 1 Subject 1 25.4A custom time period can be defined with the sleep_start andsleep_end parameters. The values of these parameters should be realnumbers between 0-24. If an integer is passed to sleep_start, the wholehour will be included. If an integer is passed to sleep_end, everythingup to but not including that hour will be included. The following callwill calculate the metrics for readings between 2:00-7:59am.
calculate_sleep_wake(example_data_5_subject, sd_measures,sleep_start =2,sleep_end =8,calculate ="sleep")#> # A tibble: 5 × 7#> id SDw SDhhmm SDwsh SDdm SDb SDbdm#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>#> 1 Subject 1 10.2 3.04 3.15 21.3 23.9 12.2#> 2 Subject 2 18.9 12.8 6.22 41.9 43.6 18.2#> 3 Subject 3 17.6 9.99 4.86 20.1 32.7 26.8#> 4 Subject 4 10.8 3.98 3.43 12.2 17.3 12.0#> 5 Subject 5 28.1 13.3 5.88 29.4 38.4 27.2There is an option to calculate for either sleep, wake, or bothperiods. By default, the metric will be calculated for the sleep period.To calculate for the wake period, set the “calculate” parameter to“wake”. The following call will calculate the metric for all readingsoutside of 11pm-6:59am
calculate_sleep_wake(example_data_5_subject, grade,sleep_start =23,sleep_end =7,calculate ="wake")#> # A tibble: 5 × 2#> id GRADE#> <fct> <dbl>#> 1 Subject 1 4.44#> 2 Subject 2 15.5#> 3 Subject 3 7.28#> 4 Subject 4 3.51#> 5 Subject 5 11.7To calculate for sleep and wake periods separately and return both ofthose values, labeled accordingly, set the ‘calculate’ parameter to‘both’.
calculate_sleep_wake(example_data_5_subject, gmi,calculate ="both")#> # A tibble: 5 × 3#> id `GMI sleep` `GMI wake`#> <fct> <dbl> <dbl>#> 1 Subject 1 5.97 6.41#> 2 Subject 2 8.58 8.52#> 3 Subject 3 7.05 6.98#> 4 Subject 4 6.70 6.31#> 5 Subject 5 6.94 7.69Custom parameters can still be passed to the applied function. Anyparameters with different names from calculate_sleep_wake’s parameterswill be passed on to the function. The following call applies COGI withthe custom defined targets and weights.
calculate_sleep_wake(example_data_5_subject, cogi,calculate ="sleep",targets =c(80,150),weights =c(.3,.2,.5))#> # A tibble: 5 × 2#> id COGI#> <fct> <dbl>#> 1 Subject 1 91.5#> 2 Subject 2 52.4#> 3 Subject 3 77.8#> 4 Subject 4 89.4#> 5 Subject 5 63.9All of these options can be combined
calculate_sleep_wake(example_data_5_subject, grade_eugly,sleep_start =1,sleep_end =9,calculate ="both",lower =80,upper =150)#> # A tibble: 5 × 3#> id `GRADE_eugly sleep` `GRADE_eugly wake`#> <fct> <dbl> <dbl>#> 1 Subject 1 59.0 31.0#> 2 Subject 2 1.32 1.38#> 3 Subject 3 41.0 21.6#> 4 Subject 4 55.7 37.4#> 5 Subject 5 25.3 6.99The iglu package supports multiple plot types summarized below:
| Function call | Visualization description | Main parameters |
|---|---|---|
plot_glu | Multiple plot types: time series and lasagna | plottype,lasagnatype |
plot_roc | Time series of glucose values colored by rate of change(ROC) | subjects,timelag |
hist_roc | Histogram of rate of change (ROC) values | subjects,timelag |
plot_lasagna | Lasagna plot of glucose values for multiplesubjects | datatype,lasagnatype |
plot_lasagna_1subject | Lasagna plot of glucose values for a singlesubject | lasagnatype |
agp | Ambulatory Glucose Profile (AGP) | maxd,daily |
epicalc_profile | Profile of glycemic episodes | hypo_thresh,hyper_thresh |
mage | MAGE plot displaying peaks and nadirs | plot,title |
Time-series and rate of change plots are shown in examples below. Fordetails on the other plotting types, see the specific vignettecorresponding to the visualization.
The simplest visual is the time series plot generated using thefunctionplot_glu. This plot type can support both singleand multiple subjects.
We set the ‘tz’ (timezone) parameter to be EST because the data wascollected in the eastern time zone. If left blank, the time zone usedfor plotting will be the system’s time zone. Time zone is mainly anissue in cases where daylight savings time might make it appear asthough there were duplicate values at some time points.
To just plot a single subject of interest from the grid of timeseries plots, set the ‘subjects’ parameter to be that subject’s ID.
The red lines can be shifted to any Lower and Upper Target RangeLimits with the ‘LLTR’ and ‘ULTR’ arguments.
plot_glu(example_data_5_subject,plottype ='tsplot',subjects ='Subject 3',LLTR =80,ULTR =150,tz ="EST")Theplot_glu function also supports lasagna plots bychanging the ‘plottype’ parameter. For more on lasagna plots, seeSwihart etal. (2010) “Lasagna Plots: A Saucy Alternative to Spaghetti Plots.”.The lasagna plots in iglu can be single-subject or multi-subject. Formore information see the lasagna_plots vignette.
In addition ,iglu also allows one to visualize localchanges in glucose variability as measured by rate of changeClarke et al. (2009).There are two types of visualizations associated with rate of change, atime-series plot and a histogram. For both plots, the colors indicaterate of change: white indicates a stable rate of change while red andblue represent times at which the glucose is significantly rising orfalling, respectively. Thus colored points represent times of glucosevariability, while white points represent glucose stability. The belowfigure shows a side by side comparison of rate of change time-seriesplots for two subjects. Subject 1 shows significantly less glucosevariability than Subject 5.
The next figure shows a side by side comparison of rate of changehistogram plots for the same subjects. Once again, the colors show inwhat direction and how quickly the glucose is changing. The histogramplots allow one to immediately assess the variation in rate of change.Extreme values on either end of the histogram indicate very rapid risesor drops in glucose - a high degree of local variability. Here, Subject1 once again shows lower glucose variability by having a narrowerhistogram with most values falling between -2 mg/dl/min and 2 mg/dl/min.Subject 5 has a shorter, more widely distributed histogram indicatinggreater glucose variability.