Movatterモバイル変換


[0]ホーム

URL:


Title:A Traceability Focused Grammar of Clinical Data Summary
Version:1.2.1
Description:A traceability focused tool created to simplify the data manipulation necessary to create clinical summaries.
License:MIT + file LICENSE
URL:https://github.com/atorus-research/Tplyr
BugReports:https://github.com/atorus-research/Tplyr/issues
Encoding:UTF-8
Depends:R (≥ 3.5.0)
Imports:rlang (≥ 0.4.6), assertthat (≥ 0.2.1), magrittr (≥ 1.5),dplyr (≥ 1.0.0), purrr (≥ 0.3.3), stringr (≥ 1.4.0), tidyr(≥ 1.0.2), tidyselect (≥ 1.1.0), tibble (≥ 3.0.1),lifecycle, forcats (≥ 1.0.0)
Suggests:testthat (≥ 2.1.0), haven (≥ 2.2.0), knitr, rmarkdown,huxtable, tidyverse, readr, kableExtra, pharmaRTF, withr
VignetteBuilder:knitr
RoxygenNote:7.2.3
RdMacros:lifecycle
Config/testthat/edition:3
LazyData:true
NeedsCompilation:no
Packaged:2024-02-19 21:00:25 UTC; mike.stackhouse
Author:Eli MillerORCID iD [aut], Mike StackhouseORCID iD [aut, cre], Ashley Tarasiewicz [aut], Nathan KosibaORCID iD [ctb], Sadchla Mascary [ctb], Andrew Bates [ctb], Shiyu Chen [ctb], Oleksii Mikryukov [ctb], Atorus Research LLC [cph]
Maintainer:Mike Stackhouse <mike.stackhouse@atorusresearch.com>
Repository:CRAN
Date/Publication:2024-02-20 08:00:02 UTC

A grammar of summary data for clinical reports

Description

'r lifecycle::badge("experimental")'

Details

'Tplyr' is a package dedicated to simplifying the data manipulation necessaryto create clinical reports. Clinical data summaries can often be broken downinto two factors - counting discrete variables (or counting shifts in state),and descriptive statistics around a continuous variable. Many of the reportsthat go into a clinical report are made up of these two scenarios. Byabstracting this process away, 'Tplyr' allows you to rapidly build thesetables without worrying about the underlying data manipulation.

'Tplyr' takes this process a few steps further by abstracting away most ofthe programming that goes into proper presentation, which is where a greatdeal of programming time is spent. For example, 'Tplyr' allows you to easilycontrol:

String formatting

Different reports warrantdifferent presentation of your strings. Programming this can get tedious, asyou typically want to make sure that your decimals properly align. 'Tplyr'abstracts this process away and provides you with a simple interface tospecify how you want your data presented

Treatmentgroups

Need a total column? Need to group summaries of multiple treatments?'Tplyr' makes it simple to add additional treatment groups into your report

Denominators

n (%) counts often vary based on the summarybeing performed. 'Tplyr' allows you to easily control what denominators areused based on a few common scenarios

Sorting

Summarizingdata is one thing, but ordering it for presentation. Tplyr automaticallyderives sorting variable to give you the data you need to order your tableproperly. This process is flexible so you can easily get what you want byleveraging your data or characteristics of R.

Another powerful aspect of 'Tplyr' are the objects themselves. 'Tplyr' doesmore than format your data. Metadata about your table is kept under the hood,and functions allow you to access information that you need. For example,'Tplyr' allows you to calculate and access the raw numeric data ofcalculations as well, and easily pick out just the pieces of information thatyou need.

Lastly, 'Tplyr' was built to be flexible, yet intuitive. A common pitfall ofbuilding tools like this is over automation. By doing to much, you end up notdoing enough. 'Tplyr' aims to hit the sweet spot in between. Additionally, wedesigned our function interfaces to be clean. Modifier functions offer youflexibility when you need it, but defaults can be set to keep the codeconcise. This allows you to quickly assemble your table, and easily makechanges where necessary.

Author(s)

Maintainer: Mike Stackhousemike.stackhouse@atorusresearch.com (ORCID)

Authors:

Other contributors:

See Also

Useful links:

Examples

# Load in pipelibrary(magrittr)# Use just the defaultstplyr_table(mtcars, gear) %>%  add_layer(    group_desc(mpg, by=cyl)  ) %>%  add_layer(    group_count(carb, by=cyl)  ) %>%  build()# Customize and modifytplyr_table(mtcars, gear) %>%  add_layer(    group_desc(mpg, by=cyl) %>%      set_format_strings(        "n"         = f_str("xx", n),        "Mean (SD)" = f_str("a.a+1 (a.a+2)", mean, sd, empty='NA'),        "Median"    = f_str("a.a+1", median),        "Q1, Q3"    = f_str("a, a", q1, q3, empty=c(.overall='NA')),        "Min, Max"  = f_str("a, a", min, max),        "Missing"   = f_str("xx", missing)      )  ) %>%  add_layer(    group_count(carb, by=cyl) %>%      add_risk_diff(        c('5', '3'),        c('4', '3')      ) %>%      set_format_strings(        n_counts = f_str('xx (xx%)', n, pct),        riskdiff = f_str('xx.xxx (xx.xxx, xx.xxx)', dif, low, high)      ) %>%      set_order_count_method("bycount") %>%      set_ordering_cols('4') %>%      set_result_order_var(pct)  ) %>%  build()# A Shift Tabletplyr_table(mtcars, am) %>%  add_layer(    group_shift(vars(row=gear, column=carb), by=cyl) %>%    set_format_strings(f_str("xxx (xx.xx%)", n, pct))  ) %>%  build()

Pipe operator

Description

Seemagrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of calling 'rhs(lhs)'.


Add an anti-join onto a tplyr_meta object

Description

An anti-join allows a tplyr_meta object to refer to data that should beextracted from a separate dataset, like the population data of a Tplyr table,that is unavailable in the target dataset. The primary use case for this isthe presentation of missing subjects, which in a Tplyr table is presentedusing the functionadd_missing_subjects_row(). The missing subjectsthemselves are not present in the target data, and are thus only available inthe population data. Theadd_anti_join() function allows you to provide themeta information relevant to the population data, and then specify theonvariable that should be used to join with the target dataset and find thevalues present in the population data that are missing from the target data.

Usage

add_anti_join(meta, join_meta, on)

Arguments

meta

A tplyr_meta object referring to the target data

join_meta

A tplyr_meta object referring to the population data

on

A list of quosures containing symbols - most likely set to USUBJID.

Value

A tplyr_meta object

Examples

tm <- tplyr_meta(  rlang::quos(TRT01A, SEX, ETHNIC, RACE),  rlang::quos(TRT01A == "Placebo", TRT01A == "SEX", ETHNIC == "HISPANIC OR LATINO"))tm %>%  add_anti_join(    tplyr_meta(      rlang::quos(TRT01A, ETHNIC),      rlang::quos(TRT01A == "Placebo", ETHNIC == "HISPANIC OR LATINO")    ),    on = rlang::quos(USUBJID)  )

Attach column headers to a Tplyr output

Description

When working with 'huxtable' tables, column headers can be controlled as if they are rows in the data frame.add_column_headers eases the process of introducing these headers.

Usage

add_column_headers(.data, s, header_n = NULL)

Arguments

.data

The data.frame/tibble on which the headers shall be attached

s

The text containing the intended header string

header_n

A header_n or generic data.frame to use for binding count values.This is required if you are using the token replacement.

Details

Headers are created by providing a single string. Columns are specified by delimitting each header with a '|' symbol.Instead of specifying the destination of each header,add_column_headers assumes that you have organized the columnsof your data frame before hand. This means that after you useTplyr::build(), if you'd like to reorganize thedefault column order (which is simply alphabetical), simply pass the build output to adplyr::select ordplyr::relocatestatement before passing intoadd_column_headers.

Spanning headers are also supported. A spanning header is an overarching header that sits across multiple columns.Spanning headers are introduced toadd_column_header by providing the spanner text (i.e. the text thatyou'd like to sit in the top row), and then the spanned text (the bottom row) within curly brackets ('{}). For example,take the iris dataset. We have the names:

"Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"

If we wanted to provide a header string for this dataset, with spanners to help with categorization ofthe variables, we could provide the following string:

"Sepal {Length | Width} | Petal {Length | Width} | Species"

Value

A data.frame with the processed header string elements attached as the top rows

Important note

Make sure you are aware of the order of your variables prior to passing in toadd_column_headers. The only requirementis that the number of column match. The rest is up to you.

Development notes

There are a few features ofadd_column_header that are intended but not yet supported:

Token Replacement

This function has support for reading values from the header_n object in a Tplyr tableand adding them in the column headers. Note: The order of the parameterspassed in the token is important. They should be first the treatment variablethen anycols variables in the order they were passed in the table construction.

Use a double asterisk "**" at the begining to start the token and anotherdouble asterisk to close it. You can separate column parameters in the tokenwith a single underscore. For example, **group1_flag2_param3** will pull the countfrom the header_n binding for group1 in thetreat_var, flag2 in the firstcolsargument, and param3 in the secondcols argument.

You can pass fewer arguments in the token to get the sum of multiple columns.For example, **group1** would get the sum of the group1 treat_var,and all cols from the header_n.

Examples

# Load in pipelibrary(magrittr)library(dplyr)header_string <- "Sepal {Length | Width} | Petal {Length | Width} | Species"iris2 <- iris %>%  mutate_all(as.character)iris2 %>% add_column_headers(header_string)# Example with countsmtcars2 <- mtcars %>%  mutate_all(as.character)t <- tplyr_table(mtcars2, vs, cols = am) %>%  add_layer(    group_count(cyl)  )b_t <- build(t) %>%  mutate_all(as.character)count_string <- paste0(" | V N=**0** {auto N=**0_0** | man N=**0_1**} |",                       " S N=**1** {auto N=**1_0** | man N=**1_1**} | | ")add_column_headers(b_t, count_string, header_n(t))

Attach a layer to atplyr_table object

Description

add_layer attaches atplyr_layer to atplyr_table object. This allowsfor a tidy style of programming (usingmagrittr piping, i.e.%>%) with asecondary advantage - the construction of the layer object may consist of a series of pipedfunctions itself.

Tplyr encourages a user to view the construction of a table as a series of "layers".The construction of each of these layers are isolated and independent of one another - buteach of these layers are children of the table itself.add_layer isolates the constructionof an individual layer and allows the user to construct that layer and insert it back into theparent. The syntax for this is intuitive and allows for tidy piping. Simply pipe the currenttable object in, and write the code to construct your layer within thelayer parameter.

add_layers is another approach to attaching layers to atplyr_table. Instead ofconstructing the entire table at once,add_layers allows you to construct layers asdifferent objects. These layers can then be attached into thetplyr_table all atonce.

add_layer andadd_layers both additionally allow you to name the layers as youattach them. This is helpful when using functions likeget_numeric_data orget_stats_data when you can access information from a layer directly.add_layer has a name parameter, and layers can be named inadd_layers bysubmitting the layer as a named argument.

Usage

add_layer(parent, layer, name = NULL)add_layers(parent, ...)

Arguments

parent

Atplyr_table ortplyr_layer/tplyr_subgroup_layer object

layer

A layer construction function and associated modifier functions

name

A name to provide the layer in the table layers container

...

Layers to be added

Value

Atplyr_table ortplyr_layer/tplyr_subgroup_layer with a new layer inserted into thelayerbinding

See Also

[tplyr_table(), tplyr_layer(), group_count(), group_desc(), group_shift()]

Examples

# Load in pipelibrary(magrittr)## Single layert <- tplyr_table(mtcars, cyl) %>%  add_layer(    group_desc(target_var=mpg)  )## Single layer with namet <- tplyr_table(mtcars, cyl) %>%  add_layer(name='mpg',    group_desc(target_var=mpg)  )# Using add_layerst <- tplyr_table(mtcars, cyl)l1 <- group_desc(t, target_var=mpg)l2 <- group_count(t, target_var=cyl)t <- add_layers(t, l1, 'cyl' = l2)

Add a missing subject row into a count summary.

Description

This function calculates the number of subjects missing from a particulargroup of results. The calculation is done by examining the total number ofsubjects potentially available from the Header N values within the resultcolumn, and finding the difference with the total number of subjects presentin the result group. Note that for accurate results, the subject variableneeds to be defined using the 'set_distinct_by()' function. As with othermethods, this function instructs how distinct results should be identified.

Usage

add_missing_subjects_row(e, fmt = NULL, sort_value = NULL)

Arguments

e

A 'count_layer' object

fmt

An f_str object used to format the total row. If none is provided,display is based on the layer formatting.

sort_value

The value that will appear in the ordering column for totalrows. This must be a numeric value.

Examples

tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      add_missing_subjects_row(f_str("xxxx", n))   ) %>%   build()

Add risk difference to a count layer

Description

A very common requirement for summary tables is to calculate the risk difference between treatmentgroups.add_risk_diff allows you to do this. The underlying risk difference calculationsare performed using the Base R functionprop.test - so prior to using this function,be sure to familiarize yourself with its functionality.

Usage

add_risk_diff(layer, ..., args = list(), distinct = TRUE)

Arguments

layer

Layer upon which the risk difference will be attached

...

Comparison groups, provided as character vectors where the first group is the comparison,and the second is the reference

args

Arguments passed directly intoprop.test

distinct

Logical - Use distinct counts (if available).

Details

add_risk_diff can only be attached to a count layer, so the count layer must be constructedfirst.add_risk_diff allows you to compare the difference between treatment group, so allcomparisons should be based upon the values within the specifiedtreat_var in yourtplyr_table object.

Comparisons are specified by providing two-element character vectors. You can provide as many ofthese groups as you want. You can also use groups that have been constructed usingadd_treat_grps oradd_total_group. The first element provided will be consideredthe 'reference' group (i.e. the left side of the comparison), and the second group will be consideredthe 'comparison'. So if you'd like to see the risk difference of 'T1 - Placebo', you would specifythis asc('T1', 'Placebo').

Tplyr forms your two-way table in the background, and then runsprop.test appropriately.Similar to way that the display of layers are specified, the exact values and format of how you'd likethe risk difference display are set usingset_format_strings. This controls both the valuesand the format of how the risk difference is displayed. Risk difference formats are set withinset_format_strings by using the name 'riskdiff'.

You have 5 variables to choose from in your data presentation:

comp

Probability of the left hand side group (i.e. comparison)

ref

Probability of the right hand side group (i.e. reference)

dif

Difference of comparison - reference

low

Lower end of the confidence interval (default is 95%, override with theargs paramter)

high

Upper end of the confidence interval (default is 95%, override with theargs paramter)

Use these variable names when forming yourf_str objects. The default presentation, if nostring format is specified, will be:

f_str('xx.xxx (xx.xxx, xx.xxx)', dif, low, high)

Note - within Tplyr, you can account for negatives by allowing an extra space within your integerside settings. This will help with your alignment.

If columns are specified on a Tplyr table, risk difference comparisons still only take place betweengroups within thetreat_var variable - but they are instead calculated treating thecolsvariables as by variables. Just like the tplyr layers themselves, the risk difference will then be transposedand display each risk difference as separate variables by each of thecols variables.

Ifdistinct is TRUE (the default), all calculations will take place on the distinct counts, ifthey are available. Otherwise, non-distinct counts will be used.

One final note -prop.test may throw quite a few warnings. This is natural, because italerts you when there's not enough data for the approximations to be correct. This may be unnervingcoming from a SAS programming world, but this is R is trying to alert you that the values provideddon't have enough data to truly be statistically accurate.

Examples

library(magrittr)## Two group comparisons with default options appliedt <- tplyr_table(mtcars, gear)# Basic risk diff for two groups, using defaultsl1 <- group_count(t, carb) %>%  # Compare 3 vs. 4, 3 vs. 5  add_risk_diff(    c('3', '4'),    c('3', '5')  )# Build and show outputadd_layers(t, l1) %>% build()## Specify custom formats and display variablest <- tplyr_table(mtcars, gear)# Create the layer with custom formattingl2 <- group_count(t, carb) %>%  # Compare 3 vs. 4, 3 vs. 5  add_risk_diff(    c('3', '4'),    c('3', '5')  ) %>%  set_format_strings(    'n_counts' = f_str('xx (xx.x)', n, pct),    'riskdiff' = f_str('xx.xxx, xx.xxx, xx.xxx, xx.xxx, xx.xxx', comp, ref, dif, low, high)  )# Build and show outputadd_layers(t, l2) %>% build()## Passing arguments to prop.testt <- tplyr_table(mtcars, gear)# Create the layer with args optionl3 <- group_count(t, carb) %>%  # Compare 3 vs. 4, 4 vs. 5  add_risk_diff(    c('3', '4'),    c('3', '5'),    args = list(conf.level = 0.9, correct=FALSE, alternative='less')  )# Build and show outputadd_layers(t, l3) %>% build()

Add a Total row into a count summary.

Description

Adding a total row creates an additional observation in the count summarythat presents the total counts (i.e. the n's that are summarized). The formatof the total row will be formatted in the same way as the other countstrings.

Usage

add_total_row(e, fmt = NULL, count_missings = TRUE, sort_value = NULL)

Arguments

e

Acount_layer object

fmt

An f_str object used to format the total row. If none is provided,display is based on the layer formatting.

count_missings

Whether or not to ignore the named arguments passed in'set_count_missing()' when calculating counts total row. This is useful ifyou need to exclude/include the missing counts in your total row. Defaultsto TRUE meaning total row will not ignore any values.

sort_value

The value that will appear in the ordering column for totalrows. This must be a numeric value.

Details

Totals are calculated using all grouping variables, including treat_var andcols from the table level. If by variables are included, the grouping of thetotal and the application of denominators becomes ambiguous. You will bewarned specifically if a percent is included in the format. To rectify this,useset_denoms_by(), and the grouping ofadd_total_row() willbe updated accordingly.

Note that when usingadd_total_row() withset_pop_data(), youshould calladd_total_row() AFTER callingset_pop_data(),otherwise there is potential for unexpected behaivior with treatment groups.

Examples

# Load in Pipelibrary(magrittr)tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      add_total_row(f_str("xxxx", n))   ) %>%   build()

Combine existing treatment groups for summary

Description

Summary tables often present individual treatment groups, but mayadditionally have a "Treatment vs. Placebo" or "Total" group added to showgrouped summary statistics or counts. This set of functions offers aninterface to add these groups at a table level and be consumed by subsequentlayers.

Usage

add_treat_grps(table, ...)add_total_group(table, group_name = "Total")treat_grps(table)

Arguments

table

Atplyr_table object

...

A named vector where names will become the new treatment groupnames, and values will be used to construct those treatment groups

group_name

The treatment group name used for the constructed 'Total' group

Details

add_treat_grps allows you to specify specific groupings. This is doneby supplying named arguments, where the name becomes the new treatmentgroup's name, and those treatment groups are made up of the argument'svalues.

add_total_group is a simple wrapper aroundadd_treat_grps.Instead of producing custom groupings, it produces a "Total" group by thesupplied name, which defaults to "Total". This "Total" group is made up ofall existing treatment groups within the population dataset.

Note that when usingadd_treat_grps oradd_total_row() withset_pop_data(), you should calladd_total_row() AFTER callingset_pop_data(), otherwise there is potential for unexpected behaiviorwith treatment groups.

The functiontreat_grps allows you to see the custom treatment groupsavailable in yourtplyr_table object

Value

The modified table object

Examples

tab <- tplyr_table(iris, Species)# A custom groupadd_treat_grps(tab, "Not Setosa" = c("versicolor", "virginica"))# Add a total groupadd_total_group(tab)treat_grps(tab)# Returns:# $`Not Setosa`#[1] "versicolor" "virginica"##$Total#[1] "setosa"     "versicolor" "virginica"

Add variables to a tplyr_meta object

Description

Add additional variable names to atplyr_meta() object.

Usage

add_variables(meta, names)add_filters(meta, filters)

Arguments

meta

A tplyr_meta object

names

A list of names, providing variable names of interest. Provideas a list of quosures usingrlang::quos()

filters

A list of symbols, providing variable names of interest. Provideas a list of quosures using 'rlang::quos()'

Value

tplyr_meta object

Examples

m <- tplyr_meta()m <- add_variables(m, rlang::quos(a, b, c))m <- add_filters(m, rlang::quos(a==1, b==2, c==3))m

Append the Tplyr table metadata dataframe

Description

append_metadata() allows a user to extend the Tplyr metadata data framewith user provided data. In some tables, Tplyr may be able to provided mostof the data, but a user may have to extend the table with other summaries,statistics, etc. This function allows the user to extend the tplyr_table'smetadata with their own metadata content using custom data frames createdusing thetplyr_meta object.

Usage

append_metadata(t, meta)

Arguments

t

A tplyr_table object

meta

A dataframe fitting the specifications of the details section ofthis function

Details

As this is an advanced feature of Tplyr, ownership is on the user to makesure the metadata data frame is assembled properly. The only restrictionsapplied byappend_metadata() are thatmeta must have a column namedrow_id, and the values inrow_id cannot be duplicates of anyrow_idvalue already present in the Tplyr metadata dataframe.tplyr_meta() objectsalign with constructed dataframes using therow_id and output datasetcolumn name. As such,tplyr_meta() objects should be inserted into a dataframe using a list column.

Value

A tplyr_table object

Examples

t <- tplyr_table(mtcars, gear) %>%  add_layer(    group_desc(wt)  )t %>%  build(metadata=TRUE)m <- tibble::tibble(  row_id = c('x1_1'),  var1_3 = list(tplyr_meta(rlang::quos(a, b, c), rlang::quos(a==1, b==2, c==3))))append_metadata(t, m)

Conditional reformatting of a pre-populated string of numbers

Description

This function allows you to conditionally re-format a string of numbers basedon a numeric value within the string itself. By selecting a "format group",which is targeting a specific number within the string, a user can establisha condition upon which a provided replacement string can be used. Either theentire replacement can be used to replace the entire string, or thereplacement text can refill the "format group" while preserving the originalwidth and alignment of the target string.

Usage

apply_conditional_format(  string,  format_group,  condition,  replacement,  full_string = FALSE)

Arguments

string

Target character vector where text may be replaced

format_group

An integer representing the targeted numeric field withinthe string, numbered from left to right

condition

An expression, using the variable name 'x' as the targetvariable within the condition

replacement

A string to use as the replacement value

full_string

TRUE if the full string should be replaced, FALSE if thereplacement should be done within the format group

Value

A character vector

Examples

string <- c(" 0  (0.0%)", " 8  (9.3%)", "78 (90.7%)")apply_conditional_format(string, 2, x == 0, " 0        ", full_string=TRUE)apply_conditional_format(string, 2, x < 1, "(<1%)")

Apply Format Strings outside of a Tplyr table

Description

Thef_str object in Tplyr is used to drive formatting of the outputsstrings within a Tplyr table. This function allows a user to use the sameinterface to apply formatted string on any data frame within adplyr::mutate() context.

Usage

apply_formats(format_string, ..., empty = c(.overall = ""))

Arguments

format_string

The desired display format. X's indicate digits. On theleft, the number of x's indicates the integer length. On the right, thenumber of x's controls decimal precision and rounding. Variables areinferred by any separation of the 'x' values other than a decimal.

...

The variables to be formatted using the format specified informat_string. These must be numeric variables.

empty

The string to display when the numeric data is not available.Use a single element character vector, with the element named '.overall' toinstead replace the whole string.

Details

Note that auto-precision is not currently supported withinapply_formats()

Value

Character vector of formatted values

Examples

library(dplyr)mtcars %>%  head() %>%  mutate(    fmt_example = apply_formats('xxx (xx.x)', hp, wt)  )

Replace repeating row label variables with blanks in preparation for display.

Description

Depending on the display package being used, row label values may need to beblanked out if they are repeating. This gives the data frame supporting thetable the appearance of the grouping variables being grouped together inblocks.apply_row_masks does this work by blanking out the value ofany row_label variable where the current value is equal to the valuebefore it. Note -apply_row_masks assumes that the data frame hasalready be sorted and therefore should only be applied once the data frame isin its final sort sequence.

Usage

apply_row_masks(dat, row_breaks = FALSE, ...)

Arguments

dat

Data.frame / tibble to mask repeating row_labels

row_breaks

Boolean - set to TRUE to insert row breaks

...

Variable used to determine where row-breaks should be inserted.Breaks will be inserted when this group of variables changes values. Thisis determined by dataset order, so sorting should be done prior to usingapply_row_masks. If left empty,ord_layer_index will be used.

Details

Additionally,apply_row_masks can add row breaks for you between eachlayer. Row breaks are inserted as blank rows. This relies on the "break by"variables (submitted via...) constructed inbuild still beingattached to the dataset. An additional order variable is attached namedord_break, but the output dataset is sorted to properly insert the rowbreaks between layers.

Value

tibble with blanked out rows where values are repeating


Trigger the execution of thetplyr_table

Description

The functions used to assemble atplyr_table object andeach of the layers do not trigger the processing of any data. Rather, a lazyexecution style is used to allow you to construct your table and thenexplicitly state when the data processing should happen.buildtriggers this event.

Usage

build(x, metadata = FALSE)

Arguments

x

Atplyr_table object

metadata

Trigger to build metadata. Defaults to FALSE

Details

When thebuild command is executed, all of the dataprocessing commences. Any pre-processing necessary within the tableenvironment takes place first. Next, each of the layers begins executing.Once the layers complete executing, the output of each layer is stacked intothe resulting data frame.

Once this process is complete, any post-processing necessary within the tableenvironment takes place, and the final output can be delivered. Metadata andtraceability information are kept within each of the layer environments,which allows an investigation into the source of the resulting datapoints.For example, numeric data from any summaries performed is maintained andaccessible within a layer usingget_numeric_data.

The 'metadata' option of build will trigger the construction of traceabilitymetadata for the constructed data frame. Essentially, for every "result" thatTplyr produces, Tplyr can also generate the steps necessary to obtain thesource data which produced that result from the input. For more information,see vignette("metadata").

Value

An executedtplyr_table

See Also

tplyr_table, tplyr_layer, add_layer, add_layers, layer_constructors

Examples

# Load in Pipelibrary(magrittr)tplyr_table(iris, Species) %>%  add_layer(    group_desc(Sepal.Length, by = "Sepal Length")  ) %>%  add_layer(    group_desc(Sepal.Width, by = "Sepal Width")  ) %>%  build()

Collapse row labels into a single column

Description

This is a generalized post processing function that allows you to take groupsof by variables and collapse them into a single column. Repeating values aresplit into separate rows, and for each level of nesting, a specifiedindentation level can be applied.

Usage

collapse_row_labels(x, ..., indent = "  ", target_col = row_label)

Arguments

x

Input data frame

...

Row labels to be collapsed

indent

Indentation string to be used, which is multiplied at each indentation level

target_col

The desired name of the output column containing collapsed row labels

Value

data.frame with row labels collapsed into a single column

Examples

x <- tibble::tribble(~row_label1, ~row_label2, ~row_label3, ~row_label4, ~var1,  "A",         "C",         "G",         "M",        1L,  "A",         "C",         "G",         "N",        2L,  "A",         "C",         "H",         "O",        3L,  "A",         "D",         "H",         "P",        4L,  "A",         "D",         "I",         "Q",        5L,  "A",         "D",         "I",         "R",        6L,  "B",         "E",         "J",         "S",        7L,  "B",         "E",         "J",         "T",        8L,  "B",         "E",         "K",         "U",        9L,  "B",         "F",         "K",         "V",        10L,  "B",         "F",         "L",         "W",        11L)collapse_row_labels(x, row_label1, row_label2, row_label3, row_label4)collapse_row_labels(x, row_label1, row_label2, row_label3)collapse_row_labels(x, row_label1, row_label2, indent = "    ", target_col = rl)

Create af_str object

Description

f_str objects are intended to be used within the functionset_format_strings. Thef_str object carries information that powers asignificant amount of layer processing. Theformat_string parameter iscapable of controlling the display of a data point and decimal precision. Thevariables provided in... control which data points are used to populatethe string formatted output.

Usage

f_str(format_string, ..., empty = c(.overall = ""))

Arguments

format_string

The desired display format. X's indicate digits. On theleft, the number of x's indicates the integer length. On the right, thenumber of x's controls decimal precision and rounding. Variables areinferred by any separation of the 'x' values other than a decimal.

...

The variables to be formatted using the format specified informat_string.

empty

The string to display when the numeric data is not available.For desc layers, an unnamed character vector will populate within theprovided format string, set to the same width as the fitted numbers. Use asingle element character vector, with the element named '.overall' toinstead replace the whole string.

Details

Format strings are one of the most powerful components of 'Tplyr'.Traditionally, converting numeric values into strings for presentation canconsume a good deal of time. Values and decimals need to align betweenrows, rounding before trimming is sometimes forgotten - it can become atedious mess that is realistically not an important part of the analysisbeing performed. 'Tplyr' makes this process as simple as we can, whilestill allowing flexibility to the user.

Tplyr provides both manual and automatic decimal precision formatting. Thedisplay of the numbers in the resulting data frame is controlled by theformat_string parameter. For manual precision, just like dummy values maybe presented on your mocks, integer and decimal precision is specified bythe user providing a string of 'x's for how you'd like your numbersformatted. If you'd like 2 integers with 3 decimal places, you specify yourstring as 'xx.xxx'. 'Tplyr' does the work to get the numbers in the rightplace.

To take this a step further, automatic decimal precision can also beobtained based on the collected precision within the data. When creatingtables where results vary by some parameter, different results may call fordifferent degrees of precision. To use automatic precision, use a single'a' on either the integer and decimal side. If you'd like to use increasedprecision (i.e. you'd like mean to be collected precision +1), use 'a+1'.So if you'd like both integer and and decimal precision to be based on thedata as collected, you can use a format like 'a.a' - or for collected+1decimal precision, 'a.a+1'. You can mix and match this with manual formatsas well, making format strings such as 'xx.a+1'.

If you want two numbers on the same line, you provide two sets of x's. Forexample, if you're presenting a value like "mean (sd)" - you could providethe string 'xx.xx (xx.xxx)', or perhaps 'a.a+1 (a.a+2). Note that you'reable to provide different integer lengths and different decimal precisionfor the two values. Each format string is independent and relates only tothe format specified.

As described above, when using 'x' or 'a', any other character within theformat string will stay stationary. So for example, if your format stringis 'xx (xxx.x)', your number may format as '12 ( 34.5)'. So the left sideparenthesis stays fixed. In some displays, you may want the parenthesis to'hug' your number. Following this example, when allotting 3 spaces for theinteger within parentheses, the parentehsis should shift to the right,making the numbers appear '12 (34.5)'. Usingf_str() you can achievethis by using a capital 'X' or 'A'. For this example, the format stringwould be 'xx (XXX.x)'.

There are a two rules when using 'parenthesis hugging':

The other parameters of thef_str call specify what values should fillthe x's.f_str objects are used slightly differently between differentlayers. When declaring a format string within a count layer,f_str()expects to see the valuesn ordistinct_n for event or distinct counts,pct ordistinct_pct for event or distinct percentages, ortotal ordistinct_total for denominator calculations. Note that in anf_str()for a count layer 'A' or 'a' are based on n counts, and therefore don'tmake sense to use in percentages. But in descriptive statistic layers,f_str parameters refer to the names of the summaries being performed,either by built in defaults, or custom summaries declared usingset_custom_summaries(). Seeset_format_strings() for some more notesabout layers specific implementation.

Anf_str() may also be used outside of a Tplyr table. The functionapply_formats() allows you to apply anf_str within the context ofdplyr::mutate() or more generally a vectorized function.

Value

Af_str object

Validf_str() Variables by Layer Type

Valid variables allowed within the... parameter off_str() differ bylayer type.

Examples

f_str("xx.x (xx.x)", mean, sd)f_str("a.a+1 (a.a+2)", mean, sd)f_str("xx.a (xx.a+1)", mean, sd)f_str("xx.x, xx.x, xx.x", q1, median, q3)f_str("xx (XXX.x%)", n, pct)f_str("a.a+1 (A.a+2)", mean, sd)

Set or return by layer binding

Description

Set or return by layer binding

Usage

get_by(layer)set_by(layer, by)

Arguments

layer

Atplyr_layer object

by

A string, a variable name, or a list of variable names suppliedusingdplyr::vars.

Value

Forget_by, theby binding of the supplied layer. Forset_by the modified layer environment.

Examples

# Load in pipelibrary(magrittr)iris$Species2 <- iris$Specieslay <- tplyr_table(iris, Species) %>%  group_count(Species) %>%  set_by(vars(Species2, Sepal.Width))

Get Data Labels

Description

Get labels for data sets included in Tplyr.

Usage

get_data_labels(data)

Arguments

data

A Tplyr data set.

Value

A data.frame with columns 'name' and 'label' containing the names and labels of each column.


Get or set the default format strings for descriptive statistics layers

Description

Tplyr provides you with the ability to set table-wide defaults of formatstrings. You may wish to reuse the same format strings across numerouslayers.set_desc_layer_formats andset_count_layer_formatsallow you to apply your desired format strings within the entire scope of thetable.

Usage

get_desc_layer_formats(obj)set_desc_layer_formats(obj, ...)get_count_layer_formats(obj)set_count_layer_formats(obj, ...)get_shift_layer_formats(obj)set_shift_layer_formats(obj, ...)

Arguments

obj

A tplyr_table object

...

formats to pass forward

Details

For descriptive statistic layers, you can also useset_format_stringsandset_desc_layer_formats together within a table, but not within thesame layer. In the absence of specified format strings, first the table willbe checked for any available defaults, and otherwise thetplyr.desc_layer_default_formats option will be used.set_format_strings will always take precedence over either. Defaultscannot be combined betweenset_format_strings,set_desc_layer_formats, and thetplyr.desc_layer_default_formats because the order of presentation ofresults is controlled by the format strings, so relying on combinations ofthese setting would not be intuitive.

For count layers, you can override then_counts orriskdiffformat strings separately, and the narrowest scope available will be usedfrom layer, to table, to default options.


Extract the result metadata of a Tplyr table

Description

Given a row_id value and a result column, this function will return thetplyr_meta object associated with that 'cell'.

Usage

get_meta_result(x, row_id, column, ...)

Arguments

x

A built Tplyr table or a dataframe

row_id

The row_id value of the desired cell, provided as a characterstring

column

The result column of interest, provided as a character string

...

additional arguments

Details

If a Tplyr table is built with themetadata=TRUE option specified, thenmetadata is assembled behind the scenes to provide traceability on eachresult cell derived. The functionsget_meta_result() andget_meta_subset() allow you to access that metadata by using an ID providedin the row_id column and the column name of the result you'd like to access.The purpose is of the row_id variable instead of a simple row index is toprovide a sort resistant reference of the originating column, so the outputTplyr table can be sorted in any order but the metadata are still easilyaccessible.

Thetplyr_meta object provided a list with two elements - names andfilters. The metadata contain every column from the target data.frame of theTplyr table that factored into the specified result cell, and the filterscontains all the necessary filters to subset to data summarized to create thespecified result cell.get_meta_subset() additionally provides a parameter tospecify any additional columns you would like to include in the returnedsubset data frame.

Value

A tplyr_meta object

Examples

t <- tplyr_table(mtcars, cyl) %>%  add_layer(    group_desc(hp)  )dat <- t %>% build(metadata = TRUE)get_meta_result(t, 'd1_1', 'var1_4')m <- t$metadatadat <- t$targetget_meta_result(t, 'd1_1', 'var1_4')

Extract the subset of data based on result metadata

Description

Given a row_id value and a result column, this function will return thesubset of data referenced by the tplyr_meta object associated with that'cell', which provides traceability to tie a result to its source.

Usage

get_meta_subset(x, row_id, column, add_cols = vars(USUBJID), ...)## S3 method for class 'data.frame'get_meta_subset(  x,  row_id,  column,  add_cols = vars(USUBJID),  target = NULL,  pop_data = NULL,  ...)## S3 method for class 'tplyr_table'get_meta_subset(x, row_id, column, add_cols = vars(USUBJID), ...)

Arguments

x

A built Tplyr table or a dataframe

row_id

The row_id value of the desired cell, provided as a characterstring

column

The result column of interest, provided as a character string

add_cols

Additional columns to include in subset data.frame output

...

additional arguments

target

A data frame to be subset (if not pulled from a Tplyr table)

pop_data

A data frame to be subset through an anti-join (if not pulledfrom a Tplyr table)

Details

If a Tplyr table is built with themetadata=TRUE option specified, thenmetadata is assembled behind the scenes to provide traceability on eachresult cell derived. The functionsget_meta_result() andget_meta_subset() allow you to access that metadata by using an ID providedin the row_id column and the column name of the result you'd like to access.The purpose is of the row_id variable instead of a simple row index is toprovide a sort resistant reference of the originating column, so the outputTplyr table can be sorted in any order but the metadata are still easilyaccessible.

Thetplyr_meta object provided a list with two elements - names andfilters. The metadata contain every column from the target data.frame of theTplyr table that factored into the specified result cell, and the filterscontains all the necessary filters to subset to data summarized to create thespecified result cell.get_meta_subset() additionally provides a parameterto specify any additional columns you would like to include in the returnedsubset data frame.

Value

A data.frame

Examples

t <- tplyr_table(mtcars, cyl) %>%  add_layer(    group_desc(hp)  )dat <- t %>% build(metadata = TRUE)get_meta_subset(t, 'd1_1', 'var1_4', add_cols = dplyr::vars(carb))m <- t$metadatadat <- t$targetget_meta_subset(t, 'd1_1', 'var1_4', add_cols = dplyr::vars(carb), target = target)

Get the metadata dataframe from a tplyr_table

Description

Pull out the metadata dataframe from a tplyr_table to work with it directly

Usage

get_metadata(t)

Arguments

t

A Tplyr table with metadata built

Value

Tplyr metadata dataframe

Examples

t <- tplyr_table(mtcars, gear) %>%  add_layer(    group_desc(wt)  )t %>%  build(metadata=TRUE)get_metadata(t)

Retrieve the numeric data from a tplyr objects

Description

get_numeric_data provides access to the un-formatted numeric data foreach of the layers within atplyr_table, with options to allow you toextract distinct layers and filter as desired.

Usage

get_numeric_data(x, layer = NULL, where = TRUE, ...)

Arguments

x

A tplyr_table or tplyr_layer object

layer

Layer name or index to select out specifically

where

Subset criteria passed to dplyr::filter

...

Additional arguments to pass forward

Details

When used on atplyr_table object, this method will aggregate thenumeric data from all Tplyr layers. The data will be returned to the user ina list of data frames. If the data has already been processed (i.e.build has been run), the numeric data is already available and will bereturned without reprocessing. Otherwise, the numeric portion of the layerwill be processed.

Using the layer and where parameters, data for a specific layer can beextracted and subset. This is most clear when layers are given text namesinstead of using a layer index, but a numeric index works as well.

Value

Numeric data from the Tplyr layer

Examples

# Load in pipelibrary(magrittr)t <- tplyr_table(mtcars, gear) %>% add_layer(name='drat',           group_desc(drat) ) %>% add_layer(name='cyl',           group_count(cyl) ) # Return a list of the numeric data frames get_numeric_data(t) # Get the data from a specific layer get_numeric_data(t, layer='drat') get_numeric_data(t, layer=1) # Choose multiple layers by name or index get_numeric_data(t, layer=c('cyl', 'drat')) get_numeric_data(t, layer=c(2, 1)) # Get the data and filter it get_numeric_data(t, layer='drat', where = gear==3)

Set or return precision_by layer binding

Description

The precision_by variables are used to collect the integer and decimalprecision when auto-precision is used. These by variables are used to groupthe input data and identify the maximum precision available within thedataset for each by group. The precision_by variables must be a subset of theby variables

Usage

get_precision_by(layer)set_precision_by(layer, precision_by)

Arguments

layer

Atplyr_layer object

precision_by

A string, a variable name, or a list of variable names suppliedusingdplyr::vars.

Value

Forget_precision_by, the precision_by binding of the suppliedlayer. Forset_precision_by the modified layer environment.

Examples

# Load in pipelibrary(magrittr)lay <- tplyr_table(mtcars, gear) %>%  add_layer(    group_desc(mpg, by=vars(carb, am)) %>%    set_precision_by(carb)  )

Set or return precision_on layer binding

Description

The precision_on variable is the variable used to establish numericprecision. This variable must be included in the list oftarget_varvariables.

Usage

get_precision_on(layer)set_precision_on(layer, precision_on)

Arguments

layer

Atplyr_layer object

precision_on

A string, a variable name, or a list of variable namessupplied usingdplyr::vars.

Value

Forget_precision_on, the precision_on binding of the suppliedlayer. Forset_precision_on the modified layer environment.

Examples

# Load in pipelibrary(magrittr)lay <- tplyr_table(mtcars, gear) %>%  add_layer(    group_desc(vars(mpg, disp), by=vars(carb, am)) %>%    set_precision_on(disp)  )

Get statistics data

Description

Like the layer numeric data, Tplyr also stores the numeric data produced fromstatistics like risk difference. This helper function gives you access toobtain that data from the environment

Usage

get_stats_data(x, layer = NULL, statistic = NULL, where = TRUE, ...)

Arguments

x

A tplyr_table or tplyr_layer object

layer

Layer name or index to select out specifically

statistic

Statistic name or index to select

where

Subset criteria passed to dplyr::filter

...

Additional arguments passed to dispatch

Details

When used on atplyr_table object, this method will aggregate thenumeric data from all Tplyr layers and calculate all statistics. The datawill be returned to the user in a list of data frames. If the data hasalready been processed (i.e.build has been run), the numeric data isalready available and the statistic data will simply be returned. Otherwise,the numeric portion of the layer will be processed.

Using the layer, where, and statistic parameters, data for a specific layerstatistic can be extracted and subset, allowing you to directly access dataof interest. This is most clear when layers are given text names instead ofusing a layer index, but a numeric index works as well. If just a statisticis specified, that statistic will be collected and returned in a list of dataframes, allowing you to grab, for example, just the risk differencestatistics across all layers.

Value

The statistics data of the supplied layer

Examples

library(magrittr)t <- tplyr_table(mtcars, gear) %>%  add_layer(name='drat',            group_desc(drat)  ) %>%  add_layer(name="cyl",            group_count(cyl)  ) %>%  add_layer(name="am",            group_count(am) %>%              add_risk_diff(c('4', '3'))  ) %>%  add_layer(name="carb",            group_count(carb) %>%              add_risk_diff(c('4', '3'))  ) # Returns a list of lists, containing stats data from each layer get_stats_data(t) # Returns just the riskdiff statistics from each layer - NULL # for layers without riskdiff get_stats_data(t, statistic="riskdiff") # Return the statistic data for just the "am" layer - a list get_stats_data(t, layer="am") get_stats_data(t, layer=3) # Return the statistic data for just the "am" and "cyl", layer - a # list of lists get_stats_data(t, layer=c("am", "cyl")) get_stats_data(t, layer=c(3, 2)) # Return just the statistic data for "am" and "cyl" - a list get_stats_data(t, layer=c("am", "cyl"), statistic="riskdiff") get_stats_data(t, layer=c(3, 2), statistic="riskdiff") # Return the riskdiff for the "am" layer - a data frame get_stats_data(t, layer="am", statistic="riskdiff") # Return and filter the riskdiff for the am layer - a data frame get_stats_data(t, layer="am", statistic="riskdiff", where = summary_var==1)

Set or return treat_var binding

Description

Set or return treat_var binding

Usage

get_target_var(layer)set_target_var(layer, target_var)

Arguments

layer

Atplyr_layer object

target_var

A symbol to perform the analysis on

Value

Fortreat_var, the treatment variable binding of the layerobject. Forset_treat_var, the modified layer environment.

Examples

# Load in pipelibrary(magrittr)iris$Species2 <- iris$Specieslay <- tplyr_table(iris, Species) %>%  group_count(Species) %>%  set_target_var(Species2)

Retrieve one of Tplyr's regular expressions

Description

This function allows you to extract important regular expressions used insideTplyr.

Usage

get_tplyr_regex(rx = c("format_string", "format_group"))

Arguments

rx

A character string with either the value 'format_string' or'format_group'

Details

There are two important regular expressions used within Tplyr. Theformat_string expression is the expression to parse format strings. This iswhat is used to make sense out of strings like 'xx (XX.x%)' or 'a+1 (A.a+2)'by inferring what the user is specifying about number formatting.

The 'format_group' regex is the opposite of this, and when given a string ofnumbers, such as ' 5 (34%) [9]' will return the separate segments of numbersbroken into their format groups, which in this example would be ' 5','(34%)', and '[9]'.

Value

A regular expression object

Examples

get_tplyr_regex('format_string')get_tplyr_regex('format_group')

Set or return where binding for layer or table

Description

Set or return where binding for layer or table

Usage

## S3 method for class 'tplyr_layer'get_where(obj)## S3 method for class 'tplyr_layer'set_where(obj, where)get_where(obj)## S3 method for class 'tplyr_table'get_where(obj)set_where(obj, where)## S3 method for class 'tplyr_table'set_where(obj, where)set_pop_where(obj, where)get_pop_where(obj)

Arguments

obj

Atplyr_layer ortplyr_table object.

where

An expression (i.e. syntax) to be used to subset the data.Supply as programming logic (i.e. x < 5 & y == 10)

Value

Forwhere, the where binding of the supplied object.Forset_where, the modified object

Examples

# Load in pipelibrary(magrittr)iris$Species2 <- iris$Specieslay <- tplyr_table(iris, Species) %>%  group_count(Species) %>%  set_where(Petal.Length > 3) %>%  # Set logic for pop_data as well  set_pop_where(Petal.Length > 3)

Create acount,desc, orshift layer for discrete countbased summaries, descriptive statistics summaries, or shift count summaries

Description

This family of functions specifies the type of summary that isto be performed within a layer.count layers are used to createsummary counts of some discrete variable.desc layers create summarystatistics, andshift layers summaries the counts of differentchanges in states. See the "details" section below for more information.

Usage

group_count(parent, target_var, by = vars(), where = TRUE, ...)group_desc(parent, target_var, by = vars(), where = TRUE, ...)group_shift(parent, target_var, by = vars(), where = TRUE, ...)

Arguments

parent

Required. The parent environment of the layer. This must be thetplyr_table object that the layer is contained within.

target_var

Symbol. Required, The variable name(s) on which the summaryis to be performed. Must be a variable within the target dataset. Enterunquoted - i.e. target_var = AEBODSYS. You may also provide multiplevariables withvars.

by

A string, a variable name, or a list of variable names suppliedusingvars

where

Call. Filter logic used to subset the target data whenperforming a summary.

...

Additional arguments to pass forward

Details

Count Layers

Count layers allow you to createsummaries based on counting values with a variable. Additionally, thislayer allows you to create n (%) summaries where you're also summarizingthe proportion of instances a value occurs compared to some denominator.Count layers are also capable of producing counts of nested relationships.For example, if you want to produce counts of an overall outside group, andthen the subgroup counts within that group, you can specify the targetvariable as vars(OutsideVariable, InsideVariable). This allows you to dotables like Adverse Events where you want to see the Preferred Terms withinBody Systems, all in one layer. Further control over denominators isavailable using the functionset_denoms_by and distinctcounts can be set usingset_distinct_by

DescriptiveStatistics Layers

Descriptive statistics layers perform summaries oncontinuous variables. There are a number of summaries built into Tplyralready that you can perform, including n, mean, median, standarddeviation, variance, min, max, inter-quartile range, Q1, Q3, and missingvalue counts. From these available summaries, the default presentation of adescriptive statistic layer will output 'n', 'Mean (SD)', 'Median', 'Q1, Q3','Min, Max', and 'Missing'. You can change these summaries usingset_format_strings, and you can also add your own summariesusingset_custom_summaries. This allows you to implement anyadditional summary statistics you want presented.

Shift Layers

Ashift layer displays an endpoint's 'shift' throughout the duration of thestudy. It is an abstraction over the count layer, however we have providedan interface that is more efficient and intuitive. Targets are passed asnamed symbols usingdplyr::vars. Generally the baseline is passedwith the name 'row' and the shift is passed with the name 'column'. Bothcounts (n) and percentages (pct) are supported and can be specified withtheset_format_strings function. To allow for flexibilitywhen defining percentages, you can define the denominator using theset_denoms_by function. This function takes variable names anduses those to determine the denominator for the counts.

Value

Antplyr_layer environment that is a child of the specifiedparent. The environment contains the object as listed below.

Atplyr_layer object

See Also

[add_layer,add_layers,tplyr_table,tplyr_layer]

Examples

# Load in pipelibrary(magrittr)t <- tplyr_table(iris, Species) %>%  add_layer(    group_desc(target_var=Sepal.Width)  )t <- tplyr_table(iris, Species) %>%  add_layer(    group_desc(target_var=Sepal.Width)  )t <- tplyr_table(mtcars, am) %>%  add_layer(    group_shift(vars(row=gear, column=carb), by=cyl)  )

Return or set header_n binding

Description

The 'header_n()' functions can be used to automatically pull the header_nderivations from the table or change them for future use.

Usage

header_n(table)header_n(x) <- valueset_header_n(table, value)

Arguments

table

Atplyr_table object

x

Atplyr_table object

value

A data.frame with columns with the treatment variable, columnvariabes, and a variable with counts named 'n'.

header_n

A data.frame with columns with the treatment variable, columnvariabes, and a variable with counts named 'n'.

Details

The 'header_n' object is created by Tplyr when a table is built and intendedto be used by the 'add_column_headers()' function when displaying table levelpopulation totals. These methods are intended to be used for calling thepopulation totals calculated by Tplyr, and to overwrite them if a userchooses to.

If you have a need to change the header Ns that appear in your table headers,say you know you are working with a subset of the data that doesn't representthe totals, you can replace the data used with 'set_header_n()'.

Value

Fortplyr_header_n the header_n binding of thetplyr_table object. Fortplyr_header_n<- andset_tplyr_header_n the modified object.

Examples

tab <- tplyr_table(mtcars, gear)header_n(tab) <- data.frame(  gear = c(3, 4, 5),  n = c(10, 15, 45))

Select levels to keep in a count layer

Description

In certain cases you only want a layer to include certain values of a factor.The 'keep_levels()' function allows you to pass character values to beincluded in the layer. The others are ignored.**NOTE: Denominator calculation is unaffected by this function, see theexamples on how to include this logic in your percentages'**

Usage

keep_levels(e, ...)

Arguments

e

Acount_layer object

...

Character values to count int he layer

Value

The modified Tplyr layer object

Examples

library(dplyr)mtcars <- mtcars %>%  mutate_all(as.character)t <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      keep_levels("4", "8") %>%      set_denom_where(cyl %in% c("4", "8")) ) %>% build()

Create, view, extract, remove, and use Tplyr layer templates

Description

There are several scenarios where a layer template may be useful. Sometables, like demographics tables, may have many layers that will allessentially look the same. Categorical variables will have the same countlayer settings, and continuous variables will have the same desc layersettings. A template allows a user to build those settings once per layer,then reference the template when the Tplyr table is actually built.

Usage

new_layer_template(name, template)remove_layer_template(name)get_layer_template(name)get_layer_templates()use_template(name, ..., add_params = NULL)

Arguments

name

Template name

template

Template layer syntax, starting with a layer constructorgroup_count|desc|shift. This function should be called with an ellipsisargument (i.e. group_count(...)).

...

Arguments passed directly into a layer constructor, matching thetarget, by, and where parameters.

add_params

Additional parameters passed into layer modifier functions.These arguments are specified in a template within curly brackets such as{param}. Supply as a named list, where the element name is the parameter.

Details

This suite of functions allows a user to create and use layer templates.Layer templates allow a user to pre-build and reuse an entire layerconfiguration, from the layer constructor down to all modifying functions.Furthermore, users can specify parameters they may want to beinterchangeable. Additionally, layer templates are extensible, so a templatecan be use and then further extended with additional layer modifyingfunctions.

Layers are created usingnew_layer_template(). To use a layer, use thefunctionuse_template() in place ofgroup_count|desc|shift(). If you wantto view a specific template, useget_layer_template(). If you want to viewall templates, useget_layer_templates(). And to remove a layer template useremove_layer_template(). Layer templates themselves are stored in theoptiontplyr.layer_templates, but a user should not access this directlyand instead use the Tplyr supplied functions.

When providing the template layer syntax, the layer must start with a layerconstructor. These are one of the functiongroup_count(),group_desc(),orgroup_shift(). Instead of passing arguments into these function,templates are specified using an ellipsis in the constructor, i.e.group_count(...). This is required, as after the template is built a usersupplies these arguments viause_template()

use_template() takes thegroup_count|desc|shift() arguments by default.If a user specified additional arguments in the template, these are providedin a list throught the argumentadd_params. Provide these arguments exactlyas you would in a normal layer. When creating the template, these parameterscan be specified by using curly brackets. See the examples for details.

Examples

op <- options()new_layer_template(  "example_template",  group_count(...) %>%    set_format_strings(f_str('xx (xx%)', n, pct)))get_layer_templates()get_layer_template("example_template")tplyr_table(mtcars, vs) %>%  add_layer(    use_template("example_template", gear)  ) %>%  build()remove_layer_template("example_template")new_layer_template(  "example_template",  group_count(...) %>%    set_format_strings(f_str('xx (xx%)', n, pct)) %>%    set_order_count_method({sort_meth}) %>%    set_ordering_cols({sort_cols}))get_layer_template("example_template")tplyr_table(mtcars, vs) %>%  add_layer(    use_template("example_template", gear, add_params =                   list(                     sort_meth = "bycount",                     sort_cols = `1`                   ))  ) %>%  build()remove_layer_template("example_template")options(op)

Return or set population data bindings

Description

The population data is used to gather information that may not be availablefrom the target dataset. For example, missing treatment groups, population Ncounts, and proper N counts for denominators will be provided through thepopulation dataset. The population dataset defaults to the target datasetunless otherwise specified usingset_pop_data.

Usage

pop_data(table)pop_data(x) <- valueset_pop_data(table, pop_data)

Arguments

table

Atplyr_table object

x

Atplyr_table object

value

A data.frame with population level information

pop_data

A data.frame with population level information

Value

Fortplyr_pop_data the pop_data binding of thetplyr_table object. Fortplyr_pop_data<- nothing is returned,the pop_data binding is set silently. Forset_tplyr_pop_data themodified object.

Examples

tab <- tplyr_table(iris, Species)pop_data(tab) <- mtcarstab <- tplyr_table(iris, Species) %>%  set_pop_data(mtcars)

Return or set pop_treat_var binding

Description

The treatment variable used in the target data may be different than thevariable within the population dataset.set_pop_treat_var allows youto change this.

Usage

pop_treat_var(table)set_pop_treat_var(table, pop_treat_var)

Arguments

table

Atplyr_table object

pop_treat_var

Variable containing treatment group assignments within thepop_data binding. Supply unquoted.

Value

Fortplyr_pop_treat_var the pop_treat_var binding of thetplyr_table object. Forset_tplyr_pop_treat_var the modifiedobject.

Examples

tab <- tplyr_table(iris, Species)pop_data(tab) <- mtcarsset_pop_treat_var(tab, mpg)

Process layers to get formatted and pivoted tables.

Description

This is an internal method, but is exported to support S3 dispatch. Not intended for direct use by a user.

Usage

process_formatting(x, ...)

Arguments

x

A tplyr_layer object

...

arguments passed to dispatch

Value

The formatted_table object that is bound to the layer


Process layers to get metadata tables

Description

This is an internal method, but is exported to support S3 dispatch. Not intended for direct use by a user.

Usage

process_metadata(x, ...)

Arguments

x

A tplyr_layer object

...

arguments passed to dispatch

Value

The formatted_meta object that is bound to the layer


Process a tplyr_statistic object

Description

This is an internal function that is not meant for use externally, but must be exported.Use with caution.

Usage

process_statistic_data(x, ...)

Arguments

x

A tplyr_statistic environment

...

Additional pass through parameters

Value

Numeric statistc data from a tplyr statistc


Process string formatting on a tplyr_statistic object

Description

This is an internal function that is not meant for use externally, but must be exported.Use with caution.

Usage

process_statistic_formatting(x, ...)

Arguments

x

A tplyr_statistic environment

...

Additional pass through parameters

Value

Formatted tplyr_statistic data


Process layers to get numeric results of layer

Description

This is an internal method, but is exported to support S3 dispatch. Not intended for direct use by a user.

Usage

process_summaries(x, ...)

Arguments

x

a tplyr_layer object

...

arguments passed to dispatch

Value

The tplyr_layer object with a 'built_table' binding


Reformat strings with leading whitespace for HTML

Description

Reformat strings with leading whitespace for HTML

Usage

replace_leading_whitespace(x, tab_width = 4)

Arguments

x

Target string

tab_width

Number of spaces to compensate for tabs

Value

String with &nbsp; replaced for leading whitespace

Examples

x <- c(" Hello there", "  Goodbye Friend ",  "\tNice to meet you","  \t What are you up to? \t \t ")replace_leading_whitespace(x)replace_leading_whitespace(x, tab=2)

Set custom summaries to be performed within a descriptive statistics layer

Description

This function allows a user to define custom summaries to be performed in acall todplyr::summarize(). A custom summary by the same name as adefault summary will override the default. This allows the user to overridethe default behavior of summaries built into 'Tplyr', while also adding newdesired summary functions.

Usage

set_custom_summaries(e, ...)

Arguments

e

desc layer on which the summaries should be bound

...

Named parameters containing syntax to be used in a call todplyr::summarize()

Details

When programming the logic of the summary function, use the variable name.var to within your summary functions. This allows you apply thesummary function to each variable when multiple target variables aredeclared.

An important, yet not immediately obvious, part of usingset_custom_summaries is to understand the link between the namedparameters you set inset_custom_summaries and the names called inf_str objects withinset_format_strings. Inf_str, after you supply the string format you'd like yournumbers to take, you specify the summaries that fill those strings.

When you go to set your format strings, the name you use to declare a summaryinset_custom_summaries is the same name that you use in yourf_str call. This is necessary becauseset_format_strings needs some means of putting two summaries inthe same value, and setting a row label for the summary being performed.

Review the examples to see this put into practice. Note the relationshipbetween the name created inset_custom_summaries and the name used inset_format_strings within thef_str call

Value

Binds a variablecustom_summaries to the specified layer

Examples

#Load in pipelibrary(magrittr)tplyr_table(iris, Species) %>%  add_layer(    group_desc(Sepal.Length, by = "Sepal Length") %>%      set_custom_summaries(        geometric_mean = exp(sum(log(.var[.var > 0]),                                     na.rm=TRUE) / length(.var))      ) %>%      set_format_strings(        'Geometric Mean' = f_str('xx.xx', geometric_mean)      )  ) %>%  build()

Set values the denominator calculation will ignore

Description

'r lifecycle::badge("defunct")'

This is generally used for missing values. Values like "", NA, "NA" arecommon ways missing values are presented in a data frame. In certain cases,percentages do not use "missing" values in the denominator. This functionnotes different values as "missing" and excludes them from the denominators.

Usage

set_denom_ignore(e, ...)

Arguments

e

Acount_layer object

...

Values to exclude from the percentage calculation. If you use'set_missing_counts()' this should be the name of the parameters instead ofthe values, see the example below.

Value

The modified layer object

Examples

library(magrittr)mtcars2 <- mtcarsmtcars2[mtcars$cyl == 6, "cyl"] <- NAmtcars2[mtcars$cyl == 8, "cyl"] <- "Not Found"tplyr_table(mtcars2, gear) %>%  add_layer(    group_count(cyl) %>%      set_missing_count(f_str("xx ", n), Missing = c(NA, "Not Found"))      # This function is currently deprecated. It was replaced with an      # argument in set_missing_count      # set_denom_ignore("Missing")  ) %>%  build()

Set Logic for denominator subsetting

Description

By default, denominators in count layers are subset based on the layer levelwhere logic. In some cases this might not be correct. This functions allowsthe user to override this behavior and pass custom logic that will be used tosubset the target dataset when calculating denominators for the layer.

Usage

set_denom_where(e, denom_where)

Arguments

e

Acount_layer/shift_layer object

denom_where

An expression (i.e. syntax) to be used to subset thetarget dataset for calculating layer denominators. Supply as programminglogic (i.e. x < 5 & y == 10). To remove the layer where parametersubsetting for the total row and thus the percentage denominators,pass 'TRUE' to this function.

Value

The modified Tplyr layer object

Examples

library(magrittr)t10 <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl, where = cyl != 6) %>%    set_denom_where(TRUE)    # The denominators will be based on all of the values, including 6  ) %>% build()

Set variables used in pct denominator calculation

Description

This function is used when calculating pct in count or shift layers. Thepercentages default to the treatment variable and any column variables butcan be calculated on any variables passed to target_var, treat_var, by, orcols.

Usage

set_denoms_by(e, ...)

Arguments

e

A count/shift layer object

...

Unquoted variable names

Value

The modified layer object

Examples

library(magrittr)# Default has matrix of treatment group, additional columns,# and by variables sum to 1tplyr_table(mtcars, am) %>%  add_layer(    group_shift(vars(row=gear, column=carb), by=cyl) %>%      set_format_strings(f_str("xxx (xx.xx%)", n, pct))  ) %>%  build()tplyr_table(mtcars, am) %>%  add_layer(    group_shift(vars(row=gear, column=carb), by=cyl) %>%      set_format_strings(f_str("xxx (xx.xx%)", n, pct)) %>%      set_denoms_by(cyl, gear) # Row % sums to 1  ) %>%  build()tplyr_table(mtcars, am) %>%  add_layer(    group_shift(vars(row=gear, column=carb), by=cyl) %>%      set_format_strings(f_str("xxx (xx.xx%)", n, pct)) %>%      set_denoms_by(cyl, gear, am) # % within treatment group sums to 1  ) %>%  build()

Set counts to be distinct by some grouping variable.

Description

In some situations, count summaries may want to see distinct counts by avariable like subject. For example, the number of subjects in a populationwho had a particular adverse event.set_distinct_by allows you to setthe by variables used to determine a distinct count.

Usage

set_distinct_by(e, distinct_by)

Arguments

e

Acount_layer/shift_layer object

distinct_by

Variable(s) to get the distinct data.

Details

When adistinct_by value is set, distinct counts will be used bydefault. If you wish to combine distinct and not distinct counts, you canchoose which to display in yourf_str() objects usingn,pct,distinct_n, anddistinct_pct. Additionally, denominatorsmay be presented usingtotal anddistinct_total

Value

The layer object with

Examples

#Load in pipelibrary(magrittr)tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      set_distinct_by(carb)  ) %>%  build()

Set the format strings and associated summaries to be performed in a layer

Description

'Tplyr' gives you extensive control over how strings are presented.set_format_strings allows you to apply these string formats to yourlayer. This behaves slightly differently between layers.

Usage

set_format_strings(e, ...)## S3 method for class 'desc_layer'set_format_strings(e, ..., cap = getOption("tplyr.precision_cap"))## S3 method for class 'count_layer'set_format_strings(e, ...)

Arguments

e

Layer on which to bind format strings

...

Named parameters containing calls tof_str to set the format strings

cap

A named character vector containing an 'int' element for the capon integer precision, and a 'dec' element for the cap on decimal precision.

Details

Format strings are one of the most powerful components of 'Tplyr'.Traditionally, converting numeric values into strings for presentation canconsume a good deal of time. Values and decimals need to align betweenrows, rounding before trimming is sometimes forgotten - it can become atedious mess that, in the grand scheme of things, is not an important partof the analysis being performed. 'Tplyr' makes this process as simple as wecan, while still allowing flexibility to the user.

In a count layer, you can simply provide a singlef_strobject to specify how you want your n's, percentages, and denominators formatted.If you are additionally supplying a statistic, like risk difference usingadd_risk_diff, you specify the count formats using the name'n_counts'. The risk difference formats would then be specified using thename 'riskdiff'. In a descriptive statistic layer,set_format_strings allows you to do a couple more things:

See thef_str documentation for more details about how thisimplementation works.

Value

The layer environment with the format string binding added

tplyr_layer object with formats attached

Returns the modified layer object.

Examples

# Load in pipelibrary(magrittr)# In a count layertplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      set_format_strings(f_str('xx (xx%)', n, pct))  ) %>%  build()# In a descriptive statistics layertplyr_table(mtcars, gear) %>%  add_layer(    group_desc(mpg) %>%      set_format_strings(        "n"         = f_str("xx", n),        "Mean (SD)" = f_str("xx.x", mean, empty='NA'),        "SD"        = f_str("xx.xx", sd),        "Median"    = f_str("xx.x", median),        "Q1, Q3"    = f_str("xx, xx", q1, q3, empty=c(.overall='NA')),        "Min, Max"  = f_str("xx, xx", min, max),        "Missing"   = f_str("xx", missing)      )  ) %>%  build()# In a shift layertplyr_table(mtcars, am) %>%  add_layer(    group_shift(vars(row=gear, column=carb), by=cyl) %>%    set_format_strings(f_str("xxx (xx.xx%)", n, pct))  ) %>%  build()

Set the option to prefix the row_labels in the inner count_layer

Description

When a count layer uses nesting (i.e. triggered byset_nest_count),theindentation argument's value will be used as a prefix for the inner layer'srecords

Usage

set_indentation(e, indentation)

Arguments

e

Acount_layer object

indentation

A character to prefix the row labels in an innercount layer

Value

The modified count_layer environment


Set variables to limit reported data values only to those that exist ratherthan fully completing all possible levels

Description

This function allows you to select a combination of by variables orpotentially target variables for which you only want to display valuespresent in the data. By default, Tplyr will create a cartesian combination ofpotential values of the data. For example, if you have 2 by variablespresent, then each potential combination of those by variables will have arow present in the final table.set_limit_data_by() allows you to choosethe by variables whose combination you wish to limit to values physicallypresent in the available data.

Usage

set_limit_data_by(e, ...)

Arguments

e

A tplyr_layer

...

Subset of variables within by or target variables

Value

a tplyr_table

Examples

tplyr_table(tplyr_adpe, TRT01A) %>%  add_layer(    group_desc(AVAL, by = vars(PECAT, PARAM, AVISIT))  ) %>%  build()tplyr_table(tplyr_adpe, TRT01A) %>%  add_layer(    group_desc(AVAL, by = vars(PECAT, PARAM, AVISIT)) %>%      set_limit_data_by(PARAM, AVISIT)  ) %>%  build()tplyr_table(tplyr_adpe, TRT01A) %>%  add_layer(    group_count(AVALC, by = vars(PECAT, PARAM, AVISIT)) %>%      set_limit_data_by(PARAM, AVISIT)  ) %>%  build()tplyr_table(tplyr_adpe, TRT01A) %>%  add_layer(    group_count(AVALC, by = vars(PECAT, PARAM, AVISIT)) %>%      set_limit_data_by(PECAT, PARAM, AVISIT)  ) %>%  build()

Set the display for missing strings

Description

Controls how missing counts are handled and displayed in the layer

Usage

set_missing_count(e, fmt = NULL, sort_value = NULL, denom_ignore = FALSE, ...)

Arguments

e

Acount_layer object

fmt

An f_str object to change the display of the missing counts

sort_value

A numeric value that will be used in the ordering column.This should be numeric. If it is not supplied the ordering column will bethe maximum value of what appears in the table plus one.

denom_ignore

A boolean. Specifies Whether or not to include themissing counts specified within the ... parameter within denominators. Ifset to TRUE, the values specified within ... will be ignored.

...

Parameters used to note which values to describe as missing.Generally NA and "Missing" would be used here. Parameters can be namedcharacter vectors where the names become the row label.

Value

The modified layer

Examples

library(magrittr)library(dplyr)  mtcars2 <- mtcars %>%mutate_all(as.character)mtcars2[mtcars$cyl == 6, "cyl"] <- NAtplyr_table(mtcars2, gear) %>%  add_layer(    group_count(cyl) %>%      set_missing_count(f_str("xx ", n), Missing = NA)  ) %>%  build()

Set the label for the missing subjects row

Description

Set the label for the missing subjects row

Usage

set_missing_subjects_row_label(e, missing_subjects_row_label)

Arguments

e

Acount_layer object

missing_subjects_row_label

A character to label the total row

Value

The modifiedcount_layer object

Examples

t <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      add_missing_subjects_row() %>%      set_missing_subjects_row_label("Missing")  )build(t)

Set the option to nest count layers

Description

If set to TRUE, the second variable specified intarget_varwill be nested inside of the first variable. This allows you to createdisplays like those commonly used in adverse event tables, whereone column holds both the labels of the outer categorical variableand the inside event variable (i.e. AEBODSYS and AEDECOD).

Usage

set_nest_count(e, nest_count)

Arguments

e

Acount_layer object

nest_count

A logical value to set the nest option

Value

The modified layer


Set a numeric cutoff

Description

[Experimental]

In certain tables, it may be necessary to only include rows that meet numericconditions. Rows that are less than a certain cutoff can be suppressed fromthe output. This function allows you to pass a cutoff, a cutoff stat(n,distinct_n, pct, or distinct_pct) to supress values that are lesser than thecutoff.

Usage

set_numeric_threshold(e, numeric_cutoff, stat, column = NULL)

Arguments

e

Acount_layer object

numeric_cutoff

A numeric value where only values greater than or equalto will be displayed.

stat

The statistic to use when filtering out rows. Either 'n','distinct_n', or 'pct' are allowable

column

If only a particular column should be used to cutoff values, itcan be supplied here as a character value.

Value

The modified Tplyr layer object

Examples

mtcars %>%tplyr_table(gear) %>%  add_layer(    group_count(cyl) %>%      set_numeric_threshold(10, "n") %>%      add_total_row() %>%      set_order_count_method("bycount")  )

Set the ordering logic for the count layer

Description

The sorting of a table can greatly vary depending on thesituation at hand. For count layers, when creating tables like adverseevent summaries, you may wish to order the table by descending occurrencewithin a particular treatment group. But in other situations, such as AEsof special interest, or subject disposition, there may be a specific orderyou wish to display values. Tplyr offers solutions to each of thesesituations.

Instead of allowing you to specify a custom sort order, Tplyr insteadprovides you with order variables that can be used to sort your table afterthe data are summarized. Tplyr has a default order in which the table willbe returned, but the order variables will always persist. This allows youto use powerful sorting functions likearrangeto get your desired order, and in double programming situations, helps yourvalidator understand the how you achieved a particular sort order and wherediscrepancies may be coming from.

When creating order variables for a layer, for each 'by' variable Tplyrwill search for a <VAR>N version of that variable (i.e. VISIT <-> VISITN,PARAM <-> PARAMN). If available, this variable will be used for sorting. Ifnot available, Tplyr will created a new ordered factor version of thatvariable to use in alphanumeric sorting. This allows the user to control acustom sorting order by leaving an existing <VAR>N variable in your datasetif it exists, or create one based on the order in which you wish to sort -no custom functions in Tplyr required.

Ordering of results is where things start to differ. Different situationscall for different methods. Descriptive statistics layers keep it simple -the order in which you input your formats usingset_format_strings is the order in which the results willappear (with an order variable added). For count layers, Tplyr offers threesolutions: If there is a <VAR>N version of your target variable, use that.If not, if the target variable is a factor, use the factor orders. Finally,you can use a specific data point from your results columns. The resultcolumn can often have multiple data points, between the n counts, percent,distinct n, and distinct percent. Tplyr allows you to choose which of thesevalues will be used when creating the order columns for a specified resultcolumn (i.e. based on thetreat_var andcols arguments). Seethe 'Sorting a Table' section for more information.

Shift layers sort very similarly to count layers, but to order your rowshift variable, use an ordered factor.

Usage

set_order_count_method(e, order_count_method, break_ties = NULL)set_ordering_cols(e, ...)set_result_order_var(e, result_order_var)

Arguments

e

Acount_layer object

order_count_method

The logic determining how the rows in the finallayer output will be indexed. Options are 'bycount', 'byfactor', and'byvarn'.

break_ties

In certain cases, a 'bycount' sort will result in conflictsif the counts aren't unique. break_ties will add a decimal to the sortingcolumn so resolve conflicts. A character value of 'asc' will add a decimalbased on the alphabetical sorting. 'desc' will do the samebut sort descending in case that is the intention.

...

Unquoted variables used to select the columns whose values will beextracted for ordering.

result_order_var

The numeric value the ordering will be done on.This can be either n, distinct_n, pct, or distinct_pct. Due to theevaluation of the layer you can add a value that isn't actually beingevaluated, if this happens this will only error out in the ordering.

Value

Returns the modified layer object. The 'ord_' columns are addedduring the build process.

Sorting a Table

When a table is built, the output has severalordering(ord_) columns that are appended. The first represents the layerindex. The index is determined by the order the layer was added to thetable. Following are the indices for the by variables and the targetvariable. The by variables are ordered based on:

  1. The 'by' variable is a factor in the target dataset

  2. If the variable isn't a factor, but has a <VAR>N variable (i.e. VISIT-> VISITN, TRT -> TRTN)

  3. If the variable is not a factor in the target dataset, it is coercedto one and ordered alphabetically.

The target variable is ordered depending on the type of layer. See morebelow.

Ordering a Count Layer

There are many ways to order a count layerdepending on the preferences of the table programmer.Tplyr supportssorting by a descending amount in a column in the table, sorting by a<VAR>N variable, and sorting by a custom order. These can be set using the'set_order_count_method' function.

Sorting by a numericcount

A selected numeric value from a selected column will be indexedbased on the descending numeric value. The numeric value extracted defaultsto 'n' but can be changed with 'set_result_order_var'. The column selectedfor sorting defaults to the first value in the treatment group variable. Ifthere were arguments passed to the 'cols' argument in the table those mustbe specified with 'set_ordering_columns'.

Sorting by a 'varn'variable

If the treatment variable has a <VAR>N variable. It can beindexed to that variable.

Sorting by a factor(Default)

If a factoris found for the target variable in the target dataset that is used toorder, if no factor is found it is coerced to a factor and sortedalphabetically.

Sorting a nested count layer

If two variables aretargeted by a count layer, two methods can be passed to 'set_order_count'.If two are passed, the first is used to sort the blocks, the second is usedto sort the "inside" of the blocks. If one method is passed, that will beused to sort both.

Ordering a Desc Layer

The order of a desc layer is mostly setduring the object construction. The by variables are resolved and indexwith the same logic as the count layers. The target variable is orderedbased on the format strings that were used when the layer was created.

Examples

library(dplyr)# Default sorting by factort <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl)  )build(t)# Sorting by <VAR>Nmtcars$cylN <- mtcars$cylt <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      set_order_count_method("byvarn")  )# Sorting by row countt <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      set_order_count_method("bycount") %>%      # Orders based on the 6 gear group      set_ordering_cols(6)  )# Sorting by row count by percentagest <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      set_order_count_method("bycount") %>%      set_result_order_var(pct)  )# Sorting when you have column arguments in the tablet <- tplyr_table(mtcars, gear, cols = vs) %>%  add_layer(    group_count(cyl) %>%      # Uses the fourth gear group and the 0 vs group in ordering      set_ordering_cols(4, 0)  )# Using a custom factor to ordermtcars$cyl <- factor(mtcars$cyl, c(6, 4, 8))t <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      # This is the default but can be used to change the setting if it is      #set at the table level.      set_order_count_method("byfactor")  )

Set the value of a outer nested count layer to Inf or -Inf

Description

Set the value of a outer nested count layer to Inf or -Inf

Usage

set_outer_sort_position(e, outer_sort_position)

Arguments

e

Acount_layer object

outer_sort_position

Either 'asc' or 'desc'. If desc the final ordering helperwill be set to Inf, if 'asc' the ordering helper is set to -Inf.

Value

The modified count layer.


Set precision data

Description

In some cases, there may be organizational standards surrounding decimal precision.For example, there may be a specific standard around the representation of precision relatingto lab results. As such,set_precision_data() provides an interface to provide integer anddecimal precision from an external data source.

Usage

set_precision_data(layer, prec, default = c("error", "auto"))

Arguments

layer

Atplyr_layer object

prec

A dataframe following the structure specified in the function details

default

Handling of unspecified by variable groupings. Defaults to 'error'. Set to 'auto' to automatically infer any missing groups.

Details

The ultimate behavior of this feature is just that of the existing auto precision method, exceptthat the precision is specified in the provided precision dataset rather than inferred from the source data.At a minimum, the precision dataset must contain the integer variablesmax_int andmax_dec. If by variablesare provided, those variables must be available in the layer by variables.

When the table is built, by default Tplyr will error if the precision dataset is missing by variable groupingsthat exist in the target dataset. This can be overriden using thedefault parameter. Ifdefault is set to"auto", any missing values will be automatically inferred from the source data.

Examples

prec <- tibble::tribble(  ~vs, ~max_int, ~max_dec,  0,        1,        1,  1,        2,        2)tplyr_table(mtcars, gear) %>%  add_layer(    group_desc(wt, by = vs) %>%      set_format_strings(        'Mean (SD)' = f_str('a.a+1 (a.a+2)', mean, sd)      ) %>%      set_precision_data(prec) %>%      set_precision_on(wt)  ) %>%  build()

Set descriptive statistics as columns

Description

In many cases, treatment groups are represented as columns within a table.But some tables call for a transposed presentation, where the treatmentgroups displayed by row, and the descriptive statistics are represented ascolumns.set_stats_as_columns() allows Tplyr to output a built tableusing this transposed format and deviate away from the standardrepresentation of treatment groups as columns.

Usage

set_stats_as_columns(e, stats_as_columns = TRUE)

Arguments

e

desc_layer on descriptive statistics summaries should be represented as columns

stats_as_columns

Boolean to set stats as columns

Details

This function leaves all specified by variables intact. The only switch thathappens during the build process is that the provided descriptive statisticsare transposed as columns and the treatment variable is left as rows. Columnvariables will remain represented as columns, and multiple target variableswill also be respected properly.

Value

The input tplyr_layer

Examples

dat <- tplyr_table(mtcars, gear) %>%  add_layer(    group_desc(wt, by = vs) %>%      set_format_strings(        "n"        = f_str("xx", n),        "sd"       = f_str("xx.x", sd, empty = c(.overall = "BLAH")),        "Median"   = f_str("xx.x", median),        "Q1, Q3"   = f_str("xx, xx", q1, q3),        "Min, Max" = f_str("xx, xx", min, max),        "Missing"  = f_str("xx", missing)      ) %>%      set_stats_as_columns()  ) %>%  build()

Set the label for the total row

Description

The row label for a total row defaults to "Total", however this can beoverriden using this function.

Usage

set_total_row_label(e, total_row_label)

Arguments

e

Acount_layer object

total_row_label

A character to label the total row

Value

The modifiedcount_layer object

Examples

# Load in pipelibrary(magrittr)t <- tplyr_table(mtcars, gear) %>%  add_layer(    group_count(cyl) %>%      add_total_row() %>%      set_total_row_label("Total Cyl")  )build(t)

Extract format group strings or numbers

Description

These functions allow you to extract segments of information from within aresult string by targetting specific format groups.str_extract_fmt_group()allows you to pull out the individual format group string, whilestr_extract_num() allows you to pull out that specific numeric result.

Usage

str_extract_fmt_group(string, format_group)str_extract_num(string, format_group)

Arguments

string

A string of number results from which to extract format groups

format_group

An integer representing format group that should beextracted

Details

Format groups refer to individual segments of a string. For example, giventhe string ' 5 (34.4%) [9]', there are three separate format groups, whichare ' 5', '(34.4%)', and '[9]'.

Value

A character vector

Examples

string <- c(" 0  (0.0%)", " 8  (9.3%)", "78 (90.7%)")str_extract_fmt_group(string, 2)str_extract_num(string, 2)

Wrap strings to a specific width with hyphenation while preservingindentation

Description

str_indent_wrap() leveragesstringr::str_wrap() under the hood, but takessome extra steps to preserve any indentation that has been applied to acharacter element, and use hyphenated wrapping of single words that runlonger than the allotted wrapping width.

Usage

str_indent_wrap(x, width = 10, tab_width = 5)

Arguments

x

An input character vector

width

The desired width of elements within the output character vector

tab_width

The number of spaces to which tabs should be converted

Details

The functionstringr::str_wrap() is highly efficient, but in thecontext of table creation there are two select features missing - hyphenationfor long running strings that overflow width, and respect for pre-indentationof a character element. For example, in an adverse event table, you may havebody system rows as an un-indented column, and preferred terms as indentedcolumns. These strings may run long and require wrapping to not surpass thecolumn width. Furthermore, for crowded tables a single word may be longerthan the column width itself.

This function takes steps to resolve these two issues, while trying tominimize additional overhead required to apply the wrapping of strings.

Note: This function automatically converts tabs to spaces. Tab width variesdepending on font, so width cannot automatically be determined within a dataframe. As such, users can specify the width

Value

A character vector with string wrapping applied

Examples

ex_text1 <- c("RENAL AND URINARY DISORDERS", "   NEPHROLITHIASIS")ex_text2 <- c("RENAL AND URINARY DISORDERS", "\tNEPHROLITHIASIS")cat(paste(str_indent_wrap(ex_text1, width=8), collapse="\n\n"),"\n")cat(paste(str_indent_wrap(ex_text2, tab_width=4), collapse="\n\n"),"\n")

ADAE Data

Description

A subset of the PHUSE Test Data Factory ADAE data set.

Usage

tplyr_adae

Format

A data.frame with 276 rows and 55 columns.

Source

https://github.com/phuse-org/TestDataFactory

See Also

[get_data_labels()]


ADAS Data

Description

A subset of the PHUSE Test Data Factory ADAS data set.

Usage

tplyr_adas

Format

A data.frame with 1,040 rows and 40 columns.

Source

https://github.com/phuse-org/TestDataFactory

See Also

[get_data_labels()]


ADLB Data

Description

A subset of the PHUSE Test Data Factory ADLB data set.

Usage

tplyr_adlb

Format

A data.frame with 311 rows and 46 columns.

Source

https://github.com/phuse-org/TestDataFactory

See Also

[get_data_labels()]


ADPE Data

Description

A mock-up dataset that is fit for testing data limiting

Usage

tplyr_adpe

Format

A data.frame with 21 rows and 8 columns.


ADSL Data

Description

A subset of the PHUSE Test Data Factory ADSL data set.

Usage

tplyr_adsl

Format

A data.frame with 254 rows and 49 columns.

Source

https://github.com/phuse-org/TestDataFactory

See Also

[get_data_labels()]


Create atplyr_layer object

Description

This object is the workhorse of thetplyr package. Atplyr_layer can be thought of as a block, or "layer" of a table.Summary tables typically consist of different sections that require differentsummaries. When programming these section, your code will create differentlayers that need to be stacked or merged together. Atplyr_layer isthe container for those isolated building blocks.

When building thetplyr_table, each layer will execute independently.When all of the data processing has completed, the layers are broughttogether to construct the output.

tplyr_layer objects are not created directly, but are rather createdusing the layer constructor functionsgroup_count,group_desc, andgroup_shift.

Usage

tplyr_layer(parent, target_var, by, where, type, ...)

Arguments

parent

tplyr_table ortplyr_layer. Required. The parentenvironment of the layer. This must be either thetplyr_table objectthat the layer is contained within, or anothertplyr_layer object ofwhich the layer is a subgroup.

target_var

Symbol. Required, The variable name on which the summary isto be performed. Must be a variable within the target dataset. Enterunquoted - i.e. target_var = AEBODSYS.

by

A string, a variable name, or a list of variable names suppliedusingdplyr::vars

where

Call. Filter logic used to subset the target data whenperforming a summary.

type

"count", "desc", or "shift". Required. The category of layer -either "counts" for categorical counts, "desc" for descriptive statistics,or "shift" for shift table counts

...

Additional arguments

Value

Atplyr_layer environment that is a child of the specifiedparent. The environment contains the object as listed below.

tplyr_layer Core Object Structure

type

This is an attribute. A string indicating the layertype, which controls the summary that will be performed.

target_var

A quosure of a name, which is the variable onwhich a summary will be performed.

by

A list of quosuresrepresenting either text labels or variable names used in grouping.Variable names must exist within the target dataset Text strings submitteddo not need to exist in the target dataset.

cols

A list ofquosures used to determine the variables that are used to display incolumns.

where

A quosure of a call that containers thefilter logic used to subset the target dataset. This filtering is inaddition to any subsetting done based onwhere criteria specified intplyr_table

layers

A list with classtplyr_layer_container. Initialized as empty, but serves as thecontainer for any sublayers of the current layer. Used internally.

Different layer types will have some different bindings specific to thatlayer's needs.

See Also

tplyr_table

Examples

tab <- tplyr_table(iris, Sepal.Width)l <- group_count(tab, by=vars('Label Text', Species),                 target_var=Species, where= Sepal.Width < 5.5,                 cols = Species)

Tplyr Metadata Object

Description

If a Tplyr table is built with the 'metadata=TRUE' option specified, thenmetadata is assembled behind the scenes to provide traceability on eachresult cell derived. The functions 'get_meta_result()' and'get_meta_subset()' allow you to access that metadata by using an ID providedin the row_id column and the column name of the result you'd like to access.The purpose is of the row_id variable instead of a simple row index is toprovide a sort resistant reference of the originating column, so the outputTplyr table can be sorted in any order but the metadata are still easilyaccessible.

Usage

tplyr_meta(names = list(), filters = exprs())

Arguments

names

List of symbols

filters

List of expressions

Details

The 'tplyr_meta' object provided a list with two elements - names andfilters. The names contain every column from the target data.frame of theTplyr table that factored into the specified result cell, and the filterscontains all the necessary filters to subset the target data to create thespecified result cell. 'get_meta_subset()' additionally provides a parameter tospecify any additional columns you would like to include in the returnedsubset data frame.

Value

tplyr_meta object

Examples

tplyr_meta(   names = rlang::quos(x, y, z),   filters = rlang::quos(x == 1, y==2, z==3) )

Create a Tplyr table object

Description

Thetplyr_table object is the main container upon which a Tplyr table is constructed. Tplyr tables are made up ofone or more layers. Each layer contains an instruction for a summary to be performed. Thetplyr_table object containsthose layers, and the general data, metadata, and logic necessary.

Usage

tplyr_table(target, treat_var, where = TRUE, cols = vars())

Arguments

target

Dataset upon which summaries will be performed

treat_var

Variable containing treatment group assignments. Supply unquoted.

where

A general subset to be applied to all layers. Supply as programming logic (i.e. x < 5 & y == 10)

cols

A grouping variable to summarize data by column (in addition to treat_var). Provide multiplecolumn variables by usingvars

Details

When atplyr_table is created, it will contain the following bindings:

tplyr_table allows you a basic interface to instantiate the object. Modifier functions are available to changeindividual parameters catered to your analysis. For example, to add a total group, you can use theadd_total_group.

In future releases, we will provide vignettes to fully demonstrate these capabilities.

Value

Atplyr_table object

Examples

tab <- tplyr_table(iris, Species, where = Sepal.Length < 5.8)

Return or set the treatment variable binding

Description

Return or set the treatment variable binding

Usage

treat_var(table)set_treat_var(table, treat_var)

Arguments

table

Atplyr_table object to set or return treatment variablethe table is split by.

treat_var

Variable containing treatment group assignments. Supply unquoted.

Value

Fortplyr_treat_var the treat_var binding of thetplyr_tableobject. Forset_tplyr_treat_var the modified object.

Examples

tab <- tplyr_table(mtcars, cyl)set_treat_var(tab, gear)

[8]ページ先頭

©2009-2025 Movatter.jp