Movatterモバイル変換


[0]ホーム

URL:


Format Precedence and NA Handling

Wojciech Wójciak and Gabriel Becker

2025-12-14

Formats Precedence

Users of thertables package can specify the format inwhich the numbers in the reporting tables are printed. Formattingfunctionality is provided by theformattersR package. Seeformatters::list_valid_format_labels() for alist of all available formats. The format can be specified by the userin a few different places. It may happen that, for a single tablelayout, the format is specified in more than one place. In such a case,the final format that will be applied depends on format precedence rulesdefined byrtables. In this vignette, we describe the basicrules ofrtables format precedence.

The examples shown in this vignette utilize the exampleADSL dataset, a demographic table that summarizes thevariables content for different population subsets (encoded in thecolumns).

library(rtables)ADSL<- ex_adsl

Note that allex_* data which is currently attached tothertables package is provided by theformatterspackage and was created using the publicly availablerandom.cdisc.dataR package.

Format Precedence and Inheritance Rules

The format in which numbers are printed can be specified by the userin a few different places. In the context of precedence, it is importantwhich level of the split hierarchy formats are specified at. In general,there are two such levels: thecell level and theso-calledparent table level. The concept of the celland the parent table results from the way in which thertables package stores resulting tables. It models theresulting tables as hierarchical, tree-like objects with the cells (asleaves) containing multiple values. Particularly noteworthy in thiscontext is the fact that the actual table splitting occurs in arow-dominant way (even if column splitting is present in the layout).rtables provides user-end functiontable_structure() that prints the structure of a giventable object.

For a simple illustration, consider the following example:

lyt<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun = mean)adsl_analyzed<-build_table(lyt, ADSL)adsl_analyzed
#                       A: Drug X          B: Placebo       C: Combination # —————————————————————————————————————————————————————————————————————————# F                                                                        #   mean             32.7594936708861   34.1168831168831   35.1969696969697# M                                                                        #   mean             35.5686274509804   37.4363636363636   35.3833333333333# U                                                                        #   mean             31.6666666666667          31               35.25      # UNDIFFERENTIATED                                                         #   mean                    28                 NA                 45
table_structure(adsl_analyzed)
# [TableTree] SEX#  [TableTree] F#   [ElementaryTable] AGE (1 x 3)#  [TableTree] M#   [ElementaryTable] AGE (1 x 3)#  [TableTree] U#   [ElementaryTable] AGE (1 x 3)#  [TableTree] UNDIFFERENTIATED#   [ElementaryTable] AGE (1 x 3)

In this table, there are 4 sub-tables under theSEXtable. These are:F,M,U, andUNDIFFERENTIATED. Each of these sub-tables has onesub-tableAGE. For example, for the firstAGEsub-table, its parent table isF.

The concept of hierarchical, tree-like representations of resultingtables translates directly to format precedence and inheritance rules.As a general principle, the format being finally applied for the cell isthe one that is the most specific, that is, the one which is the closestto the cell in a given path in the tree. Hence, theprecedence-inheritance chain looks like the following:

parent_table -> parent_table -> ... -> parent_table -> cell

In such a chain, the outermostparent_table is the leastspecific place to specify the format, while thecell is themost specific one. In cases where the format is specified by the user inmore than one place, the one which is most specific will be applied inthe cell. If no specific format has been selected by the user for thesplit, then the default format will be applied. The default format is"xx" and it yields the same formatting as theas.character() function. In the following sections of thisvignette, we will illustrate the format precedence rules with a fewexamples.

Standard Format

Below is a simple layout that does not explicitly set a format forthe output of the analysis function. In such a case, the default formatis applied.

lyt0<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE",afun = mean)build_table(lyt0, ADSL)
#           A: Drug X          B: Placebo       C: Combination # —————————————————————————————————————————————————————————————# mean   33.7686567164179   35.4328358208955   35.4318181818182

Cell Format

The format of a cell can be explicitly specified via thercell() orin_rows() functions. The former isessentially a collection of data objects while the latter is acollection ofrcell() objects. As previously mentioned,this is the most specific place where the format can be specified by theuser.

lyt1<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE",afun =function(x) {rcell(mean(x),format ="xx.xx",label ="Mean")  })build_table(lyt1, ADSL)
#        A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————# Mean     33.77       35.43          35.43
lyt1a<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x)),.formats ="xx.xx"    )  })build_table(lyt1a, ADSL)
#        A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————# Mean     33.77       35.43          35.43

If the format is specified in both of these places at the same time,the one specified viain_rows() takes highest precedence.Technically, in this case, the format defined inrcell()will simply be overwritten by the one defined inin_rows().This is because the format specified inin_rows() isapplied to the cells not the rows (overriding the previously specifiedcell-specific values), which indicates that the precedence rulesdescribed above are still in place.

lyt2<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x),format ="xx.xxx"),.formats ="xx.xx"    )  })build_table(lyt2, ADSL)
#        A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————# Mean     33.77       35.43          35.43

Parent Table Format and Inheritance

In addition to the cell level, the format can be specified at theparent table level. If no format has been set by the user for a cell,the most specific format for that cell is the one defined at itsinnermost parent table split (if any).

lyt3<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE", mean,format ="xx.x")build_table(lyt3, ADSL)
#        A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————# mean     33.8         35.4           35.4

If the cell format is also specified for a cell, then the parenttable format is ignored for this cell since the cell format is morespecific and therefore takes precedence.

lyt4<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE",afun =function(x) {rcell(mean(x),format ="xx.xx",label ="Mean")    },format ="xx.x"  )build_table(lyt4, ADSL)
#        A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————# Mean     33.77       35.43          35.43
lyt4a<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x)),"SD"=rcell(sd(x)),.formats ="xx.xx"      )    },format ="xx.x"  )build_table(lyt4a, ADSL)
#        A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————# Mean     33.77       35.43          35.43     # SD       6.55         7.90           7.72

In the following, slightly more complicated, example, we can observepartial inheritance. That is, onlySD cells inherit theparent table’s format while theMean cells do not.

lyt5<-basic_table()%>%split_cols_by("ARM")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x),format ="xx.xx"),"SD"=rcell(sd(x))      )    },format ="xx.x"  )build_table(lyt5, ADSL)
#        A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————# Mean     33.77       35.43          35.43     # SD        6.6         7.9            7.7

NA Handling

Consider the following layout and the resulting table created:

lyt6<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun = mean,format ="xx.xx")build_table(lyt6, ADSL)
#                    A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————————————————# F                                                         #   mean               32.76       34.12          35.20     # M                                                         #   mean               35.57       37.44          35.38     # U                                                         #   mean               31.67       31.00          35.25     # UNDIFFERENTIATED                                          #   mean               28.00         NA           45.00

In the output the cell corresponding to theUNDIFFERENTIATED level ofSEX and theB: Placebo level ofARM is displayed asNA. This occurs because there were no non-NAvalues under this facet that could be used to compute the mean.rtables allows the user to specify a string to display whencell values areNA. Similar to formats for numbers, theuser can specify a string to replaceNA with the parameterformat_na_str or.format_na_str. This can bespecified at the cell or parent table level.NA stringprecedence and inheritance rules are the same as those for number formatprecedence, described in the previous section of this vignette. We willillustrate this with a few examples.

ReplacingNA Values at the Cell Level

At the cell level, it is possible to replaceNA valueswith a custom string by means of theformat_na_strparameter inrcell() or.format_na_strparameter inin_rows().

lyt7<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun =function(x) {rcell(mean(x),format ="xx.xx",label ="Mean",format_na_str ="<missing>")  })build_table(lyt7, ADSL)
#                    A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————————————————# F                                                         #   Mean               32.76       34.12          35.20     # M                                                         #   Mean               35.57       37.44          35.38     # U                                                         #   Mean               31.67       31.00          35.25     # UNDIFFERENTIATED                                          #   Mean               28.00     <missing>        45.00
lyt7a<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x),format ="xx.xx"),.format_na_strs ="<MISSING>"    )  })build_table(lyt7a, ADSL)
#                    A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————————————————# F                                                         #   Mean               32.76       34.12          35.20     # M                                                         #   Mean               35.57       37.44          35.38     # U                                                         #   Mean               31.67       31.00          35.25     # UNDIFFERENTIATED                                          #   Mean               28.00     <MISSING>        45.00

If theNA string is specified in both of these places atthe same time, the one specified within_rows() takesprecedence. Technically, in this case theNA replacementstring defined inrcell() will simply be overwritten by theone defined inin_rows(). This is because theNA string specified inin_rows() is applied tothe cells, not the rows (overriding the previously specified cellspecific values), which means that the precedence rules described aboveare still in place.

lyt8<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x),format ="xx.xx",format_na_str ="<missing>"),.format_na_strs ="<MISSING>"    )  })build_table(lyt8, ADSL)
#                    A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————————————————# F                                                         #   Mean               32.76       34.12          35.20     # M                                                         #   Mean               35.57       37.44          35.38     # U                                                         #   Mean               31.67       31.00          35.25     # UNDIFFERENTIATED                                          #   Mean               28.00     <MISSING>        45.00

Parent Table Replacement ofNA Values and InheritancePrinciples

In addition to the cell level, the string replacement forNA values can be specified at the parent table level. If noreplacement string has been specified by the user for a cell, the mostspecificNA string for that cell is the one defined at itsinnermost parent table split (if any).

lyt9<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE", mean,format ="xx.xx",na_str ="not available")build_table(lyt9, ADSL)
#                    A: Drug X    B: Placebo     C: Combination# —————————————————————————————————————————————————————————————# F                                                            #   mean               32.76         34.12           35.20     # M                                                            #   mean               35.57         37.44           35.38     # U                                                            #   mean               31.67         31.00           35.25     # UNDIFFERENTIATED                                             #   mean               28.00     not available       45.00

If anNA value replacement string was also specified atthe cell level, then the one set at the parent table level is ignoredfor this cell as the cell level format is more specific and thereforetakes precedence.

lyt10<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun =function(x) {rcell(mean(x),format ="xx.xx",label ="Mean",format_na_str ="<missing>")    },na_str ="not available"  )build_table(lyt10, ADSL)
#                    A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————————————————# F                                                         #   Mean               32.76       34.12          35.20     # M                                                         #   Mean               35.57       37.44          35.38     # U                                                         #   Mean               31.67       31.00          35.25     # UNDIFFERENTIATED                                          #   Mean               28.00     <missing>        45.00
lyt10a<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x)),"SD"=rcell(sd(x)),.formats ="xx.xx",.format_na_strs ="<missing>"      )    },na_str ="not available"  )build_table(lyt10a, ADSL)
#                    A: Drug X   B: Placebo   C: Combination# ——————————————————————————————————————————————————————————# F                                                         #   Mean               32.76       34.12          35.20     #   SD                 6.09         7.06           7.43     # M                                                         #   Mean               35.57       37.44          35.38     #   SD                 7.08         8.69           8.24     # U                                                         #   Mean               31.67       31.00          35.25     #   SD                 3.21         5.66           3.10     # UNDIFFERENTIATED                                          #   Mean               28.00     <missing>        45.00     #   SD               <missing>   <missing>         1.41

In the following, slightly more complicated example, we can observepartial inheritance of NA strings. That is, onlySD cellsinherit the parent table’sNA string, while theMean cells do not.

lyt11<-basic_table()%>%split_cols_by("ARM")%>%split_rows_by("SEX")%>%analyze(vars ="AGE",afun =function(x) {in_rows("Mean"=rcell(mean(x),format_na_str ="<missing>"),"SD"=rcell(sd(x))      )    },format ="xx.xx",na_str ="not available"  )build_table(lyt11, ADSL)
#                      A: Drug X      B: Placebo     C: Combination# —————————————————————————————————————————————————————————————————# F                                                                #   Mean                 32.76           34.12           35.20     #   SD                   6.09            7.06             7.43     # M                                                                #   Mean                 35.57           37.44           35.38     #   SD                   7.08            8.69             8.24     # U                                                                #   Mean                 31.67           31.00           35.25     #   SD                   3.21            5.66             3.10     # UNDIFFERENTIATED                                                 #   Mean                 28.00         <missing>         45.00     #   SD               not available   not available        1.41

[8]ページ先頭

©2009-2025 Movatter.jp