- Notifications
You must be signed in to change notification settings - Fork0
Functions for automated statistical analysis of Xenopus Eleutherembryonic Thyroid Assays (XETA), Rapid Androgen Disruption Activity Reporter (RADAR) assays and Rapid Estrogen Activity In Vitro (REACTIV) assays.
basf/xeredar
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The package xeredar is an R-package for analysis of the New ApproachMethodology (NAM) assays of XETA (Xenopus Eleutheroembryonic Thyroid),RADAR (Rapid Androgen Disruption Activity Reporter) and REACTIV (RapidEstrogen ACTivity In Vivo) for assessing endocrine effects of chemicalson the thyroid, androgen/steroid and estrogen axis. The functionality isbased on the SAS-script recommended in the Annex 13 of OECD testguideline No. 248 of the XETA assay(2019),written by John Green.
You can install xeredar using one of the followig commands:
devtools::install_github("basf/xeredar")pak::pkg_install("basf/xeredar")
Data frames that are supposed to be analyzed with xeredar needs tofulfill certain requirements. The data frame or tibble needs to containthe following column headers:
knitr::kable(head(xeredar::testDataSpiked))
Replicate | Treatment | Row | Fluor | Conc | |
---|---|---|---|---|---|
11 | 1 | 0 + T3 | 4 | 19.768 | 0 |
12 | 1 | 0 + T3 | 4 | 27.928 | 0 |
13 | 1 | 0 + T3 | 4 | 29.592 | 0 |
14 | 1 | 0 + T3 | 4 | 22.816 | 0 |
15 | 1 | 0 + T3 | 4 | 26.080 | 0 |
16 | 1 | 0 + T3 | 4 | 25.332 | 0 |
The type of each column should be accordingly:
knitr::kable(purrr::map_df(xeredar::testDataSpiked, class))
Replicate | Treatment | Row | Fluor | Conc |
---|---|---|---|---|
factor | character | character | numeric | ordered |
factor | character | character | numeric | factor |
Replicate (i.e. run),Treatment (i.e. a unique name for eachtreatment level in either spiked or unspiked mode) andRow(i.e. exposure vessel) can either befactor orcharacter columns,butFluor (i.e. measured fluorescence) must always benumeric andConc (i.e. concentration of test item) must always containorderedfactors. The order of the columns is not relevant.
Replicate,Treatment andRow can either befactor orcharacter columns, butFluor must always benumeric andConcmust always containordered factors. The order of the columns is notrelevant. It is important that the decimal separator is a period insteadof a comma.
When simply aiming to use thedata_prep()
function, the data frameneeds to either contain spiked treatments or unspiked treatments. Whenstill having spiked and unspiked treatments in one data frame theyshould be separated. Imagine you have a XETA data frame (dat) whichcontains spiked and unspiked treatments as well as the T4 positivecontrol. The T3 or T4 additions are designated by “T3” and “T4” in theTreatment column. The spiked and unspiked datasets could quickly besubset using the following code:
datSpiked <- dat[grepl("T3",dat$Treatment),] # spiked data datUnspiked <- setdiff(dat,datSpiked)datUnspiked <- datUnspiked[which(datUnspiked$Treatment != "T4"),] # unspiked data
To demonstrate how to run XETA analysis, we will use one of the datasets by the French lab containing the spiked and unspiked measurementsfrom the XETA ring test as included in OECD test guideline No. 248 ofthe XETA assay(2019).
xeta_spiked <- xeredar::valid_data_xeta[["ptu_france_spiked"]]xeta_unspiked <- xeredar::valid_data_xeta[["ptu_france_unspiked"]]
The default XETA analysis can be run using thedata_prep()
functionwith either spiked or unspiked data. This function automatically decideswhether trimming, outlier removal and/or transformations are conductedfollowing the manuscript (Spyridonov et al. 2025).The actual analysis is carried out by theana()
function. Theana()
function iscalled by thedata_prep()
function and does not need to be calledseparately. For this dataset, the exposure well ID (Row of the 96 wellplate) is not recorded, therefore, we set therow
argument toFALSE
.In this case, we use the reduced mixed ANOVA model where the exposurewell ID is not included as a random effect. Please specifyrow=TRUE
ifthe exposure well ID is recorded and you want to use the full mixedANOVA model.
xeta_spiked_result <- xeredar::data_prep(dataframe = xeta_spiked, row = FALSE)xeta_unspiked_result <- xeredar::data_prep(dataframe = xeta_unspiked, row= FALSE)
Here we use the spike data as an example to demonstrate the output ofthedata_prep()
function.
The outputs of thedata_prep()
function are lists containing thefollowing elements:
- A reasoning for the recommended transformation and trimming. The rawdata should be used for the analysis because the residuals of themixed ANOVA are normally distributed and show homogeneous variancesamong treatment groups.
xeta_spiked_result$Justify#> [1] "The raw data should be used for the analysis because\n the residuals of the mixed ANOVA are normally\n distributed and show homogeneous variances\n among treatment groups."
- A data frame of the processed data (e.g. raw data, trimmed,transformed, or outlier removed) used for actual statistical testingfollowing the reasoning. The box plots of the processed data perrun/replicate (i.e. each panel represents each run/replicate) arealso provided for visual inspection.
knitr::kable(head(xeta_spiked_result$ProcessedData))
Replicate | Treatment | Fluor | Conc | Country | Substance | Spiked |
---|---|---|---|---|---|---|
1 | FETAXT3 | 4989.533 | 0 | france | ptu | TRUE |
1 | FETAXT3 | 5002.533 | 0 | france | ptu | TRUE |
1 | FETAXT3 | 6331.533 | 0 | france | ptu | TRUE |
1 | FETAXT3 | 4645.533 | 0 | france | ptu | TRUE |
1 | FETAXT3 | 4977.533 | 0 | france | ptu | TRUE |
1 | FETAXT3 | 6229.533 | 0 | france | ptu | TRUE |
xeta_spiked_result$BoxPlots
- Summary tables of the processed data (per replicate and overall)
knitr::kable(xeta_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")
Conc | Replicate | N | Mean | Standard deviation | Coefficient of variation |
---|---|---|---|---|---|
0 | 1 | 19 | 4651.060 | 944.7420 | 0.2031240 |
0 | 2 | 20 | 4024.300 | 603.6731 | 0.1500070 |
0 | 3 | 19 | 4952.225 | 1179.3597 | 0.2381475 |
1 | 1 | 19 | 3854.137 | 828.3944 | 0.2149364 |
1 | 2 | 20 | 4557.867 | 558.4921 | 0.1225337 |
1 | 3 | 19 | 4720.819 | 619.8992 | 0.1313118 |
3 | 1 | 19 | 4671.428 | 976.9674 | 0.2091368 |
3 | 2 | 19 | 4486.538 | 783.8641 | 0.1747147 |
3 | 3 | 20 | 5421.433 | 703.9490 | 0.1298456 |
10 | 1 | 20 | 4573.083 | 946.2029 | 0.2069070 |
10 | 2 | 19 | 4329.538 | 827.4898 | 0.1911266 |
10 | 3 | 19 | 5106.849 | 876.5940 | 0.1716507 |
30 | 1 | 20 | 5391.183 | 809.5849 | 0.1501683 |
30 | 2 | 19 | 4556.275 | 694.1698 | 0.1523547 |
30 | 3 | 19 | 5164.691 | 763.8837 | 0.1479050 |
100 | 1 | 19 | 5181.060 | 858.5454 | 0.1657085 |
100 | 2 | 19 | 5124.381 | 733.4611 | 0.1431317 |
100 | 3 | 20 | 6048.083 | 959.5900 | 0.1586602 |
knitr::kable(xeta_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")
Conc | N | Mean | Standard deviation | Coefficient of variation |
---|---|---|---|---|
0 | 58 | 4533.593 | 998.2901 | 0.2201984 |
1 | 58 | 4380.715 | 764.2147 | 0.1744498 |
3 | 58 | 4869.483 | 910.7568 | 0.1870336 |
10 | 58 | 4668.156 | 928.9066 | 0.1989879 |
30 | 58 | 5043.483 | 825.4425 | 0.1636652 |
100 | 58 | 5461.466 | 945.7370 | 0.1731654 |
- Tables of results evaluated using increasing/decreasing Williamstest and/or Dunnett’s test, if applicable.
In the the Williams’ test result tables,Y.Tilde is the amalgamatedmean of the fluorescence in each treatment group,Y0 is the mean ofthe control fluorescence,DIFF is the estimated difference between thetreatment and the control,SE_DIFF is the standard error of theWilliams’ test,DF is the degrees of freedom for Williams’ test,WILLIncr orWill Decr is the Williams’ test statistic,crit Val is thecritical value of Williams distribution,Sign suggests if there issignificant difference between the treatment and the control, and%Incr is the percent increase of the fluorescence compared to thecontrol.
In the Dunnett’s test result table,Estimate is the estimateddifference between the treatment and the control,SE is the standarderror of the mixed ANOVA model,t value is the Dunnett’s teststatistic,adj p is the adjusted p value and%Incr is the percentincrease of the fluorescence compared to the control.
knitr::kable(xeta_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")
Conc | Y.Tilde | Y0 | DIFF Incr | SE_DIFF | DF | WILL Incr | crit Val | Sign | % Incr | |
---|---|---|---|---|---|---|---|---|---|---|
Conc100 - Conc0 | 100 | 5461.47 | 4533.593 | 927.8769 | 249.7868 | 10 | 3.7146758 | 1.971 | TRUE | 20.466613 |
Conc30 - Conc0 | 30 | 5043.48 | 4533.593 | 509.8869 | 249.7868 | 10 | 2.0412886 | 1.965 | TRUE | 11.246933 |
Conc10 - Conc0 | 10 | 4768.82 | 4533.593 | 235.2269 | 249.7868 | 10 | 0.9417108 | 1.956 | FALSE | 2.968123 |
Conc3 - Conc0 | 3 | 4768.82 | 4533.593 | 235.2269 | 249.7868 | 10 | 0.9417108 | 1.940 | FALSE | 7.408918 |
Conc1 - Conc0 | 1 | 4380.72 | 4533.593 | -152.8731 | 249.7845 | 10 | -0.6120201 | 1.908 | FALSE | -3.372111 |
knitr::kable(xeta_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")
Conc | Y.Tilde | Y0 | DIFF Decr | SE_DIFF | DF | WILL Decr | crit Val | Sign | % Incr | |
---|---|---|---|---|---|---|---|---|---|---|
Conc100 - Conc0 | 100 | 4884.66 | 4533.593 | -351.0669 | 249.7868 | 10 | -1.405466 | 1.971 | FALSE | 20.466613 |
Conc30 - Conc0 | 30 | 4884.66 | 4533.593 | -351.0669 | 249.7868 | 10 | -1.405466 | 1.965 | FALSE | 11.246933 |
Conc10 - Conc0 | 10 | 4884.66 | 4533.593 | -351.0669 | 249.7868 | 10 | -1.405466 | 1.956 | FALSE | 2.968123 |
Conc3 - Conc0 | 3 | 4884.66 | 4533.593 | -351.0669 | 249.7868 | 10 | -1.405466 | 1.940 | FALSE | 7.408918 |
Conc1 - Conc0 | 1 | 4884.66 | 4533.593 | -351.0669 | 249.7845 | 10 | -1.405479 | 1.908 | FALSE | -3.372111 |
knitr::kable(xeta_spiked_result$Dunnetts, caption="Dunnett's test")
Estimate | SE | df | t value | adj p | % Incr | |
---|---|---|---|---|---|---|
Conc1 - Conc0 | -160.4067 | 249.7845 | 10 | -0.6421806 | 0.9458404 | -3.372111 |
Conc3 - Conc0 | 320.1063 | 249.7868 | 10 | 1.2815181 | 0.5970718 | 7.408918 |
Conc10 - Conc0 | 128.8281 | 249.7868 | 10 | 0.5157522 | 0.9769722 | 2.968123 |
Conc30 - Conc0 | 499.2993 | 249.7868 | 10 | 1.9989021 | 0.2367993 | 11.246933 |
Conc100 - Conc0 | 911.7088 | 249.7868 | 10 | 3.6499482 | 0.0174131 | 20.466613 |
- Further information about the normality test (Shapiro-Wilk), thehomogeneity of variance test (Levene’s test) of residuals of themixed ANOVA model, the monotonicity test and the model fit.
xeta_spiked_result$NormalityTest#> #> Shapiro-Wilk normality test#> #> data: stats::resid(mixedaov)#> W = 0.99186, p-value = 0.05343xeta_spiked_result$LeveneTest#> # A tibble: 1 × 4#> statistic p.value df df.residual#> <dbl> <dbl> <int> <int>#> 1 0.759 0.580 5 342xeta_spiked_result$`Monotonicity Test`#> Test t value Pr(>|t|) Significance#> 1 Linear 6.49 <0.0001 ***#> 2 Quadratic 2.14 0.0335 *xeta_spiked_result$MixedAnova#> Linear mixed model fit by REML ['lmerMod']#> Formula: #> Fluor ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)#> Data: dataframe#> REML criterion at convergence: 5608.294#> Random effects:#> Groups Name Std.Dev.#> Replicate:Conc (Intercept) 241.1 #> Replicate (Intercept) 350.5 #> Residual 827.7 #> Number of obs: 348, groups: #> Replicate:Conc, 18; Replicate, 3#> Fixed Effects:#> (Intercept) Conc.L Conc.Q Conc.C #> 4824.2 758.5 264.5 52.6 #> Conc^4 Conc^5 #> 149.8 -270.8
The list output from running the data_prep() function can besummarized with the data_summary() function.
xeredar::data_summary(xeta_spiked_result) |> knitr::kable()
1 | 3 | 10 | 30 | 100 | |
---|---|---|---|---|---|
Replicate 1 | -17.13 | 0.44 | -1.68 | 15.91 | 11.4 |
Replicate 2 | 13.26 | 11.49 | 7.58 | 13.22 | 27.34 |
Replicate 3 | -4.67 | 9.47 | 3.12 | 4.29 | 22.13 |
Pooled | -3.37 | 7.41 | 2.97 | 11.25 | 20.47 |
Dunnett | ns | ns | ns | ns | * |
IncreasingWilliams | ns | ns | ns | * | * |
DecreasingWilliams | ns | ns | ns | ns | ns |
To demonstrate how to run RADAR analysis, we will use one of the datasets by the Pos_mDHT_Fraunhofer_RADAR containing the spiked andunspiked measurements from the RADAR study validation in the labFraunhofer with an androgen axis active chemical.
radar_spiked <- xeredar::RADAR_valid_data_table_spiked_unspiked[["mDHTFRAUNH_Spiked"]]radar_unspiked <- xeredar::RADAR_valid_data_table_spiked_unspiked[["mDHTFRAUNH_Unspiked"]]
The default radar analysis can be run using thedata_prep()
functionwith either spiked or unspiked data. This function automatically decideswhether trimming, outlier removal and/or transformations are conductedfollowing the manuscript (Spyridonov et al. unpublished). The actualanalysis is carried out by theana()
function. Theana()
function iscalled by thedata_prep()
function and does not need to be calledseparately. the analysis the RADAR assay follows the description of theMethod 2 (the mixed ANOVA approach) in the Annex 8: methods for thestatistical analysis of RADAR assay data of the OECD TG 251(2022).For this dataset, the exposure well ID (Row of the 96 well plate) is notrecorded, therefore, we set therow
argument toFALSE
. In this case,we use the reduced mixed ANOVA model where the exposure well ID is notincluded as a random effect. Please specifyrow=TRUE
if the exposurewell ID is recorded and you want to use the full mixed ANOVA model.Trimming is not required.
radar_spiked_result <- xeredar::data_prep(dataframe = radar_spiked, row = FALSE, trimming=FALSE)radar_unspiked_result <- xeredar::data_prep(dataframe = radar_unspiked, row= FALSE, trimming=FALSE)
Here we use the spike data as an example to demonstrate the output ofthedata_prep()
function.
The outputs of thedata_prep()
function are lists containing thefollowing elements:
- A reasoning for the recommended transformation and trimming. The rawdata (without trimming or outlier removal) where the fluorescencevalues are log transformed should be used for the analysis becauseonly after log transformation, the residuals of the mixed ANOVA arenormally distributed and have homogeneous variances among treatmentgroups.
radar_spiked_result$Justify#> [1] "The raw data (without trimming or outlier removal) where\n the fluorescence values are log transformed should be used\n for the analysis because only after log transformation,\n the residuals of the mixed ANOVA are normally distributed\n and have homogeneous variances among treatment groups."
- A data frame of the processed data (e.g. raw data, transformed, oroutlier removed) used for actual statistical testing following thereasoning. The box plots of the processed data per run/replicate(i.e. each panel represents each run/replicate) are also providedfor visual inspection.
knitr::kable(head(radar_spiked_result$ProcessedData))
lab | Compound | Treatment | Fluor | Conc | Replicate |
---|---|---|---|---|---|
FRAUNH | mDHT | FETAX MT | 113152 | 0 | 1 |
FRAUNH | mDHT | mDHT-1µg/L + 17MT | 708 | 1 | 1 |
FRAUNH | mDHT | mDHT-2µg/L + 17MT | 2498 | 2 | 1 |
FRAUNH | mDHT | mDHT-4µg/L + 17MT | 10152 | 4 | 1 |
FRAUNH | mDHT | mDHT-8µg/L + 17MT | 356 | 8 | 1 |
FRAUNH | mDHT | mDHT-16µg/L + 17MT | 4621 | 16 | 1 |
radar_spiked_result$BoxPlots
- Summary tables of the processed data (per replicate and overall)
knitr::kable(radar_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")
Conc | Replicate | N | Mean | Standard deviation | Coefficient of variation |
---|---|---|---|---|---|
0 | 1 | 17 | 23818.471 | 43072.456 | 1.808364 |
0 | 2 | 17 | 15009.824 | 37062.085 | 2.469189 |
0 | 3 | 17 | 29657.294 | 37428.116 | 1.262021 |
1 | 1 | 17 | 6505.471 | 10835.321 | 1.665571 |
1 | 2 | 17 | 10271.000 | 14904.156 | 1.451091 |
1 | 3 | 17 | 20555.882 | 29472.508 | 1.433775 |
2 | 1 | 17 | 29385.941 | 47143.811 | 1.604298 |
2 | 2 | 17 | 8911.118 | 10565.718 | 1.185678 |
2 | 3 | 17 | 41815.706 | 54205.056 | 1.296285 |
4 | 1 | 17 | 15968.941 | 33341.972 | 2.087926 |
4 | 2 | 17 | 14532.765 | 21802.622 | 1.500239 |
4 | 3 | 17 | 19419.471 | 28057.724 | 1.444824 |
8 | 1 | 17 | 5178.118 | 8983.787 | 1.734952 |
8 | 2 | 17 | 8220.882 | 15084.270 | 1.834872 |
8 | 3 | 17 | 16399.647 | 23315.385 | 1.421700 |
16 | 1 | 17 | 6119.000 | 8448.481 | 1.380696 |
16 | 2 | 17 | 7955.706 | 15390.405 | 1.934512 |
16 | 3 | 17 | 17665.412 | 19842.708 | 1.123252 |
knitr::kable(radar_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")
Conc | N | Mean | Standard deviation | Coefficient of variation |
---|---|---|---|---|
0 | 51 | 22828.529 | 38967.63 | 1.706971 |
1 | 51 | 12444.118 | 20556.80 | 1.651929 |
2 | 51 | 26704.255 | 43299.93 | 1.621462 |
4 | 51 | 16640.392 | 27641.60 | 1.661115 |
8 | 51 | 9932.882 | 17189.94 | 1.730609 |
16 | 51 | 10580.039 | 15836.94 | 1.496870 |
- Tables of results evaluated using increasing/decreasing Williamstest and/or Dunnett’s test, if applicable.
In the the Williams’ test result tables,Y.Tilde is the amalgamatedmean of the fluorescence in each treatment group,Y0 is the mean ofthe control fluorescence,DIFF is the estimated difference between thetreatment and the control,SE_DIFF is the standard error of theWilliams’ test,DF is the degrees of freedom for Williams’ test,WILLIncr orWill Decr is the Williams’ test statistic,crit Val is thecritical value of Williams distribution,Sign suggests if there issignificant difference between the treatment and the control, and%Incr is the percent increase of the fluorescence compared to thecontrol.
In the Dunnett’s test result table,Estimate is the estimateddifference between the treatment and the control,SE is the standarderror of the mixed ANOVA model,t value is the Dunnett’s teststatistic,adj p is the adjusted p value and%Incr is the percentincrease of the fluorescence compared to the control.
knitr::kable(radar_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")
Conc | Y.Tilde | Y0 | DIFF Incr | SE_DIFF | DF | WILL Incr | crit Val | Sign | % Incr | |
---|---|---|---|---|---|---|---|---|---|---|
Conc16 - Conc0 | 16 | 8.38858 | 8.51927 | -0.13069 | 0.3478327 | 10 | -0.3757267 | 1.971 | FALSE | -53.65431 |
Conc8 - Conc0 | 8 | 8.38858 | 8.51927 | -0.13069 | 0.3478327 | 10 | -0.3757267 | 1.965 | FALSE | -56.48917 |
Conc4 - Conc0 | 4 | 8.38858 | 8.51927 | -0.13069 | 0.3478327 | 10 | -0.3757267 | 1.956 | FALSE | -27.10703 |
Conc2 - Conc0 | 2 | 8.38858 | 8.51927 | -0.13069 | 0.3478327 | 10 | -0.3757267 | 1.940 | FALSE | 16.97755 |
Conc1 - Conc0 | 1 | 8.07433 | 8.51927 | -0.44494 | 0.3478327 | 10 | -1.2791783 | 1.908 | FALSE | -45.48875 |
knitr::kable(radar_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")
Conc | Y.Tilde | Y0 | DIFF Decr | SE_DIFF | DF | WILL Decr | crit Val | Sign | % Incr | |
---|---|---|---|---|---|---|---|---|---|---|
Conc16 - Conc0 | 16 | 7.99566 | 8.51927 | 0.52361 | 0.3478327 | 10 | 1.5053503 | 1.971 | FALSE | -53.65431 |
Conc8 - Conc0 | 8 | 7.99566 | 8.51927 | 0.52361 | 0.3478327 | 10 | 1.5053503 | 1.965 | FALSE | -56.48917 |
Conc4 - Conc0 | 4 | 8.54577 | 8.51927 | -0.02650 | 0.3478327 | 10 | -0.0761861 | 1.956 | FALSE | -27.10703 |
Conc2 - Conc0 | 2 | 8.54577 | 8.51927 | -0.02650 | 0.3478327 | 10 | -0.0761861 | 1.940 | FALSE | 16.97755 |
Conc1 - Conc0 | 1 | 8.54577 | 8.51927 | -0.02650 | 0.3478327 | 10 | -0.0761861 | 1.908 | FALSE | -45.48875 |
knitr::kable(radar_spiked_result$Dunnetts, caption="Dunnett's test")
Estimate | SE | df | t value | adj p | % Incr | |
---|---|---|---|---|---|---|
Conc1 - Conc0 | -0.4449358 | 0.3478327 | 10 | -1.279166 | 0.5986692 | -45.48875 |
Conc2 - Conc0 | 0.4707977 | 0.3478327 | 10 | 1.353518 | 0.5520005 | 16.97755 |
Conc4 - Conc0 | 0.0536514 | 0.3478327 | 10 | 0.154245 | 0.9999132 | -27.10703 |
Conc8 - Conc0 | -0.5776610 | 0.3478327 | 10 | -1.660744 | 0.3794252 | -56.48917 |
Conc16 - Conc0 | -0.4695496 | 0.3478327 | 10 | -1.349930 | 0.5543160 | -53.65431 |
- Further information about the normality test (Shapiro-Wilk), thehomogeneity of variance test (Levene’s test) of residuals of themixed ANOVA model, the monotonicity test and the model fit.
radar_spiked_result$NormalityTest#> #> Shapiro-Wilk normality test#> #> data: stats::resid(mixedaov)#> W = 0.99071, p-value = 0.05007radar_spiked_result$LeveneTest#> # A tibble: 1 × 4#> statistic p.value df df.residual#> <dbl> <dbl> <int> <int>#> 1 0.474 0.796 5 300radar_spiked_result$`Monotonicity Test`#> Test t value Pr(>|t|) Significance#> 1 Linear -1.57 0.1167 .#> 2 Quadratic -1.48 0.1400 .radar_spiked_result$MixedAnova#> Linear mixed model fit by REML ['lmerMod']#> Formula: #> log(Fluor) ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)#> Data: dataframe#> REML criterion at convergence: 1218.932#> Random effects:#> Groups Name Std.Dev.#> Replicate:Conc (Intercept) 0.000 #> Replicate (Intercept) 0.466 #> Residual 1.756 #> Number of obs: 306, groups: #> Replicate:Conc, 18; Replicate, 3#> Fixed Effects:#> (Intercept) Conc.L Conc.Q Conc.C #> 8.35799 -0.37806 -0.37347 0.01863 #> Conc^4 Conc^5 #> 0.68924 -0.25055 #> optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
The list output from running the data_prep() function can besummarized with the data_summary() function.
xeredar::data_summary(radar_spiked_result) |> knitr::kable()
1 | 2 | 4 | 8 | 16 | |
---|---|---|---|---|---|
Replicate 1 | -72.69 | 23.37 | -32.96 | -78.26 | -74.31 |
Replicate 2 | -31.57 | -40.63 | -3.18 | -45.23 | -47 |
Replicate 3 | -30.69 | 41 | -34.52 | -44.7 | -40.43 |
Pooled | -45.49 | 16.98 | -27.11 | -56.49 | -53.65 |
Dunnett | ns | ns | ns | ns | ns |
IncreasingWilliams | ns | ns | ns | ns | ns |
DecreasingWilliams | ns | ns | ns | ns | ns |
To demonstrate how to run REACTIV analysis, we will use one artificialdata set containing the spiked and unspiked measurements.
reactiv_spiked <- xeredar::REACTIV_valid_data_table_spiked_unspiked[["Anastrozole_UK"]]$Spiked reactiv_unspiked <- xeredar::REACTIV_valid_data_table_spiked_unspiked[["Anastrozole_UK"]]$Unspiked
The default REACTIV analysis can be run using thedata_prep()
functionwith either spiked or unspiked data. This function automatically decideswhether trimming, outlier removal and/or transformations are conductedfollowing the manuscript (Spyridonov et al. unpublished). The actualanalysis is carried out by theana()
function. Theana()
function iscalled by thedata_prep()
function and does not need to be calledseparately. The analysis the REACTIV assay follows the description ofthe Method 2 (the mixed ANOVA approach) in the Annex 8: methods for thestatistical analysis of REACTIV assay data of the Amended Draft new TestGuideline for the REACTIV assay for second WNT-review(30.01.2024).For this assay, therow
argument should be set toFALSE
. Trimming isnot required. In case there are residuals deviate from normality andvariance homogeneity, outlier removal (e.g. by applying the Tukey rule(Green et al., 2018) and data transformation (for example log- orsquare-root) can be conducted.
reactiv_spiked_result <- xeredar::data_prep(dataframe = reactiv_spiked, row = FALSE, trimming=FALSE, boxcox = FALSE)reactiv_unspiked_result <- xeredar::data_prep(dataframe = reactiv_unspiked, row= FALSE, trimming=FALSE, boxcox = FALSE)
Here we use the spiked data as an example to demonstrate the output ofthedata_prep()
function.
The outputs of thedata_prep()
function are lists containing thefollowing elements:
- A reasoning for the recommended transformation and trimming. Thedata from which outliers were removed with the Tukey-rule where thefluorescence values are square-root transformed, should be used forthe analysis because only after outlier removal and sqrttransformation, the residuals of the mixed ANOVA are normallydistributed and have homogeneous variances among treatment groups
reactiv_spiked_result$Justify#> [1] "The data from which outliers were removed with the\n Tukey-rule where the fluorescence values are square-root\n transformed, should be used for the analysis because only\n after outlier removal and sqrt transformation, the\n residuals of the mixed ANOVA are normally distributed\n and have homogeneous variances among treatment groups"
- A data frame of the processed data (e.g. raw data, transformed, oroutlier removed) used for actual statistical testing following thereasoning. The box plots of the processed data per run/replicate(i.e. each panel represents each run/replicate) are also providedfor visual inspection.
knitr::kable(head(reactiv_spiked_result$ProcessedData))
Treatment | Fluor | Conc | Replicate | Compound | Country |
---|---|---|---|---|---|
AN-0,18mg/L + Testostérone | 27037.33 | 0.18 | 1 | Anastrozole | UK |
AN-0,36mg/L + Testostérone | 201316.67 | 0.36 | 1 | Anastrozole | UK |
AN-0,73mg/L + Testostérone | 34651.00 | 0.73 | 1 | Anastrozole | UK |
AN-1,45mg/L + Testostérone | 289984.67 | 1.45 | 1 | Anastrozole | UK |
AN-2,9mg/L + Testostérone | 233271.00 | 2.9 | 1 | Anastrozole | UK |
AN-0,18mg/L + Testostérone | 64846.33 | 0.18 | 1 | Anastrozole | UK |
reactiv_spiked_result$BoxPlots
- Summary tables of the processed data (per replicate and overall)
knitr::kable(reactiv_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")
Conc | Replicate | N | Mean | Standard deviation | Coefficient of variation |
---|---|---|---|---|---|
0 | 1 | 6 | 1417505.4 | 367130.47 | 0.2589976 |
0 | 2 | 5 | 1299380.0 | 241756.84 | 0.1860555 |
0 | 3 | 6 | 1699476.3 | 382759.66 | 0.2252221 |
0.18 | 1 | 8 | 216154.0 | 211883.73 | 0.9802445 |
0.18 | 2 | 7 | 149156.5 | 116877.63 | 0.7835909 |
0.18 | 3 | 8 | 255036.8 | 162183.70 | 0.6359228 |
0.36 | 1 | 8 | 149004.8 | 96508.01 | 0.6476841 |
0.36 | 2 | 7 | 230237.3 | 105711.23 | 0.4591404 |
0.36 | 3 | 7 | 199221.1 | 116329.38 | 0.5839210 |
0.73 | 1 | 8 | 169331.4 | 115443.55 | 0.6817610 |
0.73 | 2 | 7 | 187208.8 | 215340.83 | 1.1502710 |
0.73 | 3 | 8 | 292332.4 | 139121.01 | 0.4759001 |
1.45 | 1 | 8 | 116438.8 | 102761.27 | 0.8825347 |
1.45 | 2 | 7 | 173724.2 | 134626.51 | 0.7749439 |
1.45 | 3 | 7 | 236771.7 | 100056.61 | 0.4225869 |
2.9 | 1 | 8 | 159732.2 | 78955.44 | 0.4942987 |
2.9 | 2 | 6 | 234475.2 | 170916.42 | 0.7289317 |
2.9 | 3 | 8 | 247387.4 | 127553.16 | 0.5156009 |
knitr::kable(reactiv_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")
Conc | N | Mean | Standard deviation | Coefficient of variation |
---|---|---|---|---|
0 | 17 | 1482281.8 | 363637.5 | 0.2453228 |
0.18 | 23 | 209287.9 | 168250.6 | 0.8039193 |
0.36 | 22 | 190829.4 | 106636.7 | 0.5588065 |
0.73 | 23 | 217555.3 | 161918.4 | 0.7442633 |
1.45 | 22 | 172953.7 | 118883.2 | 0.6873702 |
2.9 | 22 | 211991.3 | 126959.5 | 0.5988900 |
- Tables of results evaluated using increasing/decreasing Williamstest and/or Dunnett’s test, if applicable.
In the Williams’ test result tables,Y.Tilde is the amalgamated meanof the fluorescence in each treatment group,Y0 is the mean of thecontrol fluorescence,DIFF is the estimated difference between thetreatment and the control,SE_DIFF is the standard error of theWilliams’ test,DF is the degrees of freedom for Williams’ test,WILLIncr orWill Decr is the Williams’ test statistic,crit Val is thecritical value of Williams distribution,Sign suggests if there issignificant difference between the treatment and the control, and%Incr is the percent increase of the fluorescence compared to thecontrol.
In the Dunnett’s test result table,Estimate is the estimateddifference between the treatment and the control,SE is the standarderror of the mixed ANOVA model,t value is the Dunnett’s teststatistic,adj p is the adjusted p value and%Incr is the percentincrease of the fluorescence compared to the control.
knitr::kable(reactiv_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")
Conc | Y.Tilde | Y0 | DIFF Incr | SE_DIFF | DF | WILL Incr | crit Val | Sign | % Incr | |
---|---|---|---|---|---|---|---|---|---|---|
Conc2.9 - Conc0 | 2.9 | 439.658 | 1208.915 | -769.2567 | 48.94620 | 12 | -15.71637 | 1.933 | FALSE | -85.69831 |
Conc1.45 - Conc0 | 1.45 | 413.890 | 1208.915 | -795.0246 | 48.95169 | 12 | -16.24101 | 1.927 | FALSE | -88.33193 |
Conc0.73 - Conc0 | 0.73 | 413.890 | 1208.915 | -795.0246 | 48.47768 | 12 | -16.39981 | 1.918 | FALSE | -85.32295 |
Conc0.36 - Conc0 | 0.36 | 413.890 | 1208.915 | -795.0246 | 48.95169 | 12 | -16.24101 | 1.903 | FALSE | -87.12597 |
Conc0.18 - Conc0 | 0.18 | 413.890 | 1208.915 | -795.0246 | 48.47768 | 12 | -16.39981 | 1.873 | FALSE | -85.88070 |
knitr::kable(reactiv_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")
Conc | Y.Tilde | Y0 | DIFF Decr | SE_DIFF | DF | WILL Decr | crit Val | Sign | % Incr | |
---|---|---|---|---|---|---|---|---|---|---|
Conc2.9 - Conc0 | 2.9 | 411.240 | 1208.915 | 797.6747 | 48.94620 | 12 | 16.29697 | 1.933 | TRUE | -85.69831 |
Conc1.45 - Conc0 | 1.45 | 411.240 | 1208.915 | 797.6747 | 48.95169 | 12 | 16.29514 | 1.927 | TRUE | -88.33193 |
Conc0.73 - Conc0 | 0.73 | 424.957 | 1208.915 | 783.9576 | 48.47768 | 12 | 16.17152 | 1.918 | TRUE | -85.32295 |
Conc0.36 - Conc0 | 0.36 | 424.957 | 1208.915 | 783.9576 | 48.95169 | 12 | 16.01493 | 1.903 | TRUE | -87.12597 |
Conc0.18 - Conc0 | 0.18 | 424.957 | 1208.915 | 783.9576 | 48.47768 | 12 | 16.17152 | 1.873 | TRUE | -85.88070 |
knitr::kable(reactiv_spiked_result$Dunnetts, caption="Dunnett's test")
Estimate | SE | df | t value | adj p | % Incr | |
---|---|---|---|---|---|---|
Conc0.18 - Conc0 | -786.2696 | 48.47768 | 12 | -16.21921 | 0 | -85.88070 |
Conc0.36 - Conc0 | -786.3430 | 48.95169 | 12 | -16.06365 | 0 | -87.12597 |
Conc0.73 - Conc0 | -776.5006 | 48.47768 | 12 | -16.01769 | 0 | -85.32295 |
Conc1.45 - Conc0 | -822.2317 | 48.95169 | 12 | -16.79680 | 0 | -88.33193 |
Conc2.9 - Conc0 | -769.7654 | 48.94620 | 12 | -15.72677 | 0 | -85.69831 |
- Further information about the normality test (Shapiro-Wilk), thehomogeneity of variance test (Levene’s test) of residuals of themixed ANOVA model, the monotonicity test and the model fit.
reactiv_spiked_result$NormalityTest#> #> Shapiro-Wilk normality test#> #> data: stats::resid(mixedaov)#> W = 0.97621, p-value = 0.0227reactiv_spiked_result$LeveneTest#> # A tibble: 1 × 4#> statistic p.value df df.residual#> <dbl> <dbl> <int> <int>#> 1 0.829 0.532 5 123reactiv_spiked_result$`Monotonicity Test`#> Test t value Pr(>|t|) Significance#> 1 Linear -6.34 <0.0001 ***#> 2 Quadratic 6.05 <0.0001 ***reactiv_spiked_result$MixedAnova#> Linear mixed model fit by REML ['lmerMod']#> Formula: #> sqrt(Fluor) ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)#> Data: wt_outlier#> REML criterion at convergence: 1607.975#> Random effects:#> Groups Name Std.Dev.#> Replicate:Conc (Intercept) 0.00 #> Replicate (Intercept) 51.74 #> Residual 151.56 #> Number of obs: 129, groups: #> Replicate:Conc, 18; Replicate, 3#> Fixed Effects:#> (Intercept) Conc.L Conc.Q Conc.C #> 551.13 -471.74 437.64 -271.05 #> Conc^4 Conc^5 #> 175.76 -30.96 #> optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
The list output from running the data_prep() function can besummarized with the data_summary() function.
xeredar::data_summary(reactiv_spiked_result) |> knitr::kable()
0.18 | 0.36 | 0.73 | 1.45 | 2.9 | |
---|---|---|---|---|---|
Replicate 1 | -84.75 | -89.49 | -88.05 | -91.79 | -88.73 |
Replicate 2 | -88.52 | -82.28 | -85.59 | -86.63 | -81.95 |
Replicate 3 | -84.99 | -88.28 | -82.8 | -86.07 | -85.44 |
Pooled | -85.88 | -87.13 | -85.32 | -88.33 | -85.7 |
Dunnett | * | * | * | * | * |
IncreasingWilliams | ns | ns | ns | ns | ns |
DecreasingWilliams | * | * | * | * | * |
xeredar contains an integrated shiny app that is available to the usersby writingrun_app()
into the console. The only requirement to use theapp is the successful installation of xeredar. When the app started, theuser has a couple of options to analyse the data. These options can beadjusted by clicking on the little gear sign next toInputs. Thefollowing options are available:
- Use d’Agostino test?
- The default here is that this is not selected meaning theShapiro-Wilk test is utilized to check for residual normality.However, the RADAR TG mentions the d’Agostino test which is whyit is also available to the users.
- Apply 10% Trimming
- The default here is that this is selected, meaning that 10%Trimming is conducted when the ANOVA assumptions are notfulfilled by the raw data. Please be aware that this does notmean that 10% Trimming is always conducted. It is onlyconducted, when the raw data is violating the residual normalityand variance homogeneity assumptions, tested with the respectivetest with an adjustable alpha level.
- Remove outliers
- The default here is that this is selected, meaning that outlierremoval is conducted when the ANOVA assumptions are notfulfilled by the raw data or by 10% Trimming, if it is selectedabove.
- Try Box-Cox transformation
- The default here is that this is selected, meaning that box-coxtransformation is carried out when the ANOVA assumptions are notfulfilled by the raw or processed data (10% Trimming or outlierremoval) nor by log- or square-root transformation. Box-Coxtransformation is not mentioned in any of the TGs which is whyit is left to the choice of the user.
- alpha for residual normality and variance homogeneity tests
- The default here is 0.05. In the TGs the alpha level isdiscussed so the user is adviced to inspect the requirements ofthe respective study to analyze.
The next option in the little box on the top-left of the app is to set ahook or remove the hook to decide whether the exposure well ID isregarded as random term in the underlying mixed ANOVA model. In case aREACTIV study is investigated the hook should be removed. For RADAR andXETA, the hook should be placed as long as information about theexposure well ID was documented. Of course, reducing the complexity ofthe random term might also make sense for XETA and RADAR studies in caseno variance is explained by the exposure well-ID. However, this choiceis left to the user.
When own data is supposed to be analyzed xlsx files with either spikedor unspiked data can be uploaded. Please make sure that the uploadeddata fulfills the Data requirements explained above. In case ofinsecurity, inspect the uploaded data structure of the pre-loadedexample data.
The app contains several output boxes about the required dataprocessing, a conclusion table, the uploaded data, residual diagnosticsplots, boxplots, a data summary table, the output of the residualnormality and homogeneity tests, the monotonicity test result, theDunnett’s and increasing and decreasing Williams test result tablesalong with the summary table of the underlying mixed ANOVA model.
The depicted information can be downloaded in a simple report byclicking on the button ** Word report**. It takes a couple ofseconds until the final docx file is produced and ready to download.Avoid clicking the button several times as this can lead to long waitingtimes and the production of several reports.
OECD. 2019a. Validation Report of the Xenopus Eleutheroembryonic ThyroidSignaling Assay (XETA) for the Detection of Thyroid Active Substances.OECD.
OECD. 2019b. TG 248: Xenopus Eleutheroembryonic Thyroid Assay (XETA).OECD.https://doi.org/10.1787/a13f80ee-en.
OECD. 2022b. Test No. 251: Rapid Androgen Disruption Activity Reporter(RADAR) Assay. OECD.https://doi.org/10.1787/da264d82-en.
OECD. 2022a. Rapid Estrogen Activity in Vivo (REACTIV) Assay (OECD DraftTG): Guideline for the Testing of Chemicals, Section 2: Effects onBiotic System. OECD.
Inka Marie Spyridonov, Lijuan Yan, Eduard Szöcs, Ana Filipa Pereira Miranda, Carsten Lange,Andrew Tindall, David Du Pasquier, Gregory Lemkine, Lennart Weltje, Maike Habekost,Pernille Thorbek. 2025. Xeredar: An open-source R-package for the statistical analysisof endocrine new approach methods (NAMs) using fish or amphibian eleutheroembryos,Environmental Toxicology and Chemistry.https://doi.org/10.1093/etojnl/vgaf056
About
Functions for automated statistical analysis of Xenopus Eleutherembryonic Thyroid Assays (XETA), Rapid Androgen Disruption Activity Reporter (RADAR) assays and Rapid Estrogen Activity In Vitro (REACTIV) assays.