The adoptr Package: Adaptive Optimal Designs for Clinical Trials in R

adoptr_jss.Rmd

Abstract

Even though adaptive two-stage designs with unblinded interim analyses are becoming increasingly popular in clinical trial designs, there is a lack of statistical software to make their application more straightforward. The packageadoptr fills this gap for the common case of two-stage one- or two-arm trials with (approximately) normally distributed outcomes. In contrast to previous approaches,adoptr optimizes the entire design upfront which allows maximal efficiency. To facilitate experimentation with different objective functions,adoptr supports a flexible way of specifying both (composite) objective scores and (conditional) constraints by the user. Special emphasis was put on providing measures to aid practitioners with the validation process of the package.

This manuscript is published in the Journal of Statistical Softwareunder .

Background

Confirmatory clinical trials are conducted in a strictly regulatedenvironment. A key quality criterion put forward by the relevantagencies (US Food and Drug Administration et al.(2019),Committee for Medicinal Productsfor Human Use and others (2007)) for a study that is supposed toprovide evidence for the regulatory acceptance of a new drug ortreatment is strict type one error rate control. This requirement wasoften seen as conflicting with the perceived need to make trials moreflexible by, e.g., early stopping for futility, group-sequentialenrollment, or even adaptive sample size recalculation. An excellenthistorical review of the development of the field of adaptive clinicaltrial designs and the struggles along the way is given inBauer et al. (2015).

In this manuscript, the focus lies exclusively on adaptive two-stagedesigns with one unblinded interim analysis. Both early stopping forfutility and efficacy are allowed and the final sample size as well asthe critical value to reject the null hypothesis is chosen in adata-driven way. There is a plethora of methods for modifying the designof an ongoing trial based on interim results without compromising typeone error rate control(Bauer et al. 2015)but the criteria for decidingwhich adaptation should beperformed during an interim analysis andwhen to perform theinterim analysis are still widely based on heuristics. Baueretal. mention this issue of guiding adaptive decisions at interim ina principled (i.e., ‘optimal’) way by stating that ‘[t]he question mightarise if potential decisions made at interim stages might not be betterplaced to the upfront planning stage.’ FollowingMehta and Pocock (2011),Jennison and Turnbull (2015) developed aprinciple approach to optimal interim sample size modifications, i.e.,to conduct the interim decision (conditional on interim results) suchthat it optimizes an unconditional performance score. Their approach,however, was still restricted unnecessarily. Recently,Pilz et al. (2019) extended the work to a fullygeneral variational problem where the optimization problem for any givenperformance score (optionally under further constraints) is solved overboth the sample size adaptation function and the critical value functionand the time point of the interim decision simultaneously. This approachis an application of ideas which have been put forward in single-armtrials with binary endpoint for several years (Englert and Kieser (2013),Kunzmann and Kieser (2016),Kunzmann and Kieser (2020)) to a setting withcontinuous test statistics. Clearly, by relaxing the problem tocontinuous sample sizes and test statistics, the theory becomes muchmore tractable, and important connections between conditional andunconditional optimality can be discussed much easier(Pilz et al. 2019).

A key insight from this recent development is the fact that the truechallenge in designing an adaptive trial is less the technicalmethodology for controlling the type one error rate but rather thechoice of the optimality criterion. This issue is much less pressing insingle-stage designs since most sensible criteria will be equivalent tominimizing the overall sample size. Thus, in this case, a ‘design’ isoften completely specified by given power and type one error rateconstraints. For the more complex adaptive designs, however, there aremuch more sensible criteria (minimize maximal sample size, expectedsample size, expected costs, etc.) and the balance between conditionaland unconditional properties must be explicitly specified (cf. Section6). This added complexity might be seen asdaunting by practitioners, but it is also a chance for tailoringadaptive designs more specifically to a particular situation. TheR-package(R Core Team2019)adoptr aims at providing a simple yetcustomizable interface to specifying a broad class of objectivefunctions and constraints for single- or two-arm, one- or two-stagedesigns with approximately normally distributed endpoints. The goal ofadoptr is to enable relatively easy experimentationwith different notions of optimality to shift the focus fromhow to optimize towhat to optimize.

In the following, we first give a definition of the problem settingaddressed inadoptr and the technicalities oftranslating the underlying variational problem to a simple multivariateoptimization problem before motivating the need for anR-package. We then present the core functionality ofadoptr before addressing the issue of facilitatingvalidation of open-source software in a regulated environment anddiscussing potential future work onadoptr.

Setting

We consider the problem of a two-stage, two-arm design to establishsuperiority of treatment over placebo with respect to the meandifference. Assume that to that end data $Y_{j}^{g, i} Y_j^{g, i}$ is observed for the $j j$ -thindividual of the trial in stage $i \in {1, 2} i\in\{1,2\}$ under treatment( $g = T g=T$ )or placebo( $g = C g=C$ ).Let $n_{i} n_i$ be the per-group sample size in stage $i i$ and consider the stage-wise test statistics $X_{i} := \frac{\sum_{j = 1}^{n_{i}} Y_{j}^{T, i} - \sum_{j = 1}^{n_{i}} Y_{j}^{C, i}}{σ \sqrt{2 n_{i}}} X_i := \frac{\sum_{j=1}^{n_i}Y_j^{T,i} - \sum_{j=1}^{n_i}Y_j^{C,i}}{\sigma\,\sqrt{2\,n_i}}$ for $i = 1, 2 i=1,2$ .Under the assumption that the $Y_{j}^{g, i} \overset{i i d}{\sim} F_{g} Y_j^{g, i}\stackrel{iid}{\sim} F_g$ with $𝐄 [F_{T}] - 𝐄 [F_{C}] = θ \boldsymbol{E}[F_T] - \boldsymbol{E}[F_C] = \theta$ and common variance $σ^{2} \sigma^2$ ,by the central limit theorem, the asymptotic distribution of $X_{1} X_1$ is $𝒩 (\sqrt{n_{1} / 2} θ, 1) \mathcal{N}(\sqrt{n_1/2}\, \theta, 1)$ .Formally, the null hypothesis for the superiority test is thus $ℋ_{0} : θ \leq 0 \mathcal{H}_0:\theta\leq 0$ .Based on the interim outcome $X_{1} X_1$ ,a decision can be made whether to either stop the trial early forfutility if $X_{1} < c_{1}^{f} X_1<c_1^f$ ,to stop the trial early for efficacy (early rejection of the nullhypothesis) if $X_{1} > c_{1}^{e} X_1>c_1^e$ ,or to enter stage two if $X_{1} \in [c_{1}^{f}, c_{1}^{e}] X_1\in[c_1^f, c_1^e]$ .Conditional on proceeding to a second stage, it holds that $X_{2} | X_{1} \in [c_{1}^{f}, c_{1}^{e}] \sim 𝒩 (\sqrt{n_{2} / 2} θ, 1) X_2\,|\,X_1\in[c_1^f, c_1^e]\sim\mathcal{N}(\sqrt{n_2/2}\, \theta, 1)$ .In the second stage, the null hypothesis is rejected if and only if $X_{2} > c_{2} (X_{1}) X_2 > c_2(X_1)$ for a stage-two critical value $c_{2} : x_{1} \mapsto c_{2} (x_{1}) c_2:x_1\mapsto c_2(x_1)$ .To test $ℋ_{0} \mathcal{H}_0$ at a significance level of $α \alpha$ ,the stage-one critical values $c_{1}^{f} c_1^f$ and $c_{1}^{e} c_1^e$ as well as $c_{2} (\cdot) c_2(\cdot)$ must be chosen in a way that protects the overall maximal type one errorrate $α \alpha$ .Note that it is convenient to define $c_{2} (x_{1}) = \infty c_2(x_1) = \infty$ if $x_{1} < c_{1}^{f} x_1<c_1^f$ and $c_{2} (x_{1}) = - \infty c_2(x_1) = -\infty$ if $x_{1} > c_{1}^{e} x_1>c_1^e$ since the power curve of the design is then given by $θ \mapsto {𝐏 𝐫}_{θ} [X_{2} > c_{2} (X_{1})] \theta\mapsto\boldsymbol{Pr}_\theta\big[X_2>c_2(X_1)\big]$ .This results in a classical group-sequential design and several methodswere proposed in the literature for choosing the early-stoppingboundaries $c_{1}^{f} c_1^f$ and $c_{1}^{e} c_1^e$ (O’Brien and Fleming (1979),Pocock (1977)) and for defining the stage-tworejection boundary function $c_{2} (\cdot) c_2(\cdot)$ (Bauer and Köhne (1994),Hedges and Olkin (1985)). Often, theinverse-normal combination test(Lehmacher andWassmer 1999) is applied and $c_{2} (\cdot) c_2(\cdot)$ is defined as a linear function of the stage-one test-statistic $c_{2} (x_{1}) = \frac{c - w_{1} x_{1}}{w_{2}} ‚ c_2(x_1) = \frac{c - w_1 x_1}{w_2}‚$ for a critical value $c c$ and predefined weights $w_{1} w_1$ and $w_{2} w_2$ .Most commonly, the stage-wise test statistics are weighted in terms oftheir respective sample sizes, i.e., $w_{1} = \sqrt{n_{1} / (n_{1} + n_{2})} w_1 = \sqrt{n_1 / (n_1+n_2)}$ and $w_{2} = \sqrt{n_{2} / (n_{1} + n_{2})} w_2 = \sqrt{n_2 / (n_1+n_2)}$ .This choice of the weights is optimal in the sense that it minimizes thevariance of the final test statistic if the assumed sample sizes areindeed realized(Zaykin 2011). Note,however, that such prespecified weights become inefficient if the samplesize deviates strongly from the anticipated value (cf.Wassmer and Brannath (2016), chapter 6.2.5). Anatural extension of this group-sequential framework is to allow thesecond stage sample size to also depend on the observed interim outcome,i.e., to consider a function $n_{2} : x_{1} \mapsto n_{2} (x_{1}) n_2:x_1\mapsto n_2(x_1)$ instead of a fixed value $n_{2} n_2$ .Such ‘adaptive’ two-stage designs are thus completely characterized by afive-tuple $𝒟 := (n_{1}, c_{1}^{f}, c_{1}^{e}, n_{2} (\cdot), c_{2} (\cdot)) \mathcal{D}:=\big(n_1, c_1^f, c_1^e, n_2(\cdot), c_2(\cdot)\big)$ .

While the required sample size and the critical value for asingle-stage design are uniquely defined by given type one error rateand power constraints, it is much less clear how the design parametersof a two-stage design should be selected. This is especially true sinceboth $n_{2} n_2$ and $c_{2} c_2$ are functions and thus the parameter space is in factinfinite-dimensional. In order to compare different choices of thedesign parameters, appropriate scoring criteria are essential. A widelyapplied criterion is the expected sample size under the alternativehypothesis (see, e.g.,Jennison and Turnbull(2015)). However, there is a variety of further scoring criteriathat could be incorporated or even combined in order to rate a two-stagedesign. For instance, conditional power is defined as the probability toreject the null hypothesis under the alternative given the interimresult $X_{1} = x_{1} X_1=x_1$ : ${CP}_{θ} (x_{1}) := {𝐏 𝐫}_{θ} [X_{2} > c_{2} (X_{1}) | X_{1} = x_{1}] . \operatorname{CP}_\theta(x_1) := \boldsymbol{Pr}_\theta \big[X_2 > c_2(X_1) \, \big| \, X_1=x_1\big].$ Hence, conditional power is aconditional score given as the power conditioned on the first-stageoutcome $X_{1} = x_{1} X_1=x_1$ .Vice versa, power can be seen as an unconditional score that isobtained by integrating conditional power over all possible stage-oneoutcomes, i.e., $P o w e r_{θ} = 𝐄_{θ} [CP (X_{1})] . \operatorname{Power_\theta} = \boldsymbol{E}_\theta\big[\operatorname{CP}(X_1)\big].$ Intuitively, it makes sense to requirea minimal conditional power upon continuation to the second stage sinceone might otherwise continue a trial with little prospect of stillrejecting the null hypothesis. We demonstrate the consequences of thisheuristic in Section6.4. Once thescoring criterion is selected, the design parameters may be chosen inorder to optimize this objective. The first ones to address this problemwereJennison and Turnbull (2015) whominimized the expected sample size ${ESS}_{θ} (𝒟) := 𝐄_{θ} [n (X_{1})] := 𝐄_{θ} [n_{1} + n_{2} (X_{1})] \text{ESS}_{\theta}(\mathcal{D}) := \boldsymbol{E}_{\theta} \big[n(X_1) \big] :=\boldsymbol{E}_{\theta} \big[n_1 + n_2(X_1) \big]$ of a two-stage design for given $n_{1}, c_{1}^{f}, c_{1}^{e} n_1,c_1^f, c_1^e$ with respect to $n_{2} (\cdot) n_2(\cdot)$ for given power and type one error rate constraints. The function $c_{2} (\cdot) c_2(\cdot)$ ,however, was not optimized. Instead,Jennison andTurnbull (2015) used the inverse-normal combination test approachto derive $c_{2} c_2$ given $n_{2} (\cdot) n_2(\cdot)$ and $n_{1} n_1$ .InPilz et al. (2019), the authorsdemonstrated that this restriction is not necessary and that thevariational problem of deriving both functions $n_{2} (\cdot) n_2(\cdot)$ and $c_{2} (\cdot) c_2(\cdot)$ given $n_{1}, c_{1}^{f}, c_{1}^{e} n_1,c_1^f, c_1^e$ to minimize expected sample size can be solved by analyzing thecorresponding Euler-Lagrange equation. Nesting this step in a standardoptimization over the stage-one parameters allows identifying an optimalset of all design parameters without imposing parametric assumptions on $c_{2} (\cdot) c_2(\cdot)$ .As a result, a fully optimal design $𝒟^{*} := (n_{1}^{*}, c_{1}^{f, *}, c_{1}^{e, *}, n_{2}^{*} (\cdot), c_{2}^{*} (\cdot)) \mathcal{D}^*:=\big(n_1^*, c_1^{f,*}, c_1^{e, *}, n_2^*(\cdot), c_2^*(\cdot)\big)$ for the following general optimization problem was derived $\begin{matrix} minimize & {ESS}_{θ_{1}} (𝒟) \\ subject to: & {𝐏 𝐫}_{θ_{0}} [X_{2} > c_{2} (X_{1})] & \leq α, \\ {𝐏 𝐫}_{θ_{1}} [X_{2} > c_{2} (X_{1})] & \geq 1 - β \end{matrix} \begin{align}& \text{minimize} && \operatorname{ESS}_{\theta_1}(\mathcal{D}) &&&& \\& \text{subject to:}&& \boldsymbol{Pr}_{\theta_0}\big[X_2>c_2(X_1)\big] &&\leq \alpha, &&&& \\&&& \boldsymbol{Pr}_{\theta_1}\big[X_2>c_2(X_1)\big] &&\geq 1-\beta &&&&\end{align}$ where $θ_{0} = 0 \theta_0=0$ .

Direct variational perspective

Inadoptr, a simpler solution strategy than solvingthe Euler-Lagrange equation locally is applied to the same problemclass. We propose to embed the entire problem in a finite-dimensionalparameter space and solve the corresponding problem over both stage-oneand stage-two design parameters simultaneously using standard numericallibraries. I.e., we adopt adirect approach to solving thevariational problem. This is done by defining a discrete set of pivotpoints ${\tilde{x}}_{1}^{(i)} \in (c_{1}^{f}, c_{1}^{e}), i = 1, \dots, k \widetilde{x}_1^{(i)}\in(c_1^f, c_1^e), i=1,\ldots,k$ ,and interpolating $c_{2} c_2$ and $n_{2} n_2$ between these pivots. We use cubic Hermite splines(Fritsch and Carlson 1980) which aresufficiently flexible, even for a moderate number of pivots, toapproximate any realistic stage-two sample size and critical valuefunctions. Since the optimal functions are generally very smooth(Pilz et al. 2019) they are well suited tospline interpolation. Within theadoptr validationreport (cf. Section 7) we investigate empirically the shape of theapproximated functions and that increasing the number of pivots above avalue of 5 to 7 does not improve the optimization results. The latterimplies that a relatively small number of pivot points appears to besufficient to obtain valid spline approximations of the optimalfunctions. Note that the pivots are only needed in the continuationregion since both functions are (piecewise) constant within the earlystopping regions. Inadoptr, the pivots are defined asnodes of a Gaussian quadrature rule of degree $k k$ .This choice allows fast and precise numerical integration of anyconditional score over the continuation region, e.g., $\begin{matrix} {ESS}_{θ} (𝒟) = \int n (x_{1}) f_{θ} (x_{1}) d x_{1} \approx n_{1} + \sum_{i = 1}^{k} ω_{i} n_{2} ({\tilde{x}}_{1}^{(i)}) f_{θ} ({\tilde{x}}_{1}^{(i)}), \end{matrix} \begin{align} \text{ESS}_{\theta}(\mathcal{D}) =\int n(x_1) f_{\theta} (x_1) \operatorname{d} x_1\approx n_1 + \sum_{i=1}^{k} \omega_i\, n_2\big(\widetilde{x}_1^{(i)}\big) f_{\theta}\big(\widetilde{x}_1^{(i)}\big),\end{align}$ where $f_{θ} f_\theta$ is the probability density function of $X_{1} | θ X_1\,|\,\theta$ and $ω_{i} \omega_i$ are the corresponding weights of the integration rule. The weights onlydepend on $k k$ and the nodes just need to be scaled to the integration interval.Consequentially, this objective function is smooth in the optimizationparameters and the resulting optimization problem is of dimension $2 k + 3 2k+3$ ,where the tuning parameters are $(n_{1}, c_{1}^{f}, c_{1}^{e}, n_{2} ({\tilde{x}}_{1}^{(1)}), \dots, n_{2} ({\tilde{x}}_{1}^{(k)}), c_{2} ({\tilde{x}}_{1}^{(1)}), \dots, c_{2} ({\tilde{x}}_{1}^{(k)})) \big(n_1, c_1^f, c_1^e, n_2\big(\widetilde{x}_1^{(1)}\big),\dots, n_2\big(\widetilde{x}_1^{(k)}\big), c_2\big(\widetilde{x}_1^{(1)}\big),\dots, c_2\big(\widetilde{x}_1^{(k)}\big)\big)$ .Standard numerical solvers may then be employed to minimize it. Sinceadoptr enables generic objectives (cf. Section6.3), it uses the gradient-freeoptimizer COBYLA(Powell 1994) internallyvia theR-packagenloptr (Johnson (2018),Ypma,Borchers, and Eddelbuettel (2018)).

Most commonly used unconditional performance scores $S (𝒟) S(\mathcal{D})$ can be seen as expected values over conditional scores $S (𝒟 | X_{1}) S(\mathcal{D}|X_1)$ by $S (𝒟) = 𝐄 [S (𝒟 | X_{1})] S(\mathcal{D}) = \boldsymbol{E}\big[S(\mathcal{D}|X_1)\big]$ in a similar way as power and expected sample size. Any such ‘integralscore’ can be computed quickly and reliably inadoptrvia the choice of pivots outlined above. The correctness of numericallyintegrated scores is checked in theadoptr validationreport by comparing the numerical integrals to simulated results.

Note that we tacitly relaxed all sample sizes to be real numbers inthe above argument while they are in fact restricted to positiveintegers. Integer-valued $n_{1} n_1$ and $n_{2} n_2$ would, however, lead to an NP-hard mixed-integer problem. In ourexperiments, we found that merely rounding both $n_{1} n_1$ and $n_{2} n_2$ after the optimization works fine. The extensive validation suite(cf. Section 7) evaluates by numerical integration and simulationwhether the included constraints are fulfilled for optimal designs withrounded sample sizes. Up to now, neither the constraints were violatednor an efficiency loss with respect to the underlying objective functionwas observed. In theory, one could re-adjust the decision boundaries forthese rounded sample sizes, but we failed to see any practical benefitfrom this, even for small trials where the rounding error is largest(data not shown).

The need for an R package

Bauer et al. (2015) state that adequatestatistical software for adaptive designs ‘is increasingly needed toevaluate the adaptations and to find reasonable strategies’.

Commercial software such as JMP(SAS InstituteInc., n.d.a) or Minitab(Minitab, Inc.2020) allow planning and analyzing a wide range of experimentalsetups. Amongst others, they provide tools for randomization,stratification, block-building, or D-optimal designs. These generalpurpose statistical software packages do not, however, allow planning ofmore specialized multi-stage designs encountered in clinical trials. Forgroup-sequential designs, some planning capabilities are available intheSAS procedureseqdesign(SAS Institute Inc., n.d.b), PASS(NCSS 2019), or ADDPLAN(ICON plc 2020). East(Cytel 2020) also supports design, simulationand analysis of experiments with interim analyses. The East ADAPT andthe East SURVADAPT modules support sample size recalculation.Furthermore, there are various open-sourceR-packagesfor the analysis of multi-stage designs. The packageadaptTest(Vandemeulebroecke2009) implements combination tests for adaptive two-stagedesigns.AGSDest(Hack,Brannath, and Brueckner 2019) allows estimation and computationof confidence intervals in adaptive group-sequential designs. Moredetailed overviews on software for adaptive clinical trial designs canbe found inBauer et al. (2015), chapter6, or inTymofyeyev (2014). The choice ofsoftware for optimally designing two- or multi-stage designs, however,is much more limited. CurrentR-packages concerned withoptimal clinical trial designs areOptGS (Wason and Burkardt (2015),Wason (2015)) andrpact(Wassmer and Pahlke 2019). These are, however,exclusively focused on group-sequential designs and lack the ability tospecify custom objective functions and constraints.

The lack of flexibility in formulating the objective function andconstraints might lead to off-the-shelf solutions not entirelyreflecting the needs of a particular trial consequentially resulting ininefficient designs. TheR-packageadoptr aims at providing a simple and interactive yetflexible interface for addressing a range of optimization problemsencountered with two-stage one- or two-arm clinical trials. Inparticular,adoptr allows to modela prioriuncertainty over $θ \theta$ via prior distributions and thus supports optimization under uncertainty(cf. Section6.2).adoptr also supports the combination of conditional (on $X_{1} X_1$ )and unconditional scores and constraints to address concerns such astype-one-error-rate control (unconditional score) and, e.g., a minimalconditional power (conditional score) simultaneously (cf. Section6.3). To facilitate the adoption ofthese advanced trial designs in the clinical trials community,adoptr also features an extensive test and validationsuite (cf. Section 7).

In the following, we outline the key design principles foradoptr.

Interactivity: A major advantage of theR-programming language is its powerful metaprogrammingcapabilities and flexible class system. With a combination ofnon-standard evaluation and S4 classes, we hope to achieve a structuredand modular way of expressing optimization problems in clinical trialsthat integrates nicely with an interactive workflow. We feel that astep-wise problem formulation via the creation of modular intermediateobjects, which can be explored and modified separately, encouragesexploration of different options.
Reliability: A crux in open-source softwaredevelopment for clinical trials is achieving demonstrable validation.Potential users need to be convinced of the software quality and need tobe able to comply with their respective validation requirements whichoften require the ability to produce a validation report. This burdentypically results in innovative software not being used at all - simplybecause the validation effort cannot be stemmed. We address this issuewith an extensive unit test suite and a companion validation report(cf. Section 7).
Extensibility: We do not want to impose aparticular choice of scores or constraints or promote a particularnotion of optimality for clinical trial designs. In cases where thecomposition of existing scores is not sufficient, the object-orientedapproach ofadoptr facilitates the definition of customscores and constraints that seamlessly integrate with the remainder ofthe package.

Adoptr’s structure

The packageadoptr is based onR’sS4 class system. This allows to use multiple dispatch on the classes ofmultiple arguments to a method. In this section, the central componentsofadoptr are described briefly. The following figuregives a structural overview of the main classes inadoptr.

To compute optimal designs, an object of classUnconditionalScore must be defined as objective criterion.adoptr distinguishes betweenConditionalScores andUnconditionalScores(cf. Section2). AllScores canbe evaluated using the methodevaluate. For unconditionalscores, this method only requires aScore object and aTwoStageDesign object, for conditional scores (likeconditional power), it also requires the interim outcome $x_{1} x_1$ .Note that anyConditionalScore $S (𝒟 | X_{1} = x_{1}) S(\mathcal{D}|X_1=x_1)$ can be converted to anUnconditionalScore $S (𝒟) = 𝐄 [S (𝒟 | X_{1})] S(\mathcal{D}) = \boldsymbol{E}\big[S(\mathcal{D}|X_1)\big]$ using the methodexpected. The two most widely usedconditional scores are pre-implemented asConditionalPowerandConditionalSampleSize. Their unconditional counterpartsarePower andExpectedSampleSize. Furtherpredefined unconditional scores areMaximumSampleSize,evaluating the maximum sample size,N1, measuring thefirst-stage sample size, andAverageN2, evaluating theaverage of the stage-two sample size (improper prior). These scores maybe used for regularization if variable stage-two sample sizes or a highstage-one sample size are to be penalized. Users are free to definetheir ownScores (cf. the vignette ‘Defining New Scores’(Kunzmann and Pilz 2020)). Moreover,differentScores can be composed to a single one by thefunctioncomposite (cf. Section6.3). Both conditional andunconditional scores can also be used to define constraints - the mostcommon case being constraints for power and maximal type one error rate.The functionminimize takes an unconditional score asobjective and a set of constraints and optimizes the designparameters.

Inadoptr, different kinds of designs areimplemented. The most frequently applied case is aTwoStageDesign, i.e., a design with one interim analysisand a sample size function that varies with the interim test statistic.Another option is the subclassGroupSequentialDesign whichrestricts the sample size function on the continuation region to asingle number, i.e., $n_{2} (x_{1}) = n_{2} \forall x_{1} \in [c_{1}^{f}, c_{1}^{e}] {n_2(x_1) = n_2\ \forall x_1\in[\,c_1^f, c_1^e\,]}$ .Additionally,adoptr supports the computation ofoptimalOneStageDesigns, i.e., designs without an interimanalysis. Technically, one-stage designs are implemented as subclassesofTwoStageDesign since they can be viewed as the limitingcase for $n_{2} \equiv 0 n_2\equiv0$ and $c_{1}^{f} = c_{1}^{e} c_1^f=c_1^e$ .Hence, all methods that are implemented forTwoStageDesignsalso work forGroupSequentialDesigns andOneStageDesigns. Users can chose to keep some elements of adesign fixed during optimization using the methodsmake_fixed (cf. Section6.5).

The joint data distribution inadoptr consists oftwo elements. The distribution of the test statistic is specified by anobject of classDataDistribution. Currently, the threeoptionsNormal,Binomial, andStudent are implemented. The logical variabletwo_armed allows the differentiation between one- andtwo-armed trials. Furthermore,adoptr supports priordistributions on the effect size. These can bePointMassPriors (cf. Section6.1) as well asContinuousPriors (cf. Section6.2).

In the following section, more hands on examples demonstrate thecapabilities ofadoptr and its syntax.

Examples

Standard case

Consider the case of a randomized controlled clinical trial whereefficacy is to be demonstrated in terms of superiority of the treatmentover placebo with respect to the population mean difference $θ \theta$ of an outcome. Let the null hypothesis be $ℋ_{0} : θ \leq 0 \mathcal{H}_0:\theta\leq 0$ .Assume that the maximal type one error rate is to be controlled at aone-sided level $α = 2.5 % \alpha=2.5\%$ and a minimal power of $90 % 90\%$ at a point alternative of $θ_{1} = 0.3 \theta_1=0.3$ is deemed necessary. For simplicity’s sake, we assume $σ^{2} = 1 \sigma^2=1$ without loss of generality. The required sample size for a one-stagedesign with analysis by the one-sided two-sample $t t$ -testwould then be roughly 235 per group.

Usingadoptr, the two-stage design minimizing theexpected sample size under the alternative hypothesis can be derived forthe very same situation. First, the data distribution is specified to benormal. Thetwo_armed parameter allows to switch betweensingle-armed and two-armed trials.

R>datadist<-Normal(two_armed=TRUE)

In this example, we use simple point priors for both the null andalternative hypotheses. The hypotheses and the corresponding scores(power values) can be specified as:

R>null<-PointMassPrior(theta=.0, mass=1.0)R>alternative<-PointMassPrior(theta=.3, mass=1.0)R>power<-Power(dist=datadist, prior=alternative)R>toer<-Power(dist=datadist, prior=null)R>mss<-MaximumSampleSize()

APower score requires the data distribution and theprior to be specified. For this example, we choosePointMassPriors with the entire probability mass of $11$ on a single point, the null hypothesis $θ = 0 \theta = 0$ to compute the type one error rate, and the alternative hypothesis $θ = 0.3 \theta = 0.3$ to compute the power. The objective function is the expected sample sizeunder the alternative.

R>ess<-ExpectedSampleSize(dist=datadist, prior=alternative)

Sinceadoptr internally relies on the COBYLAimplementation ofnloptr, an initial design isrequired. A heuristic initial choice is provided by the functionget_initial_design. It is based on a fixed design thatfulfills constraints on type one error rate and power. The type of thedesign (two-stage, group-sequential, or one-stage) and the datadistribution have to be defined. For the Gaussian quadrature used duringoptimization, one also has to specify the order of the integration rule,i.e., the number of pivot points between early stopping for futility andearly stopping for efficacy. In practice, order 7 turned out to besufficiently flexible to obtain valid results (data not shown).

R> initial_design <- get_initial_design(theta = 0.3, alpha = 0.025,+                                       beta = 0.1, type = "two-stage",+                                       dist = datadist, order = 7)

It is easy to check that the initial design does not fulfill any ofthe constraints (minimal power of 90% and maximal type one error rate of2.5%) with equality by evaluating the respective scores:

R>evaluate(toer,initial_design)R>evaluate(power,initial_design)

Alternatively, one might alsoevaluate a constraintobject directly via

R>evaluate(toer<=.025,initial_design)R>evaluate(power>=.9,initial_design)

All constraint objects are normalized to the form $h (𝒟) \leq 0 h(\mathcal{D}) \leq 0$ (unconditional) or $h (𝒟, x_{1}) \leq 0 h(\mathcal{D}, x_1) \leq 0$ (conditional on $X_{1} = x_{1} X_1=x_1$ ).Callingevaluate on a constraint object then simply returnsthe left-hand side of the inequality. The actual optimization is startedby invokingminimize

R>opt1<-minimize(ess,subject_to(power>=0.9,toer<=0.025),+initial_design)

The modular structure of the problem specification is intended tofacilitate the inspection or modification of individual components. Thecall tominimize() is designed to be as close as possibleto the mathematical formulation of the optimization problem and returnsboth the optimized design (opt1$design) as well as the fullnloptr return value with details on the optimizationprocedure (opt1$nloptr_return).

Asummary method for objects of the classTwoStageDesign is available to quickly evaluate a set ofConditionalScores such as conditional power as well asUnconditionalScores such as power and expected samplesize.

R>cp<-ConditionalPower(dist=datadist, prior=alternative)R>summary(opt1$design,"Power"=power,"ESS"=ess,"CP"=cp)

adoptr also implements a default plot method for theoverall sample size and the stage-two critical value as functions of thefirst-stage test statistic $x_{1} x_1$ .The plot method also accepts additionalConditionalScoressuch as conditional power. Calling the plot method produces severalplots with the interim test statistic $x_{1} x_1$ on the $x x$ -axisand the respective function on the $y y$ -axis.

R>plot(opt1$design, `Conditional power`=cp)

Note the slightly bent shape of the $c_{2} (\cdot) c_2(\cdot)$ function. For two-stage designs based on the inverse-normal combinationfunction, $c_{2} (\cdot) c_2(\cdot)$ would be linear by definition. Since the optimal shape of $c_{2} (\cdot) c_2(\cdot)$ is not linear (but almost), inverse-normal combination methods areslightly less efficient (cf.Pilz et al.(2019) for a more detailed discussion of this issue).

Optimization under uncertainty

adoptr is not limited to point priors but alsosupports arbitrary continuous prior distributions. Consider the samesituation as before but now assume that the prior over the effect sizeis given by a much more realistic truncated normal distribution withmean $0.3 0.3$ and standard deviation $0.1 0.1$ ,i.e., $θ \sim 𝒩_{[- 1, 1]} (0.3, {0.1}^{2}) \theta\sim\mathcal{N}_{[-1, 1]}(0.3, 0.1^2)$ .The order of integration is set to 25 to obtain precise results.

R> prior <- ContinuousPrior(+    pdf     = function(theta) dnorm(theta, mean = .3, sd = .1),+    support = c(-1, 1),+    order   = 25)

The objective function is the expected sample size under theprior

R>ess<-ExpectedSampleSize(dist=datadist, prior=prior)

and we replace power with expected power $𝐄 [{𝐏 𝐫}_{θ} [X_{2} > c_{2} (X_{1})] | θ \geq 0.1] \boldsymbol{E} \Big[\boldsymbol{Pr}_\theta\big[X_2>c_2(X_1)\big] \, \Big| \, \theta \geq 0.1 \Big]$ which is the expected power given arelevant effect (here we define the minimal relevant effect as $0.1 0.1$ ).This score can be defined inadoptr by firstconditioning the prior.

R>epower<-Power(dist=datadist, prior=condition(prior,c(.1,1)))

The optimal design under the point prior only achieves an expectedpower of

R>evaluate(epower,opt1$design)

The optimal design under the truncated normal prior fulfilling theexpected power constraint is then given by

R> opt2 <- minimize(ess, subject_to(epower >= 0.9, toer <= 0.025),+                   initial_design,+                   opts = list(algorithm = "NLOPT_LN_COBYLA",+                               xtol_rel = 1e-5, maxeval = 20000))

Note that the increased complexity of the problem requires a largermaximal number of iterations for the underlying optimization procedure.adoptr exposes thenloptr options viathe argumentopts. In cases where the maximal number ofiterations is exhausted, a warning is thrown.

The expected sample size under the prior of the obtained optimaldesign equals . This shows that an increased uncertainty on $θ \theta$ requires larger sample sizes to fulfill the expected power constraintsince the expected sample size under the continuous prior considered inthis section of the optimal design derived under a point alternative(see Section6.1) is only .

Utility maximization and composite scores

adoptr also supports composite scores. This can beused to derive utility maximizing designs by defining an objectivefunction combining both expected power and expected sample size insteadof imposing a hard constraint on expected power. For example, in theabove situation one could be interested in a utility-maximizing design.Here, we consider the utility function $u (𝒟) := 200000 𝐄 [{𝐏 𝐫}_{θ} [X_{2} > c_{2} (X_{1})] | θ \geq 0.1] - 𝐄 [n {(X_{1})}^{2}], u(\mathcal{D}) := 200000\, \boldsymbol{E} \Big[\boldsymbol{Pr}_\theta\big[X_2>c_2(X_1)\big] \, \Big| \, \theta \geq 0.1 \Big] - \boldsymbol{E}\Big[n(X_1)^2\Big],$ thus allowing a direct trade-offbetween power and sample size. Here, the expected sample size is chosenbecause the practitioner might prefer flatter sample size curves. Thiscan be achieved with expected squared sample size by penalizing largesample sizes stronger than low sample sizes. Furthermore, there is nolonger a strict expected power constraint but the expected power becomespart of the utility function which allows a direct trade-off between thetwo quantities. This can be interpreted as a pricing mechanism (cf.Kunzmann and Kieser (2020)): Everyadditional percent point of expected power has a (positive) value of $$ 2' 000 \$ 2'000$ while an increase of $𝐄 [n {(X_{1})}^{2}] \boldsymbol{E}\big[n(X_1)^2\big]$ by $11$ incurs costs of $$ 1 \$ 1$ .The goal is then to compute the design which is maximizing the overallutility defined by the utility function $u (𝒟) u(\mathcal{D})$ (or equivalently minimize costs).

A composite score can be defined via any valid numericalR expression of score objects. We start by defining ascore for the expected quadratic sample size

R> `n(X_1)`      <- ConditionalSampleSize()R> `E[n(X_1)^2]` <- expected(composite({`n(X_1)`^2}),+                            data_distribution = datadist,+                            prior = prior)

before minimizing the corresponding negative utility without a hardexpected power constraint.

R>opt3<-minimize(composite({`E[n(X_1)^2]`-200000*epower}),+subject_to(toer<=0.025),initial_design)

The expected power of the design is

R>evaluate(epower,opt3$design)

The three optimal designs which have been computed so far aredepicted in a joint plot. The design using the continuous prior requireshigher sample sizes due to the higher uncertainty about $θ \theta$ .The utility maximization approach results in similar shapes of $n (\cdot) n(\cdot)$ and $c_{2} (\cdot) c_2(\cdot)$ as the constraint optimization. However, the sample sizes are lower dueto the design’s lower power which is only possible by allowing atrade-off between expected power and expected sample size. Inparticular, the maximal sample size of the utility-based design equalsand is distinctly smaller than in the case of a hard power constraintunder a point prior (maximal sample size: ) or a continuous prior(maximal sample size: ).

Conditional power constraint

adoptr also allows the incorporation of hardconstraints on conditional scores such as conditional power. Conditionalpower constraints are intuitively sensible to make sure that a trialwhich continues to the second stage maintains a high chance of rejectingthe null hypothesis at the end. For this example, we return to the caseof a point prior on the effect size.

R>prior<-PointMassPrior(theta=.3, mass=1.0)R>ess<-ExpectedSampleSize(dist=datadist, prior=prior)R>cp<-ConditionalPower(dist=datadist, prior=prior)R>power<-expected(cp, data_distribution=datadist, prior=prior)

Here, power is derived as expected score of the correspondingconditional power. A conditional power constraint is added in exactlythe same way as unconditional constraints.

R>opt4<-minimize(ess,subject_to(toer<=0.025,power>=0.9,cp>=0.8),+initial_design)

Comparing the optimal design that has been computed here with thesame constraints but without a conditional power constraint(cf. beginning of this chapter), the optimal design with the additionalconstraint requires larger sample sizes in regions where the conditionalpower would usually be below the given threshold. Overall, theadditional constraint reduces the feasible solution space andconsequently increases the expected sample size ( with conditional powerconstraint vs. without). This example demonstrates, that any additionalbinding conditional constraints do come at costs for global optimality.Whether or not the loss in unconditional performance is outweighed bymore appealing conditional properties must be decided on a case by casebasis.

Keeping design parameters fixed

In clinical practice, non-statistical considerations may imposedirect constraints on design parameters. For instance, a sponsor mightbe subject to logistical constraints that render it necessary to designa trial with a specific stage one sample size. Returning to the standardcase discussed in Section6.1, assumethat a stage-one per-group sample size of exactly 80 individuals pergroup is required( $n_{1} = 80 n_1 = 80$ )instead of the optimal value of $n_1^* = $. Furthermore, assume that thesponsor wants to stop early for futility if and only if there is anegative effect size at the interim analysis, i.e., $c_{1}^{f} = 0 c_1^f = 0$ .adoptr supports such considerations by allowing to fixspecific values of a design:

R>initial_design@n1<-80R>initial_design@c1f<-0R>initial_design<-make_fixed(initial_design,n1,c1f)

Any ‘fixed’ parameter will be kept constant during optimization. Notethat it is also possible to ‘un-fix’ parameters again using themake_tunable function.

R>opt5<-minimize(ess,subject_to(toer<=0.025,power>=0.9),+initial_design)

The following figure visually compares the original design with thenew, more restricted design. The designs are qualitatively similar, butfixing $n_{1} n_1$ and $c_{1}^{f} c_1^f$ does come at the price of slightly increased expected sample size (compared to in the less restricted case).

Validation concept

The conduct and analysis of clinical trials is a highly regulatedprocess. An essential requirement being put forward in Title 21 CRF(code of federal regulations) Part 11 is the need to validate anysoftware used to work with or produce records(USFood and Drug Administration and others 2003). The exact scope ofregulations such as CRF 11 is sometimes difficult to assess, and it isnot always clear which regulations apply toR packagesused in a production environment(The RFoundation for Statistical Computing 2013). Irrespective of theapplicability of the CRF 11 toadoptr, the design of aclinical trial is undoubtedly crucial and package authors should provideextensive evidence of the correctness of the package functionality.Additionally, this evidence should be easily accessible andhuman-readable. The latter requirement is a consequence of the factthat, again following CRF 11 and the remarks inThe R Foundation for Statistical Computing(2013), a ‘validatedR-package’ does not existsince the validation process must always be implemented by theresponsible user.

To facilitate the process of validation as much as possible,adoptr implements the following measures:

Open-source development: The entire development ofadoptr is organized around a public GitHub.comrepository (https://github.com/optad/adoptr). Anybody can freelydownload the source code, browse the development history, raise issues,or contribute to the code base by opening pull requests.
CRAN releases: Regular CRAN(CRAN2020) releases of updated versions maximize visibility and add anadditional layer of testing and quality control. New features can beimplemented and tested in the (public) development version on GitHubbefore pushing new releases to CRAN.
Unit testing:adoptr implements anextensive test suite using the packagetestthat (H. Wickham, R Studio, and R Core Team (2018),H. Wickham (2011)) which allows spottingnew errors early during development and localizing them quickly.Together with continuous integration (cf. below), this helps to improvequality and speeds up the development process.
Continuous Integration / Continuous Deployment:adoptr makes extensive use of the continuousintegration and deployment services GitHub Actions(GitHub.com 2021). Each new commit on the publicGitHub.com repository is immediately run through the automated testingpipeline. Merges to the main branch are only possible after tests werepassed successfully and a contributor reviewed and approved the changes(‘branch protection system’). Continuous deployment allows automaticallyupdating code-coverage statistics (cf. below) and up-to-date onlinedocumentation (cf. below).
Coverage analyses: To document the extent to which the testsuite covers the package code,adoptr relies on thecodecov(Codecov LLC 2020) online servicein conjunction with thecovr package(Hester 2019). It provides statistics on theproportion of lines visited at least once during testing (currently100%) and enables easy online publication of the results.
Online documentation: Beyond the standard documentationgenerated usingroxygen2(HadleyWickham et al. 2019), we also make use of thepkgdown(H. Wickham andHesselberth 2019) package and the free GitHub pages service topublish a statichtml version of the documentationonline athttps://optad.github.io/adoptr/. This includes both thefunction reference and the vignettes in a consistent and easilyaccessible format. The online documentation experience is furtherimproved by the integration of a full-text docsearch engine (https://www.algolia.com/ref/docsearch).
Extended validation report: There are limits to what can bedone in the standard unit testing framework within a package itself(cf. https://cran.r-project.org/web/packages/policies.html).Long-running test suites also hinder active development with a strictcontinuous integration and continuous deployment (CI/CD) workflow sincechanges to the main branch require passing the automated tests. We,therefore, decided to restrict the internal unit tests to a bare minimumwith a clear focus on coverage and technical integrity of the package.To demonstrate correctness of our results over a larger set of examplesand in comparison with existing packages such asrpact,we implemented an external ‘validation report’ (sources:https://github.com/optad/adoptr-validation-report,current report:https://optad.github.io/adoptr-validation-report/) usingthebookdown (Xie (2019),Xie (2016)) package. The report itselfagain uses CI/CD and daily rebuilds to automatically deploy the reportcorresponding to the most current CRAN-hosted version of the package.Within the report, we still usetestthat to conductformal tests. In case any of these tests fails, the build of the reportwill fail, the maintainers will get notified, and the status indicatorin the repository changes.

Validating the software employed may well be as much work asdeveloping it in the first place. The opaque requirements and the lackof adequate tools to automate validation tasks are a major hurdle foracademic developers to address validation issues. The additional work,however, is worth it since it not only improves quality but alsofacilitates collaboration and makes it easier to promote packages forreal-world use.

Future work

The main motivation of implementingadoptr inR is the fact that this is by far the most commonprogramming language used by the target audience. Note, however, thatusingR for generic nonlinear constraint optimizationproblems leads to a performance bottleneck since there is currently nostable and efficient way of obtaining gradient information for generic,user-defined functions. Since one of the design principles ofadoptr is extensibility, the ability to support customobjective functions is central. InR, this implies thatone has to resort to either a finite differences approximation of first-and second-order derivatives or to a completely gradient-free optimizersuch as COBYLA. In our experiments, we found that COBYLA was far morestable than a finite-differences augmented Lagrangian method (data notshown). Still, for some problems, convergence using COBYLA is ratherslow. An interesting alternative toR andnloptr would therefore beJulia(Bezanson et al. 2017) and theJuMP framework for numerical programming(Lubin and Dunning 2015). This framework allowsinterfacing generic nonlinear solvers via a common interface and,leveragingJulia’s excellent automatic-differentiationcapabilities, is able to provide fast and precise (second-order)gradient information for user-defined objective functions.

Acknowledgments

The first two authors contributed equally to this manuscript.

This work was partly supported by the Deutsche Forschungsgemeinschaftunder Grant number KI 708/4-1.

References

Bauer, P., F. Bretz, V. Dragalin, F. König, and G. Wassmer. 2015.“Twenty-Five Years of Confirmatory Adaptive Designs: Opportunitiesand Pitfalls.”Statistics in Medicine 35 (3): 325–47.https://doi.org/10.1002/sim.6472.

Bauer, P., and K. Köhne. 1994.“Evaluation of Experiments withAdaptive Interim Analyses.”Biometrics 50 (4): 1029–41.

Bezanson, J., A. Edelman, S. Karpinski, and V. Shah. 2017.“Julia:A Fresh Approach to Numerical Computing.”SIAM Review 59(1): 65–98.https://doi.org/10.1137/141000671.

Codecov LLC. 2020.Codecov.https://about.codecov.io/.

Committee for Medicinal Products for Human Use and others. 2007.“Reflection Paper on Methodological Issues in ConfirmatoryClinical Trials Planned with an Adaptive Design.”London:EMEA.https://www.ema.europa.eu/en/methodological-issues-confirmatory-clinical-trials-planned-adaptive-design-scientific-guideline.

CRAN. 2020.Comprehensive r Archive Network.https://cran.r-project.org/.

Cytel. 2020.East 6.https://www.cytel.com/software/east/.

Englert, S., and M. Kieser. 2013.“Optimal Adaptive Two-StageDesigns for Phase II Cancer Clinical Trials.”BiometricalJournal 55: 955–68.https://doi.org/10.1002/bimj.201200220.

Fritsch, F. N., and R. E. Carlson. 1980.“Monotone Piecewise CubicInterpolation.”SIAM Journal on Numerical Analysis 17(2): 238–46.https://doi.org/10.1137/0717021.

GitHub.com. 2021.GitHub Actions.https://docs.github.com/en/actions/.

Hack, N., W. Brannath, and M. Brueckner. 2019.AGSDest: Estimationin Adaptive Group Sequential Trials.https://CRAN.R-project.org/package=AGSDest.

Hedges, L., and I. Olkin. 1985.Statistical Methods inMeta-Analysis. Academic Press.

Hester, Jim. 2019.Covr: Test Coverage for Packages.https://CRAN.R-project.org/package=covr.

ICON plc. 2020.ADDPLAN.

Jennison, C., and B. W. Turnbull. 2015.“Adaptive Sample SizeModification in Clinical Trials: Start Small Then Ask for More?”Statistics in Medicine 34 (29): 3793–3810.https://doi.org/10.1002/sim.6575.

Johnson, S. G. 2018.The NLopt Nonlinear-Optimization Package.https://nlopt.readthedocs.io/en/latest/.

Kunzmann, K., and M. Kieser. 2016.“Optimal Adaptive Two-StageDesigns for Single-Arm Trial with Binary Endpoint.”arXiv, 1605.00249.

———. 2020.“Optimal Adaptive Single-Arm Phase II Trials UnderQuantified Uncertainty.”Journal of BiopharmaceuticalStatistics 30 (1): 89–103.https://doi.org/10.1080/10543406.2019.1609016.

Kunzmann, K., and M. Pilz. 2020.Defining New Scores.https://rdrr.io/cran/adoptr/f/vignettes/defining-new-scores.Rmd.

Lehmacher, W., and G. Wassmer. 1999.“Adaptive Sample SizeCalculations in Group Sequential Trials.”Biometrics 55(4): 1286–90.https://doi.org/10.1111/j.0006-341X.1999.01286.x.

Lubin, M., and I. Dunning. 2015.“Computing in Operations ResearchUsing Julia.”INFORMS Journal on Computing 27 (2):238–48.https://doi.org/10.1287/ijoc.2014.0623.

Mehta, C. R., and S. J. Pocock. 2011.“Adaptive Increase in SampleSize When Interim Results Are Promising: A Practical Guide withExamples.”Statistics in Medicine 30 (28): 3267–84.https://doi.org/10.1002/sim.4102.

Minitab, Inc. 2020.Minitab 19 Statistical Software.https://www.minitab.com.

NCSS. 2019.PASS Sample Size 2019.https://www.ncss.com/software/pass/.

O’Brien, P. C., and T. R. Fleming. 1979.“A Multiple TestingProcedure for Clinical Trials.”Biometrics 35 (3):549–56.

Pilz, M., K. Kunzmann, C. Herrmann, G. Rauch, and M. Kieser. 2019.“A Variational Approach to Optimal Two-Stage Designs.”Statistics in Medicine 38 (21): 4159–71.https://doi.org/10.1002/sim.8291.

Pocock, S. J. 1977.“Group Sequential Methods in the Design andAnalysis of Clinical Trials.”Biometrika 64 (2): 191–99.https://doi.org/10.1093/biomet/64.2.191.

Powell, M. J. D. 1994.“A Direct Search Optimization Method ThatModels the Objective and Constraint Functions by LinearInterpolation.” InAdvances in Optimization and NumericalAnalysis, 51–67. Dordrecht: Springer-Verlag Netherlands.https://doi.org/10.1007/978-94-015-8330-5_4.

R Core Team. 2019.R: A Language and Environment for StatisticalComputing. Vienna, Austria: The R Foundation for StatisticalComputing.https://www.R-project.org/.

SAS Institute Inc., NC, Cary. n.d.a.JMP Clinical.

———. n.d.b.SAS.

The R Foundation for Statistical Computing. 2013.“R: RegulatoryCompliance and Validation Issues - a Guidance Document for the Use of rin Regulated Clinical Trial Environments.”https://www.r-project.org/doc/R-FDA.pdf.

Tymofyeyev, Y. 2014.“A Review of Available Software andCapabilities for Adaptive Designs.” InPracticalConsiderations for Adaptive Trial Design and Implementation,139–55. Springer-Verlag New York.https://doi.org/10.1007/978-1-4939-1100-4_8.

US Food and Drug Administration et al. 2019.Adaptive Designs forClinical Trials of Drugs and Biologics - Guidance for Industry.https://www.fda.gov/media/78495/download.

US Food and Drug Administration and others. 2003.“Guidance forIndustry Part 11, Electronic Records; Electronic Signatures — Scope andApplication.”US Food Drug Admin, Rockville.https://www.fda.gov/media/75414/download.

Vandemeulebroecke, M. 2009.adaptTest: Adaptive Two-StageTests.https://CRAN.R-project.org/package=adaptTest.

Wason, J. M. S. 2015.“OptGS: An r Package for FindingNear-Optimal Group-Sequential Designs.”Journal ofStatistical Software 66 (2): 1–13.https://doi.org/10.18637/jss.v066.i02.

Wason, J. M. S., and J. Burkardt. 2015.OptGS: Near-Optimal andBalanced Group-Sequential Designs for Clinical Trials with ContinuousOutcomes.https://CRAN.R-project.org/package=OptGS.

Wassmer, G., and W. Brannath. 2016.Group Sequential andConfirmatory Adaptive Designs in Clinical Trials. Springer Seriesin Pharmaceutical Statistics -. Springer International Publishing.https://doi.org/10.1007/978-3-319-32562-0.

Wassmer, G., and F. Pahlke. 2019.Rpact: Confirmatory AdaptiveClinical Trial Design and Analysis.https://CRAN.R-project.org/package=rpact.

Wickham, H. 2011.“Testthat: Get Started with Testing.”The R Journal 3: 5–10.https://doi.org/10.32614/RJ-2011-002.

Wickham, Hadley, Peter Danenberg, Gábor Csárdi, and Manuel Eugster.2019.Roxygen2: In-Line Documentation for r.https://CRAN.R-project.org/package=roxygen2.

Wickham, H., and J. Hesselberth. 2019.Pkgdown: Make Static HTMLDocumentation for a Package.https://CRAN.R-project.org/package=pkgdown.

Wickham, H., R Studio, and R Core Team. 2018.Testthat: Unit Testingfor r.https://CRAN.R-project.org/package=testthat.

Xie, Y. 2016.Bookdown: Authoring Books and Technical Documents withr Markdown. Boca Raton, Florida: Chapman; Hall/CRC.https://doi.org/10.1201/9781315204963.

———. 2019.Bookdown: Authoring Books and Technical Documents with rMarkdown.https://github.com/rstudio/bookdown.

Ypma, J., H. W. Borchers, and D. Eddelbuettel. 2018.Nloptr: RInterface to NLopt.https://CRAN.R-project.org/package=nloptr.

Zaykin, D. V. 2011.“Optimally Weighted z-Test Is a PowerfulMethod for Combining Probabilities in Meta-Analysis.”Journalof Evolutionary Biology 24 (8): 1836–41.https://doi.org/10.1111/j.1420-9101.2011.02297.x.

Movatterモバイル変換