This article/vignette provides a basic time-to-event endpoint designsfor fixed designs usingnSurv() and group sequentialdesigns usinggsSurv(). Some detail in specification comeswith the flexibility allowed by theLachin andFoulkes (1986) method for sample size under a proportionalhazards model with piecewise constant enrollment, piecewise exponentialfailure and dropout rates. Users may also be interested in theShiny interface as alearning tool. We only use the simplest options here with a singlestratum and exponential failure and dropout rates; see the help file forgsSurv() for examples with a stratified population orpiecewise exponential failure.
We apply theLachin and Foulkes (1986)sample size method and extend it to group sequential design. This methodfixes the duration of a study and varies enrollment rates to power atrial. We also use theLachin and Foulkes(1986) basic power calculation to compute sample size along thelines ofKim and Tsiatis (1990) whereenrollment rates are fixed and enrollment duration is allowed to vary toenroll a sufficient sample size to power a study.
Since the parameters used for a design with no interim are also usedfor a group sequential design, we first specify and derive a design withno interim analysis.
We begin with information about the median time-to-event in thecontrol group, dropout rate, hazard ratios under the null and alternatehypotheses for experimental therapy compared to control, and the desiredType I and II error rates.
# Median control time-to-eventmedian<-12# Exponential dropout rate per unit of timeeta<- .001# Hypothesized experimental/control hazard ratio# (alternate hypothesis)hr<- .75# Null hazard ratio (1 for superiority, >1 for non-inferiority)hr0<-1# Type I error (1-sided)alpha<- .025# Type II error (1-power)beta<- .1Next, we plan the trial duration and the enrollment pattern. Thereare two basic methods for doing this. TheLachinand Foulkes (1986) method demonstrated here fixes the enrollmentpattern and duration as well as the trial duration and changes theabsolute enrollment rates to obtain desired power. The alternaterecommended method is along the lines ofKim andTsiatis (1990), fixing the enrollment rates and follow-upduration, varying the total trial duration to power the design; thiswill also be demonstrated below.
The above information is sufficient to design a trial with no interimanalyses. Note that when callingnSurv(), we transform themedian time-to-event (\(m\)) to anexponential event rate (\(\lambda\))with the formula\[\lambda=\log(2)/m.\]
library(gsDesign)x<-nSurv(R = R,gamma = gamma,eta = eta,minfup = minfup,T = T,lambdaC =log(2)/ median,hr = hr,hr0 = hr0,beta = beta,alpha = alpha)A textual summary of this design is given by printing it. For thegroup sequential design shown later, much more complete formatted outputwill be shown.
x#> Fixed design, two-arm trial with time-to-event#> outcome (Lachin and Foulkes, 1986).#> Solving for: Accrual rate#> Hazard ratio H1/H0=0.75/1#> Study duration: T=36#> Accrual duration: 24#> Min. end-of-study follow-up: minfup=12#> Expected events (total, H1): 507.1519#> Expected sample size (total): 775.0306#> Accrual rates:#> Stratum 1#> 0-1 9.2818#> 1-3 13.9227#> 3-6 23.2045#> 6-24 37.1272#> Control event rates (H1):#> Stratum 1#> 0-Inf 0.0578#> Censoring rates:#> Stratum 1#> 0-Inf 0.001#> Power: 100*(1-beta)=90%#> Type I error (1-sided): 100*alpha=2.5%#> Equal randomization: ratio=1If we had setT = NULL above, the specified enrollmentrates would not be changed but enrollment duration would be adjusted toachieve desired power. For the low enrollment rates specified ingamma above, this would have resulted in a long trial.
Now we move on to a group sequential design.
All the parameters above are used. We set up the number of analyses,timing and spending function parameters. These deserve careful attentionfor every trial and tend to be somewhat customized to be fit-for-purposeaccording to all those involved in designing the trial. Here the choicesconsidered the following:
# Number of analyses (interim + final)k<-3# Timing of interim analyses (k-1 increasing numbers >0 and <1).# Proportion of final events at each interim.timing<-c(.25, .75)# Efficacy bound spending function.# We use Lan-DeMets spending function approximating O'Brien-Fleming bound.# No parameter required for this spending function.sfu<- sfLDOFsfupar<-NULL# Futility bound spending functionsfl<- sfHSD# Futility bound spending parameter specificationsflpar<--7Type II error (1-power) may be set up differently than for a fixeddesign so that more meaningful futility analyses can be performed duringthe course of the trial.
Now we are prepared to generate the design.
The design summary is:
Asymmetric two-sided group sequential design with non-bindingfutility bound, 3 analyses, time-to-event outcome with sample size 676and 443 events required, 85 percent power, 2.5 percent (1-sided) Type Ierror to detect a hazard ratio of 0.75. Enrollment and total studydurations are assumed to be 24 and 36 months, respectively. Efficacybounds derived using a Lan-DeMets O’Brien-Fleming approximation spendingfunction (no parameters). Futility bounds derived using aHwang-Shih-DeCani spending function with gamma = -7.
An important addition not provided above is that the mediantime-to-event is assumed to be 12 months in the control group.
Following are the enrollment rates required to power the trial.
library(gt)library(tibble)tibble(Period =paste("Month",rownames(x$gamma)),Rate =as.numeric(x$gamma))%>%gt()%>%tab_header(title ="Enrollment rate requirements")| Enrollment rate requirements | |
| Period | Rate |
|---|---|
| Month 0-1 | 8.090968 |
| Month 1-3 | 12.136452 |
| Month 3-6 | 20.227421 |
| Month 6-24 | 32.363873 |
Next we provide a tabular summary of bounds for the design. We haveadded extensive footnoting to the table, which may or may not berequired for your design. However, as seen here it makes many choicesfor design parameters and properties transparent. No attempt has beenmade to automate this, but it may be worth considering for a template ifyou wish to make the same choice across many trials. Note that theexclude argument forgsBoundSummary() allowsadditional descriptions for design bounds such as conditional orpredictive power; see the help file for details or just provideexclude = NULL togsBoundSummary() to see alloptions.
# Footnote text for tablefootnote1<-"P{Cross} is the probability of crossing the given bound (efficacy or futility) at or before the given analysis under the assumed hazard ratio (HR)."footnote2<-" Design assumes futility bound is discretionary (non-binding); upper boundary crossing probabilities shown here assume trial stops at first boundary crossed and thus total less than the design Type I error."footnoteHR<-"HR presented is not a requirement, but an estimate of approximately what HR would be required to cross each bound."footnoteM<-"Month is approximated given enrollment and event rate assumptions under alternate hypothesis."# Spending function footnotesfootnoteUS<-"Efficacy bound set using Lan-DeMets spending function approximating an O'Brien-Fleming bound."footnoteLS<-paste("Futility bound set using ", x$lower$name," beta-spending function with ", x$lower$parname,"=", x$lower$param,".",sep ="")# Caption text for tablecaption<-paste("Overall survival trial design with HR=", hr,", ",100* (1- beta),"% power and ",100* alpha,"% Type I error",sep ="")gsBoundSummary(x)%>%gt()%>%tab_header(title ="Time-to-event group sequential design")%>%cols_align("left")%>%tab_footnote(footnoteUS,locations =cells_column_labels(columns =3))%>%tab_footnote(footnoteLS,locations =cells_column_labels(columns =4))%>%tab_footnote(footnoteHR,locations =cells_body(columns =2,rows =c(3,8,13)))%>%tab_footnote(footnoteM,locations =cells_body(columns =1,rows =c(4,9,14)))%>%tab_footnote(footnote1,locations =cells_body(columns =2,rows =c(4,5,9,10,14,15)))%>%tab_footnote(footnote2,locations =cells_body(columns =2,rows =c(4,9,14)))| Time-to-event group sequential design | |||
| Analysis | Value | Efficacy1 | Futility2 |
|---|---|---|---|
| IA 1: 25% | Z | 4.3326 | -1.7019 |
| N: 414 | p (1-sided) | 0.0000 | 0.9556 |
| Events: 111 | ~HR at bound3 | 0.4386 | 1.3823 |
| Month: 164 | P(Cross) if HR=15,6 | 0.0000 | 0.0444 |
| P(Cross) if HR=0.755 | 0.0024 | 0.0007 | |
| IA 2: 75% | Z | 2.3398 | 0.6728 |
| N: 676 | p (1-sided) | 0.0096 | 0.2505 |
| Events: 332 | ~HR at bound3 | 0.7734 | 0.9288 |
| Month: 284 | P(Cross) if HR=15,6 | 0.0096 | 0.7500 |
| P(Cross) if HR=0.755 | 0.6110 | 0.0260 | |
| Final | Z | 2.0118 | 2.0118 |
| N: 676 | p (1-sided) | 0.0221 | 0.0221 |
| Events: 443 | ~HR at bound3 | 0.8258 | 0.8258 |
| Month: 364 | P(Cross) if HR=15,6 | 0.0249 | 0.9751 |
| P(Cross) if HR=0.755 | 0.8500 | 0.1500 | |
| 1 Efficacy bound set using Lan-DeMets spending function approximating an O'Brien-Fleming bound. | |||
| 2 Futility bound set using Hwang-Shih-DeCani beta-spending function with gamma=-7. | |||
| 3 HR presented is not a requirement, but an estimate of approximately what HR would be required to cross each bound. | |||
| 4 Month is approximated given enrollment and event rate assumptions under alternate hypothesis. | |||
| 5 P{Cross} is the probability of crossing the given bound (efficacy or futility) at or before the given analysis under the assumed hazard ratio (HR). | |||
| 6 Design assumes futility bound is discretionary (non-binding); upper boundary crossing probabilities shown here assume trial stops at first boundary crossed and thus total less than the design Type I error. | |||
Several plots are available to summarize a design; see help forplot.gsDesign(); one easy way to see how to generate eachis by checking plots and code generated by theShiny interface. Thepower plot is information-rich, but also requires some explanation;thus, we demonstrate here.
The solid black line represents the trial power by effect size. Powerat interim 1 is represented by the black dotted line. Cumulative powerat interim 2 is represented by the black dashed line. The red dottedline is 1 minus the probability of crossing the futility bound on thepercentage scale. The red dashed line is 1 minus the cumulativeprobability of crossing the futility bound by interim 2.
Analyses rarely occur at exactly the number of events which areplanned. The advantage of the spending function approach to design isthat bounds can be updated to account for the actual number of eventsobserved at each analysis. In fact, analyses can be added or deletednoting that any changes in timing or analyses should not be made withknowledge of unblinded study results. We suggest tables and a plot thatmay be of particular use. We also present computation of conditional andpredictive power.
First, we update the actual number of events for interims 1 and 2 andassume the final analysis event count is still as originallyplanned:
The simple updates to Z-values and p-values for the design based oninformation fraction just requires the fraction of final events planned,but does not include the number of events or treatment effect in theoutput:
xu<-gsDesign(alpha = x$alpha,beta = x$beta,test.type = x$test.type,maxn.IPlan = x$n.I[x$k],n.I = n.I,sfu = sfu,sfupar = sfupar,sfl = sfl,sflpar = sflpar,delta = x$delta,delta1 = x$delta1,delta0 = x$delta0)Now we print the design summary, selecting minimal calculations for atable to provide guidance for review of results. If you wish to see allpossible summaries of bounds, change toexclude = NULLbelow. Here we have assumed futility guidance is based on the hazardratio at interim analysis; this is not generally the case, but is anoption as these bounds are guidance rather than having strictinferential interpretation.
gsBoundSummary( xu,deltaname ="HR",logdelta =TRUE,Nname ="Events",exclude =c("Spending","B-value","CP","CP H1","PP","P(Cross) if HR=1","P(Cross) if HR=0.75" ))%>%gt()%>%cols_align("left")%>%tab_header(title ="Time-to-event group sequential bound guidance",subtitle ="Bounds updated based on event counts through IA2" )%>%tab_footnote("Nominal p-value required to establish statistical significance.",locations =cells_body(columns =3,rows =c(2,5,8)) )%>%tab_footnote("Interim futility guidance based on observed HR is non-binding.",locations =cells_body(columns =4,rows =c(3,6)) )%>%tab_footnote("HR bounds are approximations; decisions on crossing are based solely on p-values.",locations =cells_body(column =2,rows =c(3,6,9)) )| Time-to-event group sequential bound guidance | |||
| Bounds updated based on event counts through IA2 | |||
| Analysis | Value | Efficacy | Futility |
|---|---|---|---|
| IA 1: 26% | Z | 4.2416 | -1.6470 |
| Events: 115 | p (1-sided) | 0.00001 | 0.9502 |
| ~HR at bound2 | 0.4534 | 1.35963 | |
| IA 2: 82% | Z | 2.2115 | 1.0322 |
| Events: 364 | p (1-sided) | 0.01351 | 0.1510 |
| ~HR at bound2 | 0.7931 | 0.89743 | |
| Final | Z | 2.0323 | 2.0261 |
| Events: 443 | p (1-sided) | 0.02111 | 0.0214 |
| ~HR at bound2 | 0.8244 | 0.8249 | |
| 1 Nominal p-value required to establish statistical significance. | |||
| 2 HR bounds are approximations; decisions on crossing are based solely on p-values. | |||
| 3 Interim futility guidance based on observed HR is non-binding. | |||
We recommend 3 things to present to summarize results in addition tostandard summaries such as the logrank p-value, hazard ratio based onthe Cox model, median time-to-event and Kaplan-Meier curves by treatmentgroup.
For these summaries, we will assume the updated interim event countsused above along with interim Z-values of 0.25 and 2 at interim 1 andinterim 2, respectively.
Conditional power at interim analysis 2 is computed for the currenttrend, under the null hypothesis (HR=1), and under the alternatehypothesis (HR=0.75 in this case) as follows:
gsCP(x = xu,# Updated designi =2,# Interim analysis 2zi = Z[2]# Observed Z-value for testing)$upper$prob#> [,1] [,2] [,3]#> [1,] 0.6599398 0.301728 0.7764629Predictive power incorporates uncertainty into the above conditionalpower evaluation. The computation assumes a prior distribution for thetreatment effect and then updates to a posterior distribution for thetreatment effect based on the most recent interim result. Theconditional probability of a positive finding is then averaged accordingto this posterior. We specify a normal prior for the standardized effectsize using thegsDesign::normalGrid() function. We select aweak prior with mean half-way between the alternative(x$delta) and null (0) hypotheses and a variance equivalentto observing 5% (=1/20) of the targeted events at the final analysis;the following shows that the standard deviation for the prior is wellover twice the mean, so the prior is relatively weak.
prior<-normalGrid(mu = x$delta/2,sigma =sqrt(20/max(x$n.I)))cat(paste(" Prior mean:",round(x$delta/2,3),"\n Prior standard deviation",round(sqrt(20/ x$n.fix),3),"\n"))#> Prior mean: 0.072#> Prior standard deviation 0.215Now based on the interim 2 result, we compute the predictive power ofa positive final analysis.
gsPP(x = xu,# Updated designi =2,# Interim analysis 2zi = Z[2],# Observed Z-value for testingtheta = prior$z,# Grid points for above priorwgts = prior$wgts# Weights for averaging over grid)#> [1] 0.6407376A B-value (Proschan, Lan, and Wittes(2006)) is a Z-value multiplied by the square root of theinformation fraction (interim information divided by final plannedinformation. In the plot below on the B-value scale, we present theefficacy bounds at each analysis in black, futility guidance in red, theobserved interim tests in blue connected by solid lines, and a dashedblue line to project the final result. Under a constant treatment effect(proportional hazards for a time-to-event outcome tested with a logranktest) the blue line behaves like observations from a Brownian motionwith a linear trend (“constant drift”). While a comparable Z-value plotwould have the effect increasing with the square root of the number ofevents, the B-value plot trend is linear in the event count. The trendis proportional to the logarithm of the underlying hazard ratio. Theprojected final test is based on the dashed line which represents alinear trend based on the most recent B-value computed; this projectionis what was used in the conditional power calculation under the currenttrend that was computed above.
maxx<-450# Max for x-axis specified by userylim<-c(-1,3)# User-specified y-axis limitsanalysis<-2# Current analysis specified by user# Following code should require no further changesplot( xu,plottype ="B",base =TRUE,xlim =c(0, maxx),ylim = ylim,main ="B-value projection",lty =1,col =1:2,xlab ="Events")N<-c(0, xu$n.I[1:analysis])B<-c(0, Z*sqrt(xu$timing[1:analysis]))points(x = N,y = B,col =4)lines(x = N,y = B,col =4)slope<- B[analysis+1]/ N[analysis+1]Nvals<-c(N[analysis+1],max(xu$n.I))lines(x = Nvals,y = B[analysis+1]+c(0, slope* (Nvals[2]- Nvals[1])),col =4,lty =2)