Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
NCBI home page
Search in PMCSearch
  • View on publisher site icon
As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health.
Learn more:PMC Disclaimer | PMC Copyright Notice
NIHPA Author Manuscripts logo
. Author manuscript; available in PMC: 2022 Feb 8.

Megastudies improve the impact of applied behavioural science

Katherine L Milkman1,,Dena Gromet2,Hung Ho1,26,Joseph S Kay2,Timothy W Lee2,27,Pepi Pandiloski3,Yeji Park4,Aneesh Rai1,Max Bazerman5,John Beshears5,Lauri Bonacorsi6,Colin Camerer7,Edward Chang5,Gretchen Chapman8,Robert Cialdini9,Hengchen Dai10,Lauren Eskreis-Winkler11,Ayelet Fishbach11,James J Gross12,Samantha Horn8,Alexa Hubbard13,Steven J Jones14,Dean Karlan15,Tim Kautz16,Erika Kirgios1,Joowon Klusowski17,Ariella Kristal18,Rahul Ladhania19,George Loewenstein8,Jens Ludwig3,Barbara Mellers17,Sendhil Mullainathan11,Silvia Saccardo8,Jann Spiess20,Gaurav Suri21,Joachim H Talloen8,Jamie Taxer12,Yaacov Trope13,Lyle Ungar22,Kevin G Volpp23,Ashley Whillans5,Jonathan Zinman24,Angela L Duckworth1,25,
1Department of Operations, Information and Decisions, The Wharton School, University of Pennsylvania, Philadelphia, PA, USA
2Behavior Change for Good Initiative, The Wharton School, University of Pennsylvania, Philadelphia, PA, USA
3Harris School of Public Policy, University of Chicago, Chicago, IL, USA
4Department of Psychology, Princeton University, Princeton, NJ, USA
5Department of Negotiation, Organizations & Markets, Harvard Business School, Harvard University, Boston, MA, USA
6Pritzker School of Law, Northwestern University, Chicago, IL, USA
7Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
8Department of Social and Decision Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
9Department of Psychology, Arizona State University, Tempe, AZ, USA
10Department of Management and Organizations, Anderson School of Management, University of California Los Angeles, Los Angeles, CA, USA
11Department of Behavioral Science, Booth School of Business, University of Chicago, Chicago, IL, USA
12Department of Psychology, Stanford University, Stanford, CA, USA
13Department of Psychology, New York University, New York, NY, USA
14Department of Psychology, Rutgers University, New Brunswick, NJ, USA
15Department of Finance, Kellogg School of Management, Northwestern University, Evanston, IL, USA
16Mathematica, Princeton, NJ, USA
17Department of Marketing, The Wharton School, University of Pennsylvania, Philadelphia, PA, USA
18Department of Organizational Behavior, Harvard Business School, Harvard University, Boston, MA, USA
19Department of Health Management and Policy, School of Public Health, University of Michigan, Ann Arbor, MI, USA
20Department of Operations, Information & Technology, Stanford Graduate School of Business, Stanford, CA, USA
21Department of Psychology, San Francisco State University, San Francisco, CA, USA
22Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, PA, USA
23Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
24Department of Economics, Dartmouth College, Hanover, NH, USA
25Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
26Present address: Department of Marketing, Booth School of Business, University of Chicago, Chicago, IL, USA
27Present address: McCormick School of Engineering, Northwestern University, Evanston, IL, USA

Author contributions K.L.M., D.G., A.R., M.B., J.B., L.B., E.C., G.C., R.C., H.D., L.E.-W., A.F., J.J.G., S.H., A.H., S.J.J., D.K., E.K., J.K., A.K., G.L., B.M., S.M., S.S., G.S., J.H.T., J.T., Y.T., L.U., K.G.V., A.W., J.Z. and A.L.D. designed the research. K.L.M., D.G., J.S.K., P.P., Y.P., A.L.D. and A.R. performed the research. H.H., T.W.L., P.P. and Y.P. analysed the data. K.L.M. and A.L.D wrote the paper. D.G., H.H., J.S.K., T.W.L., P.P., Y.P., A.R., M.B., J.B., C.C., G.C., H.D., A.F., J.J.G., D.K., T.K., E.K., J.K., R.L., J.L., B.M., S.M., S.S., J.S., A.W. and J.Z. provided feedback on the paper. K.L.M., D.G., J.S.K., T.K., R.L. and S.M. supervised data analysis. K.L.M., D.G., H.H., J.S.K. and T.W.L. prepared theSupplementary Information.

Correspondence and requests for materials should be addressed to Katherine L. Milkman or Angela L. Duckworth.kmilkman@wharton.upenn.edu;aduckworth@characterlab.org

Issue date 2021 Dec.

Reprints and permissions information is available athttp://www.nature.com/reprints.

PMCID: PMC8822539  NIHMSID: NIHMS1774681  PMID:34880497
The publisher's version of this article is available atNature

Abstract

Policy-makers are increasingly turning to behavioural science for insights about how to improve citizens’ decisions and outcomes1. Typically, different scientists test different intervention ideas in different samples using different outcomes over different time intervals2. The lack of comparability of such individual investigations limits their potential to inform policy. Here, to address this limitation and accelerate the pace of discovery, we introduce the megastudy–a massive field experiment in which the effects of many different interventions are compared in the same population on the same objectively measured outcome for the same duration. In a megastudy targeting physical exercise among 61,293 members of an American fitness chain, 30 scientists from 15 different US universities worked in small independent teams to design a total of 54 different four-week digital programmes (or interventions) encouraging exercise. We show that 45% of these interventions significantly increased weekly gym visits by 9% to 27%; the top-performing intervention offered microrewards for returning to the gym after a missed workout. Only 8% of interventions induced behaviour change that was significant and measurable after the four-week intervention. Conditioning on the 45% of interventions that increased exercise during the intervention, we detected carry-over effects that were proportionally similar to those measured in previous research36. Forecasts by impartial judges failed to predict which interventions would be most effective, underscoring the value of testing many ideas at once and, therefore, the potential for megastudies to improve the evidentiary value of behavioural science.


A major impediment to prescribing behaviourally informed policy interventions is the inability to make apples-to-apples comparisons of their efficacy2. Scientific teams tend to run studies independently, recruiting their own samples, making their own decisions about design parameters and targeting behavioural outcomes of their own choosing. As a consequence, differences in treatment efficacy are obscured by massive heterogeneity in sample demographics, treatment and follow-up periods, contexts and outcomes. Furthermore, many promising ideas for changing behaviour do not work in practice7, and it can be surprisingly difficult to predict ex ante which seeds will eventually bear fruit711. Thus, the ‘one-apple-at-time’ approach is an inefficient way to advance behavioural science.

We propose an experimental paradigm for evaluating many behavioural interventions at once: the megastudy is a massive field experiment in which many different treatments are tested synchronously in one large sample using a common, objectively measured outcome. This approach takes inspiration from the common task framework, which has substantially accelerated progress in the field of machine learning12. In a common task framework, researchers compete to solve the same problem (such as image recognition), subject to the same constraints (for example, the same validation method) and using the same dataset, with complete transparency in terms of hypotheses tested and results12,13. There are also precedents for this kind of research in online and laboratory environments14,15. Furthermore, scientific tournaments have a similar flavour to megastudies16, although they rarely involve random assignment and have not focused on behaviour change.

Additional benefits of megastudies include enabling economies of scale and publishing null results. The centralized administration of megastudies both decreases the marginal costs of conducting field research for individual scientists and accelerates the pace of scientific exploration. Further, in the spirit of recent large scientific collaborations aimed at improving the openness and reproducibility of research17, megastudies enable null findings to be published because those null results are part of a larger endeavour.

Here we present a demonstration megastudy involving scientists who worked in small teams to create dozens of different online programmes aimed at promoting gym attendance in American adults. We also summarize separate prediction studies in which lay and expert third-party observers made ex ante forecasts of the relative efficacy of these interventions.

Defining the primary outcome

As policy-makers agree that physical exercise is healthy and because gym attendance can be measured objectively and precisely, gym visits are a natural target for applied behavioural science research35,18. Currently, only 49% of American adults exercise at the recommended levels19, and physical inactivity accounts for an estimated 9% of premature mortality globally20.

Our final megastudy sample includedn = 61,293 participants in 46 US states (65% female, mean age = 39.13, s.d. = 13.25). The outcomes of interest over a four-week intervention period were: (1) the number of days participants checked into the gym each week, and (2) an indicator for whether participants checked into the gym at least once in a given week (following previous research5,6). For simplicity, here we focused on the number of days that participants exercised, but include the discrete exercise measure inExtended Data Fig. 1,Extended Data Tables 13 andSupplementary Information 5, in which we show that results with this secondary outcome are remarkably similar to our main results below.

Gym attendance data were provided by 24 Hour Fitness, which requires members to check in to enter the gym. In the four weeks before joining our megastudy, participants’ mean number of weekly visits to the gym was 1.27 (s.d. = 1.48) and the mean number of participants who checked into the gym at least once in a given week was 47.7% (s.d. = 40.4%).

At least 455 participants were assigned to each megastudy condition (mean:n = 1,135; median:n = 839;Extended Data Table 4), yielding at least 90% power to detect a mean difference of 0.32 weekly gym visits per person between conditions whenα is set at 0.05. Furthermore, as reported inExtended Data Table 5 andSupplementary Information 1 and7, balance checks suggest that randomization was successful and participant characteristics were similar across experimental conditions.

The effects of study conditions on exercise

Our megastudy included a placebo control condition in which participants received 1,500 points when they enrolled in the study (worth US$1.08 when redeemed athttps://www.amazon.com, an amount equal to the expected earnings of participants in a typical experimental condition; see the ‘Descriptions of the 54 conditions in the megastudy’ section of theSupplementary Information). Participants in the placebo control condition received no other intervention content.

We also included a baseline intervention called planning, reminders and microincentives to exercise. This condition combined three low-cost, evidence-based components that are expected to increase exercise. First, as past research has shown that planning prompts facilitate follow-through2123, we prompted participants to plan the dates and times when they would exercise each week of the programme. Second, as reminders have been shown to enhance goal achievement24, we texted participants reminders to exercise at these scheduled times. Finally, building on past work showing that cash rewards for exercise that are an order of magnitude larger than this can promote gym attendance36 and that the effects of very small incentives on goal commitment can be surprisingly large25, we offered participants microincentives for each gym visit (300 points per visit, redeemable for approximately US$0.22).

The other 52 experimental conditions in our megastudy augmented this planning, reminders and microincentives to exercise condition by adding new features (Supplementary Table 1).

Compared with the placebo control condition, 45% of the 53 experimental conditions tested in our megastudy produced a statistically significant (two-sidedP < 0.05) increase in an ordinary least squares (OLS) regression model predicting weekly gym visits during our four-week intervention (significantP values range from 2.39 × 107 to 0.045;Fig. 1a andExtended Data Table 6 present these regressions;Table 2 shows the percentage of other treatments each experimental condition outperformed). InExtended Data Table 7, we present parallel analyses of whether study participants attended the gym at least once per week, and we found that, compared with the placebo control condition, approximately 34% of the experimental conditions had significantly more people visiting the gym at least once per week.

Fig. 1|. Measured versus predicted changes in weekly gym visits induced by interventions.

Fig. 1|

The measured change (blue) versus change predicted by third-party observers (gold) in weekly gym visits induced by each of the 53 experimental conditions in our megastudy compared with the placebo control condition during a four-week intervention period. The error bars represent the 95% confidence intervals (seeExtended Data Table 6 for the complete OLS regression results shown here in blue and the sample sizes for each condition;Supplementary Information 11 for more details about the prediction data shown in gold; andSupplementary Table 1 for full descriptions of each treatment condition in our megastudy). Sample weights were included in the pooled third-party prediction data to ensure equal weighting of each of our three participant samples (professors, practitioners and Prolific respondents). The superscripts a–e denote the different incentive amounts offered in different versions of the bonus for returning after missed workouts, higher incentives and rigidity rewarded conditions, which are described inSupplementary Table 1. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Table 2 |.

The percentage of treatments that each experimental condition outperformed

Experimental conditionThe percentage of conditions outperformed (P < 0.05)List of conditions outperformed (P < 0.05)
(1) Bonus for returning after missed workoutsb5554***, 30**, 40**, 41**, 44–53**, 26–29*, 31–39*, 42*, 43*
(2) Higher incentivesa4754***, 47–52**, 28–31*,
33*, 35–46*, 53*
(3) Exercise social norms shared (high and increasing)4054***, 47–52**, 30*, 33*, 35–37*, 39–46*, 53*
(4) Free audiobook provided1554**, 47–53*
(5) Bonus for returning after missed workoutsa3854***, 47–52**, 30*, 33*, 36*, 37*, 39–46*, 53*
(6) Planning fallacy described and planning revision encouraged1154**, 48–52*
(7) Choice of gain- or loss-framed microincentives3254***, 47–52**, 30*, 37*, 40–46*, 53*
(8) Exercise commitment contract explained1154**, 48–52*
(9) Free audiobook provided, temptation bundling explained1754**, 45*, 47–53*
(10) Following workout plan encouraged1354**, 47–52*
(11) Fitness questionnaire with decision support and cognitive reappraisal prompt1154**, 48–52*
(12) Values affirmation451*, 54*
(13) Asked questions about workouts254*
(14) Rigidity rewardeda654**, 51*, 52*
(15) Defaulted into three weekly workouts254*
(16) Exercise fun facts shared254*
(17) Exercise advice solicited254*
(18) Fitness questionnaire254**
(19) Planning revision encouraged254*
(20) Exercise social norms shared (low)254*
(21) Exercise encouraged with typed pledge0
(22) Gain-framed microincentives254*
(23) Higher incentivesb254*
(24) Rigidity rewardede254*
(25) Exercise encouraged with signed pledge0
(26) Values affirmation followed by diagnosis as gritty0
(27) Bonus for consistent exercise schedule0
(28) Rigidity rewardedc0
(29) Loss-framed microincentives0
(30) Planning, reminders and microincentives to exercise254**
(31) Fitness questionnaire with cognitive reappraisal prompt0
(32) Exercise encouraged0
(33) Planning workouts encouraged0
(34) Gym routine encouraged0
(35) Reflecting on workouts encouraged0
(36) Planning workouts rewarded0
(37) Effective workouts encouraged0
(38) Planning benefits explained0
(39) Reflecting on workouts rewarded0
(40) Fun workouts encouraged0
(41) Monday–Friday consistency rewarded, Saturday–Sunday consistency rewarded0
(42) Exercise encouraged with electronically signed pledge0
(43) Bonus for variable exercise schedule0
(44) Exercise commitment contract explained post-intervention0
(45) Rewarded for responding to questions about workouts0
(46) Defaulted into one weekly workout0
(47) Exercise social norms shared (low but increasing)0
(48) Rigidity rewardedd0
(49) Exercise commitment contract encouraged0
(50) Fitness questionnaire with decision support0
(51) Rigidity rewardedb0
(52) Exercise advice solicited, shared with others0
(53) Exercise social norms shared (high)0
(54) Placebo control0

The percentage of conditions outperformed (P < 0.05) was obtained by conducting pairwise Wald tests to assess whether paired regression coefficients significantly differed from one another inExtended Data Table 6.

The superscripts a–e denote the different incentive amounts offered in different versions of the bonus for returning after missed workouts, higher incentives and rigidity rewarded conditions, which are described inSupplementary Table 1. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

*

P < 0.05;

**

P < 0.01;

***

P < 0.001.

Rather than adjusting ourP values for 53 paired comparisons, we report unadjusted standard errors, two-sidedP values and confidence intervals (CI) so readers may choose a preferred correction. Using the Storey–Tibshirani method of computing the false-discovery rate26, we estimate that the results identified as significant at the 5% level have less than a 5.07% chance of being a true null. The 45% of our experimental conditions that increased gym visits produced an estimated 0.14 to 0.40 extra weekly gym visits during the four-week intervention period (the CI lower bounds range from 0.004 to 0.21 and the CI upper bounds range from 0.23 to 0.59), increasing exercise by an estimated 9% to 27% compared with the placebo control condition, in which participants visited the gym a mean of 1.48 times per week during the intervention period. No treatment significantly reduced gym visits. Furthermore, anF-test enables us to reject the null hypothesis that all 53 treatment effects have the same true value (F = 1.392,P = 0.032).

The planning, reminders and microincentives to exercise condition produced an estimated 0.14 more weekly gym visits per participant (a 9% increase in exercise) compared with the placebo control condition (b = 0.14, 95% CI = 0.04–0.23,P = 0.006).

All of the 24 treatments that significantly increased exercise in comparison to the placebo control condition included planning, reminders and incentives to exercise, typically with an additional nudge or reward to visit the gym (Fig. 1). Five of these experimental conditions stood out, significantly outperforming the planning, reminders and microincentives condition according to Wald tests comparing the estimated treatment effects. As some effect-size estimates had wider confidence intervals than others, these five conditions were not exactly the same as the five conditions with the largest estimated effect sizes shown inFig. 1. The conditions in question are presented inTable 1 with their estimated effects on exercise. Note that the criteria used for their selection (that they are the top performers in a distribution) mean that these estimated treatment effects are probably inflated.

Table 1 |.

Regression-estimated effects of top-performing interventions

Compared with the placebo control conditionCompared with the planning, reminders and microincentives condition
Treatmentb95% CIPb95% CIP
(1) Bonus for returning after missed workoutsb0.4030.21–0.59<0.0010.2660.06–0.470.010
(2) Higher incentivesa0.3650.18–0.55<0.0010.2290.04–0.420.020
(3) Exercise social norms shared (high and increasing)0.3450.18–0.51<0.0010.2090.03–0.390.020
(5) Bonus for returning after missed workoutsa0.3360.18–0.49<0.0010.2000.03–0.370.022
(7) Choice of gain- or loss-framed microincentives0.2840.18–0.39<0.0010.1470.02–0.270.021

SeeExtended Data Table 6 for the complete OLS regression results summarized here in columns 2–4, andExtended Data Table 8 for the complete OLS regression results summarized in columns 5–7.

The superscripts a–b denote the different incentive amounts offered in different versions of the bonus for returning after missed workouts and higher incentives, which are described inSupplementary Table 1. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

As shown inTable 1, we found that rewarding participants with a bonus of 125 points (US$0.09) for returning to the gym after a missed workout produced an estimated 0.40 more weekly gym visits per participant (a 27% increase in exercise) compared with the placebo control (b = 0.40,P < 0.001). This condition produced a 16% increase in exercise relative to planning, reminders and microincentives (b = 0.27,P = 0.010). Second, offering participants larger incentives (that is, 490 points per gym visit, or US$1.75) produced an estimated 0.37 more weekly gym visits per participant (a 25% increase in exercise) compared with the placebo control (b = 0.37,P < 0.001). This condition produced a 14% increase in exercise relative to planning, reminders and microincentives (b = 0.23,P = 0.020). Third, telling participants that the majority of Americans exercise and the fraction is growing produced an estimated 0.35 more weekly gym visits per participant (a 24% increase in exercise) compared with the placebo control (b = 0.35,P < 0.001). This condition produced a 13% increase in exercise relative to planning, reminders and microincentives (b = 0.21,P = 0.020). Fourth, rewarding participants with a bonus of 225 points (US$0.16) for returning to the gym after a missed workout produced an estimated 0.34 more weekly gym visits per participant (a 23% increase in exercise) compared with the placebo control (b = 0.34,P < 0.001). This condition produced a 12% increase in exercise relative to planning, reminders and microincentives (b = 0.20,P = 0.022). Fifth, allowing participants to choose whether their rewards for gym visits would be framed as gains (such that they would earn points each day that they visited the gym) or losses (such that they would lose points each day that they did not visit the gym) produced an estimated 0.28 more weekly gym visits per participant (a 19% increase in exercise) compared with the placebo control (b = 0.28,P < 0.001). This condition produced a 9% increase in exercise relative to planning, reminders and microincentives (b = 0.15,P = 0.021). Note that, in different conditions, points had different cash values (Supplementary Table 1).

Enduring effects of study conditions

Although 45% of the experimental conditions in our megastudy outperformed the placebo control condition during our four-week intervention, only 8% produced significant increases in the frequency of gym visits during the four weeks post-intervention, compared with 2.5% that would be expected to do so by chance (Extended Data Table 9). AnF-test enabled us to reject the null hypothesis that all 53 treatments have null effects beyond the treatment period (F = 1.418,P = 0.024).

Focusing on the 45% of interventions that outperformed the placebo control during the four-week intervention period, each extra gym visit that was generated during the four-week intervention period corresponded to between −0.07 and 0.76 extra gym visits during the ten weeks post-intervention (median = 0.354 extra gym visits post-intervention, 25th percentile = 0.085 extra gym visits post-intervention, 75th percentile = 0.522 extra gym visits post-intervention;Supplementary Table 5). We also pooled data from these interventions into a single category and estimated that they generated a mean of 0.30 extra gym visits during the 10-week post-intervention period for every additional gym visit that they produced during the four-week intervention (skew-corrected 95% CI=0.13–0.54; seeSupplementary Information 3 for details). These post-intervention returns are consistent with those from previous studies of gym attendance and habit formation36, in which analogous returns range from 0.16 to 0.46 extra gym visits post-intervention for every extra gym visit induced during the intervention (Supplementary Table 5).

By selecting on the basis of those interventions that increased exercise significantly during the four-week intervention period, we focused on experimental conditions that will be of the greatest interest to policy makers, but we also probably overstate their post-intervention effects due to the winner’s curse. To address this, we pooled data from all 53 experimental conditions into a single category. We estimate that interventions in our study generated a mean of 0.28 extra gym visits during the 10-week post-intervention period for every additional gym visit that they produced during the four-week intervention (skew-corrected 95% CI = 0.07–0.59).

Prediction accuracy

One could argue that the harder it is to predict the results of experiments, the more valuable the megastudy approach. The more difficult it is to forecast ex ante which interventions will work, the harder it is to decide in advance which interventions to prioritize for testing, and the more useful it is to instead test a large number of treatment approaches.

To assess forecasting accuracy, we conducted a series of separate preregistered studies (see the ‘Data availability’ section) in which third-party observers were asked to predict the impact of three randomly selected interventions from our megastudy. We collected these data 14 months after conducting our megastudy. One study included 301 participants recruited from Prolific (who made a total of 903 predictions, or a mean of 17 predictions per treatment condition); another included 156 professors from the top 50 schools of public health as rated by U.S. News & World Report in 2019 (who made a total of 468 predictions, or a mean of 9 predictions per treatment condition; a list of schools is provided inSupplementary Information 11); and a final study included 90 practitioners recruited from companies that specialize in applied behavioural science (who made a total of 270 predictions, or a mean of 5 predictions per treatment condition). See the ‘Prediction study participants’ section in the Methods for demographic information about the study participants.

We found no robust correlations (weighted pooledr = 0.02,P = 0.89) between these populations’ estimated treatment effects and observed treatment effects (Prolific participantsr = 0.25,P = 0.07; professors’r = −0.07,P = 0.63; practitionersr = −0.18,P = 0.19). Furthermore, predictions about the benefits of our interventions were a mean of 9.1 times too optimistic (Fig. 1b). Predictions of treatment effects for our secondary dependent variable–the likelihood of making a gym visit in a week–were similarly inaccurate and are presented inSupplementary Information 11.

Taken together, these results highlight how difficult it is to predict ex ante the efficacy of interventions and why it is therefore so valuable that megastudies enable the synchronous testing of many different approaches to changing behaviour.

Conclusions

The megastudy paradigm enables apples-to-apples comparisons of dozens of different behaviour change interventions, each designed by an independent scientific team. If we had tested only one or two interventions (as is typical in behavioural science research27,28), we probably would not have picked many top performers and failed to gain valuable new insights. Relatedly, few of the 20 preregistered studies embedded within our megastudy yielded results that were consistent with their preregistered hypotheses. The megastudy paradigm ensures that all results, including null results, are published and that insights can still be gleaned from comparing treatments across studies, as illustrated both by this megastudy and a follow-up megastudy testing the best strategies for nudging vaccination29.

The megastudy paradigm has limitations. First, the insights of a megastudy depend on the strength of the included interventions. In the current demonstration, it is probable that more extensive interaction (such as in-person coaching) or greater financial incentives would have produced larger treatment effects36,18. Second, constraining scientists to a specific sample, dependent variable and timeframe arguably limits creativity in intervention design. Third, the effect sizes of top-performing interventions in megastudies will typically be over-estimated, whereas the effect sizes of the worst-performing interventions in megastudies will typically be underestimated due to noise and mean reversion30. Replicating the effects of outlier interventions identified in megastudies will therefore be important for establishing their true impact.

Regarding contexts that are especially well-suited for megastudies, one prerequisite is a sufficiently large population for testing more than a handful of interventions with adequate statistical power. Furthermore, as is the case with any study intended to influence policy, a cost–benefit analysis should suggest that, if tested interventions yield plausible treatment effects, deploying those interventions widely would be a wise investment. For example, our use of microincentives in this megastudy (rather than the substantially larger incentives that have been proven impactful in previous gym studies) was informed by cost-effectiveness calculations that suggested that large incentives could not be justified by the expected treatment effects and the value of exercise to society (Supplementary Information 3 and4). Furthermore, as megastudies add value to policy-makers by separating the wheat from the chaff, they are especially valuable when the targeted behaviour is of unambiguous consequence to individual and societal wellbeing. Finally, as megastudies reduce the downside of individual study failures, they may create incentives for scientists to design interventions with a low probability of a notable result, so they may be well-suited to environments where risk-taking could have a particularly large upside.

By enabling direct comparisons of diverse intervention ideas, megastudies can accelerate the generation and testing of new insights about human behaviour and the relevance of these insights for public policy.

Methods

Ethics approval

The Institutional Review Board at the University of Pennsylvania approved our study’s protocols, and this research was deemed to comply with all of the relevant ethical regulations. Informed consent was obtained from all of the study participants as part of the enrolment process. The reference number for the field experiment was 827107 and the reference number for the prediction accuracy studies was 833336.

Megastudy setting

We conducted our megastudy in partnership with 24 Hour Fitness, one of the largest gym chains in the United States. At the time of the study, 24 Hour Fitness had over four million members and 450 gym locations in 14 states (although some members of 24 Hour Fitness reside in states without a 24 Hour Fitness location, so our study participants came from more than 14 US states). The cost of a basic membership at 24 Hour Fitness varies by location, but ranges from approximately US$30 to US$60 per month. Members check in to 24 Hour Fitness gyms by either (1) giving their ID to a staff member at the front desk, (2) swiping or scanning a member card or (3) using a fingerprint reader and unique check-in code. We used 24 Hour Fitness check-in data to track gym attendance.

Participant recruitment and enrolment

All of the approximately 4 million adult members of 24 Hour Fitness gyms whose memberships were active between 21 March 2018 and 31 January 2019 were eligible to participate. Recruitment involved a multichannel marketing campaign advertising “a habit-building, science-based workout program” called StepUp, and 24 Hour Fitness members could sign up online anytime between 21 March 2018 and 31 January 2019. All of the recruitment materials informed members that they could sign up for free for the StepUp Program and earn Amazon cash rewards for exercising. Members were also told that they would earn a chance to receive a US$50 Amazon gift card by simply registering for the programme. Three participants were randomly selected to receive a US$50 gift card.

All of the recruitment materials included a URL that directed gym members to the StepUp Program website, which conveyed that StepUp was a 28-day digital experience being offered exclusively to 24 Hour Fitness members. Participants who visited the StepUp Program website were first prompted to consent to participate in research. Participants then provided their gym check-in code and date of birth to verify their gym membership. Finally, participants were prompted to provide their name, email address and phone number, and they were required to verify that their phone could receive text messages from StepUp (details are provided in the ‘Registration experience’ section of theSupplementary Information). After verifying that they could receive text messages, the participants were randomly assigned to one of twenty different preregistered substudies (all involving different versions of the StepUp Program) aimed at increasing gym visit frequency, and they were then randomly assigned to one of the 54 different experimental conditions within these studies. Participants were blind to study hypotheses.

Our initial, preregistered recruitment goal was to include at least 3,000 participants per experimental condition in our megastudy. However, shortly after launching recruitment, it became apparent that this would take nearly a decade. As a consequence, we updated our preregistrations early on in the 10 month study to reflect a more realistic stopping rule of recruiting at least 400 participants per condition.

In total, 62,746 participants were randomized to one of the 54 study conditions in our megastudy, with at least 455 participants in each condition (Extended Data Table 4). Participants were excluded from analyses if they requested to withdraw (n = 123), signed up more than once for the StepUp Program (n = 355) or experienced severe technology glitches (n = 975). Further details about these exclusions are provided inSupplementary Information 9 and10.

Thus, our final sample includesn = 61,293 study participants. 24 Hour Fitness shared a record of every gym visit made by study participants starting one year before each participant’s enrolment in the programme and continuing until one year after each participant’s programme participation concluded (for a total of 758 d of observations per participant).

As reported inExtended Data Table 6 andSupplementary Information 1 and7, balance checks suggest that randomization was successful. As we obtained informed consent to analyse data on study participants only, we unfortunately cannot determine how representative our final sample is of the 24 Hour Fitness membership.

Megastudy intervention content

After enrolling, participants in all 54 conditions of our megastudy were shown descriptions of the StepUp Program. All of the participants learned that they would receive points during the intervention period that were redeemable for an Amazon gift card after they completed the intervention. Participants in the 53 experimental conditions (that is, every condition except for the placebo control condition) received 100 points for registering and learned how they could earn incentives (through points that were redeemable for an Amazon gift card at the conclusion of the programme; notably, the conversion rate differed by experimental condition). Most conditions awarded points for gym visits. A number of the conditions offered additional bonuses based on the time of a participant’s gym visit or other observable behaviours (such as responding to text messages). Complete information about study stimuli and incentives in each condition is provided in the ‘Descriptions of the 54 conditions in the megastudy’ section of theSupplementary Information.

In 53 experimental conditions (all of the conditions except for the placebo control condition), the participants were prompted to create a weekly schedule of the days and times that they planned to work out during the four-week programme. The registration experience for the experimental conditions also included other content specific to the study condition (such as survey questions, instructions, images and videos). At the conclusion of the registration experience, all of the participants were informed that their four-week programme started the next day.

Participants across all 54 study conditions received a welcome text message shortly after they completed enrolment confirming the points that they received for registering, as well as a final text message on the last (28th) day of the programme confirming the programme’s end.

In all 53 experimental conditions, the participants received workout reminders by text 30 min before each scheduled workout (the language of these texts varied across conditions); most of the experimental conditions included additional text messages reinforcing intervention content. Moreover, the participants in all 53 experimental conditions received an email shortly after registration and once a week thereafter for four weeks. Each email confirmed the workout schedule that they had created and reinforced study-specific content.

The simplest experimental condition was the planning, reminders and microincentives to exercise condition. This condition included components that have previously been shown to increase exercise–prompts to plan workouts, reminders to exercise at planned times and microincentives for gym visits6. The study participants in this condition were prompted to create a weekly workout schedule after registering for StepUp. Over the next four weeks, the participants received text message reminders before each scheduled gym visit, weekly emails containing their workout schedules and 300 points (worth a total of US$0.22) each time they visited the gym that were redeemable for an Amazon gift card at the conclusion of the study.

To develop our study’s 52 other experimental conditions, members of an interdisciplinary group of 34 scientists who study behaviour change were invited to independently submit designs (‘tournament’ entries) along with additional collaborators of their choosing, and submissions were then revised in partnership with the project’s principal investigators (a process that required extensive coordination). The first and last author invited all of the scientists affiliated with the University of Pennsylvania’s Behaviour Change for Good Initiative (BCFG) to contribute submissions, and the 23 affiliated scientists who submitted study designs brought 13 of their own collaborators and graduate students to the project.

The participants in the placebo control condition received 1,500 points (US$1.08) when they signed up for our programme. This value was equivalent to the expected earnings of participants in our planning, reminders and microincentives condition, which was determined by calculating the mean historical gym attendance of the 24 Hour Fitness members and the point values that participants would earn in the planning, reminders and microincentives condition if they attended the gym at this frequency (100 points for registering and 300 points per gym visit × 1.17 expected gym visits per week for 4 weeks = 1,500 expected points). The participants in the placebo control condition did not create a workout schedule or receive any additional intervention content.

The other 52 experimental conditions in our megastudy involved augmentations to our planning, reminders and microincentives to exercise condition designed by scientists affiliated with BCFG. Scientists were invited to vary the (1) online registration experience delivered immediately after participants completed study enrolment, (2) text messages and emails sent during the four-week programme and (3) incentives for activities completed during the programme.

Megastudy randomization

The 54 conditions in our megastudy comprised 20 separate preregistered studies (links to all study preregistrations are provided in the ‘Full descriptions of each study condition’ section of theSupplementary Information). To offset the risk of underpowering all studies if we failed to reach our recruitment targets, megastudy participants were randomized using a weighted, time-varying algorithm as follows. At any given time, the plurality of participants (40–60%) was assigned with equal probability to conditions within one of the 20 studies noted above (the target study), 5% of participants were assigned to our placebo control condition and the remaining participants were randomly assigned with equal probability to treatment conditions in the other 19 studies. The randomization algorithm switched to a different target study after a predetermined number of participants enrolled, and this happened 26 times, creating 27 megastudy ‘stratification cohorts’. Our data analyses are weighted to account for these 27 different stratification cohorts, as described below. More details on randomization weighting are included inSupplementary Information 8.

Megastudy statistical analysis

Each of the 20 studies in our megastudy was preregistered on the Open Science Framework (details are provided in the ‘Data availability’ section). For analyses of our megastudy, we scaled up our standard, preregistered regression analysis strategy (including all of the study conditions in one giant regression model) to identify which of the 53 conditions across all 20 preregistered studies increased the frequency of gym visits during our intervention relative to our placebo control condition.

Although all 20 of the substudies in this megastudy were preregistered, the megastudy itself was not. This was an oversight on our part. We had planned to publish analyses on the totality of preregistered substudies within our megastudy, which is why we used a weighted random assignment scheme rather than sequential random assignment. Preregistering the individual substudies obviated concerns about selective inclusion of treatment arms in substudy analyses. We recommend that future megastudies are preregistered themselves.

To identify which experimental conditions were effective at increasing the frequency of gym visits during our megastudy’s four-week intervention period, we evaluated the mean estimated effect of each of the 53 experimental conditions compared with the placebo control condition. We used OLS regressions and weighted observations to account for the different probabilities of assignment across stratification cohorts.

Specifically, we used an OLS regression with participant fixed effects to estimate the following equation:

Yict=α+g=1Gβgditg+δct+vi+εict,

whereYict is the outcome (that is, gym attendance) of participanti from stratification cohortc in weekt,α is a constant,ditg is an indicator for both whether participanti is in experimental conditiong and whether weekt is during the intervention period,βg is the effect of experimental conditiong during the intervention period,δct is a cohort-by-week fixed effect,vi is a participant fixed effect andεict is a random error term.G is the number of treatment conditions in the analysis (53 when estimating the treatment effect of experimental conditions relative to the placebo control reference group). We estimate the cohort-by-week fixed effects by including cohort-by-week indicator variables in the regression. To account for clustering, we estimated cluster-robust standard errors that allowed for arbitrary correlations of the error term within individuals over time31. This regression estimates the treatment effect of experimental conditiong relative to the reference group (either the placebo control, or the planning, reminders and microincentives treatment) across all of the cohorts. Participant fixed effects are not collinear with the indicators for whether an individual is in an experimental condition during the intervention period (ditg) because even though each individual can be in only one condition (which would normally create collinearity) our model includes data on participants’ preintervention gym visits for up to 52 weeks (fewer weeks are included when fewer are available for new gym members).

To adjust for the compositional differences across cohorts, we weighted each observation such that each condition is equally weighted within a cohort, and each cohort is weighted proportionally to the length of the cohort in days. This weighting, along with the inclusion of individual and cohort-by-week fixed effects described above, accounts for differences in cohort assignment and seasonality and ensures that our regression produces unbiased estimates of treatment effects. By design, the probability of assignment to each study condition differs by cohort, which would produce unbalanced estimates without the use of sample weighting and fixed effects in our regression specification. Thus, we included sample weights that ensure that, for each cohort, each experimental group is equally represented such that the estimates are equivalent to those from an experiment with equal probabilities of assignment and are therefore balanced estimates. Furthermore, to control for chance imbalances and improve statistical precision, our models include individual fixed effects and cohort-by-week fixed effects. As cohorts were determined by when participants signed up for the StepUp Program, these fixed effects should absorb any remaining seasonal variation in gym attendance. Our simulations, which are presented in the ‘Simulation to ensure validity of analyses’ section of theSupplementary Information, show that this approach yields unbiased estimates of the mean treatment effects and our balance tests reveal that experimental groups do not systematically differ in ways that could lead to biases in our estimates (details about our weighting strategy are provided inSupplementary Information 8). We rely on this statistical analysis strategy for additional regression analyses presented inSupplementary Information 5 and6.

Approximately 6.6% of the megastudy participants were not assigned to the experimental condition that they were intended to experience according to a predefined randomization matrix due to a bug that manifested when there was heavy traffic on our website (leading occasional skips or repeats in the conditions to which subsequent participants were assigned). Our weighting accounts for this error because it is based on the number of people who were actually assigned to each condition within a cohort, rather than the number of people to whom we intended to assign each condition within a cohort. Analyses based on the intended condition assignment are provided in theSupplementary Information (seeSupplementary Information 5ag for robustness checks) and provide very similar results to those presented here.

In addition to estimating treatment effects during the four-week StepUp Program, we also estimated treatment effects during the four-week post-intervention period. To measure the mean estimated effect of experimental conditions on post-intervention gym attendance, we ran a similar regression with an additional indicator term for the post-intervention period:

Yict=α+g=1Gβ1gditg+g=1Gβ2gditg+δct+vi+εict,

Here,pitg is an indicator for whether participanti is both in experimental conditiong and the weekt is during the four-week post-intervention period,β1g is the mean effect of experimental conditiong during the intervention period,β2g is the mean effect of experimental conditiong during the four-week post-intervention period and all of the other variables are as defined above.

Across all analyses, to identify the most effective interventions, we conducted Wald tests to compare effects across all of the experimental conditions. Specifically, each Wald test assessed the null hypothesis that the estimated treatment effect of experimental conditiong (βg) minus the estimated treatment effect of experimental conditionk (βk) equalled 0.

Prediction study participants

Study 1: lay participants.

We recruited 301 workers from Prolific to answer questions about different gym programmes in exchange for US$1.25. Participants each made predictions about the effects of three experimental conditions from our megastudy, producing a total of 903 predictions and a mean of 17 predictions per condition. The participants had the following demographic characteristics: mean age = 30.8 (s.d. = 10.5); 55% female; mean years of work experience = 10.9 (s.d. = 9.8); 66% reported having a gym membership in the past 10 years; degree level: high school or less = 11.3%, some college = 28.9%, associate’s degree = 9.6%, bachelor’s degree = 38.9%, master’s, doctoral or professional degree = 11.3%. This study was preregistered and the preregistration is available in the ‘Data availability’ section.

Study 2: public health school faculty.

We recruited faculty members from the top 50 public health schools according to the 2019 U.S. News & World Report to participate in this study. We contacted 1,037 faculty members (assistant, associate or full professors) from the department in each of the schools that most closely aligned with behavioural health (such as social and behavioural sciences, health promotion and behaviour, exercise science and health policy). If there was not a relevant department listed, we selected faculty members on the basis of whether one of their listed areas of expertise fell under behavioural health. Faculty members were emailed with a request to complete a short survey to identify techniques that scientists believe effectively promote exercise. They were offered a chance to win a US$50 Amazon gift card and provided with a link to our survey; a reminder email was sent 3 d later.

A total of 156 faculty members (mean age = 48.3, s.d. = 10.7; 68% female; academic title: assistant professor = 35.9%, associate professor = 39.1%, full professor = 25.0%; 79% reported having a gym membership in the past 10 years; research expertise: health education = 13.5%, health policy = 11.5%, mental health = 12.2%, nutrition = 9.6%, physical activity = 10.9%, other = 42.3%) responded to our survey. They made a total of 465 predictions about the effects of experimental conditions from our megastudy, giving a mean of 9 predictions per experimental condition. The study was preregistered and the preregistration is available in the ‘Data availability’ section.

Study 3: behavioural science practitioners.

We recruited practitioners at leading for-profit and non-profit organizations with a specialty in the application of behavioural science to real world issues to participate in this study. Leaders at 15 different organizations were emailed a request to forward an invitation to participate in a short survey to their colleagues on a strictly volunteer basis. The email described the survey as asking for predictions about the efficacy of a random sample of three nudges designed to increase gym visits. A total of 90 practitioners (mean age = 33.2, s.d. = 7.2; 62% female; 85% reported having a gym membership in the past 10 years; mean years of work experience = 10.1, s.d. = 7.6; 71% reported a degree in behavioural science; reported frequency of using behavioural science at work: every day: 69.7%, often: 16.9%, sometimes: 10.1%, rarely: 2.3%, never: 1.1%) responded to our survey. They made a total of 270 predictions about the effects of the experimental conditions from our megastudy, giving a mean of 5 forecasts per experimental condition. The study was preregistered and the preregistration is available in the ‘Data availability’ section.

Prediction study content

Before beginning the survey (which was the same for all participant populations with the exception of the demographic questions asked at the end), potential participants were screened out if they reported being familiar with any of the results from the megastudy (which were featured on an episode of the Freakonomics Radio podcast32). The participants were first shown an overall description of the StepUp Program, and they were then asked to compare three of the megastudy’s experimental conditions with the placebo control condition (one at a time). The three conditions that the participants reviewed were randomly selected from the megastudy’s 53 experimental conditions and were presented in a random order.

For each experimental condition that they were prompted to examine, the participants were presented with a summary table comparing the key features of the experimental condition with the placebo control condition. The participants next viewed screenshots of the registration experience and a summary of the text messages and emails sent during the programme in both the experimental condition and the placebo control condition. Sample stimuli comparing the planning, reminders and microincentives to exercise condition with the placebo control condition are available in Prediction Study Stimuli on the Open Science Framework (https://osf.io/kyt7d/?view_only=8bb9282111c24f81a19c2237e7d7eba3). The participants were informed of how many days per week an average participant in the placebo control condition visited the gym during the StepUp Program as well as how likely a participant was to visit the gym in a given week, on average, in the placebo control condition. The participants were then asked to forecast the average number of days per week that gym members would visit the gym and the percentage of the time that members would visit the gym at least once in a given week in the StepUp Program experimental condition that they had just reviewed. Specifically, participants answered these two questions:

  1. On average, how many days per week do you think members in the enhanced version of StepUp went to the gym? (For reference, people in the basic version went to the gym 1.5 days per week.)

  2. In an average week, what percent of the time do you think members in the enhanced version of StepUp made it to the gym? (For reference, in a given week, members in the basic version of StepUp made it to the gym at least once 57% of the time)

For each study, our key dependent variable was the predicted increase in gym attendance induced by a given experimental condition (compared with the placebo control condition). To determine the extra number of gym visits per week that a participant predicted a condition would induce, we subtracted the placebo control condition’s mean of 1.5 d of gym visits per week from the participants’ estimated total weekly gym visits for a given experimental condition (the possible range of values was −1.5 to 5.5, as weeks include only 7 d). To determine the added likelihood of visiting the gym at least once in a given week that a participant predicted a condition would induce, we subtracted the placebo control condition’s mean visit likelihood of 57% from the participants’ estimated weekly visit likelihood for a given experimental condition (the possible range of values was −57% to 43% as the maximum likelihood was 100%). As any weekly gym attendance is not our primary focus, we present these results inExtended Data Fig. 1,Extended Data Tables 13 and7 andSupplementary Information 2. Finally, we computed an unweighted correlation between the actual regression-estimated change in gym attendance induced by a given experimental condition in our megastudy (see estimates inExtended Data Tables 6 and7) and the mean predicted change in gym attendance induced by that same experimental condition.

Extended Data

Extended Data Fig. 1|. Measured vs. predicted change in likelihood of gym visit in a given week.

Extended Data Fig. 1|

The measured change (blue) vs. change predicted by third-party observers (gold) in whether participants visited the gym that was induced by each of our megastudy’s 53 experimental conditions compared to a Placebo Control condition during a four-week intervention period is depicted here. Error bars represent 95% confidence intervals. SeeExtended Data Table 7 for complete OLS regression results graphed here in blue,Supplementary Information 11 for more details about the prediction data graphed here in gold, andSupplementary Table 1 for full descriptions of each treatment condition in our megastudy. Sample weights were included in the pooled third-party prediction data to ensure equal weighting of each of our three participant samples (professors, practitioners and prolific respondents). The superscripts a–e denote the different incentive amounts offered in different versions of the bonus for returning after missed workouts, higher incentives and rigidity rewarded conditions, which are described inSupplementary Table 1. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 1 |.

Regression-estimated effects of each experimental condition on whether participants visited the gym in a given week during the four-week intervention period relative to the Planning, Reminders and Micro-Incentives to Exercise condition

Experimental ConditionbSEp-valueN
03. Exercise Social Norms Shared (High and Increasing)0.0710.0260.006798
02. Higher Incentivesa0.0680.0210.0011,750
09. Free Audiobook Provided, Temptation Bundling Explained0.0680.0250.0071,685
06. Planning Fallacy Described and Planning Revision Encouraged0.0530.0410.200811
35. Reflecting on Workouts Encouraged0.0510.0260.051517
01. Bonus for Returning after Missed Workoutsb0.0500.0240.0381,633
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt0.0470.0310.123825
05. Bonus for Returning after Missed Workoutsa0.0450.0270.0991,719
13. Asked Questions about Workouts0.0380.0260.1471,191
20. Exercise Social Norms Shared (Low)0.0360.0230.114821
12. Values Affirmation0.0250.0280.364824
36. Planning Workouts Rewarded0.0250.0260.3401,466
10. Following Workout Plan Encouraged0.0250.0260.338805
19. Planning Revision Encouraged0.0240.0240.328860
21. Exercise Encouraged with Typed Pledge0.0230.0270.382849
26. Values Affirmation Followed by Diagnosis as Gritty0.0230.0240.346804
33. Planning Workouts Encouraged0.0220.0240.3711,499
07. Choice of Gain- or Loss-Framed Micro-Incentives0.0210.0200.2941,652
08. Exercise Commitment Contract Explained0.0200.0300.504810
42. Exercise Encouraged with E-Signed Pledge0.0160.0290.586878
04. Free Audiobook Provided0.0140.0370.7011,604
14. Rigidity Rewardeda0.0110.0250.6531,816
34. Gym Routine Encouraged0.0090.0290.755820
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded0.0080.0220.727564
24. Rigidity Rewardede0.0060.0280.831548
28. Rigidity Rewardedc0.0050.0260.8361,701
18. Fitness Questionnaire0.0040.0230.864799
46. Defaulted into 1 Weekly Workout0.0030.0250.891455
17. Exercise Advice Solicited0.0030.0250.903749
25. Exercise Encouraged with Signed Pledge0.0030.0310.924802
39. Reflecting on Workouts Rewarded0.0020.0220.927469
22. Gain-Framed Micro-Incentives0.0000.0270.986783
32. Exercise Encouraged−0.0010.0280.973806
15. Defaulted into 3 Weekly Workouts−0.0010.0230.965477
48. Rigidity Rewardedd−0.0040.0240.8801,613
37. Effective Workouts Encouraged−0.0070.0230.768852
52. Exercise Advice Solicited, Shared with Others−0.0090.0310.780707
47. Exercise Social Norms Shared (Low but Increasing)−0.0090.0260.723835
31. Fitness Questionnaire with Cognitive Reappraisal Prompt−0.0110.0260.680868
27. Bonus for Consistent Exercise Schedule−0.0130.0270.635798
43. Bonus for Variable Exercise Schedule−0.0160.0260.529865
16. Exercise Fun Facts Shared−0.0190.0270.478836
53. Exercise Social Norms Shared (High)−0.0220.0230.340841
40. Fun Workouts Encouraged−0.0230.0260.381770
23. Higher Incentivesb−0.0240.0270.3791,910
50. Fitness Questionnaire with Decision Support−0.0240.0270.374893
29. Loss-Framed Micro-Incentives−0.0250.0250.309872
38. Planning Benefits Explained−0.0250.0350.473859
54. Placebo Control−0.0290.0150.0554,992
49. Exercise Commitment Contract Encouraged−0.0310.0300.301812
45. Rewarded for Responding to Questions about Workouts−0.0360.0280.2081,199
51. Rigidity Rewardedb−0.0420.0320.1881,850
44. Exercise Commitment Contract Explained Post-Intervention−0.0560.0320.074828

Number of observations2,397,729
Number of participants61,293
R20.445

The table reports the results of an ordinary least squares regression predicting whether participants visited the gym in a given week during the four-week intervention period with indicators for experimental condition during the four-week intervention period, participants fixed effects, and cohort-week interactions. Robust standard errors were clustered by participant. Observations in the regression were weighted to ensure that each condition was equally weighted within a cohort and each cohort was weighted proportionally to its length. The reference group was the Planning, Reminders, and Micro-Incentives to Exercise condition. SeeTable S1 in theSupplementary Information for descriptions of each experimental condition.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 2 |.

Regression-estimated effects of each experimental condition on whether participants visited the gym in a given week during the four-week post-intervention period relative to the Placebo Control condition

Experimental ConditionbSEp-valueN
01. Bonus for Returning after Missed Workoutsb0.0850.0260.0011,633
03. Exercise Social Norms Shared (High and Increasing)0.0770.0270.005798
06. Planning Fallacy Described and Planning Revision Encouraged0.0610.0360.091811
04. Free Audiobook Provided0.0580.0310.0601,604
20. Exercise Social Norms Shared (Low)0.0480.0230.042821
02. Higher Incentivesa0.0460.0250.0651,750
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt0.0450.0240.054825
09. Free Audiobook Provided, Temptation Bundling Explained0.0450.0250.0711,685
10. Following Workout Plan Encouraged0.0440.0260.086805
26. Values Affirmation Followed by Diagnosis as Gritty0.0390.0230.092804
18. Fitness Questionnaire0.0380.0250.127799
33. Planning Workouts Encouraged0.0370.0200.0631,499
25. Exercise Encouraged with Signed Pledge0.0340.0260.196802
52. Exercise Advice Solicited, Shared with Others0.0320.0350.371707
24. Rigidity Rewardede0.0270.0210.208548
43. Bonus for Variable Exercise Schedule0.0260.0250.301865
12. Values Affirmation0.0240.0240.326824
37. Effective Workouts Encouraged0.0220.0240.364852
28. Rigidity Rewardedc0.0200.0230.3851,701
47. Exercise Social Norms Shared (Low but Increasing)0.0200.0250.427835
16. Exercise Fun Facts Shared0.0170.0260.510836
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded0.0130.0220.550564
22. Gain-Framed Micro-Incentives0.0130.0250.608783
05. Bonus for Returning after Missed Workoutsa0.0120.0260.6551,719
13. Asked Questions about Workouts0.0090.0220.6731,191
21. Exercise Encouraged with Typed Pledge0.0080.0270.780849
35. Reflecting on Workouts Encouraged0.0070.0220.748517
46. Defaulted into 1 Weekly Workout0.0060.0290.832455
42. Exercise Encouraged with E-Signed Pledge0.0060.0230.790878
50. Fitness Questionnaire with Decision Support0.0040.0240.866893
49. Exercise Commitment Contract Encouraged0.0040.0280.889812
17. Exercise Advice Solicited0.0030.0250.891749
27. Bonus for Consistent Exercise Schedule0.0020.0250.924798
31. Fitness Questionnaire with Cognitive Reappraisal Prompt0.0000.0250.999868
15. Defaulted into 3 Weekly Workouts0.0000.0230.999477
07. Choice of Gain- or Loss-Framed Micro-Incentives0.0000.0170.9911,652
36. Planning Workouts Rewarded−0.0010.0260.9781,466
23. Higher Incentivesb−0.0020.0220.9311,910
19. Planning Revision Encouraged−0.0040.0250.886860
40. Fun Workouts Encouraged−0.0040.0260.891770
48. Rigidity Rewardedd−0.0050.0220.8271,613
14. Rigidity Rewardeda−0.0080.0250.7461,816
45. Rewarded for Responding to Questions about Workouts−0.0080.0290.7751,199
32. Exercise Encouraged−0.0140.0240.569806
34. Gym Routine Encouraged−0.0150.0320.647820
08. Exercise Commitment Contract Explained−0.0170.0280.533810
30. Planning, Reminders & Micro-Incentives to Exercise−0.0210.0160.1813,503
39. Reflecting on Workouts Rewarded−0.0270.0270.314469
51. Rigidity Rewardedb−0.0300.0280.2961,850
44. Exercise Commitment Contract Explained Post-Intervention−0.0400.0290.162828
38. Planning Benefits Explained−0.0480.0280.089859
29. Loss-Framed Micro-Incentives−0.0510.0240.033872
53. Exercise Social Norms Shared (High)−0.0630.0240.008841

Number of observations2,642,901
Number of participants61,293
R20.426

The table reports the results of an ordinary least squares regression predicting whether participants visited the gym during a given week in the first four weeks after the intervention period with indicators for experimental condition during the four-week intervention period, indicators for experimental condition during the first four weeks post-intervention, participants fixed effects, and cohort-week interactions. Robust standard errors were clustered by participant. Observations in the regression were weighted to ensure that each condition was equally weighted within a cohort and each cohort was weighted proportionally to its length. The reference group was the Placebo Control condition. SeeTable S1 in theSupplementary Information for descriptions of each experimental condition.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 3 |.

The percentage of other conditions that each experimental condition outperformed for our dependent variable measuring whether participants visited the gym in a given week at p < .05 during the four-week intervention period

Experimental Condition% of Conditions Outperformed (p<.05)List of Conditions Outperformed
01. Bonus for Returning after Missed Workoutsb30%17*, 23*, 27*, 29-31*, 37*, 40*, 43*, 49*, 50*, 53*; 44**, 45**, 51**; 54***
02. Higher Incentivesa62%7*, 14*, 16*, 18*, 22*, 24*, 25*, 28*, 32*, 38*, 41*, 46*, 52*; 15**, 17**, 23**, 27**, 31**, 37**, 39**, 40**, 43**, 47-50**; 29***, 30***, 44***, 45***, 51***, 53***, 54***
03. Exercise Social Norms Shared (High and Increasing)55%15*, 16*, 18*, 22*, 27*, 28*, 31*, 32*, 38*, 39*, 41*, 46-48*, 52*; 17**, 23**, 29**, 30**, 37**, 40**, 43**, 45**, 49-51**, 53**; 44***, 54***
04. Free Audiobook Provided0%
05. Bonus for Returning after Missed Workoutsa19%23*, 29*, 40*, 45*, 49-51*, 53*; 44**, 54**
06. Planning Fallacy Described and Planning Revision Encouraged4%44*, 54*
07. Choice of Gain- or Loss-Framed Micro-Incentives4%44*; 54**
08. Exercise Commitment Contract Explained0%
09. Free Audiobook Provided, Temptation Bundling Explained55%15*, 16*, 18*, 22*, 27*, 28*, 31*, 32*, 38*, 39*, 41*, 46-48*, 52*; 17**, 23**, 29**, 30**, 37**, 40**, 43**, 45**, 49-51**, 53**; 44***, 54***
10. Following Workout Plan Encouraged4%44*, 54*
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt13%29*, 45*, 49*, 51*, 53*; 44**, 54**
12. Values Affirmation4%44*, 54*
13. Asked Questions about Workouts11%29*, 44*, 45*, 51*, 53*; 54**
14. Rigidity Rewardeda0%
15. Defaulted into 3 Weekly Workouts0%
16. Exercise Fun Facts Shared0%
17. Exercise Advice Solicited0%
18. Fitness Questionnaire0%
19. Planning Revision Encouraged4%44*, 54*
20. Exercise Social Norms Shared (Low)19%23*, 29*, 40*, 45*, 49-51*, 53*; 44**, 54**
21. Exercise Encouraged with Typed Pledge4%44*, 54*
22. Gain-Framed Micro-Incentives0%
23. Higher Incentivesb0%
24. Rigidity Rewardede0%
25. Exercise Encouraged with Signed Pledge0%
26. Values Affirmation Followed by Diagnosis as Gritty4%44*, 54*
27. Bonus for Consistent Exercise Schedule0%
28. Rigidity Rewardedc0%
29. Loss-Framed Micro-Incentives0%
30. Planning, Reminders & Micro-Incentives to Exercise0%
31. Fitness Questionnaire with Cognitive Reappraisal Prompt0%
32. Exercise Encouraged0%
33. Planning Workouts Encouraged4%44*, 54*
34. Gym Routine Encouraged0%
35. Reflecting on Workouts Encouraged25%17*, 23*, 29*, 37*, 40*, 43*, 45*, 49-51*, 53*; 44**; 54***
36. Planning Workouts Rewarded4%44*, 54*
37. Effective Workouts Encouraged0%
38. Planning Benefits Explained0%
39. Reflecting on Workouts Rewarded0%
40. Fun Workouts Encouraged0%
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded0%
42. Exercise Encouraged with E-Signed Pledge0%
43. Bonus for Variable Exercise Schedule0%
44. Exercise Commitment Contract Explained Post-Intervention0%
45. Rewarded for Responding to Questions about Workouts0%
46. Defaulted into 1 Weekly Workout0%
47. Exercise Social Norms Shared (Low but Increasing)0%
48. Rigidity Rewardedd0%
49. Exercise Commitment Contract Encouraged0%
50. Fitness Questionnaire with Decision Support0%
51. Rigidity Rewardedb0%
52. Exercise Advice Solicited, Shared with Others0%
53. Exercise Social Norms Shared (High)0%
54. Placebo Control0%

The percentage of conditions outperformed (p < .05) was obtained from conducting pairwise Wald tests to assess whether paired regression coefficients significantly differed from one another in the regression presented inExtended Data Table 7.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 4 |.

Participants’ mean age (in years), gender, length of gym membership (in weeks), and mean weekly gym visits in the four-week pre-intervention period across the 54 study conditions

Experimental ConditionSample SizeAgeFemale (%)White (%)Weeks since Joining 24HFWeekly Gym Visits Four Weeks before Intervention
1. Bonus for Returning after Missed Workoutsb1,63340.0 (13.6)64.7%48.9%35.9 (20.3)1.1 (1.4)
2. Higher Incentivesa1,75039.7 (13.1)65.4%47.1%36.6 (20.2)1.3 (1.5)
3. Exercise Social Norms Shared (High and Increasing)79838.8 (13.4)66.3%50.3%34.8 (20.6)1.3 (1.5)
4. Free Audiobook Provided1,60439.6 (13.4)63.5%50.7%35.9 (20.3)1.2 (1.5)
5. Bonus for Returning after Missed Workoutsa1,71939.8 (13.9)65.6%48.8%35.5 (20.5)1.1 (1.4)
6. Planning Fallacy Described and Planning Revision Encouraged81140.4 (13.9)67.2%49.1%36.4 (20.0)1.3 (1.5)
7. Choice of Gain- or Loss-Framed Micro-Incentives1,65238.1 (12.8)66.5%46.7%33.8 (21.5)1.3 (1.4)
8. Exercise Commitment Contract Explained81040.9 (13.5)69.0%52.8%34.9 (20.5)1.1 (1.4)
9. Free Audiobook Provided, Temptation Bundling Explained1,68539.6 (13.3)63.6%49.8%36.9 (19.9)1.2 (1.4)
10. Following Workout Plan Encouraged80538.6 (13.0)60.9%49.8%31.7 (21.9)1.2 (1.5)
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt82539.3 (13.2)67.5%50.3%35.2 (20.5)1.4 (1.5)
12. Values Affirmation82438.1 (12.8)64.9%51.8%34.5 (20.8)1.4 (1.6)
13. Asked Questions about Workouts1,19137.6 (12.3)69.6%49.0%32.3 (21.5)1.3 (1.5)
14. Rigidity Rewardeda1,81638.9 (13.2)65.9%48.7%34.8 (20.8)1.3 (1.5)
15. Defaulted into 3 Weekly Workouts47739.0 (13.1)68.1%48.8%34.7 (20.6)1.3 (1.4)
16. Exercise Fun Facts Shared83638.0 (13.0)65.8%48.7%35.3 (20.3)1.4 (1.5)
17. Exercise Advice Solicited74939.9 (13.4)66.2%51.0%34.8 (20.6)1.3 (1.5)
18. Fitness Questionnaire79939.4 (13.6)66.0%47.7%35.3 (20.9)1.3 (1.5)
19. Planning Revision Encouraged86039.5 (13.2)64.4%47.3%36.3 (20.2)1.3 (1.5)
20. Exercise Social Norms Shared (Low)82139.0 (13.1)65.2%50.3%35.2 (20.5)1.4 (1.5)
21. Exercise Encouraged with Typed Pledge84939.2 (13.2)68.7%53.1%34.3 (21.1)1.3 (1.5)
22. Gain-Framed Micro-Incentives78338.7 (12.9)69.2%48.9%33.7 (21.0)1.3 (1.5)
23. Higher Incentivesb1,91039.5 (13.1)64.9%50.8%35.6 (20.6)1.3 (1.5)
24. Rigidity Rewardede54838.8 (13.2)62.8%50.7%35.3 (20.8)1.2 (1.5)
25. Exercise Encouraged with Signed Pledge80238.6 (13.1)65.2%50.9%33.7 (21.2)1.3 (1.5)
26. Values Affirmation Followed by Diagnosis as Gritty80437.3 (12.1)68.5%49.4%35.1 (20.3)1.3 (1.5)
27. Bonus for Consistent Exercise Schedule79839.4 (13.4)65.9%51.4%34.7 (21.0)1.2 (1.4)
28. Rigidity Rewardedc1,70139.7 (13.3)67.6%51.5%37.1 (19.9)1.2 (1.4)
29. Loss-Framed Micro-Incentives87238.6 (12.8)67.7%46.6%32.7 (21.6)1.3 (1.5)
30. Planning, Reminders & Micro-Incentives to Exercise3,50339.2 (13.3)66.5%51.2%35.4 (20.3)1.3 (1.5)
31. Fitness Questionnaire with Cognitive Reappraisal Prompt86839.9 (13.8)65.2%50.2%34.6 (20.9)1.3 (1.5)
32. Exercise Encouraged80638.2 (12.7)66.7%49.3%34.9 (20.5)1.3 (1.5)
33. Planning Workouts Encouraged1,49940.5 (13.9)65.1%51.2%35.6 (20.6)1.2 (1.4)
34. Gym Routine Encouraged82039.2 (13.1)66.6%48.2%35.2 (20.9)1.3 (1.5)
35. Reflecting on Workouts Encouraged51738.3 (12.8)64.0%47.4%35.4 (20.6)1.2 (1.4)
36. Planning Workouts Rewarded1,46640.2 (13.9)66.4%50.1%35.5 (20.9)1.2 (1.4)
37. Effective Workouts Encouraged85237.8 (12.8)63.7%47.5%33.0 (21.6)1.4 (1.5)
38. Planning Benefits Explained85938.2 (13.3)66.2%49.4%33.1 (21.7)1.3 (1.4)
39. Reflecting on Workouts Rewarded46937.6 (12.0)67.4%44.1%34.2 (21.3)1.3 (1.5)
40. Fun Workouts Encouraged77038.2 (13.3)64.9%49.0%32.8 (21.5)1.5 (1.6)
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded56439.0 (13.5)62.4%53.2%36.4 (20.5)1.3 (1.6)
42. Exercise Encouraged with E-Signed Pledge87838.4 (13.2)64.8%49.7%33.5 (20.7)1.3 (1.5)
43. Bonus for Variable Exercise Schedule86539.9 (13.6)67.3%48.2%34.5 (21.1)1.3 (1.5)
44. Exercise Commitment Contract Explained Post-Intervention82840.3 (13.6)67.4%54.1%35.8 (20.1)1.2 (1.4)
45. Rewarded for Responding to Questions about Workouts1,19938.1 (12.9)66.9%50.8%33.4 (21.4)1.4 (1.6)
46. Defaulted into 1 Weekly Workout45538.6 (13.0)64.6%56.5%34.8 (20.7)1.3 (1.6)
47. Exercise Social Norms Shared (Low but Increasing)83538.3 (12.7)65.4%47.2%35.4 (20.5)1.4 (1.6)
48. Rigidity Rewardedd1,61339.9 (13.5)64.6%52.3%36.5 (20.5)1.2 (1.5)
49. Exercise Commitment Contract Encouraged81240.4 (14.4)65.9%51.1%35.6 (20.4)1.3 (1.5)
50. Fitness Questionnaire with Decision Support89339.5 (13.5)65.7%49.2%36.2 (20.5)1.2 (1.5)
51. Rigidity Rewardedb1,85039.1 (13.1)64.9%50.4%36.5 (20.1)1.3 (1.5)
52. Exercise Advice Solicited, Shared with Others70738.7 (12.9)65.3%49.4%33.2 (21.9)1.2 (1.5)
53. Exercise Social Norms Shared (High)84138.3 (13.4)68.1%46.8%36.3 (19.6)1.4 (1.6)
54. Placebo Control4,99238.9 (13.0)66.0%49.6%35.3 (20.6)1.3 (1.5)

Overall61,29339.1 (13.3)65.9%49.8%35.1 (20.7)1.3 (1.5)

Standard deviations for means are reported in parentheses. For summary statistics in this table, mean weekly gym visits prior to the intervention were calculated with a balanced panel constructed by inserting 0’s for weeks with no recorded gym visits. Conditions are numbered in descending order based on the beta coefficients from our primary analysis reported in the paper and inExtended Data Table 6, and the Placebo Control is always labeled 54. The values shown in the table are unweighted.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 5 |.

Percentage of significant p-values and absolute difference in coefficients from pairwise comparisons of the 54 study conditions in our megastudy on each variable listed (alpha = .05)

Percentage of Paired Tests Yielding Significant ResultsF-testp-valueAverage Absolute Difference in Pairwise Coefficients
Age (years)7.1%0.210.91
Membership Tenure at 24 Hour Fitness (weeks)2.8%0.851.26
Average Weekly Gym Visits in 4 Weeks Before Intervention1.9%0.980.08
Percent Female4.1%0.740.03

Overall4.0%

The table summarizes the results of Wald tests of equality for all pairwise comparisons of the 54 megastudy conditions based on ordinary least squares regressions testing if the composition of participants in these experimental conditions differed by age, membership tenure at 24 Hour Fitness, mean weekly gym visits in the four weeks prior to the start of the intervention, and gender. Regressions included robust standard errors. Observations in the regressions were weighted to ensure that each condition was weighted equally within a cohort and each cohort was weighted proportionally to its length.

Extended Data Table 6 |.

Regression-estimated effects of each experimental condition on total weekly gym visits during the four-week intervention period relative to the Placebo Control condition

Experimental ConditionbSEp-valueN
01. Bonus for Returning after Missed Workoutsb0.4030.098<0.0011,633
02. Higher Incentivesa0.3650.092<0.0011,750
03. Exercise Social Norms Shared (High and Increasing)0.3450.083<0.001798
04. Free Audiobook Provided0.3430.1230.0051,604
05. Bonus for Returning after Missed Workoutsa0.3360.081<0.0011,719
06. Planning Fallacy Described and Planning Revision Encouraged0.3250.1220.008811
07. Choice of Gain- or Loss-Framed Micro-Incentives0.2840.055<0.0011,652
08. Exercise Commitment Contract Explained0.2790.0950.003810
09. Free Audiobook Provided, Temptation Bundling Explained0.2780.077<0.0011,685
10. Following Workout Plan Encouraged0.2680.0830.001805
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt0.2550.0810.002825
12. Values Affirmation0.2430.0950.011824
13. Asked Questions about Workouts0.2360.1120.0361,191
14. Rigidity Rewardeda0.2300.0800.0041,816
15. Defaulted into 3 Weekly Workouts0.2130.0850.012477
16. Exercise Fun Facts Shared0.2070.0840.013836
17. Exercise Advice Solicited0.2070.0840.014749
18. Fitness Questionnaire0.2060.0800.009799
19. Planning Revision Encouraged0.1960.0870.025860
20. Exercise Social Norms Shared (Low)0.1930.0770.012821
21. Exercise Encouraged with Typed Pledge0.1910.1080.076849
22. Gain-Framed Micro-Incentives0.1800.0900.045783
23. Higher Incentivesb0.1750.0780.0251,910
24. Rigidity Rewardede0.1670.0830.043548
25. Exercise Encouraged with Signed Pledge0.1560.0990.115802
26. Values Affirmation Followed by Diagnosis as Gritty0.1550.0820.060804
27. Bonus for Consistent Exercise Schedule0.1510.0880.087798
28. Rigidity Rewardedc0.1420.0760.0601,701
29. Loss-Framed Micro-Incentives0.1390.0770.071872
30. Planning, Reminders & Micro-Incentives to Exercise0.1360.0490.0063,503
31. Fitness Questionnaire with Cognitive Reappraisal Prompt0.1340.0790.088868
32. Exercise Encouraged0.1320.0880.135806
33. Planning Workouts Encouraged0.1310.0710.0641,499
34. Gym Routine Encouraged0.1290.0860.135820
35. Reflecting on Workouts Encouraged0.1220.0840.146517
36. Planning Workouts Rewarded0.1180.0780.1291,466
37. Effective Workouts Encouraged0.1120.0690.104852
38. Planning Benefits Explained0.1110.0960.248859
39. Reflecting on Workouts Rewarded0.1090.0830.190469
40. Fun Workouts Encouraged0.1000.0720.167770
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded0.0950.0750.203564
42. Exercise Encouraged with E-Signed Pledge0.0880.0890.321878
43. Bonus for Variable Exercise Schedule0.0830.0930.373865
44. Exercise Commitment Contract Explained Post-Intervention0.0760.0810.346828
45. Rewarded for Responding to Questions about Workouts0.0660.0840.4321,199
46. Defaulted into 1 Weekly Workout0.0620.0940.510455
47. Exercise Social Norms Shared (Low but Increasing)0.0520.0780.509835
48. Rigidity Rewardedd0.0450.0790.5681,613
49. Exercise Commitment Contract Encouraged0.0350.0830.671812
50. Fitness Questionnaire with Decision Support0.0250.0800.757893
51. Rigidity Rewardedb0.0030.0830.9671,850
52. Exercise Advice Solicited, Shared with Others0.0010.0890.987707
53. Exercise Social Norms Shared (High)−0.0300.1370.827841

Number of observations2,397,729
Number of participants61,293
R20.574

The table reports the results of an ordinary least squares regression predicting participants’ weekly gym visits during the four-week intervention period with indicators for experimental condition during the four-week intervention period, participants fixed effects, and cohort-week interactions. Robust standard errors were clustered by participant. Observations in the regression were weighted to ensure that each condition was equally weighted within a cohort and each cohort was weighted proportionally to its length. The reference group was the Placebo Control condition. SeeTable S1 in theSupplementary Information for descriptions of each experimental condition.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 7 |.

Regression-estimated effects of each experimental condition on whether participants visited the gym in a given week during the four-week intervention period relative to the Placebo Control condition

Experimental ConditionbSEp-valueN
03. Exercise Social Norms Shared (High and Increasing)0.1000.024<0.001798
02. Higher Incentivesa0.0970.018<0.0011,750
09. Free Audiobook Provided, Temptation Bundling Explained0.0970.023<0.0011,685
06. Planning Fallacy Described and Planning Revision Encouraged0.0820.0400.040811
35. Reflecting on Workouts Encouraged0.0800.0240.001517
01. Bonus for Returning after Missed Workoutsb0.0790.022<0.0011,633
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt0.0760.0290.008825
05. Bonus for Returning after Missed Workoutsa0.0740.0250.0041,719
13. Asked Questions about Workouts0.0670.0240.0051,191
20. Exercise Social Norms Shared (Low)0.0650.0200.001821
12. Values Affirmation0.0540.0260.037824
36. Planning Workouts Rewarded0.0540.0240.0261,466
10. Following Workout Plan Encouraged0.0540.0240.024805
19. Planning Revision Encouraged0.0530.0220.017860
21. Exercise Encouraged with Typed Pledge0.0520.0250.034849
26. Values Affirmation Followed by Diagnosis as Gritty0.0520.0220.018804
33. Planning Workouts Encouraged0.0510.0220.0211,499
07. Choice of Gain- or Loss-Framed Micro-Incentives0.0500.0170.0041,652
08. Exercise Commitment Contract Explained0.0490.0280.079810
42. Exercise Encouraged with E-Signed Pledge0.0450.0270.099878
04. Free Audiobook Provided0.0430.0360.2251,604
14. Rigidity Rewardeda0.0400.0230.0831,816
34. Gym Routine Encouraged0.0380.0270.165820
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded0.0370.0190.056564
24. Rigidity Rewardede0.0350.0270.188548
28. Rigidity Rewardedc0.0340.0240.1551,701
18. Fitness Questionnaire0.0330.0210.113799
46. Defaulted into 1 Weekly Workout0.0320.0230.152455
17. Exercise Advice Solicited0.0320.0230.165749
25. Exercise Encouraged with Signed Pledge0.0320.0290.275802
39. Reflecting on Workouts Rewarded0.0310.0190.111469
22. Gain-Framed Micro-Incentives0.0290.0250.235783
30. Planning, Reminders & Micro-Incentives to Exercise0.0290.0150.0553,503
32. Exercise Encouraged0.0280.0260.287806
15. Defaulted into 3 Weekly Workouts0.0280.0200.170477
48. Rigidity Rewardedd0.0250.0220.2421,613
37. Effective Workouts Encouraged0.0220.0200.267852
52. Exercise Advice Solicited, Shared with Others0.0200.0290.488707
47. Exercise Social Norms Shared (Low but Increasing)0.0200.0240.407835
31. Fitness Questionnaire with Cognitive Reappraisal Prompt0.0180.0240.451868
27. Bonus for Consistent Exercise Schedule0.0160.0250.527798
43. Bonus for Variable Exercise Schedule0.0120.0240.605865
16. Exercise Fun Facts Shared0.0100.0250.696836
53. Exercise Social Norms Shared (High)0.0070.0210.727841
40. Fun Workouts Encouraged0.0060.0240.796770
23. Higher Incentivesb0.0050.0250.8271,910
50. Fitness Questionnaire with Decision Support0.0050.0240.826893
29. Loss-Framed Micro-Incentives0.0040.0220.858872
38. Planning Benefits Explained0.0040.0340.914859
49. Exercise Commitment Contract Encouraged−0.0020.0280.953812
45. Rewarded for Responding to Questions about Workouts−0.0070.0260.8001,199
51. Rigidity Rewardedb−0.0130.0300.6691,850
44. Exercise Commitment Contract Explained Post-Intervention−0.0270.0300.357828

Number of observations2,397,729
Number of participants61,293
R20.445

The table reports the results of an ordinary least squares regression predicting whether participants visited the gym in a given week during the four-week intervention period with indicators for experimental condition during the four-week intervention period, participants fixed effects, and cohort-week interactions. Robust standard errors were clustered by participant. Observations in the regression were weighted to ensure that each condition was equally weighted within a cohort and each cohort was weighted proportionally to its length. The reference group was the Placebo Control condition. SeeTable S1 in theSupplementary Information for descriptions of each experimental condition.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 8 |.

Regression-estimated effects of each experimental condition on total weekly gym visits during the four-week intervention period relative to the Planning, Reminders, and Micro-Incentives to Exercise condition

Experimental ConditionbSEp-valueN
01. Bonus for Returning after Missed Workoutsb0.2660.1030.0101,633
02. Higher Incentivesa0.2290.0980.0201,750
03. Exercise Social Norms Shared (High and Increasing)0.2090.0900.020798
04. Free Audiobook Provided0.2060.1280.1061,604
05. Bonus for Returning after Missed Workoutsa0.2000.0870.0221,719
06. Planning Fallacy Described and Planning Revision Encouraged0.1880.1260.135811
07. Choice of Gain- or Loss-Framed Micro-Incentives0.1470.0640.0211,652
08. Exercise Commitment Contract Explained0.1430.1010.156810
09. Free Audiobook Provided, Temptation Bundling Explained0.1410.0840.0921,685
10. Following Workout Plan Encouraged0.1310.0890.142805
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt0.1190.0880.177825
12. Values Affirmation0.1060.1000.290824
13. Asked Questions about Workouts0.0990.1170.3961,191
14. Rigidity Rewardeda0.0930.0870.2811,816
15. Defaulted into 3 Weekly Workouts0.0760.0910.400477
16. Exercise Fun Facts Shared0.0710.0900.430836
17. Exercise Advice Solicited0.0710.0900.433749
18. Fitness Questionnaire0.0700.0860.416799
19. Planning Revision Encouraged0.0590.0930.524860
20. Exercise Social Norms Shared (Low)0.0570.0840.497821
21. Exercise Encouraged with Typed Pledge0.0550.1130.626849
22. Gain-Framed Micro-Incentives0.0430.0950.652783
23. Higher Incentivesb0.0380.0850.6531,910
24. Rigidity Rewardede0.0310.0890.727548
25. Exercise Encouraged with Signed Pledge0.0200.1050.848802
26. Values Affirmation Followed by Diagnosis as Gritty0.0180.0890.836804
27. Bonus for Consistent Exercise Schedule0.0150.0940.876798
28. Rigidity Rewardedc0.0060.0820.9451,701
29. Loss-Framed Micro-Incentives0.0020.0840.977872
31. Fitness Questionnaire with Cognitive Reappraisal Prompt−0.0020.0850.979868
32. Exercise Encouraged−0.0040.0940.962806
33. Planning Workouts Encouraged−0.0050.0780.9471,499
34. Gym Routine Encouraged−0.0070.0920.936820
35. Reflecting on Workouts Encouraged−0.0140.0900.875517
36. Planning Workouts Rewarded−0.0180.0840.8281,466
37. Effective Workouts Encouraged−0.0240.0760.749852
38. Planning Benefits Explained−0.0250.1020.805859
39. Reflecting on Workouts Rewarded−0.0280.0890.754469
40. Fun Workouts Encouraged−0.0370.0790.641770
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded−0.0410.0820.613564
42. Exercise Encouraged with E-Signed Pledge−0.0480.0950.612878
43. Bonus for Variable Exercise Schedule−0.0540.0990.586865
44. Exercise Commitment Contract Explained Post-Intervention−0.0600.0870.489828
45. Rewarded for Responding to Questions about Workouts−0.0700.0910.4381,199
46. Defaulted into 1 Weekly Workout−0.0750.0990.453455
47. Exercise Social Norms Shared (Low but Increasing)−0.0850.0850.318835
48. Rigidity Rewardedd−0.0920.0850.2821,613
49. Exercise Commitment Contract Encouraged−0.1010.0890.255812
50. Fitness Questionnaire with Decision Support−0.1120.0860.196893
51. Rigidity Rewardedb−0.1330.0890.1361,850
52. Exercise Advice Solicited, Shared with Others−0.1350.0950.156707
53. Exercise Social Norms Shared (High)−0.1660.1410.237841
54. Placebo Control−0.1360.0490.0064,992

Number of observations2,397,729
Number of participants61,293
R20.574

The table reports the results of an ordinary least squares regression predicting participants’ weekly gym visits during the four-week intervention period with indicators for experimental condition during the four-week intervention period, participants fixed effects, and cohort-week interactions. Robust standard errors were clustered by participant. Observations in the regression were weighted to ensure that each condition was equally weighted within a cohort and each cohort was weighted proportionally to its length. The reference group was the Planning, Reminders, and Micro-Incentives to Exercise condition. SeeTable S1 in theSupplementary Information for descriptions of each experimental condition.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Extended Data Table 9 |.

Regression-estimated effects of each experimental condition on total weekly gym visits during the four-week post-intervention period relative to the Placebo Control condition

Experimental ConditionbSEp-valueN
01. Bonus for Returning after Missed Workoutsb0.2490.1100.0241,633
04. Free Audiobook Provided0.2130.0980.0301,604
03. Exercise Social Norms Shared (High and Increasing)0.1730.0870.047798
06. Planning Fallacy Described and Planning Revision Encouraged0.1700.1110.124811
20. Exercise Social Norms Shared (Low)0.1650.0850.052821
05. Bonus for Returning after Missed Workoutsa0.1360.0910.1341,719
10. Following Workout Plan Encouraged0.1310.0860.125805
09. Free Audiobook Provided, Temptation Bundling Explained0.1300.0750.0841,685
33. Planning Workouts Encouraged0.1290.0620.0381,499
43. Bonus for Variable Exercise Schedule0.1210.0820.137865
26. Values Affirmation Followed by Diagnosis as Gritty0.1200.0800.136804
22. Gain-Framed Micro-Incentives0.1060.0740.151783
18. Fitness Questionnaire0.1050.0800.187799
11. Fitness Questionnaire with Decision Support & Cognitive Reappraisal Prompt0.0840.0790.290825
25. Exercise Encouraged with Signed Pledge0.0830.0800.299802
12. Values Affirmation0.0700.1000.481824
02. Higher Incentivesa0.0520.0910.5691,750
17. Exercise Advice Solicited0.0490.0780.527749
07. Choice of Gain- or Loss-Framed Micro-Incentives0.0450.0540.4011,652
08. Exercise Commitment Contract Explained0.0440.0850.605810
27. Bonus for Consistent Exercise Schedule0.0400.0860.644798
45. Rewarded for Responding to Questions about Workouts0.0390.0700.5811,199
15. Defaulted into 3 Weekly Workouts0.0340.0830.682477
28. Rigidity Rewardedc0.0340.0710.6361,701
31. Fitness Questionnaire with Cognitive Reappraisal Prompt0.0320.0830.705868
47. Exercise Social Norms Shared (Low but Increasing)0.0300.0990.760835
41. Mon-Fri Consistency Rewarded, Sat-Sun Consistency Rewarded0.0140.0830.862564
37. Effective Workouts Encouraged0.0120.0680.858852
19. Planning Revision Encouraged0.0120.0910.896860
16. Exercise Fun Facts Shared0.0040.0830.966836
49. Exercise Commitment Contract Encouraged−0.0020.0910.982812
44. Exercise Commitment Contract Explained Post-Intervention−0.0040.0730.954828
52. Exercise Advice Solicited, Shared with Others−0.0190.1220.875707
24. Rigidity Rewardede−0.0230.0800.773548
51. Rigidity Rewardedb−0.0290.0740.6991,850
23. Higher Incentivesb−0.0290.0690.6771,910
30. Planning, Reminders & Micro-Incentives to Exercise−0.0310.0500.5273,503
32. Exercise Encouraged−0.0320.0700.642806
50. Fitness Questionnaire with Decision Support−0.0410.0710.557893
36. Planning Workouts Rewarded−0.0500.0850.5571,466
13. Asked Questions about Workouts−0.0530.0770.4941,191
34. Gym Routine Encouraged−0.0680.0730.352820
40. Fun Workouts Encouraged−0.0690.0760.365770
46. Defaulted into 1 Weekly Workout−0.0700.0900.435455
14. Rigidity Rewardeda−0.0780.0810.3371,816
35. Reflecting on Workouts Encouraged−0.0800.0780.302517
42. Exercise Encouraged with E-Signed Pledge−0.0810.0740.274878
29. Loss-Framed Micro-Incentives−0.1100.0750.142872
39. Reflecting on Workouts Rewarded−0.1230.0790.117469
48. Rigidity Rewardedd−0.1240.0770.1051,613
21. Exercise Encouraged with Typed Pledge−0.1470.1100.182849
38. Planning Benefits Explained−0.1910.1160.100859
53. Exercise Social Norms Shared (High)−0.3770.2130.077841

Number of observations2,642,901
Number of participants61,293
R20.553

The table reports the results of an ordinary least squares regression predicting participants’ weekly gym visits during the first four weeks after the intervention period with indicators for experimental condition during the four-week intervention period, indicators for experimental condition during the first four weeks post-intervention, participants fixed effects, and cohort-week interactions. Robust standard errors were clustered by participant. Observations in the regression were weighted to ensure that each condition was equally weighted within a cohort and each cohort was weighted proportionally to its length. The reference group was the Placebo Control condition. SeeTable S1 in theSupplementary Information for descriptions of each experimental condition.

a, b, c, d, e

These superscripts denote the different incentive amounts offered in different versions of the Bonus for Returning after Missed Workouts, Higher Incentives, and Rigidity Rewarded conditions, which are detailed inTable S1 in theSupplementary Information. In conditions with the same name, superscripts that come earlier in the alphabet indicate larger incentives.

Supplementary Material

Supplementary information

Acknowledgements

Support for this research was provided in part by the Robert Wood Johnson Foundation, the AKO Foundation, J. Alexander, M. J. Leder, W. G. Lichtenstein, the Pershing Square Fund for Research on the Foundations of Human Behavior from Harvard University and by Roybal Center grants (P30AG034546 and 5P30AG034532) from the National Institute on Aging. The views expressed here do not necessarily reflect the views of any of these individuals or entities. We thank 24 Hour Fitness for partnering with the Behavior Change for Good Initiative at the University of Pennsylvania to make this research possible.

Footnotes

Online content

Any methods, additional references, Nature Research reporting summaries, source data, extended data,supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available athttps://doi.org/10.1038/s41586-021-04128-4.

Competing interests The authors declare no competing interests. The authors did not receive commercial benefits from the fitness chain or speaking/consulting fees related to any of the interventions presented here.

Supplementary information The online version containssupplementary material available athttps://doi.org/10.1038/s41586-021-04128-4.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Code availability

The code to replicate the analyses and figures in the paper andSupplementary Information is available online (https://osf.io/9av87/?view_only=8bb9282111c24f81a19c2237e7d7eba3).

Data availability

The data analysed in this paper were provided by 24 Hour Fitness and we have their legal permission to share the deidentified data. We have therefore made deidentified data available athttps://osf.io/9av87/?view_only=8bb9282111c24f81a19c2237e7d7eba3. Furthermore, tables of all of the preregistration links for each of the substudies with the interventions and the prediction studies are available inSupplementary Tables 2 and30.

References

  • 1.Behavioural Insights and Public Policy: Lessons from Around the World (OECD Publishing, 2017).
  • 2.Benartzi Set al. Should governments invest more in nudging?Psychol. Sci28, 1041–1055 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Charness G & Gneezy UIncentives to EXercise. Econometrica77, 909–931 (2009). [Google Scholar]
  • 4.Acland D & Levy MRNaiveté, projection bias, and habit formation in gym attendance. Manage. Sci61, 146–160 (2015). [Google Scholar]
  • 5.Royer H, Stehr M & Sydnor JIncentives, commitments, and habit formation in exercise: evidence from a field experiment with workers at a Fortune-500 company. Am. Econ. J. Appl. Econ7, 51–84 (2015). [Google Scholar]
  • 6.Beshears J, Lee HN, Milkman KL, Mislavsky R & Wisdom JCreating exercise habits using incentives: the tradeoff between flexibility and routinization. Manage. Sci67, 4139–4171 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.DellaVigna S & Linos ERCTs to Scale: Comprehensive Evidence from Two Nudge Units 65 (National Bureau of Economic Research, 2020). [Google Scholar]
  • 8.DellaVigna S & Pope DWhat motivates effort? Evidence and expert forecasts. Rev. Econ. Stud85, 1029–1069 (2018). [Google Scholar]
  • 9.DellaVigna S & Pope DPredicting experimental results: who knows what?J. Polit. Econ126, 2410–2456 (2018). [Google Scholar]
  • 10.DellaVigna S, Pope D & Vivalt EPredict science to improve science. Science366, 428–429 (2019). [DOI] [PubMed] [Google Scholar]
  • 11.Kristal AS & Whillans AVWhat we can learn from five naturalistic field experiments that failed to shift commuter behaviour. Nat. Hum. Behav4, 169–176 (2020). [DOI] [PubMed] [Google Scholar]
  • 12.Donoho D50 years of data science. J. Comput. Graph. Stat26, 745–766 (2017). [Google Scholar]
  • 13.Liberman MFred Jelinek. Comput. Linguist36, 595–599 (2010). [Google Scholar]
  • 14.Lai CKet al. Reducing implicit racial preferences: I. A comparative investigation of 17 interventions. J. Exp. Psychol. Gen143, 1765–1785 (2014). [DOI] [PubMed] [Google Scholar]
  • 15.Lai CKet al. Reducing implicit racial preferences: II. Intervention effectiveness across time. J. Exp. Psychol. Gen145, 1001–1016 (2016). [DOI] [PubMed] [Google Scholar]
  • 16.Mellers Bet al. Psychological strategies for winning a geopolitical forecasting tournament. Psychol. Sci25, 1106–1115 (2014). [DOI] [PubMed] [Google Scholar]
  • 17.Open Science Collaboration Estimating the reproducibility of psychological science. Science349, aac4716 (2015). [DOI] [PubMed] [Google Scholar]
  • 18.Milkman KL, Minson JA & Volpp KGMHolding the hunger games hostage at the gym: an evaluation of temptation bundling. Manage. Sci60, 283–299 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ward BW, Clarke TC, Nugent CN & Schiller JSEarly Release of Selected Estimates Based on Data From the 2015 National Health Interview Survey 120 (National Center for Health Statistics, 2015). [Google Scholar]
  • 20.Lee I-Met al. Effect of physical inactivity on major non-communicable diseases worldwide: an analysis of burden of disease and life expectancy. Lancet380, 219–229 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gollwitzer PMImplementation intentions: strong effects of simple plans. Am. Psychol54, 493–503 (1999). [Google Scholar]
  • 22.Milkman KL, Beshears J, Choi JJ, Laibson D & Madrian BCUsing implementation intentions prompts to enhance influenza vaccination rates. Proc. Natl Acad. Sci. USA108, 10415–10420 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rogers T, Milkman KL, John LK & Norton MIBeyond good intentions: prompting people to make plans improves follow-through on important tasks. Behav. Sci. Pol1, 33–41 (2015). [Google Scholar]
  • 24.Karlan D, McConnell M, Mullainathan S & Zinman JGetting to the top of mind: how reminders increase saving. Manage. Sci62, 3393–3411 (2016). [Google Scholar]
  • 25.Homonoff TACan small incentives have large effects? The impact of taxes versus bonuses on disposable bag use. Am. Econ. J. Econ. Pol10, 177–210 (2018). [Google Scholar]
  • 26.Storey JD & Tibshirani RStatistical significance for genomewide studies. Proc. Natl Acad. Sci. USA100, 9440–9445 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Allcott HSocial norms and energy conservation. J. Publ. Econ95, 1082–1095 (2011). [Google Scholar]
  • 28.Chapman GB, Li M, Colby H & Yoon HOpting in vs opting out of influenza vaccination. JAMA304, 43–44 (2010). [DOI] [PubMed] [Google Scholar]
  • 29.Milkman KL, et al. A megastudy of text-based nudges encouraging patients to get vaccinated at an upcoming doctor’s appointment. Proc. Natl Acad. Sci. USA118, e2101165118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Lee MR & Shen MWinner’s curse: bias estimation for total effects of features in online controlled experiments In Proc. 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining491–499 (ACM, 2018). [Google Scholar]
  • 31.White HAsymptotic Theory for Econometricians (Elsevier, 1984). [Google Scholar]
  • 32.Dubner SJHow goes the behavior-change revolution? (Ep. 382). Freakonomicshttps://freakonomics.com/podcast/live-philadelphia/ (2019).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary information

Data Availability Statement

The data analysed in this paper were provided by 24 Hour Fitness and we have their legal permission to share the deidentified data. We have therefore made deidentified data available athttps://osf.io/9av87/?view_only=8bb9282111c24f81a19c2237e7d7eba3. Furthermore, tables of all of the preregistration links for each of the substudies with the interventions and the prediction studies are available inSupplementary Tables 2 and30.

ACTIONS

RESOURCES


[8]ページ先頭

©2009-2026 Movatter.jp