
Incremental efficacy systematic review and meta-analysis of psilocybin-for-depression RCTs
Nicholas C Borgogna
Tyler Owen
Dan Petrovitch
Jacob Vaughn
David A L Johnson
Louis A Pagano Jr
Stephen L Aita
Benjamin D Hill
Corresponding author.
Received 2024 Aug 27; Accepted 2025 Apr 5; Issue date 2025.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/.
Abstract
Rationale
Psilocybin is a potentially paradigm-shifting depression intervention. We conducted a systematic review and meta-analysis of psilocybin-for-depression randomized controlled trials (RCTs).
Objectives
Systematically assess harm reporting, risk of bias, action mechanism specification, and incremental therapeutic effect sizes in the psilocybin-for-depression RCT literature.
Methods
Assessed databases included PsycINFO, CINAHL, Embase, Medline, Web of Science, and Scopus. Search terms “Psilocybin” or “Psychedelic” were paired with “Depression”, and "Randomized Controlled Trial" or “RCT”.
Results
We identifiedk = 9 RCTs (k = 10 subgroups) involvingn = 602 participants (56% psilocybin). Five studies had low/very low harm quality reporting, opposed to two with high. Most studies demonstrated a high risk of bias. Therapeutic mechanisms of action (MoAs) were discussed in varying detail but rarely assessed in original publications. Psilocybin was moderately superior to controls at reducing depression (g = 0.62; 95% CI = 0.27, 0.98). Effects were heterogenous (τ = .47). Smaller studies evidenced stronger effects that favored psilocybin (Egger’sb0 = 3.63,p = .014). Almost all studies documented financial conflicts of interests.
Conclusion
Psilocybin demonstrates significant depression reduction relative to controls. However, researchers, clinicians, and stakeholders should consider several contextual factors. Effects were moderate and attenuated in larger and better-controlled studies. Harms reporting and risk of bias was high, though partly driven by unique challenges of psilocybin research. MoAs were variably specified but rarely assessed; suggesting it is unclear how depression is reduced. We advise researchers conduct RCTs with active control conditions, larger samples, and include MoA assessments. Independent RCTs from researchers without financial conflicts of interest are needed.
Supplementary Information
The online version contains supplementary material available at 10.1007/s00213-025-06788-w.
Keywords: Depression, Psychedelics, Psilocybin, Randomized Controlled Trial, Harm Reporting, Risk of Bias, Effect Size, Mechanisms of Action
Depression affects an estimated 280 million individuals worldwide (World Health Organization [WHO], 2023). Despite decades of research and billions of dollars in funding, debate remains about the underlying etiology and best approach to treatment (Cai et al.2020; Z. Li et al.2021). Currently, the latent disease model is the established framework for diagnosing major depressive disorder (MDD) and related depressive disorders. All depression presentations are conceptualized as syndromes “characterized by a clinically significant disturbance in an individual’s cognition, emotion regulation, or behavior that reflects a dysfunction in the psychological, biological, or developmental processes underlying mental functioning” (American Psychiatric Association2022). We use the broad term “depression” to describe the latent diseases cataloged in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision (DSM-5-TR; American Psychiatric Association2022) and International Classification of Diseases-11 (ICD-11; WHO, 2022). Depression typically involves prolonged sadness or anhedonia as core criteria along with other symptoms, such as changes in sleep or weight. However, others argue that depression has many features determined by individual and cultural differences (Juhasz et al.2012).
Treating depression is an active area of clinical science. In the United States, the Food and Drug Administration (FDA) currently has approved 30 agents for depression treatment. These act on various neurotransmitter systems, with many affecting the serotonin system. Many non-FDA approved agents are also employed to treat depression (e.g., quetiapine; Ignácio et al.2018), especially when FDA approved agents fail (Pappa et al.2024; Reid et al.2013). Multiple evidence-based behavioral interventions also exist (Butler et al.2006; Feijo De Mello et al.2005; Hayes et al.2012).
While many interventions hold therapeutic potential, there are drawbacks. For example, psychotherapy interventions take a significant amount of time, are often expensive, and finding qualified therapists can be difficult. Similarly, pharmaceuticals, while relatively accessible, are often met with treatment resistance and/or aversive side effects (Cipriani et al.2018; McIntyre et al.2023). The FDA approved antidepressants represent disparate pharmacological classes (e.g., esketamine and fluoxetine are both antidepressants yet pharmacologically very different), suggesting that neurobiological mechanisms underlying depression for any given person might differ drastically. Moreover, many clinicians initially take a “guess-and-check” approach with depression prescriptions (Zeier et al.2018), as the optimal intervention for a given patient is often unclear. Recent research has also cast doubt on the validity of traditional neurotransmitter models, most notably the serotonin theory that underlies many traditional antidepressant interventions (Moncrieff et al.2023). Some researchers have gone so far as to suggest that the true therapeutic mechanism underlying depression interventions is placebo (Cuijpers & Cristea2015).
The problems associated with traditional interventions have led researchers to consider alternative frameworks. Thepsychedelic renaissance (Rhee et al.2023; Schenberg2018) represents one such paradigm shift. Prominent examples include classic psychedelics, such as lysergic acid diethylamide (LSD) and psilocybin that primarily affect serotonergic systems, and non-classic hallucinogens such as ketamine and 3,4- methylenedioxymethamphetamine (MDMA). Psychedelics occur as natural by-products of various organisms (e.g., mushrooms), but also can be synthetically derived (e.g., LSD). Various psychedelic compounds have been used for millennia as part of spiritual/religious ceremonies and for recreation. In Western science (Swanson2018), psychedelics were first investigated by late 19th-century (Heffter1898; Lewin1888) and mid-twentieth century scholars (Eisner & Cohen1958; Osmond1957; Savage & Mccabe1973). However, by the 1970s, most psychedelic research was halted based on a confluence of media, political, and legal factors (Hall2022). The past 20 years have been associated with decreasing stigmatization and increasing interest in potential psychedelic health benefits.
Psilocybin has received attention as a potential antidepressant (dos Santos et al.2021; N. X. Li et al.2022). Psilocybin is primarily a 5-HT1A/2A/2C agonist, with action on 5-HT2A explaining the well-documented hallucinogenic effects (Kometer et al.2013; Nichols2004). Initial modern clinical trials have demonstrated interest for psilocybin as a possible depression intervention (Carhart-Harris et al.2016; Grob et al.2011), particularly for those suffering from treatment resistant depression (Carhart-Harris et al.2016). More recent randomized controlled trials (RCTs) have also demonstrated hopeful results (Goodwin et al.2022; von Rotz et al.2023). In 2018, the FDA granted “breakthrough” status for psilocybin as an intervention for treatment resistant depression, and for MDD in 2019. Such results have led to widespread public interest and media coverage (Lamotte2022). Scholarly opinion reports have also increased, suggesting a therapeutic potential for psilocybin across a diverse range of problems, including minority stress (Ortiz et al.2022), dementia (Haniff et al.2024), and compulsive sexual behavior disorder (Wizła et al.2022).
While interest is high, several scholars have also voiced warnings that the therapeutic benefits of psychedelics (Yaden et al.2022), including psilocybin (Rucker2023), are highly preliminary, susceptible to blind penetration/expectancy effects, and that accompanying psychotherapy may drive the therapeutic effects (Meling et al.2024; van Elk & Fried2023). In a similar vein, psilocybin researchers have yet to identify definitivetherapeutic action mechanisms between psilocybin and depression remission. Most psilocybin scholars have either not addressed therapeutic mechanisms (Gukasyan et al.2022), posited mechanisms without assessing them (Grob et al.2011), or acknowledged the therapeutic mechanisms are unknown (Johnson & Griffiths2017). Indeed, the unique subjective effects produced by psilocybin render it challenging to blind clinical trials using traditional methods (Muthukumaraswamy et al.2021; Nayak et al.2023). Moreover, the accompanying hallucinations as well as cognitive, emotional, and self-referential changes, also create pause regarding the functionality of widescale intervention implementation (Preller & Vollenweider2016; Strickland & Johnson2022).
Given the increased interest, coupled with the noted concerns, a critical review and meta-analysis of extant psilocybin-for-depression RCTs (i.e., psilocybin vs control) is warranted. To date, several meta-analyses have already evaluated the therapeutic effects of psilocybin on depression (Goldberg et al.2020; Haikazian et al.2023; N. X. Li et al.2022; Metaxa & Clarke2024; Perez et al.2023; Yu et al.2022). Many of these were conducted when the literature was notably smaller (e.g., Goldberg et al.2020). Extant meta-analyses also tend to report net therapeutic effects relative to a statistical null (i.e., testing whether psilocybin intervention reduced depression score relative to baseline). However, almost none of these meta-analyses consideredincremental therapeutic effects in relation to control interventions. Even Yu et al. (2022), one of the few meta-analyses to examine standard mean differences comparing psilocybin intervention to a control, only included four RCTs in their analyses (as this was the size of the psilocybin RCT literature at the time). Additionally, at least one recent meta-analysis (Metaxa and Clarke2024) was criticized within days of release due to questionable reporting (Cristea et al.2024). Given the relatively fast paced nature of psychedelic research, we believe it is important to re-review many of the formerly reviewed RCTs, in addition to the new studies, with a specific aim for incremental efficacy.
Concurrently, popular opinion pieces often describe the benefits of psilocybin with minimized consideration for potential harm. A recent meta-analysis demonstrated that researchers have broadly mischaracterized or underreported potential harms across published ketamine (a similar hallucinogenic agent) for depression trials (Taillefer de Laportalière et al.2023). The current psilocybin enthusiasm described in popular media could be misleading regarding the risk/benefit trade-offs associated with psilocybin. Indeed, De Giorgi and Ede (2024) noted the contradictory messages pervasive within the psychedelic renaissance, such as psilocybin having “negligible side effects” including “confusional states, substance misuse, intentional self-harm, suicidal behaviour, and psychotic symptoms” (pg. 1). Having estimates of bias risk and harm quality reporting would help stakeholders contextualize any observed incremental therapeutic effects.
Additionally, we were curious of the degree to which psilocybin scholars and practitioners specify and/or measure potential therapeutic mechanisms of action (MoA) within their RCTs. This is common practice in other areas of medicine, but relatively underutilized in psychiatry (e.g., confirming serotonin dysregulation before SSRI prescription). For example, psilocybin is associated with many well documented pharmacological processes, such as 5-HT1A/2A/2C agonist activity (Kometer et al.2013; Nichols2004). However, the degree to which such processes are measured and/or explained is less clear. As such, we wanted to synthesize how hypothesized MoAs are considered and/or measured by researchers conducting psilocybin-for-depression RCTs.
Current study
The current systematic review and meta-analysis aimed to critically evaluate the state of the psilocybin-for-depression RCT literature. Specifically, we aimed to:
Systematically assess harm reporting and bias risk within extant psilocybin-for-depression RCTs.
Review the degree to which theoretically derived therapeutic action mechanisms are discussed within the psilocybin-for-depression RCT literature.
Ascertain incremental efficacy of psilocybin relative to control conditions by comparing pooled effect sizes of depression reduction across RCT conditions.
Methods
Search strategy
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA; Page et al.2021) guidelines were followed for this study (see Fig. 1 for study selection flowchart). The databases used to gather the articles were PsycINFO, CINAHL, Embase, Medline, Web of Science, and Scopus. Search terms “Psilocybin” or “Psychedelic” were paired with “Depression”, and "Randomized Controlled Trial" or “RCT”. Search terms were identical per search engine. The search included all published research through March 2024a.
Fig. 1.
PRISMA 2020 flow diagram for new systematic reviews which included searches of databases, registers and other sources
Inclusion/Exclusion Criteria
The following inclusion criteria were utilized: a) human subjects research, b) psilocybin administration, c) inclusion of participants in a control condition, d) inclusion of at least one validated measure of depression, e) RCT design component (cross-over designs were included for data gathered prior to the cross over), and f) sufficient information available to extract effect sizes (pre/post between-subjectsn’s, means, and standard deviations).
Exclusion criteria were a) secondary analyses of original RCT data, b) non-peer review publication outlet, and c) psilocybin only control conditions (i.e., dosage studies). English language publication wasnot an exclusion criterion.
Data extraction and study selection criteria
After completing the initial literature search, titles and abstracts were assessed. If the abstract suggested the study would meet inclusion criteria, the full text was reviewed by a team member. Studies that met inclusion criteria had means (mean change when raw means were not available), standard deviations/errors,n’s, therapeutic and pharmacological mechanisms, and basic demographic information extracted for analyses. Any extraction/coding issues were resolved via group discussion. When data were unclear or unavailable, corresponding authors were emailed. Consistent with Yu et al. (2022), if we could not obtain specific estimates, we extracted estimates from figures using the Web Plot Digitizer tool (version 4) to estimate means and deviation estimates (Automeris LLC.2024). We also tested the Web Plot Digitizer tool on our charts and observed the tool to be sufficiently accurate (largest error observed was within 0.02). To minimize researcher bias, the first author did not provide input on coding/extraction decisions besides determining initial criteria.
Harm reporting
We followed previous review work (Taillefer de Laportalière et al.2023) by using the 21-item adverse reporting checklist (Ioannidis et al.2004; Taillefer de Laportalière et al.2023) informed by the Consolidated Standards of Reporting Trials (CONSORT) Extension of Harms checklist (Ioannidis et al.2004). Questions were evaluated on a binary scale where “1” indicates they met criteria. Coding was conducted by two trained research assistants, then replicated by two co-authors (TO and JV), then re-evaluated by an additional team members selected from the broader authorship list. Discrepancies were resolved through team discussion. Specific in-text evidence for a score of “1” for each criterion was extracted to aid in replication. Additionally, excerpts were extracted for evidence of partial criteria (these were still scored as “0” if full criteria were not met). Only original manuscript materials were considered during harm reporting review (i.e., supplemental materials were excluded). Scores for each item on the checklist were summed and described as high quality (17–21); moderate quality (12–16); low quality (7–11); or very low quality (0-6; Taillefer de Laportalière et al.2023).
Risk of bias
Risk of bias was assessed via the Cochrane’s Risk of Bias 2.0 (Higgins et al.2019; Sterne et al.2019). RoB 2.0 is the recommended tool for evaluating likelihood of bias in RCTs. It consists of five risk domains (risk from the randomization process, deviations from the intended intervention, missing outcome data, measurement of the outcome, and reported results), each domain has multiple questions that are measured on a four-level metric from yes to no with a “no information provided” option. The risk of bias for each domain is qualitatively described as low, medium, and high with an overall risk of bias for the study’s effect. The risk of bias by domain and by the study are determined by an algorithm (Sterne et al.2019) with question-level metrics informing domain metrics, which in turn inform study metrics. However, despite the algorithm being provided by the tool creators, it is appropriate to override the estimate of bias in instances where a particular risk of bias does not appear to be of concern. Or, conversely, when an estimate of a particular risk of bias is concerning, but not detected by the algorithm. Two coauthors (JV and DP) independently conducted risk of bias assessment using the RoB 2.0 tool. Initial interrater reliability for RoB coding was Cohen’sκ = 0.57 (moderate agreement). Rating of “probable yes” and “probable no” were converged with their respective “yes” and “no” classifications. After discussion and re- review, 100% agreement was achieved.
Action mechanisms synthesis
We completed a review of all posited therapeutic MoAs reported in the reviewed RCTs. One co-author (DP) reviewed the introduction and discussion sections of each manuscript, extracted the original author wordings that were used to describe how psilocybin reduced depression. Separate classifications/appraisals were afforded for neurophysiological (e.g., neurotransmitter mechanisms) and psychological mechanisms (e.g., spirituality mechanisms). Co-authors (TO and JV) then independently reviewed and coded each RCT as either "Highly Specific", "Moderately Specific", “Vague”, or "NA" based on each manuscript's level of detail of the extracted MoA(s). The highest codes received by a study’s MoA description became the study’s overall classification for that type of mechanism. Discrepancies between the reviewers were settled by team discussion between DP, JV, and TO.
Incremental efficacy data analysis
Effect sizes were analyzed using Comprehensive Meta-Analysis (v4) software (Borenstein et al.2013). All analyses were modeled under random effects. Hedges’g was selected as the index of effect size to adjust for the relatively lown’s observed across many of the study conditions. Post-score standard deviation was used to standardize the effect sizes, this was chosen instead of pre-post correlations due to a lack of reporting of pre-post correlations. In all analyses, we examined theincremental effect sizes of psilocybin intervention relative to control. That is, we compared the within group change across timepoints between the groups. Incremental Hedges’g represents the strength of the treatment group (psilocybin) relative the control in reducing depression in standard deviation units. Consistent with (Pizer et al.2024), we adopted Ferguson’s (2009) recommendation of Hedges’g ≥ 0.41 as the threshold for minimal “practical” significance.
Indicators were coded such that positive values represented a greater therapeutic effect in the psilocybin conditions relative to controls (greater depressionreduction). We conducted meta-analyses where all psilocybin conditions were compared to a novel control condition with time points and depression measures nested within study subgroups. Study subgroups were set as the unit of analysis. Publication bias was statistically assessed using Egger’s regression test, which regresses the effect sizes on the inverse of the standard error. Duval and Tweedie’s trim-and-fill method was used to assess the possibility of missing studies due to publication bias (Duval & Tweedie2000). Tau (τ) estimates were also calculated to provide heterogeneity context.
Data availability
CMA data sheets are available on the Open Science Framework https://osf.io/p3fcx/.
Results
Study characteristics, including demographic data, are available in Table 1. Overall,k = 9 studies met inclusion criteria (see Fig. 1, also see Supplementary File1 for exclusions). Because one study involved a “high” and a “moderate” psilocybin dose condition,k = 10 treatment conditions were evaluated against controls in meta- analyses. In total, a sample ofn = 602 participants (n = 337, 56% in psilocybin conditions) were evaluated.
Table 1.
Study characteristics
| Reference | n | Age (Mean) | % Male | %White | Outcome Period** | Funding? | Sample | Financial COI | Psychotherapy Component | |
|---|---|---|---|---|---|---|---|---|---|---|
| Psilocybin | Control | |||||||||
| Back et al. (2024) | 15 | 15 | 38 | 50 | 25 | 28 days | Yes | Moderate-to-severe depression symptoms | Yes | Clinical Facilitation |
| Carhart-Harris et al. (2021) | 30 | 29 | 43.3 | 63 | 93 | 6 weeks | Yes | Moderate-to-severe MDD | Yes | Clinical facilitation |
| Davis et al. (2021) | 13 | 11 | 39.8 | 33 | 92 | 4 weeks | Yes | MDD | Yes | Clinical facilitation |
| Goodwin et al. (2022) | 154* | 79 | 39.8 | 48 | 92 | 3 weeks | Yes | Treatment resistant MDD | Yes | Clinical facilitation |
| Marschall et al. (2022) | 18 | 23 | 30.05b | 38.95b | NS | 3 weeks | No | Attendees of micro-dosing workshops | No | NS |
| Raison et al. (2023) | 51 | 53 | 41.1 | 50 | 91 | 43 days | Yes | Moderate-to-severe MDD | Yes | Clinical facilitation |
| Rosenblat et al. (2024)a | 16 | 14 | 44.4 | 61.3 | NS | 2 weeks | Yes | Treatment resistant MDD (or BDII) | Yes | Yes |
| Ross et al. (2016) | 14 | 14c | 56.28 | 38 | 90 | 7 weeks | Yes | Individuals with cancer suffering from AD or GAD | No | Yes |
| von Rotz et al. (2023) | 26 | 26 | 36.75 | 36.55 | 94.2 | 2 weeks | Yes | MDD | Yes | Clinical facilitation |
NS Not Specified,COI Conflict of Interest,MDD Major Depressive Disorder,BDII Bipolar Disorder II,ASD Acute Stress Disorder,AD Adjustment Disorder,GAD Generalized Anxiety Disorder. When overall sample demographics were not reported, the average between groups estimate was calculated (applies to the age, % male, and % White columns). *The 154 represents two groups (10 mg and 25 mg of psilocybin).a Demographics were collected after assignment, with 31 total participants, however, we include the sample sizes that completed the primary endpoint (30 participants). One participant dropped out prior to treatment, an additional participant dropped out due to adverse events, however it was not specified from which group (29 total participants completed the experiment to the primary endpoint).b These values were calculated using the full sample from the initial assignment and were retrieved off the OSF repository.c “They report an additional participant withdrew from the Niacin control group prior to the six-week post first dose assessment in their CONSORT diagram, however in their NCT they list the control group as 15 participants.” **Note also that many authors report gathering additional follow-up information beyond the specified outcome period
Psilocybin dosage modestly varied across conditions with Back et al., (2024), Carhart-Harris et al. (2021), Goodwin et al. (2022), Rosenblat et al. (2024), and Raison et al. (2023) administering 25 mgs of psilocybin. Control conditions were more variable in content and dosage. Three studies used Niacin (i.e. vitamin B3 [differing doses]) as a control (Back et al.2024; Raison et al.2023; Ross et al.2016); one study utilized Mannitol (von Rotz et al.2023), one study utilized non-psychedelic mushrooms (Marschall et al.2022); two studies involved waitlist controls (Davis et al.2021; Rosenblat et al.2024); one study used an inert psilocybin control (1 mg psilocybin; Goodwin et al.2022), and one study utilized escitalopram, but also included an inert psilocybin dose (1 mg) for control (Carhart-Harris et al.2016). Seven of the studies hadn’s ≤ 30 per condition. Extracted depression measures included: Beck Depression Inventory Amended (BDI-1A; Beck and Steer1993), 17-item Hamilton Depression Rating Scale (Bech2010), Montgomery-Åsberg Depression Rating Scale (Montgomery and Åsberg1979), Quick Inventory of Depressive Symptomatology-Self-Report (Rush et al.2003), Beck Depression Inventory-II (BDI-II; Beck et al.1996), Grid Hamilton Rating Scale for Depression (Kalali et al.2002), Depression Anxiety Stress Scale-21 (Lovibond and Lovibond1995), Oxford Depression Questionnaire (Price et al.2012), Hospital Anxiety and Depression Scale-Depression (Zigmond and Snaith1983), and Symptom Checklist-90-Revised (Derogatis1977). Time point assessments varied across studies. No study assessed between group outcomes beyond three months. Almost all studies (k = 7) reported a financial conflict of interest.
Harm reporting
Table2 breaks down harm reporting criteria by each RCT/criterion. Quality was heterogenous. Over half (k = 6) the studies were classified as low or very low, one was classified as moderate, and two as high. Criterion 4d was universally missed (i.e.,Described the plan for monitoring for harms and rules for stopping the trial because of harms). Almost all studies missed criteria 2 (“Information on AEs mentioned in the introduction”), 3a (“Definitions of AEs mentioned”), 3b (“If article mentioned all or selected sample of AE”), 3c (“If article mentioned the use of a validated instrument to report AEs severity”), 4c (Description of how AE were attributed to trial drugs”), 5b (“Description of approach for the handling of recurrent AEs”), 7a (“Provided denominators for AEs”), 8a (“Reported results separately for each treatment arm”), 8c (“Provided both number of AEs and number of patients with AEs”), and 9 (“Described subgroup analysis and exploratory analysis for harms”). Criteria 4b (“Stated the timing of collection of AE data”) and 6b (“Reported deaths and serious AEs”) were almost always met. Marschall et al. (2022) had the most missed scores by a substantial margin. Goodwin et al. (2022) and Raison et al., (2023) had the best harm quality reporting. For a list of specific appraisal criteria see Supplementary File2. Extracted excerpts from each reviewed manuscript are also provided in Supplementary Files3–11. We also provide excerpts of when criteria were partially fulfilled, such as in the case of meeting one barrel of a double-barreled question (e.g., criteria 8b “Severityand grading of AEs”), and of when authors provided indirect harm reporting evidence (which was still considered insufficient for a favorable score, but could conceivably be linked to harm reporting).
Table 2.
Harm reporting criteria on the extension of harm checklist
| Study | 1 | 2 | 3a | 3b | 3c | 4a | 4b | 4c | 4d | 5a | 5b | 6a | 6b | 7a | 7b | 8a | 8b | 8c | 9 | 10a | 10b | k | % | Quality |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Back et al.2024 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 7 | 33% | Low |
| Carhart-Harris et al.2021 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 12 | 57% | Moderate |
| Davis et al.2021 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 6 | 29% | Very Low |
| Goodwin et al.2022 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 18 | 86% | High |
| Marschall et al.2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0% | Very Low |
| Raison et al.2023 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 17 | 81% | High |
| Rosenblat et al.2024 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 11 | 52% | Low |
| Ross et al.2016 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 11 | 38% | Low |
| von Rotz et al.2023 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 6 | 29% | Very Low |
| Total | 56% | 11% | 11% | 44% | 22% | 56% | 89% | 44% | 0% | 56% | 11% | 67% | 89% | 44% | 56% | 44% | 56% | 22% | 22% | 78% | 56% |
Risk of bias
Outcomes from the risk of bias review are available in Table 3. Full results including specific RCT-by-criteria rating are available in the Supplementary File12. RoB 2.0 algorithm results suggested “High” overall bias across all RCTs. Reviewer subjective appraisal concurred with the algorithm that overall risk of bias was high for six studies, though diverged from the algorithm for Back et al. (2024), Goodwin et al. (2022), and Marschall et al. (2022) which we thought would be better classified as having “Some Concerns”. Reasons for bias scores were largely clustered around themeasurement of the outcome domain. Otherwise, risk of bias was relatively heterogenous across domains/RCTs. Rosenblat et al. (2024) was determined to have the highest risk, being classified as “High” by both the algorithm and assessors across three different domains. Whereas Goodwin et al. (2022) generally evidenced the lowest risk of bias.
Table 3.
Risk of bias
| Randomization Process | Deviations from Intended Interventions | Missing Outcome Data | Measurement of the Outcome | Selection of the Reported Results | Overall Bias | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Algorithm Result | Assessor Judgement | Algorithm Result | Assessor Judgement | Algorithm Result | Assessor Judgement | Algorithm Result | Assessor Judgement | Algorithm Result | Assessor Judgement | Algorithm Result | Assessor’s Judgement | |
| Back et al.2024 | Low | Low | Low | Low | Low | Low | High | Some concerns | Low | Low | High | Some concerns |
| Carhart-Harris et al.2021 | Some concerns | Some concerns | Low | Low | Low | Low | High | High | Low | Low | High | High |
| Davis et al.2021 | Low | Low | High | High | High | High | High | High | Low | Low | High | High |
| Goodwin et al.2022 | Low | Low | Low | Low | Low | Low | High | High | Low | Low | High | Some concerns |
| Marschall et al.2022 | Low | Some concerns | High | High | High | Some concerns | Some concerns | Some concerns | Low | Low | High | Some concerns |
| Raison et al.2023 | Low | Low | High | High | Low | Low | High | High | Low | Low | High | High |
| Ross et al.2016 | Some concerns | Some concerns | High | High | Some concerns | Some concerns | High | High | Some concerns | Low | High | High |
| von Rotz et al.2023 | Low | Low | Low | Some concerns | Low | Low | High | High | High | Low | High | High |
| Rosenblat et al.2024 | High | High | Some concerns | Some concerns | Low | Low | High | High | High | High | High | High |
See supplemental file12 for additional details
Action mechanisms
Table4 illustrates results from the MoA synthesis. Overall, most RCTs provided at least some level of description for neurophysiological and/or psychological MoAs. Ross et al. (2016) and Marschall et al. (2022) were the only studies classified as “Highly Specific” in both domains. Three studies did not specify any neurophysiological mechanisms (Goodwin et al.2022; Raison et al.2023; Rosenblat et al.2024), while Raison et al. (2023) was the only study not to specify a psychological mechanism. For a comprehensive list of extracted MoA statements with accompanying classification, see Supplementary File13. The most commonly mentioned MoA was action on 5- HT2A, though detail regarding how 5-HT2A functioned in relation to depression reduction was variable. Additionally, MoA statements tended to occur in discussion sections (k = 24 extracted MoA statements), relative to introductions (k = 15;k = 6 of which came from Marschall et al.2022).
Table 4.
Summary of descriptions of mechanisms of action for psilocybin therapy
| Reference | Neurophysiological Mechanism | Psychological Mechanism |
|---|---|---|
| Back et al. (2024) | Slightly Specific | Slightly Specific |
| Carhart-Harris et al. (2021) | Highly Specific | Vague |
| Davis et al. (2021) | Vague | Slightly Specific |
| Goodwin et al. (2022) | NA | Vague |
| Marschall et al. (2022) | Highly Specific | Highly Specific |
| Raison et al. (2023) | NA | NA |
| Rosenblat et al. (2024) | NA | Vague |
| Ross et al. (2016) | Highly Specific | Highly Specific |
| von Rotz et al. (2023) | Slightly Specific | Slightly Specific |
NA Not Mentioned. See supplementary file13 for extracted language that we used to qualify the “Highly Specific, Slightly Specific, and Vague” labels
Incremental efficacy meta-analyses
Random effects meta-analysis of all available effect sizes of psilocybin intervention relative to control conditions suggested a significant incremental effect favoring psilocybin as being statistically superior at reducing depression symptoms,g = 0.69; 95% CI = 0.34, 1.04. The results were heterogenous τ = 0.47,Q(9) = 39.25,p < 0.001, see Fig. 2 for a forest plot of effect sizes. Given the heterogeneity, we provide omnibus results but recommend evaluating each incremental effect sizes to ascertain a comprehensive reflection of the research body. Table5 provides the incremental effects by study subgroup with relative weights. Supplementary File14 provides a breakdown of each incremental effect sizes across all RCTs (note that the individual incremental effects in Supplementary File14 are not adjusted by study-level dependence). Omnibus publication bias estimates indicated a significant correlation between Hedges’g and standard error (Egger’sb0 = 3.64,SE = 1.36,t[8] = 2.68,p = 0.014), such that smaller studies tended to yield larger effects that favored psilocybin. The trim-and-fill procedure removed one study, giving an adjusted incremental g = 0.56 95% CI = 0.16, 0.95. Subgroup-analyses indicated the effect was driven by studies where psilocybin was compared against non-intervention controls:g = 0.76 95% CI = 0.37, 1.14, relative to active controls (i.e., escitalopram/inert psilocybin; Carhart-Harris et al.2021), which demonstrated a non-significant incremental effect:g = 0.21; 95% CI = −0.31, 0.73. Because Marschall et al. (2022) examined microdosing, we conducted a sensitivity analysis with their study removed. However, omnibus results did not measurably changeg = 0.76; 95% CI = 0.39, 1.14.
Fig. 2.
Forest Plot by Subgroup. Note: Goodwin et al. (2022) comparator group is not counted twice in the sample size overall
Table 5.
Incremental effects by study subgroup
| Reference | Psilocybin Dose | Control Type | Measures Used | Time Points | n | Hedgesg | 95% CI | Relative Weight | |
|---|---|---|---|---|---|---|---|---|---|
| Psilocybin | Control | ||||||||
| Back et al. (2024) | 25 mg | 100 mg Niacin | MADRS | 28 days | 15 | 15 | 1.34* | 0.57, 2.12 | 8.41 |
| Carhart-Harris et al. (2021) | 25 mg | 1 mg Psilocybin 10 mg Escitalopram | BDI-1A HAM-D-17 MADRS QIDS-SR-16 | 3–42 days | 30 | 29 | 0.19 | −0.33, 0.71 | 10.88 |
| Davis et al. (2021) | 20 mg/kg dose 1 30 mg/kg dose 2 | WLC | BDI-II GRID-HAMD QIDS-SR | 35–56 days | 13 | 11 | 3.08* | 1.89, 4.27 | 5.39 |
| Goodwin et al. (2022) | 10 mg 25 mg | 1 mg Psilocybin | MADRS | 21 days | 79 75 | 79 | 0.12 0.40* | −0.20, 0.43 0.08, 0.71 | 12.81 12.82 |
| Marschall et al. (2022) | .7 g of psilocybin Galindoi truffles | Placebo (mushrooms) | DASS-21 | 21 days | 18 | 23 | 0.05 | −0.55, 0.66 | 10.02 |
| Raison et al. (2023) | 25 mg | 100 mg Niacin | MADRS ODQ Symptoms | 1–43 days | 51 | 53 | 0.73* | 0.34, 1.13 | 12.08 |
| Rosenblat et al. (2024) | 25 mg | WLC | MADRS | 14 days | 16 | 14 | 1.33* | 0.55, 2.12 | 8.35 |
| Ross et al. (2016)a | .3 mg/kg | 250 mg Niacin | HADS-D BDI | 2–14 days | 14 | 15 | 0.34 | −0.39, 1.08 | 8.80 |
| von Rotz et al. (2023)a | .215 mg/kg | 1 or 5 mg Mannitol | BDI SCL-90-R MADRS | 14 days | 26 | 26 | 0.82* | 0.26, 1.38 | 8.41 |
| Omnibus Effect | - | - | - | - | 337 | 265 | 0.69* | 0.34, 1.04 | 100 |
*Significance at the level ofp < .05.aBaseline values scores are referenced against are 1-day pre psilocybin administration, both studies included a several week pre-psilocybin administration baseline.WLC weight list control,BDI-1A Beck Depression Inventory Amended,HAM-D-17 17-item Hamilton Depression Rating Scale,MADRS = Montgomery-Asberg Depression Rating Scale,QIDS-SR-16 16-item Quick Inventory of Depressive Symptomatology—Self-Report,BDI-II Beck Depression Inventory II,GRID-HAM Grid Hamilton Rating Scale for Depression,QIDS-SR Quick Inventory of Depressive Symptomatology—Self-Report,DASS-21 Depression Anxiety Stress Scale-21,ODQ Oxford Depression Questionnaire,HADS-D Hospital Anxiety and Depression Scale-Depression,SCL-90-R Symptom Checklist-90-Revised
Discussion
The current systematic review and meta-analysis aimed to critically evaluate the state of the psilocybin-for-depression RCT literature. Our findings provide an overview of between group RCTs designed to test the efficacy of psilocybin as a depression intervention. We specifically provide harm reporting, risk of bias, and MoAs syntheses to help contextualize parametric incremental effect results.
Harm quality reporting was heterogenous across RCTs. Overall, there is room for improvement for how psilocybin researchers conceptualize, measure, and report AEs. There was a general lack of discussion of AEs relative to potential benefits, AE materials were often relegated to supplementary materials (hence lower quality scores on otherwise well conducted trials), and a tendency to only vaguely report how AEs were assessed. Frequently, some AE-related information was provided, but it was often unclear how it fit into the broader study. For instance, a tendency to employ a suicidality assessment or blood pressure monitoring as part of the protocol but not include formal (or at least explicate in publication) how additional potential AEs were being assessed, scored, and/or conceptualized.
Across studies, AE definitions were inconsistent. Indeed, this is an area of heterogeneity across the field as researchers conceptualize AEs differently. For example, von Rotz et al. (2023) explicitly excluded “transient symptoms directly related to the well-known psychotropic effects of psilocybin” (see their supplementary materials). Whereas, in Davis et al. (2021) included AEs such as “visual distortion” and “altered body sensations”. Echoing previous researchers (Breeksema et al.2022), consensus over what is considered an AE in the context of psilocybin research is needed. This could explain the diversity of AE measures employed across trials. At times well established AE tools were employed, but authors failed to specify key elements of the CONSORT tool. For example, Carthart-Harris et al. (2021) employed the Medical Dictionary for Regulatory Activities in their study, but did not specify how AE severity was assessed (e.g., criteria 3c). Additionally, most studies failed to adequately describe plans for monitoring AEs and/or criteria for ending a trial due to AEs. Most studies also failed to report both the number of AEs along with the number of patients who experienced AEs. Ideally, ideographic data could be provided where the frequency and AE descriptions are provided for each participant (at least in supplementary contexts) to accompany full sample/subsample estimates.
Despite areas of concern, most studies did reportsome AE information in their main texts and most attempted to provide a balanced view of harm versus benefits in their discussions. It also somewhat understandable that AE information was relegated to supplementary contexts, as few studies explicated AE assessment as a targeted aim. Together, stakeholders could find AE information if they knew where to look. Additionally, the overly low scores for Marschall et al. (2022) should be interpreted cautiously as their study did not target a clinical population, despite assessing depression and utilizing a placebo-controlled RCT design. They were also the only study to implement psychedelic mushrooms. Qualitatively, they were quite different from the other trials being more directed toward basic science than a generalizable intervention.
Risk of bias was also a problem across the reviewed studies. Notably, our risk of bias findings converged with Hovmand et al. (2023), who also analyzed risk of bias for Ross et al. (2016), Carhart-Harris et al. (2021), and Goodwin et al. (2022). Both our team and Hovmand et al. (2023) classified Ross et al. (2016) and Carhart-Harris et al. (2021) as “High” overall risk of bias. They rated Goodwin et al. (2022) as “Low Risk,” whereas we rated them as “Some Concerns” (i.e., assessor’s judgement; algorithm result was high). Our absolute rating of Goodwin et al. (2022) diverged based on the “Measurement of the Outcome” domain. Participant de-blinding (due to psychedelic drug effects or lack thereof) could have influenced the assessment of the outcome despite Goodwin et al.’s (2022) deployment of rigorous outcome-assessment procedures (e.g., initial blinding, naïve raters).
Our outcome measurement classifications (which we almost universally categorized as a high bias source) were anchored in blinding concerns. This is a broader problem with psychedelic research. Although clinician-rated instruments such as the MADRS were employed, the information that the clinician has available to score are driven by participants’ reports of inner experiences (e.g., inner tension, concentration difficulties) or behaviors that cannot be observed in the clinic or over the phone (e.g., reduced sleep; see Montgomery & Åsberg1979). Therefore, bias in the “Measurement of the Outcome” domain is susceptible to the compound influence of the patient’s self-report and the clinician’s interpretation. Similar to responding in a self-report format, responding during the MADRS (and other clinician-scored instruments that rely on participant self-report) are susceptible to de-blinding by the subjective effects of psilocybin. That is, even if the assessor was blind, the patient could be de-blinded by virtue of the psychedelic experience, in turn de-blinding the assessor. Therefore, for our RoB analysis, we diverged from considering the “assessor” as the “observer” for the rater-administered MADRS. We consider the participant to be the observer and encourage other evaluations of psychedelic substance clinical trials that rely on patient self-reported information to do the same.
Ideally, RoB analyses should be expanded to model the compound sources of bias. Participant de-blinding issues, which pervade psychedelic science (see Muthukumaraswamy et al.2021; Nayak et al.2023), imply that evidence produced from psychedelic RCTs using extant control methods (e.g., inert placebos, simple active placebos) may be no more rigorous than evidence produced from non-RCT trials (i.e., open-label studies). That is, a therapeutic effect is evident, but it is difficult to ascertain how much of that effect is due to de- blinding/expectancy/placebo excitement over a novel therapy. Extant RoB measures must be considered with such limitations in mind.
Conversely, we rated Marschall et al. (2022) as only having “Some Concerns” for the “Measurement of the Outcome” domain. In contrast to the problems with high-dose psychedelic trials outlined above, the reduced risk of bias was primarily driven by their aim of investigating microdosing. Because a paucity of perceived subjective effects is part of the definition of microdosing (Kuypers et al.2019), it is easier to effectively blind microdose trials. Importantly, we are not suggesting that microdosing, as currently studied and practiced is completely subperceptual (Fadiman & Korb2019; Holze et al.2021). However, we suggest it is easier to create non-psychedelic blinding conditions that mimic “improved energy” (Anderson et al.2019) vs. “I approached the border where existence began, and on the other side of this border was nothing” (Noorani et al.2018). We encourage investigators to further consider trials testing microdosing, as this may confer an easier scientific problem to initially solve. Should microdosing trials maintain blindness and find efficacy, it will be clearer that the treatment effects are not driven by de-blinding.
Researchers varied in the degree to which they specified/considered MoAs. Additionally, the nature of the reported MoAs differed by research team. This likely reflects the various MoAs posited in the psilocybin literature. Psilocybin’s agonist activity on serotonergic receptors (particularly 5-HT2A) was arguably the most commonly posited MoA (with varying degrees of specificity; see Supplemental File 13). Other mechanisms included the potential for increased interoceptive awareness, neural plasticity, and network level changes, among others. Most researchers did not integrate multiple MoA theories, such as how 5-HT2A activity might yield changes in the default mode network (Gattuso et al.2023; Smigielski et al.2019) though exceptions were evident (Ross et al.2016, see Supplementary File13). Few of the research teams measured any a priori MoAs or conducted any meaningful mediation analyses within the reviewed RCTs.
One potential reason for the lack of MoA tests was that such investigations occurred in post hoc publications. We attempted to confirm all such cases. Of those, Daws et al. (2022) stands out as an example of how researchers are trying to answer MoA questions. They compared brain states using fMRI in participants from Carhart-Harris et al. (2021). Their results provided evidence that participant response to psilocybin was correlated with increased brain “network flexibility” relative to an absence of commensurate brain state changes in participants in the escitalopram condition. However, their relatively small sample sizes were each further reduced by attrition (e.g., “head motion”), meaning effect sizes were probably inflated and replication probability reduced (Marek et al.2022). Other follow-up studies from the Carhart-Harris et al. (2021) sample include Zeifman et al (2023) who conducted a study suggesting reductions in experiential avoidance as a potential mechanism. Murphy et al. (2022) also conducted a follow-up, Carhart-Harris (2016) and highlighted the importance of therapist rapport/therapeutic alliance in psilocybin intervention. Their results are consistent with a common factors model (Wampold2015) that if alliance is sound, therapeutic effects should follow.
Several other secondary analyses were also conducted on the reviewed RCTs. Broadly, these did not include traditional mediation-based MoA analyses. For example, Goodwin et al. (2023) published a follow-up analysis based on the Goodwin et al. (2022) sample, though this did not involve MoA/mediation analyses. Goodwin et al. (2025) also recently published a follow-up analysis suggesting dosage strength is correlated with psychedelic psychological experiences (e.g., “Oceanic Boundlessness”) which could potentially explain reductions in depression. Malone et al. (2018) published a follow-up to Ross et al. (2016); however, no formal MoA analyses were conducted. Though they did reference the MoAs discussed in the original Ross et al. (2016) study. Gukasyan et al. (2022) conducted a follow-up on Davis et al. (2021) but again there was no formal MoA analyses, but rather bivariate correlations with various potential mechanisms such as spiritual significance and mystical experience (which could be conceptualized as “outcomes” instead of MoAs based on their analyses). Jungwirth et al. (2024) recently published a follow-up to von Rotz et al. (2023) with analyses suggesting psilocybin may increase empathy, thereby indirectly reducing depression. We were unable to identify any potential MoA secondary analyses that were specific to psilocybin’s effect on depression for the other reviewed RCTs (Back et al.2024; Marschall et al.2022; Raison et al.2023; Rosenblat et al.2024).
The general lack of MoA measurement signifies the neophyte nature of psilocybin intervention. We encourage authors to proactively specify MoAs and report MoA assessment in primary publications. The potential concern when MoA analyses are absent is that psilocybin is presented as a treatment without a clear argument for why it should reduce depression. This is consistent with our observation that researchers tended to discuss MoAs in their discussion sections rather than their introductions (i.e., backwards theorizing). Indeed, many researchers relied on citations of prior studies that indicate psilocybin can be associated with some degree of mood enhancement rather than trying to identify an explanatory process (see Supplementary File13). By demonstrating the mediational mechanisms involved in psilocybin intervention, arguments for placebo and expectancy effects related to psilocybin as a novel (if not exciting) intervention become less profound. Another concern is that it does not appear that researchers performing secondary analyses in subsequent publications conducted appropriate Bonferroni corrections to account for the number of analyses already performed. This problem is exacerbated by the fact that all of the RCTs had small samples. Said another way, there is significant type 1 error risk. While an understandable starting place, more empirically based testing of MoAs is needed in original trials to demonstrate psilocybin is uniquely responsible for the reported depression reductions and not expectancy.
This could be done with a variety of approaches depending on the MoAs of interest. For example, if 5- HT2A activation is theorized to be causing the therapeutic effects, researchers should have evidence of baseline 5-HT2A problems (however defined) and post-test evidence of sustained 5-HT2A change corresponding to reduced depression scores. Alternatively, similar to the approach taken by Daws et al. (2022), participants could undergo imaging procedures before and after the intervention to see how neural changes correspond to depression changes. Further controls could be added, such as non-pathological participants. That is, would brain networks resemble healthy controls before/after treatment? Future researchers should also consider assigning patients to treatment groups based on MoA evidence. It is possible that psilocybin could have meaningful therapeutic effects for some, but not all, depression etiologies. If group assignment is based solely on latent disease classification (e.g., meeting DSM-5- TR criteria), then there is no way of knowing which patient might respond (positively or adversely) as the syndromal classification confounds etiology. Psilocybin researchers could categorize depressive participants with more specificity (ideally, by operationalized MoAs) to correct for such confounding issues. This recommendation is consistent with recent paradigm shifts within clinical science, such as the Research Domain Criteria (Insel et al.2010).
To be sure, MoA limitations is an underlying problem in psychiatry research and is not necessarily specific to psilocybin intervention (Cuijpers & Cristea2015). We also recognize that understanding the exact MoAs might not be essential if the therapeutic effects are robustly efficacious. Our results suggest some degree of incremental efficacy exists, but it is relatively modest and needs to be contextualized by the methodological limitations (i.e., small sample sizes biasing effect sizes), generalizability issues, and researcher bias (prominence of financial conflicts of interest). Thus, a robust explanation for why psilocybin could specifically reduce depression notwithstanding confounding factors, would greatly enhance stakeholder trust in the intervention.
In terms of our incremental meta-analyses, psilocybin statistically reduced depression relative to controls, but only to a moderate degree. Taking the BDI-II (one of the most popular depression measures assessed across RCTs) as an example, psilocybin on average would reduce depressive symptoms by approximately 7.9 points more than the reduction achieved by controls (the BDI-II has a range from 0 to 63). This estimate is based on the BDI-II outpatient normative data (Beck et al.1996). Thus, some unique potential is evident. However, almost all the studies used non-psychotropic agents as controls. Carhart-Harris et al. (2021) was the only exception, demonstrating a non-significant incremental effect. Observation of study-by-study effects suggests the strongest effects come from studies like Davis et al. (2021; incrementalg = 3.08) and Rosenblat et al. (2024; incrementalg = 1.33), which utilized waitlist controls and had extremely small samples (i.e., unstable metrics/effect size inflation). Conversely, Goodwin et al. (2022), a relatively larger study, demonstrated much more modest incremental effects (g = 0.19 for 10 mg psilocybin andg = 0.43 for 25 mg psilocybin). The Egger’s test demonstrated a significant correlation between standard error (small samplen) and effect size, favoring psilocybin. Taken together, our results indicate unique therapeutic effects are observed but also raise decline effect concerns. In other words, future large scale (n’s > 100 per group) psilocybin-for-depression RCTs that utilize active controls (FDA approved antidepressants) will likely be associated with a reduction in incremental effect. The primary question is how much of an effect size reduction will occur? Concurrently, how long will the incremental effect last? No studies followed between group outcomes beyond three months.
That said, psychedelic-assisted therapy is a complete paradigm shift for psychiatry. By design, psychedelic-assisted therapies entail the deemphasis of long-term pharmacological treatments and substitute an emphasis on the healing power of the pharmacologically inducedacute experience. This shift implies concomitant new ways of thinking about psychiatric diagnosis, etiology, and therapeutic MoAs (Schenberg2018). In other words, even if psilocybin only yields modest incrementally superior effects relative to standard treatments, it could still be globally superior if intervention was needed significantly less frequently than a daily prescription. If the theorized single session/single dose approach yields long lasting reductions in depressive symptoms, thereby reducing side effects and other problems traditionally associated with typical psychopharmacological agents, psilocybin may hold substantially more incremental value than statistically demonstrated.
Observations
There were a relatively high number of barriers to conducting the current meta-analysis. First, few of the relevant studies comprehensively provided needed statistical information in original texts. This is problematic as clinicians and researchers need such information when trying to appraise this body of work. Underreporting in original texts is somewhat understandable, as comprehensive results must often be relegated to supplementary files (the current paper serves as a prime example). However, many of the supplementary files reviewed lacked necessary information such as depression outcome means at specific time points and deviation estimates. Many authors reported pieces of information in the form of charts, but not exact estimates, and some failed to provide necessary information (e.g., exclusion of Grob et al.2011, see Supplementary File1). In many cases, enough results are provided such that analyses were possible, but hints towards unreported data were evident. For instance, Back et al. (2024), collected between-group measurements at days “1, 8, 15, and 28” (p. 4). However, they only reported means and standard deviations at baseline and day 28. We attempted to contact many authors to obtain such information, but received few responses, with only a subset of authors providing further data. Supplementary File15 provides a record of our attempts to contact corresponding authors of the evaluated RCTs. Four teams failed to respond to multiple inquiries, and two responded after multiple prompts. Our efforts to contact authors included attempts to clarify important pieces of information. For example, we observed in the pre-registration for von Rotz et al. (2023; ClinicalTrials.govNCT03715127) that they initially planned to report depression outcomes at Day 32, yet their publication only presents results up until day 14. There is no documented explanation for this discrepancy. We recommend all psilocybin researchers report means, deviation estimates, andn’s for participants in each treatment and control group for all time points in their studies, in addition to other useful statistical information (95% confidence intervals for respective point estimates,p-values).
Almost all of the RCTs had authors who reported financial conflicts of interest with companies sponsoring psilocybin intervention. We are concerned that such financial conflicts of interest may be adding an additional source of bias. Indeed, this is a well-established corollary of misleading findings that favor respective target pharmacologic agents (Bekelman et al.2003; Ioannidis2005; Mahase2024; Nejstgaard et al.2020). It would be preferable for analyses to be conducted and reported by researchers who are not funded by private companies interested in psychedelic therapies.
Limitations
Our study presents a critical evaluation of the psilocybin-for-depression research program. We offer this synthesis of the research in the spirit of improving psychedelic research. Our study is limited by the relatively small literature body. We observed many interesting trends that could be important moderators. For instance, the microdosing study by Marschall and colleagues (2022) failed to yield a significant incremental effect, but the absence of additional microdosing RCTs prevented us from conducting meaningful meta-regression analyses to evaluate this finding. We also observed that most of the RCTs predominantly included White participants. Presently, psychedelic science is grappling with multicultural, diversity, and social justice issues. Indeed, in contemporary psychedelic science, participants have been relatively demographically homogeneous (Michaels et al.2018). This limits knowledge, makes the external validity of research findings uncertain (Ortiz et al.2022), and stunts our understanding of the ethnopsychopharmacology of psychedelics and psychedelic-assisted therapies (Fogg et al.2021). These inequities are exacerbated by the well-documented need for novel treatments in diverse communities, the rich history of psychedelic use in non-White, indigenous cultures, and the unethical practices that harmed vulnerable groups during prior waves of psychedelic research (Strauss et al.2022). Future researchers should make greater efforts to include diverse participants in their RCTs.
We were unable to disentangle the role of psychotherapeutic facilitation in the intervention. This was one of many issues that recently precluded favorable ratings from the FDA for MDMA PTSD interventions. Some degree of facilitation was apparent in almost every reviewed study, with multiple studies acknowledging concurrent psychotherapy. Ostensibly, this issue should be controlled by the RCT methodologies. However, that is assuming all participants received equal degrees of psychotherapy quality (in terms of alliance, trust, experience, etc.) in each condition per study. As such, it is difficult to ascertain what effects are attributable to psychotherapy and which are attributable to psilocybin, as different therapists might have different skills and levels of patient alliance (that in turn differs by patient). This issue is exacerbated by the noted blinding problem, with therapists being de-blinded by participant reactions, thereby altering their therapeutic approaches between conditions. As noted by Murphy et al. (2022), therapist alliance plays a significant role in psilocybin’s therapeutic effectiveness.
Conclusions
Psilocybin intervention demonstrates a significant incremental therapeutic effect that is modestly superior to control conditions in treating depression. Decline effects linked to expectancy/placebo effects are a concern within the psilocybin-for-depression research program. Better controls and larger samples sizes could attenuate effect sizes in future RCTs. Harm reporting is highly variable and largely driven by how researchers define and report AEs, as well as their strategies for managing AEs. Risk of bias can be reduced by employing more objective outcome measurements that are not reliant on patient self-report. MoAs were highly variable, and more often specified in discussions opposed to introductions, and generally not measured in primary or even secondary analyses. We advise psilocybin researchers to specify mechanisms,a priori, that theoretically explain therapeutic effects in depression treatment. Similarly, psilocybin researchers are encouraged to measure mechanisms and have inclusion criteria based on hypothesized mechanisms (not necessarily latent disease classification). Psilocybin researchers are also encouraged to conduct biomarker-driven causal mediational analyses (Muthukumaraswamy2023). This would involve: 1) assessing pre-trial expectancies; 2) capturing known objective correlates of outcomes associated with psychedelics (e.g., biomarkers); and 3) measuring treatment outcomes. These data can be combined to statistically parse the causal effects of psychedelics vs. expectancy (placebo) in a mediation model based on the logic that expectancies are more likely to influence self-report outcomes than observable biomarkers. We also encourage funding bodies to create accessible psychedelic-focused mechanisms for researchers not affiliated with for-profit psychedelic promoting companies. This will allow for more teams to conduct rigorous psychedelic research, while minimizing bias associated with private funding. It is hoped that by applying these recommendations the psilocybin research program can improve, such that it can be utilized in a way that maximizes beneficence and minimizes error.
Supplementary Information
Below is the link to the electronic supplementary material.
Author contributions
NCB: Conceptualization, writing, meta-analyses. TO: Management of coding teams, reviewing/writing, harm reporting review. DP: Writing, reviewing, risk-of-bias, MoA synthesis, harm reporting review. JV: Reviewing, risk-of-bias, harm reporting review. DJ: Reviewing, harm reporting, coding. LP: Reviewing harm reporting, coding. SA: Reviewing, harm reporting review. BH: Reviewing.
Funding
No funding was associated with this work.
Data availability
Data is available from the first author upon request.
Declarations
Ethical approval
This study did not involve living subjects and therefore was exempt from ethical board review
Consent to participate
No consent to participate was administered as there were no living subjects.
Consent to publication
All authors consent to publication.
Competing interest
All authors declare no competing interests.
Footnotes
The original version of this article was revised: This article was originally published with an error in the author name and in the 3rd affiliation.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Change history
5/17/2025
A Correction to this paper has been published: 10.1007/s00213-025-06818-7
References
- American Psychiatric Association (2022) Diagnostic and Statistical Manual of Mental Disorders, 5th Edition: DSM-5-TR. American Psychiatric Association [Google Scholar]
- Anderson T, Petranker R, Christopher A, Rosenbaum D, Weissman C, Dinh-Williams LA, Hui K, Hapke E (2019) Psychedelic microdosing benefits and challenges: An empirical codebook. Harm Reduct J 16:1–10. 10.1186/S12954-019-0308-4/FIGURES/3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Automeris LLC. (2024). Web Plot Digitizer. Automeris.Io.https://automeris.io/WebPlotDigitizer.html
- Back AL, Freeman-Young TK, Morgan L, Sethi T, Baker KK, Myers S, McGregor BA, Harvey K, Tai M, Kollefrath A, Thomas BJ, Sorta D, Kaelen M, Kelmendi B, Gooley TA (2024) Psilocybin therapy for clinicians with symptoms of depression from frontline care during the COVID-19 pandemic: A randomized clinical trial. JAMA Network Open 7:e2449026. 10.1001/JAMANETWORKOPEN.2024.49026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bech P (2010) Struggle for subtypes in primary and secondary depression and their mode-specific treatment or healing. Psychother Psychosom 79:331–338. 10.1159/000320118 [DOI] [PubMed] [Google Scholar]
- Beck AT, Steer RA, Brown GK (1996) Beck Depression Inventory II (BDI-II). In: Manual for the Beck Depression Inventory–II. Psychological Corporation
- Beck AT, Steer RA (1993) Beck depression inventory amended. In manual for the beck depression inventory, The Psychological Corporation Inc [Google Scholar]
- Bekelman JE, Li Y, Gross CP (2003) Scope and impact of financial conflicts of interest in biomedical research: A systematic review. JAMA 289:454–465. 10.1001/JAMA.289.4.454 [DOI] [PubMed] [Google Scholar]
- Borenstein M, Hedges L, Higgins J, Rothstein H (2013) Comprehensive Meta-Analysis. Englewood, NJ: Bisostat
- Breeksema JJ, Kuin BW, Kamphuis J, van den Brink W, Vermetten E, Schoevers RA (2022) Adverse events in clinical treatments with serotonergic psychedelics and MDMA: A mixed-methods systematic review. J Psychopharmacol 36:1100–1117. 10.1177/02698811221116926/ASSET/IMAGES/LARGE/10.1177_02698811221116926-FIG1.JPEG [DOI] [PMC free article] [PubMed] [Google Scholar]
- Butler AC, Chapman JE, Forman EM, Beck AT (2006) The empirical status of cognitive- behavioral therapy: A review of meta-analyses. Clin Psychol Rev 26:17–31. 10.1016/J.CPR.2005.07.003 [DOI] [PubMed] [Google Scholar]
- Cai N, Choi KW, Fried EI (2020) Reviewing the genetics of heterogeneity in depression: operationalizations, manifestations and etiologies. Hum Mol Genet 29:R10–R18. 10.1093/HMG/DDAA115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carhart-Harris RL, Bolstridge M, Rucker J, Day CMJ, Erritzoe D, Kaelen M, Bloomfield M, Rickard JA, Forbes B, Feilding A, Taylor D, Pilling S, Curran VH, Nutt DJ (2016) Psilocybin with psychological support for treatment-resistant depression: an open-label feasibility study. The Lancet Psychiatry 3:619–627. 10.1016/S2215-0366(16)30065-7 [DOI] [PubMed] [Google Scholar]
- Carhart-Harris R, Giribaldi B, Watts R, Baker-Jones M, Murphy-Beiner A, Murphy R, Martell J, Blemings A, Erritzoe D, Nutt D (2021) Trial of psilocybin versus escitalopram for depression. N Engl J Med 384:1402–1411. 10.1056/NEJMoa2032994 [DOI] [PubMed]
- Cipriani A, Furukawa TA, Salanti G, Chaimani A, Atkinson LZ, Ogawa Y, Leucht S, Ruhe HG, Turner EH, Higgins JPT, Egger M, Takeshima N, Hayasaka Y, Imai H, Shinohara K, Tajika A, Ioannidis JPA, Geddes JR (2018) Comparative efficacy and acceptability of 21 antidepressant drugs for the acute treatment of adults with major depressive disorder: A systematic review and network meta-analysis. Lancet 16:420–429. 10.1016/S0140-6736(17)32802-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cristea IA, Fried EI, Kaiser T, Hengartner MP, Turner EH, Naudet F (2024) Maintaining rational expectations about psychedelic interventions. BMJ 385 10.1136/bmj-2023-078084
- Cuijpers P, Cristea IA (2015) What if a placebo effect explained all the activity of depression treatments? World Psychiatry 14:311. 10.1002/WPS.20249 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davis AK, Barrett FS, May DG, Cosimano MP, Sepeda ND, Johnson MW, Finan PH, Griffiths RR (2021) Effects of psilocybin-assisted therapy on major depressive disorder: A randomized clinical trial. JAMA Psychiat 78:481–489. 10.1001/JAMAPSYCHIATRY.2020.3285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daws RE, Timmermann C, Giribaldi B, Sexton JD, Wall MB, Erritzoe D, Roseman L, Nutt D, Carhart-Harris R (2022) Increased global integration in the brain after psilocybin therapy for depression. Nat Med 28:844–851. 10.1038/s41591-022-01744-z [DOI] [PubMed] [Google Scholar]
- Derogatis LR (1977) SCL-90: Administration, scoring & procedures manual for the R(evised) version and other instruments of the psychopathology rating scale series. Johns Hopkins University School of Medicine [Google Scholar]
- dos Santos RG, Hallak JEC, Baker G, Dursun S (2021) Hallucinogenic/psychedelic 5HT2A receptor agonists as rapid antidepressant therapeutics: Evidence and mechanisms of action. J Psychopharmacol 35:453–458. 10.1177/0269881120986422 [DOI] [PubMed] [Google Scholar]
- Duval S, Tweedie R (2000) Trim and fill: A simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics 56:455–463. 10.1111/J.0006-341X.2000.00455.X [DOI] [PubMed] [Google Scholar]
- Eisner B, Cohen S (1958) Psychotherapy with lysergic acid diethylamide. J Nerv Ment Dis 127:528–539 [DOI] [PubMed] [Google Scholar]
- Fadiman J, Korb S (2019) Might microdosing psychedelics be safe and beneficial? An initial exploration. J Psychoactive Drugs 51:118–122. 10.1080/02791072.2019.1593561 [DOI] [PubMed] [Google Scholar]
- Feijo De Mello M, De Jesus Mari J, Bacaltchuk J, Verdeli H, Neugebauer R (2005) A systematic review of research findings on the efficacy of interpersonal therapy for depressive disorders. Eur Arch Psychiatry Clin Neurosci 255:75–82. 10.1007/S00406-004-0542-X/METRICS [DOI] [PubMed] [Google Scholar]
- Ferguson CJ (2009) An effect size primer: A guide for clinicians and researchers. Prof Psychol Res Pract 40:532–538. 10.1037/a0015808 [Google Scholar]
- Fogg C, Michaels TI, de la Salle S, Jahn ZW, Williams MT (2021) Ethnoracial health disparities and the ethnopsychopharmacology of psychedelic-assisted psychotherapies. Exp Clin Psychopharmacol 29:539–554. 10.1037/PHA0000490 [DOI] [PubMed] [Google Scholar]
- Gattuso JJ, Perkins D, Ruffell S, Lawrence AJ, Hoyer D, Jacobson LH, Timmermann C, Castle D, Rossell SL, Downey LA, Pagni BA, Galvão-Coelho NL, Nutt D, Sarris J (2023) Default mode network modulation by psychedelics: A systematic review. Int J Neuropsychopharmacol 26:155–188. 10.1093/IJNP/PYAC074 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldberg SB, Pace BT, Nicholas CR, Raison CL, Hutson PR (2020) The experimental effects of psilocybin on symptoms of anxiety and depression: A meta-analysis. Psychiatry Res 284:112749. 10.1016/J.PSYCHRES.2020.112749 [DOI] [PubMed] [Google Scholar]
- Goodwin GM, Aaronson ST, Alvarez O, Arden PC, Baker A, Bennett JC, Bird C, Blom RE, Brennan C, Brusch D, Burke L, Campbell-Coker K, Carhart-Harris R, Cattell J, Daniel A, DeBattista C, Dunlop BW, Eisen K, Feifel D, … Malievskaia E (2022) Single-Dose Psilocybin for a Treatment-Resistant Episode of Major Depression. New England J Med 387:1637–1648 10.1056/NEJMOA2206443/SUPPL_FILE/NEJMOA2206443_DATA- SHARING.PDF [DOI] [PubMed]
- Goodwin GM, Aaronson ST, Alvarez O, Atli M, Bennett JC, Croal M, DeBattista C, Dunlop BW, Feifel D, Hellerstein DJ, Husain MI, Kelly JR, Lennard-Jones MR, Licht RW, Marwood L, Mistry S, Páleníček T, Redjep O, Repantis D, … Malievskaia E (2023) Single- dose psilocybin for a treatment-resistant episode of major depression: Impact on patient-reported depression severity, anxiety, function, and quality of life. J Affect Disord 327:120–127. 10.1016/J.JAD.2023.01.108 [DOI] [PubMed]
- Goodwin GM, Aaronson ST, Alvarez O, Carhart-Harris R, Chai-Rees J, Croal M, DeBattista C, Dunlop BW, Feifel D, Hellerstein DJ, Husain MI, Kelly JR, Kirlic N, Licht RW, Marwood L, Meyer TD, Mistry S, Nowakowska A, Páleníček T, … Malievskaia E (2025) The role of the psychedelic experience in psilocybin treatment for treatment-resistant depression. J Affect Disord 372:523–532 10.1016/J.JAD.2024.12.061 [DOI] [PubMed]
- Grob CS, Danforth AL, Chopra GS, Hagerty M, McKay CR, Halberstad AL, Greer GR (2011) Pilot study of psilocybin treatment for anxiety in patients with advanced-stage cancer. Arch Gen Psychiatry 68:71–78. 10.1001/ARCHGENPSYCHIATRY.2010.116 [DOI] [PubMed] [Google Scholar]
- Gukasyan N, Davis AK, Barrett FS, Cosimano MP, Sepeda ND, Johnson MW, Griffiths RR (2022) Efficacy and safety of psilocybin-assisted treatment for major depressive disorder: Prospective 12-month follow-up. J Psychopharmacol 36:151–158. 10.1177/02698811211073759/ASSET/IMAGES/LARGE/10.1177_02698811211073759-FIG2.JPEG [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haikazian S, Chen-Li DCJ, Johnson DE, Fancy F, Levinta A, Husain MI, Mansur RB, McIntyre RS, Rosenblat JD (2023) Psilocybin-assisted therapy for depression: A systematic review and meta-analysis. Psychiatry Res 329:115531. 10.1016/J.PSYCHRES.2023.115531 [DOI] [PubMed] [Google Scholar]
- Hall W (2022) Why was early therapeutic research on psychedelic drugs abandoned? Psychol Med 52:26–31. 10.1017/S0033291721004207 [DOI] [PubMed] [Google Scholar]
- Haniff ZR, Bocharova M, Mantingh T, Rucker JJ, Velayudhan L, Taylor DM, Young AH, Aarsland D, Vernon AC, Thuret S (2024) Psilocybin for dementia prevention? The potential role of psilocybin to alter mechanisms associated with major depression and neurodegenerative diseases. Pharmacol Ther 258:108641. 10.1016/J.PHARMTHERA.2024.108641 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayes SC, Strosahl K, Wilson KG (2012) Acceptance and commitment therapy: An experiential approach to behavior change. Guilford Press [Google Scholar]
- Heffter A (1898) Ueber Pellote - Beiträge zur chemischen und pharmakologischen Kenntniss der Cacteen Zweite Mittheilung. Archiv Für Exp Pathol Pharmakol 40:385–429. 10.1007/BF01825267/METRICS [Google Scholar]
- Higgins JPT, Savović J, Page MJ, Elbers RG, Sterne JAC (2019) Assessing risk of bias in a randomized trial. In Higgins JP, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (Eds.),Cochrane Handbook for Systematic Reviews of Interventions (Second Edi). John Wiley & Sons, Ltd. 10.1002/9781119536604.CH8
- Holze F, Liechti ME, Hutten NRPW, Mason NL, Dolder PC, Theunissen EL, Duthaler U, Feilding A, Ramaekers JG, Kuypers KPC (2021) Pharmacokinetics and pharmacodynamics of lysergic acid diethylamide microdoses in healthy participants. Clin Pharmacol Ther 109:658–666. 10.1002/CPT.2057 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hovmand OR, Poulsen ED, Arnfred S, Storebø OJ (2023) Risk of bias in randomized clinical trials on psychedelic medicine: A systematic review. J Psychopharmacol 37:649–659. 10.1177/02698811231180276/ASSET/IMAGES/LARGE/10.1177_02698811231180276-FIG2.JPEG [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ignácio ZM, Calixto AV, da Silva RH, Quevedo J, Réus GZ (2018) The use of quetiapine in the treatment of major depressive disorder: Evidence from clinical and experimental studies. Neurosci Biobehav Rev 86:36–50. 10.1016/J.NEUBIOREV.2017.12.012 [DOI] [PubMed] [Google Scholar]
- Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, Sanislow C, Wang P (2010) Research domain criteria (RDoC): Toward a new classification framework for research on mental disorders. Am J Psychiatry 167:748–751. 10.1176/APPI.AJP.2010.09091379 [DOI] [PubMed] [Google Scholar]
- Ioannidis JPA (2005) Why most published research findings are false. PLoS Medicine 2:e124. 10.1371/journal.pmed.0020124 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ioannidis JPA, Evans SJW, Gøtzsche PC, O’Neill RT, Altman DG, Schulz K, Moher D (2004) Better reporting of harms in randomized trials: An extension of the CONSORT statement. Ann Intern Med 141:781–788. 10.7326/0003-4819-141-10-200411160-00009/ASSET/IMAGES/9TT5.JPG [DOI] [PubMed] [Google Scholar]
- Johnson MW, Griffiths RR (2017) Potential therapeutic effects of psilocybin. Neurotherapeutics 14:734–740. 10.1007/S13311-017-0542-Y/METRICS [DOI] [PMC free article] [PubMed] [Google Scholar]
- Juhasz G, Eszlari N, Pap D, Gonda X (2012) Cultural differences in the development and characteristics of depression. Neuropsychopharmacol Hung 14:259–265. 10.5706/nph201212007 [PubMed] [Google Scholar]
- Jungwirth J, von Rotz R, Dziobek I, Vollenweider FX, Preller KH (2024) Psilocybin increases emotional empathy in patients with major depression. Mol Psychiatry 2024:1–8. 10.1038/s41380-024-02875-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalali A, Bech P, Williams JBW, Kobak KA, Lipschitz J, Engelhardt N, Evans K, Olin J, Pearson J, Rothman M (2002) The new GRID HAM-D: pilot testing and international field trials. Int J Neuropsychopharmacol 5:S147 [Google Scholar]
- Kometer M, Schmidt A, Jäncke L, Vollenweider FX (2013) Activation of serotonin 2A receptors underlies the psilocybin-induced effects on α Oscillations, N170 visual-evoked potentials, and visual hallucinations. J Neurosci 33:10544–10551. 10.1523/JNEUROSCI.3007-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuypers KPC, Ng L, Erritzoe D, Knudsen GM, Nichols CD, Nichols DE, Pani L, Soula A, Nutt D (2019) Microdosing psychedelics: More questions than answers? An overview and suggestions for future research. J Psychopharmacol 33:1039–1057. 10.1177/0269881119857204/ASSET/IMAGES/LARGE/10.1177_0269881119857204-FIG1.JPEG [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lamotte S (2022) Severe depression eased by single dose of synthetic ‘magic mushroom.’CNN.https://www.cnn.com/2022/11/02/health/psilocybin-magic-mushroom-depression- wellness/index.html
- Lewin L (1888) Ueber Anhalonium Lewinii. Archiv Für Exp Pathol Pharmakol 24:401–411. 10.1007/BF01923627/METRICS [Google Scholar]
- Li Z, Ruan M, Chen J, Fang Y (2021) Major depressive disorder: Advances in neuroscience research and translational applications. Neurosci Bull 37:863–880. 10.1007/S12264-021-00638-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li NX, Hu YR, Chen WN, Zhang B (2022) Dose effect of psilocybin on primary and secondary depression: a preliminary systematic review and meta-analysis. J Affect Disord 296:26–34. 10.1016/J.JAD.2021.09.041 [DOI] [PubMed] [Google Scholar]
- Lovibond SH, Lovibond PF (1995) The structure of negative emotional states: comparison of the depression anxiety stress scales (DASS) with the beck depression and anxiety inventories. Behav Res Ther 33:335–343. 10.1016/0005-7967(94)00075-U [DOI] [PubMed]
- Mahase E (2024) MDMA assisted therapy: Three papers are retracted as FDA rejects PTSD application. BMJ 386:q1798. 10.1136/BMJ.Q1798 [DOI] [PubMed] [Google Scholar]
- Malone TC, Mennenga SE, Guss J, Podrebarac SK, Owens LT, Bossis AP, Belser AB, Agin- Liebes G, Bogenschutz MP, Ross S (2018) Individual experiences in four cancer patients following psilocybin-assisted psychotherapy. Front Pharmacol 9:335252. 10.3389/FPHAR.2018.00256/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marek S, Tervo-Clemmens B, Calabro FJ, Montez DF, Kay BP, Hatoum AS, Donohue MR, Foran W, Miller RL, Hendrickson TJ, Malone SM, Kandala S, Feczko E, Miranda-Dominguez O, Graham AM, Earl EA, Perrone AJ, Cordova M, Doyle O, … Dosenbach NUF (2022) Reproducible brain-wide association studies require thousands of individuals. Nature 603:654–660 10.1038/s41586-022-04492-9 [DOI] [PMC free article] [PubMed]
- Marschall J, Fejer G, Lempe P, Prochazkova L, Kuchar M, Hajkova K, van Elk M (2022) Psilocybin microdosing does not affect emotion-related symptoms and processing: A preregistered field and lab-based study. J Psychopharmacol 36:97–113. 10.1177/02698811211050556 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McIntyre RS, Alsuwaidan M, Baune BT, Berk M, Demyttenaere K, Goldberg JF, Gorwood P, Ho R, Kasper S, Kennedy SH, Ly-Uson J, Mansur RB, McAllister-Williams RH, Murrough JW, Nemeroff CB, Nierenberg AA, Rosenblat JD, Sanacora G, Schatzberg AF, … Maj M (2023) Treatment-resistant depression: definition, prevalence, detection, management, and investigational interventions. World Psychiatry 22:394–412. 10.1002/WPS.21120 [DOI] [PMC free article] [PubMed]
- Meling D, Ehrenkranz R, Nayak SM, Aicher HD, Funk X, van Elk M, Graziosi M, Bauer PR, Scheidegger M, Yaden DB (2024) Mind the psychedelic hype: Characterizing the risks and benefits of psychedelics for depression. Psychoactives 3:215–234. 10.3390/PSYCHOACTIVES3020014 [Google Scholar]
- Metaxa AM, Clarke M (2024) Efficacy of psilocybin for treating symptoms of depression: systematic review and meta-analysis. BMJ 385:e078084. 10.1136/BMJ-2023-078084 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michaels TI, Purdon J, Collins A, Williams MT (2018) Inclusion of people of color in psychedelic-assisted psychotherapy: A review of the literature. BMC Psychiatry 18:1–14. 10.1186/S12888-018-1824-6/TABLES/3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moncrieff J, Cooper RE, Stockmann T, Amendola S, Hengartner MP, Horowitz MA (2023) The serotonin theory of depression: a systematic umbrella review of the evidence. Mol Psychiatry 28:3243–3256. 10.1038/s41380-022-01661-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Montgomery SA, Åsberg M (1979) A new depression scale designed to be sensitive to change. Br J Psychiatry 134:382–389. 10.1192/BJP.134.4.382 [DOI] [PubMed] [Google Scholar]
- Murphy R, Kettner H, Zeifman R, Giribaldi B, Kartner L, Martell J, Read T, Murphy-Beiner A, Baker-Jones M, Nutt D, Erritzoe D, Watts R, Carhart-Harris R (2022) Therapeutic alliance and rapport modulate responses to psilocybin assisted therapy for depression. Front Pharmacol 12:788155. 10.3389/FPHAR.2021.788155/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muthukumaraswamy SD (2023) Overcoming blinding confounds in psychedelic randomized controlled trials using biomarker driven causal mediation analysis. Expert Rev Clin Pharmacol 16:1163–1173. 10.1080/17512433.2023.2279736 [DOI] [PubMed] [Google Scholar]
- Muthukumaraswamy SD, Forsyth A, Lumley T (2021) Blinding and expectancy confounds in psychedelic randomized controlled trials. Expert Rev Clin Pharmacol 14:1133–1152. 10.1080/17512433.2021.1933434 [DOI] [PubMed] [Google Scholar]
- Nayak SM, Bradley MK, Kleykamp BA, Strain EC, Dworkin RH, Johnson MW (2023) Control conditions in randomized trials of psychedelics: An ACTTION systematic review. J Clin Psychiatry 84:47000. 10.4088/JCP.22R14518 [DOI] [PubMed] [Google Scholar]
- Nejstgaard CH, Bero L, Hróbjartsson A, Jørgensen AW, Jørgensen KJ, Le M, Lundh A (2020) Association between conflicts of interest and favourable recommendations in clinical guidelines, advisory committee reports, opinion pieces, and narrative reviews: systematic review. BMJ 371 10.1136/BMJ.M4234 [DOI] [PMC free article] [PubMed]
- Nichols DE (2004) Hallucinogens. Pharmacol Ther 101:131–181. 10.1016/J.PHARMTHERA.2003.11.002 [DOI] [PubMed] [Google Scholar]
- Noorani T, Garcia-Romeu A, Swift TC, Griffiths RR, Johnson MW (2018) Psychedelic therapy for smoking cessation: Qualitative analysis of participant accounts. J Psychopharmacol 32:756–769. 10.1177/0269881118780612 [DOI] [PubMed] [Google Scholar]
- Ortiz CE, Dourron HM, Sweat NW, Garcia-Romeu A, MacCarthy S, Anderson BT, Hendricks PS (2022) Special considerations for evaluating psilocybin-facilitated psychotherapy in vulnerable populations. Neuropharmacology 214:109127. 10.1016/J.NEUROPHARM.2022.109127 [DOI] [PubMed] [Google Scholar]
- Osmond H (1957) A review of the clinical effects of psychotomimetic agents. Ann N Y Acad Sci 66:418–434. 10.1111/J.1749-6632.1957.TB40738.X [DOI] [PubMed] [Google Scholar]
- Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl E A, Brennan SE, Chou R, Glanville J, Grimshaw JM, Hróbjartsson A, Lalu MM, Li T, Loder EW, Mayo-Wilson E, McDonald S, … Moher D (2021) The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. PLoS Med 18(3):1–15 10.1371/JOURNAL.PMED.1003583 [DOI] [PMC free article] [PubMed]
- Pappa S, Shah M, Young S, Anwar T, Background TM (2024) Care pathways, prescribing practices and treatment outcomes in major depressive disorder and treatment-resistant depression: retrospective, population-based cohort study. Bjpsych Open 10:e32. 10.1192/BJO.2023.627 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perez N, Langlest F, Mallet L, De Pieri M, Sentissi O, Thorens G, Seragnoli F, Zullino D, Kirschner M, Kaiser S, Solmi M, Sabé M (2023) Psilocybin-assisted therapy for depression: A systematic review and dose-response meta-analysis of human studies. Eur Neuropsychopharmacol 76:61–76. 10.1016/J.EURONEURO.2023.07.011 [DOI] [PubMed] [Google Scholar]
- Pizer JH, Aita SL, Myers MA, Hawley NA, Ikonomou VC, Brasil KM, Hernandez KA, Pettway EC, Owen T, Borgogna NC, Smitherman TA, Hill BD (2024) Neuropsychological function in migraine headaches. Neurology 102(4):e208109. 10.1212/WNL.0000000000208109/SUPPL_FILE/SUPPLEMENTARY_DATA1.PDF [DOI] [PubMed] [Google Scholar]
- Preller KH, Vollenweider FX (2016) Phenomenology, Structure, and Dynamic of Psychedelic States. In Behavioral Neurobiology of Psychedelic Drugs (pp. 221–256). Springer Berlin Heidelberg. 10.1007/7854_2016_459
- Price J, Cole V, Doll H, Goodwin GM (2012) The Oxford questionnaire on the emotional side- effects of antidepressants (OQuESA): Development, validity, reliability and sensitivity to change. J Affect Disord 140:66–74. 10.1016/J.JAD.2012.01.030 [DOI] [PubMed] [Google Scholar]
- Raison CL, Sanacora G, Woolley J, Heinzerling K, Dunlop BW, Brown RT, Kakar R, Hassman M, Trivedi RP, Robison R, Gukasyan N, Nayak SM, Hu X, O’Donnell KC, Kelmendi B, Sloshower J, Penn AD, Bradley E, Kelly DF, … Griffiths RR (2023) Single-dose psilocybin treatment for major depressive disorder: A randomized clinical trial. JAMA 330:843–853. 10.1001/JAMA.2023.14530 [DOI] [PMC free article] [PubMed]
- Reid JG, Gitlin MJ, Altshuler LL (2013) Lamotrigine in psychiatric disorders. J Clin Psychiatry 74:10187. 10.4088/JCP.12R08046 [DOI] [PubMed] [Google Scholar]
- Rhee TG, Davoudian PA, Sanacora G, Wilkinson ST (2023) Psychedelic renaissance: Revitalized potential therapies for psychiatric disorders. Drug Discovery Today 28:103818. 10.1016/J.DRUDIS.2023.103818 [DOI] [PubMed] [Google Scholar]
- Rosenblat JD, Meshkat S, Doyle Z, Kaczmarek E, Brudner RM, Kratiuk K, Mansur RB, Schulz- Quach C, Sethi R, Abate A, Ali S, Bawks J, Blainey MG, Brietzke E, Cronin V, Danilewitz J, Dhawan S, Di Fonzo A, Di Fonzo M, … McIntyre RS (2024) Psilocybin-assisted psychotherapy for treatment resistant depression: A randomized clinical trial evaluating repeated doses of psilocybin. Med 5:190–200.e5. 10.1016/j.medj.2024.01.005 [DOI] [PubMed]
- Ross S, Bossis A, Guss J, Agin-Liebes G, Malone T, Cohen B, Mennenga SE, Belser A, Kalliontzi K, Babb J, Su Z, Corby P, Schmidt BL (2016) Rapid and sustained symptom reduction following psilocybin treatment for anxiety and depression in patients with life-threatening cancer: A randomized controlled trial. J Psychopharmacol 30:1165–1180. 10.1177/0269881116675512/ASSET/IMAGES/LARGE/10.1177_0269881116675512-FIG7.JPEG [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rucker JJ (2023) Evidence versus expectancy: the development of psilocybin therapy. BJPsych Bull 48(2):110–117. 10.1192/BJB.2023.28 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rush AJ, Trivedi MH, Ibrahim HM, Carmody TJ, Arnow B, Klein DN, Markowitz JC, Ninan PT, Kornstein S, Manber R, Thase ME, Kocsis JH, Keller MB (2003) The 16-Item quick inventory of depressive symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS- SR): a psychometric evaluation in patients with chronic major depression. Biol Psychiat 54:573–583. 10.1016/S0006-3223(02)01866-8 [DOI] [PubMed] [Google Scholar]
- Savage C, Mccabe OL (1973) Residential psychedelic (LSD) therapy for the narcotic addict: A controlled study. Arch Gen Psychiatry 28:808–814. 10.1001/ARCHPSYC.1973.01750360040005 [DOI] [PubMed] [Google Scholar]
- Schenberg EE (2018) Psychedelic-assisted psychotherapy: A paradigm shift in psychiatric research and development. Front Pharmacol 9:323606. 10.3389/FPHAR.2018.00733/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smigielski L, Scheidegger M, Kometer M, Vollenweider FX (2019) Psilocybin-assisted mindfulness training modulates self-consciousness and brain default mode network connectivity with lasting effects. Neuroimage 196:207–215. 10.1016/J.NEUROIMAGE.2019.04.009 [DOI] [PubMed] [Google Scholar]
- Sterne JAC, Savović J, Page MJ, Elbers RG, Blencowe NS, Boutron I, Cates CJ, Cheng HY, Corbett MS, Eldridge SM, Emberson JR, Hernán MA, Hopewell S, Hróbjartsson A, Junqueira DR, Jüni P, Kirkham JJ, Lasserson T, Li T, … Higgins JPT (2019) RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ 366. 10.1136/BMJ.L4898 [DOI] [PubMed]
- Strauss D, De La Salle S, Sloshower J, Williams MT (2022) Research abuses against people of colour and other vulnerable groups in early psychedelic research. J Med Ethics 48:728–737. 10.1136/MEDETHICS-2021-107262 [DOI] [PubMed] [Google Scholar]
- Strickland JC, Johnson MW (2022) Human behavioral pharmacology of psychedelics. In J. Li (Ed.), Behavioral Pharmacology of Drug Abuse: Current Status (pp. 105–132). Academic Press. 10.1016/bs.apha.2021.10.003 [DOI] [PubMed]
- Swanson LR (2018) Unifying theories of psychedelic drug effects. Front Pharmacol 9:344412. 10.3389/FPHAR.2018.00172/BIBTEX [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taillefer de Laportalière T, Jullien A, Yrondi A, Cestac P, Montastruc F (2023) Reporting of harms in clinical trials of esketamine in depression: a systematic review. Psychol Med 53(10):4305–4315. 10.1017/S0033291723001058 [DOI] [PubMed] [Google Scholar]
- van Elk M, Fried EI (2023) History repeating: guidelines to address common problems in psychedelic science. Ther Adv Psychopharmacol 13:20451253231198464. 10.1177/20451253231198466 [DOI] [PMC free article] [PubMed] [Google Scholar]
- von Rotz R, Schindowski EM, Jungwirth J, Schuldt A, Rieser NM, Zahoranszky K, Seifritz E, Nowak A, Nowak P, Jäncke L, Preller KH, Vollenweider FX (2023) Single-dose psilocybin-assisted therapy in major depressive disorder: A placebo-controlled, double-blind, randomised clinical trial. The Lancet 56:101809. 10.1016/j.eclinm.2022.101809 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wampold BE (2015) How important are the common factors in psychotherapy? An Update. World Psychiatry 14(3):270–277. 10.1002/WPS.20238 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wizła M, Kraus SW, Lewczuk K (2022) Perspective: Can psychedelic-assisted therapy be a promising aid in compulsive sexual behavior disorder treatment? Compr Psychiatry 115:152303. 10.1016/J.COMPPSYCH.2022.152303 [DOI] [PubMed] [Google Scholar]
- Yaden DB, Potash JB, Griffiths RR (2022) Preparing for the bursting of the psychedelic hype bubble. JAMA Psychiat 79:943–944. 10.1001/JAMAPSYCHIATRY.2022.2546 [DOI] [PubMed] [Google Scholar]
- Yu CL, Liang CS, Yang FC, Tu YK, Hsu CW, Carvalho AF, Stubbs B, Thompson T, Tsai CK, Yeh TC, Yang SN, Shin JI, Chu CS, Tseng PT, Su KP (2022) Trajectory of antidepressant effects after single-or two-dose administration of psilocybin: A systematic review and multivariate meta-analysis. J Clin Med 11:938. 10.3390/JCM11040938/S1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeier Z, Carpenter LL, Kalin NH, Rodriguez CI, McDonald WM, Widge AS, Nemeroff CB (2018) Clinical implementation of pharmacogenetic decision support tools for antidepressant drug prescribing. Am J Psychiatry 175:873–886. 10.1176/appi.ajp.2018.17111282 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeifman RJ, Wagner AC, Monson CM, Carhart-Harris RL (2023) How does psilocybin therapy work? An exploration of experiential avoidance as a putative mechanism of change. J Affect Disord 334:100–112. 10.1016/J.JAD.2023.04.105 [DOI] [PubMed] [Google Scholar]
- Zigmond AS, Snaith RP (1983) The hospital anxiety and depression scale. Acta Psychiatr Scand 67:361–370. 10.1111/J.1600-0447.1983.TB09716.X [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
CMA data sheets are available on the Open Science Framework https://osf.io/p3fcx/.
Data is available from the first author upon request.

