FORWARD-4 (ClinicalTrials.gov ID: NCT02158533) and FORWARD-5 (NCT02218008) were two global, phase III, multicenter, randomized, double-blind, placebo-controlled, SPCD studies conducted at 54 and 57 sites, respectively, evaluating BUP/SAM plus continued ADT. The studies were identical in design (Fig. 1), except for the timing and requirement of the safety follow-up visit. Both studies evaluated BUP/SAM 2 mg/2 mg dose. In addition, FORWARD-4 evaluated a 0.5 mg/0.5 mg dose and FORWARD-5 a 1 mg/1 mg dose. Treatment durations were 5 weeks for stage 1 and 6 weeks for stage 2. The same inclusion/exclusion criteria for eligibility, efficacy assessments, and frequency of treatment visits were utilized in both studies (see Supplementary Information for additional details). Both studies utilized enhanced blinding (masking) in which the overall study design, criteria for randomization, and points of randomization were blinded to the site investigators, study staff, and patients. Site investigators and study staff, including the statisticians, were blinded until database was locked for the studies. The sponsor designed the trial in collaboration with the authors and conducted the data analyses according to a statistical analysis plan (see Section Statistical analysis).

Fig. 1 FORWARD-4 and FORWARD-5 study design. ADT antidepressant therapy; BUP buprenorphine; SAM samidorphan. Full size image

The study protocols were reviewed by an independent ethics committee or institutional review board at each site and conducted following the principles of Good Clinical Practice derived from the Declaration of Helsinki, and in accordance with local regulations and International Council of Harmonization guidelines.

Patients

Female and male patients aged 18–70 years were eligible if they met Diagnostic and Statistical Manual for Mental Disorders, Fourth Edition, Text Revision criteria for MDD, their current major depressive episode (MDE) lasted 8 weeks to 24 months, and they experienced an inadequate response to one or two ADTs during the current MDE. Inadequate ADT response was defined as <50% reduction in symptom severity with an adequate antidepressant dose of an FDA-approved ADT for ≥8 weeks (including up to 3 weeks for titration into the adequate dose range and stable for ≥4 weeks), assessed by the Massachusetts General Hospital Antidepressant Treatment Response Questionnaire [31]. Inadequate responses were verified by remote raters reviewing historic records and/or prospectively collected response data. Additional study design details are described in the Supplementary information.

Randomization and treatment stages

The hallmark of SPCD is the presence of two double-blind, placebo-controlled stages. In FORWARD-4 and FORWARD-5, patients entering stage 1 were randomized (2:2:9) to receive BUP/SAM 2 mg/2 mg, BUP/SAM low-dose (0.5 mg/0.5 mg or 1 mg/1 mg), or placebo administered as a once-daily sublingual tablet for 5 weeks with continued ADT (SSRI, SNRI, or bupropion) (Fig. 1). Patients assigned to BUP/SAM 2 mg/2 mg initiated treatment with the following 1-week blinded titration period: BUP/SAM 0.5 mg/0.5 mg for the first 3 days, BUP/SAM 1 mg/1 mg on days 4–7, and BUP/SAM 2 mg/2 mg thereafter. Patients assigned to BUP/SAM 1 mg/1 mg initiated treatment with BUP/SAM 0.5 mg/0.5 mg for the first 3 days and BUP/SAM 1 mg/1 mg thereafter. Patients assigned the 0.5 mg/0.5 mg dose did not undergo titration.

At the conclusion of stage 1, patients receiving placebo were blindly determined to be nonresponders if they had a Montgomery–Åsberg Depression Rating Scale (MADRS)-10 [32] score >15 at week 5 and a <50% reduction in MADRS-10 score from baseline to week 5. Placebo nonresponders were then rerandomized in stage 2 in 1:1:1 ratio to BUP/SAM 2 mg/2 mg, low-dose BUP/SAM, or placebo for a 6-week treatment period. Placebo responders from stage 1 remained on placebo in stage 2. Patients receiving BUP/SAM in stage 1 remained on the same dose during stage 2 (Fig. 1).

Patients could enter a long-term safety study of BUP/SAM (ClinicalTrials.gov ID: NCT02141399) immediately after the last treatment visit (FORWARD-5) or the safety follow-up visit (FORWARD-4). For both studies, treatment was stopped without dose tapering.

Efficacy assessments

Efficacy assessments included two scores derived from the MADRS. MADRS-10 was the sum of all ten-items in the MADRS. MADRS-6 was the sum of six MADRS items representing core MDD symptoms as per Bech recommendations [33] (see Supplementary information). The MADRS was administered weekly. In both studies, longitudinal data were analyzed in statistical models to estimate change from baseline in MADRS score at each timepoint for the difference between BUP/SAM versus placebo within each stage. These estimates were averaged across the two stages using prespecified equal weights. Using these combined-stage estimates, the primary endpoint in FORWARD-4 was change from baseline to week 5 in MADRS-10. In FORWARD-5, three primary endpoints were defined hierarchically and tested sequentially to control Type 1 error for multiplicity as prespecified in the protocol and statistical analysis plan. Two of these endpoints analyzed the average difference between BUP/SAM and placebo in change from baseline to week 3 through end-of-treatment (EOT) using MADRS-6 and MADRS-10 scores. The third primary endpoint used change from baseline to EOT. To control for multiplicity due to multiple BUP/SAM doses, hypothesis tests were conducted in a prespecified order and a fixed sequence where the tests comparing BUP/SAM 2 mg/2 mg to placebo were conducted prior to those comparing lower BUP/SAM doses to placebo. These endpoints were also evaluated in FORWARD-4 as part of post hoc analysis.

Secondary efficacy endpoints for both studies included MADRS response (≥50% reduction in MADRS-10 score from baseline to week 5 in FORWARD-4 and EOT in FORWARD-5) and remission (MADRS-10 score ≤10 at week 5 and EOT, respectively).

An additional endpoint was the change in the clinician-rated Hamilton Anxiety Rating Scale (HAM-A) [34].

Safety assessments

Safety and tolerability were assessed in all randomized patients who received at least one dose of study drug, based on adverse events (AEs), clinical laboratory parameters, and echocardiogram parameters. AEs of special interest (AESI) were assessed to evaluate abuse potential, dependence, and withdrawal, and AEs associated with suicidal ideation and/or behavior, sexual dysfunction, and hypomania/mania (see Supplementary information). Objective assessment of withdrawal used the clinical opiate withdrawal scale (COWS). Suicidal ideation and behavior was assessed at each visit with the Columbia-Suicide Severity Rating Scale (C-SSRS).

Statistical analysis

Mixed models for repeated measures were used to assess change from baseline at each timepoint for all treatment arms as well as the BUP/SAM versus placebo difference during both stages of the SPCD study. As specified, some endpoints were based on single timepoint estimates and some on the average of estimates from multiple timepoints. Models included fixed effect variables for treatment group; visit; treatment group-by-visit interaction; site region; and site region-by-treatment interaction as categorical fixed effects, and baseline value and baseline-by-visit interaction as covariates. Random effects associated with patients were included as part of the marginal covariance matrix (specified as unstructured) as recommended for longitudinal data with continuous outcomes. Primary analyses were based on weighted combined-stage analysis using equal weights for BUP/SAM versus placebo difference derived from stage-specific models. Type 1 error due to multiplicity was controlled by testing each hypothesis (two-sided alpha = 0.05) in a prespecified fixed sequence for the primary endpoints. Efficacy analyses were performed using all randomized patients who received at least one dose of study drug and had at least one postbaseline MADRS measurement in the given stage. All statistical analyses on efficacy were conducted using SAS v9.4 (SAS Institute, Inc., Cary, NC). Sample size and power calculations were conducted using SAS v9.3 (SAS Institute, Inc., Cary, NC). A full description of the statistical analysis is provided in the Supplementary information.

In post hoc analyses, effect sizes (Hedges’ g) were calculated for each stage of the SPCD trial and for the stages combined as described in the Supplementary information.

The pooled analysis plan was prespecified following unblinding of FORWARD-4 and before unblinding of FORWARD-5, and utilized the same endpoints. The pooled efficacy analysis population included placebo and BUP/SAM 2 mg/2 mg treatment groups.

All safety assessments were summarized using descriptive statistics for each stage in the individual study and pooled safety populations.