Conclusions Neither paroxetine nor high dose imipramine showed efficacy for major depression in adolescents, and there was an increase in harms with both drugs. Access to primary data from trials has important implications for both clinical practice and research, including that published conclusions about efficacy and safety should not be read as authoritative. The reanalysis of Study 329 illustrates the necessity of making primary trial data and protocols available to increase the rigour of the evidence base.

Results The efficacy of paroxetine and imipramine was not statistically or clinically significantly different from placebo for any prespecified primary or secondary efficacy outcome. HAM-D scores decreased by 10.7 (least squares mean) (95% confidence interval 9.1 to 12.3), 9.0 (7.4 to 10.5), and 9.1 (7.5 to 10.7) points, respectively, for the paroxetine, imipramine and placebo groups (P=0.20). There were clinically significant increases in harms, including suicidal ideation and behaviour and other serious adverse events in the paroxetine group and cardiovascular problems in the imipramine group.

Main outcome measures The prespecified primary efficacy variables were change from baseline to the end of the eight week acute treatment phase in total Hamilton depression scale (HAM-D) score and the proportion of responders (HAM-D score ≤8 or ≥50% reduction in baseline HAM-D) at acute endpoint. Prespecified secondary outcomes were changes from baseline to endpoint in depression items in K-SADS-L, clinical global impression, autonomous functioning checklist, self-perception profile, and sickness impact scale; predictors of response; and number of patients who relapse during the maintenance phase. Adverse experiences were to be compared primarily by using descriptive statistics. No coding dictionary was prespecified.

Objectives To reanalyse SmithKline Beecham’s Study 329 (published by Keller and colleagues in 2001), the primary objective of which was to compare the efficacy and safety of paroxetine and imipramine with placebo in the treatment of adolescents with unipolar major depression. The reanalysis under the restoring invisible and abandoned trials (RIAT) initiative was done to see whether access to and reanalysis of a full dataset from a randomised controlled trial would have clinically relevant implications for evidence based medicine.

Abstract

The first RIAT trial publication was a surgery trial that had been only partly published before. 7 Few previously published randomised controlled trials have ever been subsequently reported in published papers by different teams of authors. 8

Study 329 was a multicentre eight week double blind randomised controlled trial (acute phase), followed by a six month continuation phase. SKB’s stated primary objective was to examine the efficacy and safety of imipramine and paroxetine compared with placebo in the treatment of adolescents with unipolar major depression. Secondary objectives were to identify predictors of treatment outcomes across clinical subtypes; to provide information on the safety profile of paroxetine and imipramine when these drugs were given for “an extended period of time”; and to estimate the rate of relapse among patients who responded to imipramine, paroxetine, and placebo and were maintained on treatment. Study enrolment took place between April 1994 and March 1997.

On 14 June 2013, the RIAT researchers asked GSK whether it had any intention to restore any of the trials it sponsored, including Study 329. GSK did not signal any intent to publish a corrected version of any of its trials. In later correspondence, GSK stated that the study by Keller and colleagues “accurately reflects the honestly-held views of the clinical investigator authors” and that GSK did “not agree that the article is false, fraudulent or misleading.” 6

The current article represents a RIAT publication of Study 329. The original study was funded by SmithKline Beecham (SKB; subsequently GlaxoSmithKline, GSK). We acknowledge the work of the original investigators. This double blinded randomised controlled trial to evaluate the efficacy and safety of paroxetine and imipramine compared with placebo for adolescents diagnosed with major depression was reported in the Journal of the American Academy of Child and Adolescent Psychiatry (JAACAP) in 2001, with Martin Keller as the primary author. 2 The RIAT researchers identified Study 329 as an example of a misreported trial in need of restoration. The article by Keller and colleagues, which was largely ghostwritten, 3 claimed efficacy and safety for paroxetine that was at odds with the data. 4 This is problematic because the article has been influential in the literature supporting the use of antidepressants in adolescents. 5

In 2013, in the face of the selective reporting of outcomes of randomised controlled trials, an international group of researchers called on funders and investigators of abandoned (unpublished) or misreported trials to publish undisclosed outcomes or correct misleading publications. 1 This initiative was called “restoring invisible and abandoned trials” (RIAT). The researchers identified many trials requiring restoration and emailed the funders, asking them to signal their intention to publish the unpublished trials or publish corrected versions of misreported trials. If funders and investigators failed to undertake to correct a trial that had been identified as unpublished or misreported, independent groups were encouraged to publish an accurate representation of the clinical trial based on the relevant regulatory information.

Methods

We reanalysed the data from Study 329 according to the RIAT recommendations. To this end, we used the clinical study report (SKB’s “final clinical report”), including appendices A-G, which are publically available on the GSK website,9 other publically available documents,10 and the individual participant data accessed through SAS Solutions OnDemand website,11 on which GSK subsequently also posted some Study 329 documents (available only to users approved by GSK). After negotiation,12 GSK posted about 77 000 pages of de-identified individual case report forms (appendix H) on that website. We used a tool for documenting the transformation from regulatory documents to journal publication, based on the CONSORT 2010 checklist of information to include when reporting a randomised trial. The audit record, including a table of sources of data consulted in preparing each part of this paper, is available in appendix 1.

Except where indicated, in accordance with RIAT recommendations, our methods are those set out in the 1994-96 protocol for Study 329.13 In cases when the methods used and published by Keller and colleagues diverged from the protocol, we followed the original protocol. Because the protocol specified method of correction for missing values—last observation carried forward—has been questioned in the intervening years, we also included a more modern method—multiple imputation—at the request of the BMJ peer reviewers. This is a post hoc method added for comparison only and is not part of our formal reanalysis. When the protocol was not specific, we chose by consensus standard methods that best presented the data. The original 1993 protocol had minor amendments in 1994 and 1996 (replacement of the Schedule for Affective Disorders and Schizophrenia for Adolescents-Present Version with the Lifetime Version (K-SADS-L) and reduction in required sample size). Furthermore, the clinical study report (CSR) reported some procedures that varied from those specified in the protocol. We have noted variations that we considered relevant.

Participants The original study recruited 275 adolescents aged 12-18 who met DSM-IV criteria14 for a current episode of major depression of at least eight weeks’ duration (the protocol specified DSM-III-R criteria, which are similar). Box 1 lists the eligibility criteria. Box 1: Study eligibility criteria Inclusion criteria Adolescents aged 12-18 who met DSM-III-R criteria for major depression for at least 8 weeks

Severity score <60 on the children’s global assessment scale (CGAS)

Score ≥12 on the Hamilton depression scale (17 item) (HAM-D)

Medically healthy

IQ ≥80 (based on Peabody picture vocabulary test) Exclusion criteria Current or past DSM-III-R diagnosis of bipolar disorder, schizoaffective disorder, anorexia nervosa, bulimia, alcohol or drug abuse/dependence, obsessive-compulsive disorder, autism/pervasive mental disorder, or organic psychiatric disorder

Current (within 12 months) DSM-III-R diagnosis of post-traumatic stress disorder

Adequate trial of an antidepressant within six months (at least four weeks’ treatment with an adequate dose of antidepressant)

Suicidal ideation with a definite plan, suicide attempt during current depressive episode, or history of suicide attempt by drug overdose

Medical illness that contraindicates the use of heterocyclic antidepressants

Current use of psychotropic drugs (including anxiolytics, antipsychotics, mood stabilisers), or illicit drugs

Organic brain disease, epilepsy, or “mental retardation”

Pregnancy or lactation

Sexually active females not using reliable contraception

Use of an investigational drug within previous 30 days or five half lives of the investigation drug An unknown number of patients (not disclosed in the available documents) identified by telephone screening as potential participants were subsequently evaluated at the study site by a senior clinician (psychiatrist or psychologist). Multiple meetings and teleconferences were held by the sponsoring company with site study investigators to ensure standardisation across sites. Patients and parents were interviewed separately with the K-SADS-L. After this initial assessment, the patient and parent both signed the study informed consent form; there was no mention of a separate assent form in the protocol or in the CSR. A screening period of seven to ten days was used to obtain past clinical records and to document that the depressive symptoms were stable. At the end of the screening period, only patients continuing to meet the inclusion criteria (DSM-III-R major depression and the HAM-D total score ≥12) were randomised. There was no placebo lead-in phase. There were originally six study sites, but this was increased to 12 (10 in the United States and two in Canada). The centres were affiliated with either a university or a hospital psychiatry department and had experience with adolescent patients. The investigators were selected for their interest in the study and their ability to recruit study patients. The recruitment period ran from 20 April 1994 until 15 March 1997, and the acute phase was completed on 7 May 1997. In a small number of patients, 30 day follow-up data for cases that went into the continuation phase were collected into February 1998.

Patient involvement So far as we can ascertain, there was no patient involvement in SKB’s study design.

Interventions The study drug was provided to patients in weekly blister packs. Patients were instructed to take the drug twice daily. There were six dosing levels. Over the first four weeks, all patients were titrated to level four, corresponding to 20 mg paroxetine or 200 mg imipramine, regardless of response. Non-responders (those failing to reach responder criteria) could be titrated up to level five or six over the next four weeks. This corresponds to maximum doses of 60 mg paroxetine and 300 mg imipramine. Compliance with treatment was evaluated from the number of capsules dispensed, taken, and returned. Non-compliance was defined as taking less than 80% or more than 120% of the number of capsules, assessed from the numbers expected to be returned at two consecutive visits, and resulted in withdrawal. Any patient missing two consecutive visits was also withdrawn from the study. Patients were provided with 45 minute weekly sessions of supportive psychotherapy,15 primarily for the purpose of assessing the effects of treatment.

Sample size The acute phase of the trial was initially based on a power analysis that indicated that a sample size of 100 patients per treatment group was required to have a statistical power of 80% for a two tailed α of 0.05 and an effect size of 0.40. This effect size entailed a difference of 4 in the HAM-D total score from baseline to endpoint, specified in the protocol to be large enough to be clinically meaningful, considering a standard deviation of 10. No allowance was made in the power calculation for attrition (anticipated dropout rate) or non-compliance during the study. Recruitment was slower than expected, and reportedly supplies of treatment (mainly placebo) ran short due to exceeding the expiry date. The researchers carried out a midcourse evaluation of 189 patients, without breaking the blinding, which showed less variability in HAM-D scores (SD 8) than expected. Therefore the recruitment target was reduced to 275 on the grounds that it would have no negative impact on the estimated 80% power required to detect a 4 point difference between placebo and active drug groups.

Randomisation A computer generated randomisation list of 360 numbers for the acute phase was generated and held by SKB. According to the CSR, treatments were balanced in blocks of six consecutive patients; however, there is an inconsistency in that appendix A randomisation code details block sizes of both six and eight. Each investigator was allocated a block of consecutively numbered treatment packs, and patients were assigned treatment numbers in strict sequential order. Patients were randomised in a 1:1:1 ratio to treatment with paroxetine, imipramine, or placebo.

Blinding Paroxetine was supplied as film coated, capsule shaped yellow (10 mg) and pink (20 mg) tablets. Imipramine (50 mg) was bought commercially and supplied as green film coated round 50 mg tablets. “Paroxetine placebos” matched the paroxetine 20 mg tablets, and “imipramine placebos” matched the imipramine tablets. All tablets were over-encapsulated in bluish-green capsules to preserve blinding. The blinding was to be broken only in the event of a serious adverse event that the investigator thought could not be adequately treated without knowing the identity of the allocated study treatment. The identity of the study treatment was not otherwise to be disclosed to the investigator or SKB staff associated with the study.

Outcomes Patients were evaluated weekly for the following outcome variables during the eight week duration of the acute treatment phase. Primary efficacy variables The prespecified primary efficacy variables were change in total score on HAM-D16 from the beginning of the treatment phase to the endpoint of the acute phase and the proportion of responders at the end of the eight week acute treatment phase (longer than many antidepressant trials). Responders were defined as patients who had ≥50% reduction in the HAM-D or a HAM-D score of ≤8. (Scores on the HAM-D can vary from 0 to 52.) Secondary efficacy variables The prespecified secondary efficacy variables were: Changes from baseline to endpoint in: Depression items in K-SADS-L Clinical global impression (CGI) Autonomous functioning checklist17 Self perception profile Sickness impact scale.

Predictors of response (endogenous subtypes, age, previous episodes, duration and severity of present episode, comorbidity with separate anxiety, attention deficit, and conduct disorder)

The number of patients who relapsed during the maintenance phase (referred to in the CSR and in this paper as “continuation phase”). Both before and after breaking the blind, however, the sponsors made changes to the secondary outcomes as previously detailed.4 We could not find any document that provided any scientific rationale for these post hoc changes,18 and the outcomes are therefore not reported in this paper.

Challenges in carrying out RIAT To our knowledge this is the first RIAT analysis of a misreported trial by an external team of authors, so there are no clear precedents or guides. Challenges we have encountered included: Potential or perceived bias A RIAT report is not intended to be a critique of a previous publication. The point is rather to produce a thorough independent analysis of a trial that has remained unpublished or called into question. We acknowledge, however, that any RIAT team might be seen as having an intrinsic bias in that questioning the earlier published conclusions is what brought some members of the team together. Consequently, we took all appropriate procedural steps to avoid such putative bias. In addition, we have made the data available for others to analyse. Correction for testing multiple variables We had multiple sources of information: the protocol; the published paper; the documents posted on the GSK website including the CSR and individual patient data; and the raw primary data in the case report forms provided by GSK on a remote desktop for this project. The protocol declared two primary and six secondary variables for the three treatment groups in two differing datasets (observed case and last observation carried forward). The CSR contained statistical comparisons on 28 discrete variables using two comparisons (paroxetine v placebo and imipramine v placebo) in the two datasets (observed case and last observation carried forward). The published paper listed eight variables with two statistical comparisons each in one dataset (last observation carried forward). The authors of the original paper, however, did not deal with the need for corrections for multiple variables—a standard requirement when there are multiple outcome measures. In the final analysis, there were no statistically or clinically significant findings for any outcome variable, so corrections were not needed for this analysis. Statistical testing The protocol called for ANOVA testing (generalised linear model) for continuous variables using a model that included the effects of site, treatment, and site × treatment interaction, with the latter dropped if P≥0.10. Logistical regression (2×3 χ2) was prescribed for categorical variables under the same model. Both methods begin with an omnibus statistic for the overall significance of the dataset, then progress to pairwise testing if, and only if, the omnibus statistic meets α=0.05. Yet all statistical outcomes in the CSR and published paper were reported only as the pairwise values for only two of the three possible comparisons (paroxetine v placebo and imipramine v placebo), with no mention of the omnibus statistic. Therefore, we conducted the required omnibus analyses, with negative results as shown. The pairwise values are available in table A in appendix 2. Missing values The protocol called for evaluation of the observed case and last observation carried forward datasets, with the latter being definitive. The last observation carried forward method for correcting missing values was the standard at the time the study was conducted. It continues to be widely used, although newer models such as multiple imputation or mixed models are superior. We chose to adhere to the protocol and use the last observation carried forward method, including multiple imputation for comparison only. Outcome variables not specified in protocol There were four outcome variables in the CSR and in the published paper that were not specified in the protocol. These were the only outcome measures reported as significant. They were not included in any version of the protocol as amendments (despite other amendments), nor were they submitted to the institutional review board. The CSR (section 3.9.1) states they were part of an “analysis plan” developed some two months before the blinding was broken. No such plan appears in the CSR, and we have no contemporaneous documentation of that claim, despite having repeatedly requested it from GSK. Conclusions We decided that the best and most unbiased course of action was to analyse the efficacy data in the individual patient data based on the last guaranteed a priori version of SKB’s own protocol (1994, amended in 1996 to accept a reduced sample size). Although the protocol omitted a discussion of corrections that we would have thought necessary, correction for multiple variables is designed to prevent false positives and there were no positives. We agreed with the statistical mandates of the protocol, but though we regarded pairwise comparisons in the absence of overall significance as inappropriate, we recognise that this is not a universal opinion, so we included the data in table A in appendix 2. Finally, although investigators can explore the data however they want, additional outcome variables outside those in the protocol cannot be legitimately declared once the study is underway, except as “exploratory variables”—appropriate for the discussion or as material for further study but not for the main analysis. The a priori protocol and blinding are the bedrock of a randomised controlled trial, guaranteeing that there is not even the possibility of the HARK phenomenon (“hypothesis after results known”). Though we can readily show that none of the reportedly “positive” four non-protocol outcome variables stands up to scrutiny, the primary mandate of the RIAT enterprise is to reaffirm essential practices in randomised controlled trials, so we did not include these variables in our efficacy analysis.

Harm endpoints An adverse experience/event was defined in the protocol (page 18) as “any noxious, pathologic or unintended change in anatomical, physiologic or metabolic functions as indicated by physical signs, symptoms and/or laboratory changes occurring in any phase of the clinical trial whether associated with drug or placebo and whether or not considered drug related. This includes an exacerbation of pre-existing conditions or events, intercurrent illnesses, drug interaction or the significant worsening of the disease under investigation that is not recorded elsewhere in the case report form under specific efficacy assessments.” Adverse events were to be elicited by the investigator asking a non-leading question such as: “Do you feel different in any way since starting the new treatment/the last assessment?” Details of adverse events that emerged with treatment, their severity, including any change in study drug administration, investigator attribution to study drug, any corrective therapy given, and outcome status were documented. Attribution or relation to study drug was judged by the investigator to be “unrelated,” “probably unrelated,” “possibly related,” “probably related,” or “related.” Vital signs and electrocardiograms were obtained at weekly visits. Patients with potentially concerning cardiovascular measures either had their drug dose reduced or were withdrawn from the study. In addition, if the combined serum concentrations (obtained at weeks four and eight) of imipramine and desipramine exceeded 500 µg/mL the patient was to be withdrawn from the study. Clinical laboratory tests, including clinical chemistry, haematology, and urinalysis, were carried out at the screening visit and at the end of week eight. Clinically relevant laboratory abnormalities were to be included as adverse events.

Source of harms data The harms data in this paper cover the acute phase, a taper period, and a follow-up phase of up to 30 days for those who discontinued treatment because of adverse events. To ensure comparability with the report by Keller and colleagues, none of the tables contains data from the continuation phase. Data on adverse events come from the CSR lodged on GSK’s website,19 primarily in appendix D. Appendix B provides details of concomitant drugs. Additional information was available from the summary narratives in the body of the CSR for patients who had adverse events that were designated as serious or led to withdrawal. (Of the 11 patients taking paroxetine who experienced adverse events designated as serious, nine discontinued treatment because of these events.) The large number of other patients discontinued because of adverse events that were not regarded as serious, or discontinued for lack of efficacy or protocol violations, however, did not generate patient narratives. The tables in appendix D of the CSR provide the verbatim terms used by the blinded investigators, along with preferred terms as coded by SKB using the adverse drug events coding system (ADECS) dictionary. Appendix D also includes ratings of severity and ratings of relatedness. We used the Medical Dictionary for Regulatory Activities (MedDRA) to code the verbatim terms provided in appendix D in the CSR. MedDRA terminology is the international medical terminology developed under the auspices of the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) www.meddra.org), endorsed by the FDA, and now used by GSK.20 Several limitations of the ADECS coded preferred terms provided in appendix D of the CSR became clear when we examined the ADECS preferred terms assigned to the verbatim terms. Firstly, several verbatim terms had been left uncoded into ADECS. Secondly, several adverse events found in the patient narratives of serious adverse events that led to discontinuation from the trial were not transcribed into appendix D. We therefore approached GSK for access to case report forms (appendix H of the CSR), which are not publically available. GSK made available all 275 case report forms for patients entered into Study 329. These forms, however, which totalled about 77 000 pages, were available only through a remote desktop facility (SAS Solutions OnDemand Secure Portal),11 which made it difficult and extremely time consuming to inspect the records properly.21 Effectively only one person could undertake the task, with backup for ambiguous cases. Accordingly we could not examine all case report forms. Instead we decided to focus on those 85 participants identified in appendices D and G of the CSR who were withdrawn from the study, along with eight further participants who were known from our inspection of the CSRs to have become suicidal. Of the case report forms that were checked, 31 were from the paroxetine group, 40 from the imipramine group, and 22 from the placebo group. All case report forms were reviewed by JLN, who was trained in the use of MedDRA. The second reviewer (JMN), a clinician, was not trained in the MedDRA system, but training is not necessary for coding of dropouts. These two reviewers agreed about reasons for discontinuation and coding of side effects (we did not use a quantitative indicator of agreement between raters). We scrutinised these 93 case report forms for all adverse events occurring during the acute, taper, and follow-up phases, and compared our totals for adverse events with the totals reported in appendix D of the CSR. This review process identified additional adverse events that had not been recorded as verbatim terms in appendix D of the CSR. It also led to recoding of several of the reasons for discontinuation. Tables B, C, and H in appendix 2 show the new adverse events and the reasons for changing the discontinuation category. At least 1000 pages were missing from the case report forms we reviewed, with no discernible pattern to missing information—for example, one form came with a page inserted stating that pages 114 to 223 were missing, without indicating reasons.

Coding of adverse events Choice of coding dictionary for harms The protocol (page 25) indicates that adverse events were to be coded and compared by preferred term and body system by using descriptive statistics but does not prespecify a choice of coding dictionary for generating preferred terms from verbatim terms. The CSR (written after the study ended) specifies that the adverse events noted by clinical investigators in this trial were coded with ADECS, which was being used by SKB at the time. This system was derived from a coding system developed by the US Food and Drug Administration (FDA), Coding Symbols for a Thesaurus of Adverse Reaction Terms (COSTART), but ADECS is not itself a recognised system and is no longer available. We coded adverse events using MedDRA, which has replaced COSTART for the FDA because it is by far the most commonly used coding system today. For coding purposes, we have taken the original terms used by the clinical investigators, as transcribed into appendix D of the CSR, and applied MedDRA codes to these descriptions. Information from appendix D was transcribed into spreadsheets (available at www.Study329.org). The verbatim terms and the ADECS coding terms were transcribed first into these sheets, allowing all coding to be done before the drug names were added in. The transcription was carried out by a research assistant who was a MedDRA trained coder but took no part in the actual coding. All coding was carried out by JLN, and checked by DH, or vice versa. All of our coding from the verbatim terms in the appendix D of the CSR was done blind, as was coding from the case report forms. We present results as SKB presented them in the CSR using the ADECS dictionary (table 14.2.1), and as coded by us using MedDRA. In general, MedDRA coding stays closer than ADECS to the original clinician description of the event. For instance, MedDRA codes “sore throat” as “sore throat” but SKB, using ADECS, coded it as “pharyngitis” (inflammation of the throat). Sore throats can arise because of pharyngitis, but when someone is taking selective serotonin reuptake inhibitors they can indicate a dystonic reaction in the oropharyngeal area.22 Classification of a problem as a “respiratory system disorder” (inflammation) rather than as a “dystonia” (a central nervous system disorder) can make a considerable difference to the apparent adverse event profile of a drug. In staying closer to the original description of events, MedDRA codes suicidal events as “suicidal ideation” or “self harm/attempted suicide” rather than the ADECS option of “emotional lability”; similarly, aggression is more clearly flagged as “aggressive events” rather than “hostility.” Most coding was straightforward. Nearly all the verbatim terms simply mapped onto coding terms in MedDRA. Coding challenges usually related to cases where there were significant adverse events but the patients were designated by SKB to have discontinued for lack of efficacy. There was no patient narrative for such patients, in contrast to patients deemed to have discontinued because of the adverse event occurring at discontinuation. There were few challenging coding decisions. Appendix 3 shows our coding of cases in which suicidal and self injurious behaviours were considered.

Analysis of harms data In analysing the harms data for the safety population, we firstly explored the discrepancies in the number of events between case report forms and the CSR. Secondly, we presented all adverse events rather than those happening only at a particular rate (as done by Keller and colleagues). Thirdly, we grouped events into broader system organ class (SOC) groups: psychiatric, cardiovascular, gastrointestinal, respiratory, and other. Table D in appendix 2 summarises all adverse events by all MedDRA SOC groupings. Fourthly, we broke down events by severity, selecting adverse events coded as severe and using the listing in appendix G of the CSR of patients who discontinued for any reason. Fifthly, we included an analysis of the effects of previous treatment, presenting the run-in phase profiles of drugs taken by patients entering each of the three arms of the study and comparing the list of adverse events experienced by patients on concomitant drugs (from appendix B) versus those not on other drugs. Finally, we extracted the events occurring during the taper and follow-up phase. We did not undertake statistical tests of harms data, as discussed below.

Patient withdrawal A study patient could withdraw or be withdrawn prematurely for “adverse experiences including intercurrent illness,” “insufficient therapeutic effect,” “deviation from protocol including non-compliance,” “loss to follow-up,” “termination by SB [SKB],” and “other (specify).” The CSR states that the primary reason for withdrawal was determined by the investigator. We reviewed the codes given for discontinuation from the study, which are found in appendix G of the CSR, and we made changes in a proportion of cases.