Using data from a large UK primary care database, The Health Improvement Network (THIN), we aimed to compare rates of stopping medication or add‐on of another psychotropic drug in individuals prescribed lithium, valproate, olanzapine or quetiapine as maintenance monotherapy for bipolar disorder. This outcome represents a combination of both effectiveness and tolerability of the study medication, and is similar to that used in many RCTs of maintenance treatment for bipolar disorder 9 , 10 .

The applicability of RCT results to people with bipolar disorder in the real world may be limited by the exclusion criteria adopted in those trials, and by diagnostic heterogeneity, diagnosis or treatment rejection, and complex presentations of the illness occurring over the life course 12 , 13 . These concerns have been raised when considering RCTs in other areas of medicine: applying their results to managing a lifelong illness of unpredictable course is not straightforward 14 , 15 . Necessary trials are also costly and difficult to run for sufficient periods in relation to the time course of bipolar disorder 16 . Electronic health records offer an opportunity to augment RCT findings with head‐to‐head comparison studies which include large numbers of patients, representative of real world clinical practice, and long follow‐up periods.

A number of drug treatments are recommended for maintenance in bipolar disorder. In the UK, the most commonly used medications are lithium, valproate, olanzapine and quetiapine 4 . This reflects previous National Institute for Health and Care Excellence (NICE) guidance on first‐line monotherapy maintenance treatment, which suggested equivalence of these drugs 5 . Globally, there is a range of prescribing advice, which includes additionally lamotrigine, carbamazepine, oxcarbazepine, aripiprazole and other second generation antipsychotics 6 - 8 . Recent meta‐analyses and network meta‐analyses have highlighted the superiority of lithium 9 , 10 , and these results have contributed to the change in NICE guidance in September 2014, where lithium is presented as first line 11 . However, no randomized controlled trial (RCT) has conclusively proved the benefit of lithium over other drugs, and there are no trials that compare valproate vs. olanzapine, valproate vs. quetiapine or olanzapine vs. quetiapine directly.

Bipolar disorder is a lifelong recurrent illness with high rates of hospitalization, suicide and comorbidity 1 . It is the sixth most common cause of disability in the world, responsible for the loss of more disability‐adjusted life years than all forms of cancer or major neurological conditions such as epilepsy and Alzheimer's disease 2 . Long‐term drug treatment is often required to prevent relapse or recurrence. Even with treatment, the proportion of people who remain in remission is low 3 .

Analysis using PS matching was then completed. Although matched analyses may include a non‐representative sample of patients receiving treatment, they may provide a more valid estimate of treatment effect as they compare patients with similar observed characteristics 35 , 37 . Pairwise matching was performed for each patient in the valproate, olanzapine and quetiapine groups with individuals in the lithium treated group. Patients were matched on a one‐to‐one basis if their PS was within 0.01 of each other; all other patients were dropped from the analysis.

The PS was calculated using multinomial logistic regression using the covariates described as independent variables, with drug treatment as the dependent variable. The PS was then used as a linear term in a Cox regression analysis that also included age and calendar year 35 . This model was shown to be superior to stratifying on PS using Akaike information criterion and Bayesian information criterion 36 , and was a more efficient use of data than PS matching, because it uses all patients.

Although PS estimation cannot remove all bias, it has been postulated to also reduce confounding from unmeasured variables, because of their association with measured covariates 31 , 32 . Therefore in this study, for a given PS, exposure to lithium, valproate, olanzapine or quetiapine is presumed to have been at random 33 .

A propensity score (PS) for each individual was estimated using variables defined a priori, based on existing research 28 , 29 . The PS attempts to account for all of the covariates that predict receiving a particular study drug 29 , 30 . The PS was then checked by comparison of covariate balance across treatments, within strata. The included variables were: gender; age at start of treatment with the study drug; year of entry to the cohort; ethnicity (grouped as White, Black, Asian, mixed, other, with missing values coded as White); physical health history at baseline (ischemic heart disease, myocardial infarction, cerebrovascular event, hypertension, renal disease, thyroid disease, liver disease, type 2 diabetes mellitus, epilepsy, history of alcohol dependence, history of illicit drug use); smoking status (grouped as never‐smoker, ex‐smoker, current smoker); body mass index (BMI) (grouped as healthy weight, overweight (BMI 25 to 30), obese (BMI over 30)); mental health history at baseline (history of anxiety symptoms, hypomania as most proximal diagnosis code, history of depressive symptoms, sleep disturbance, previous treatment with the study drug before baseline, incident diagnosis of bipolar disorder); and clustering by GP practice. These variables were selected because they represent factors influencing prescribing choice (such as risk factors for adverse effects with a particular study medication) 11 .

Socio‐demographic, psychiatric and physical health characteristics at baseline were extracted from each patient's electronic health record. Psychiatric and physical health problems were considered present if referenced in the patient notes. If a patient had multiple entries of the same (or similar) Read codes, the start date of the condition was taken as the earliest date of entry.

Patients were considered to have a period of continuous prescribing if another prescription for the same drug was issued within three months of the predicted end date. If this did not occur, the date of stopping the study drug was the end date of the final prescription.

Patients were followed up until they stopped the study drug, or had a mood stabilizer, an antipsychotic, an antidepressant or a benzodiazepine added to their treatment regimen. Date of first prescription was taken as the start of exposure time. The end of the prescription was calculated from the prescription length and prescribing instructions coded by the GP.

Patients with a diagnosis of bipolar disorder were included if they had at least one 28‐day prescription of lithium, valproate, olanzapine or quetiapine after January 1, 1995, or after the date at which the GP practice met quality assurance criteria for data entry (based on computer usage and mortality recording rates) 26 , 27 . Patients were excluded if they received a diagnosis of schizophrenia at any time. They were also excluded if they were prescribed another of the study drugs, or any other mood stabilizer, antipsychotic, antidepressant or benzodiazepine at the start of follow‐up, or in the month before this. The cohort was therefore one in which the intention was to treat with lithium, valproate, olanzapine or quetiapine monotherapy. Patients were censored at date of death, leaving the GP practice or the end of the study period (December 31, 2013).

At the time this cohort was extracted, THIN contained records for over 11 million people 17 . Patients in the database have been shown to be broadly representative of the UK population, and GPs contributing data have been shown to be representative in terms of consultation and prescribing statistics 21 , 22 . Approximately 98% of the UK population is registered with a GP practice 23 . The incidence rate of bipolar disorder in THIN has been shown to be similar to other European cohorts 24 , and validity of severe mental illness diagnoses held in primary care has been established 25 .

THIN is a UK primary care database that contains anonymized patient information from routine clinical consultations 17 . General practitioners (GPs) use Read codes, a hierarchical coding system, to record information in THIN 18 . These codes include diagnoses (which map onto ICD‐10 codes), symptoms, examination findings, referrals, test results and information from hospital specialists, creating a longitudinal record for each patient 19 . In the UK, GPs are responsible for issuing all drug prescriptions if treatment is ongoing, following advice from a psychiatrist, and this information is also available 20 .

Supplementary analyses produced similar results to the primary analyses. If treatment failure was restricted to stopping the study drug or add‐on of a mood stabilizer or antipsychotic medication, PS adjusted HRs were elevated for all drugs compared to lithium (Table 3 ). The same was true if patients failing in the first three months of follow‐up were excluded from the analysis (Table 3 , Figure 2 ).

Individuals prescribed lithium or valproate were more likely to require antipsychotic add‐on (19.53% and 18.41%, respectively) than those prescribed olanzapine or quetiapine monotherapy (10.25% and 9.02%, respectively). Conversely, individuals prescribed olanzapine and quetiapine were more likely to require mood stabilizer add‐on (14.07% and 12.56%, respectively) compared to lithium and valproate (6.71% and 5.20%, respectively).

Lithium's superiority remained after adjustment for clustering by GP practice, age, gender, calendar year, and ethnicity. It also remained after adjusting for PS, age and calendar year, and after matching by PS (Table 2 ), with olanzapine having the least elevated hazard ratio (HR) (1.16, 95% CI: 1.05‐1.28). Compared to olanzapine, quetiapine had an increased rate of monotherapy failure (HR 1.12, 95% CI: 1.02‐1.23) in the PS adjusted model. Compared to valproate, olanzapine and quetiapine had similar rates of treatment failure (HR 0.97, 95% CI: 0.89‐1.06 and HR 1.09, 95% CI: 0.99‐1.19, respectively). The proportional hazards assumption held for all analyses. Before pairwise matching, PS scores were most different for lithium (median 0.45, interquartile range, IQR 0.25‐0.61) and quetiapine (median 0.14, IQR 0.08‐0.25). After matching, the median PS was 0.21 (IQR 0.13‐0.30) for lithium and 0.14 for quetiapine (IQR 0.08‐0.25).

In unadjusted analyses, the overall rate of treatment failure was increased for valproate, olanzapine and quetiapine when compared to lithium (Table 2 ). Treatment failure had occurred in 75% of those prescribed lithium by 2.05 years (95% CI: 1.63‐2.51), compared to 0.76 years (95% CI: 0.64‐0.84) for those prescribed quetiapine, 0.98 years (95% CI: 0.84‐1.18) for those prescribed valproate, and 1.13 years for those prescribed olanzapine (95% CI: 1.00‐1.31). The median time to treatment failure in the lithium monotherapy group was 0.28 years (95% CI: 0.23‐0.35), compared to 0.17 years (95% CI: 0.14‐0.21) in the quetiapine group, 0.22 years (95% CI: 0.19‐0.27) in the valproate group, and 0.24 years (95% CI: 0.21‐0.28) in the olanzapine group. The differences between treatments became more apparent the longer the duration of treatment (Figure 1 ).

A total of 14,396 individuals had a diagnosis of bipolar disorder. Of these, 5,089 were prescribed monotherapy with one of the study drugs at the start of cohort follow‐up: lithium was prescribed to 1,505 people, valproate to 1,173, olanzapine to 1,366 and quetiapine to 1,075 people. Individuals prescribed lithium tended to be older than other groups, with more years of follow‐up data and fewer GP practice contacts during this period. They were less likely to have a previous record of depression in their notes and less likely to be an incident case (Table 1 ).

DISCUSSION

As far as we are aware, this study represents the only head‐to‐head comparison of the four most common maintenance treatments for bipolar disorder, and has the longest follow‐up and largest cohort of any direct comparison of treatments for bipolar disorder. RCTs making these comparisons do not exist and are unlikely to be conducted.

The overall rate of treatment failure (represented by stopping index medication or requiring add‐on of a mood stabilizer, antipsychotic, antidepressant or benzodiazepine) was increased for valproate, olanzapine and quetiapine when compared to lithium. This was also true if failures within the first three months were excluded (i.e., once the patient had been stabilized on the prescribed drug). These results suggest that monotherapy with lithium may be more successful than the other recommended drugs. The rate of treatment failure was also elevated for quetiapine compared to olanzapine, while it was not possible to separate the other drugs from each other.

The use of contemporaneous, representative medical records avoided the risk of potential biases relating to selection into the study. Information bias should partially have been avoided by the use of prescribing data as exposure: in the UK, GPs are responsible for all ongoing prescribing within the national health system20, which is detailed and well recorded in THIN. However, exposure to the study drug was approximated through prescriptions issued to patients, and may not reflect how the patient used the medication. Poor adherence to prescribed drug regimens is a problem with all medications, and this is particularly true if side effects are unpleasant, as can be the case with all of the study drugs39, 40. In this study, stopping the drug will be reflected in the outcome, but erratic adherence cannot be detected. It is possible that erratic adherence is more likely for drugs other than lithium (as this is more closely monitored through regular blood tests). This may have contributed to lithium's perceived superiority, but we found that patients prescribed lithium had fewer GP contacts, and other longitudinal cohort studies have not shown differential adherence40.

Treatment failure was defined as stopping the study drug or add‐on of any mood stabilizer, antipsychotic, antidepressant or benzodiazepine. It is likely that addition of a mood stabilizer or antipsychotic represents more serious treatment failure than addition of an antidepressant (which would only occur during a depressive relapse) or a benzodiazepine (which may be used short term to avoid a relapse). A supplementary analysis excluding addition of these drugs had similar results. It may be the case that both of these outcomes fail to capture what is important to patients in terms of relapse, recurrence, functioning and quality of life. However, through examining monotherapy treatment failure, we believe we have described a proxy for these important outcomes which captures both tolerability and effectiveness and highlights very common need for adjunctive drug treatments. This outcome has also been used in a number of RCTs of maintenance drug treatment for bipolar disorder and therefore comparison with these results is possible. For example, the largest trial of lithium vs. valproate treatment had a primary outcome of “time to new intervention for an emerging mood episode”41.This trial found similar results to our study (HR 1.41, 95% CI: 1.00‐1.92), but was not powered to directly compare lithium and valproate.

A limitation of interpretation of data from cohort studies is the inability to rule out important confounding effects. We attempted to account for confounding by indication by building a PS model that included important clinical predictors of treatment allocation30. This included physical health variables which may lead a clinician to avoid a certain drug because of its side effect profile, e.g. renal disease with lithium or cardiovascular disease with olanzapine. Characteristics such as gender, age and BMI were also included, as valproate is contraindicated (though commonly prescribed) in women of childbearing potential5, and olanzapine has the potential to cause rapid weight gain42. Adjusting for the GP practice should account for physician preference for a particular drug. Once these covariates were adjusted for, there was a similar propensity for patients to be prescribed valproate, olanzapine or quetiapine, with patients prescribed lithium having slightly higher scores. Despite this, we cannot rule out the possibility that these confounders were imperfectly adjusted for, or that other important confounders were not included in the PS model.

Unfortunately, we were unable to separate treatment failure relating to emergent manic (or hypomanic) episodes from depressive episodes, and there is evidence that the study drugs may be differentially effective in preventing a particular polarity of illness9. However, an ideal “mood stabilizer” would protect against both polarities of relapse43, and this is what our study captures. We were also unable to examine the physician's reason for treatment initiation, and it may be that quetiapine's apparent inferiority is because in some patients it is prescribed as maintenance treatment, but for shorter term indications (which we hoped to capture in the supplementary analysis). There were too few patients on monotherapy with other recommended maintenance treatments, such as lamotrigine or aripiprazole, to include these drugs in the analysis.

In conclusion, this study provides necessary supplementary and complementary evidence to RCT findings for maintenance treatments for bipolar disorder. In real world clinical practice, lithium appears to be the most effective treatment to prevent relapse or recurrence of bipolar disorder and may prolong the time before adjunctive prescribing is necessary. This finding echoes the results of recent meta‐analyses that suggest lithium is superior to other drugs in protecting against both manic and depressive relapse9, 10. This is important as lithium is often avoided because of its side effect profile44, but monotherapy with valproate, olanzapine or quetiapine is more likely to fail sooner and may result in patients experiencing the additive side effects of multiple psychotropic drugs.