The main findings of this study are that (treated) patients with a long-standing bipolar disorder (presenting a duration of illness of about 20 years and at least five illness episodes in their history) and controls did not differ significantly in neurocognitive performance in overall cognitive function and verbal learning, recall, and recognition, regardless of whether lithium was part of their long-term treatment. The patients, however, revealed impaired early visual information processing compared to healthy controls, with the lithium-treated patients performing worse than those without.

In contrast to the CVLT results from bipolar patients published by Martinez-Aran et al. ([2004]) and Fleck et al. ([2003]) (upon which we based our sample size estimation), our patients were a total of only about 6% worse than the healthy controls. Both patients and controls in the Martinez-Aran et al. study performed worse than ours, i.e., in the total number of words remembered on list A (45.1 ± 11.4 vs. 58.6 ± 14.0 and 54.4 ± 9.6 vs. 59.2 ± 17.1, respectively). Patients in the Martinez-Aran study were roughly comparable to ours with regard to age (about 8 years younger), education, and duration of the disease (about 5 years shorter). Their controls were also roughly comparable to ours regarding age and education. In terms of medication status at testing, 83% of their patients were taking lithium (vs. 65% in our total patient sample), 30% carbamazepine (vs. 11%), and 18% valproate (vs. 15%). Twenty-eight percent of their patients received more than one mood-stabilizing drug (vs. 14%), 58% were treated with antipsychotics (38% with atypical ones vs. 22%), and 20% received antidepressants. Their patients’ medication status differs from our study’s both in the percentage use of drugs we allowed and in the use of medications on our exclusion list (20% received typical antipsychotics, and it is not known whether clozapine or tricyclic antidepressants were used). In the study by Fleck et al., their 14 euthymic bipolar patients were measurably worse in the total number of words remembered on list A of the CVLT than the patients in the Martinez-Aran study (46.4 ± 14.3), whereas their controls were nearly as good as ours (58.9 ± 8.1). The Fleck study patients were about 15 years younger than ours and thus had a shorter duration of illness, but they are otherwise similar to ours with regard to education. Of Fleck’s cohort, 79% were treated with mood stabilizers and 50% with antipsychotics; however, they do not provide greater detail on the type of medication administered, which hinders comparison with our patients. In the meta-analysis by Robinson et al. ([2006]) (including the aforementioned studies), euthymic bipolar patients did display moderate to severe impairment in verbal learning and memory (and executive functioning) compared to healthy controls (effect sizes of 0.7 to 0.9). Compared to patients with schizophrenia, bipolar patients (including those in our study) seem to be less severely impaired in cognitive performance, and they reveal a similar pattern of functions affected (see Daban et al. [2006] and Trivedi et al. [2007]). In a recent large individual patient data meta-analysis including a part of our sample, Bourne et al. showed only moderate effect sizes of neurocognitive impairment (for the CVLT 0.51, Bourne et al. [2013]). They suggested that better control of the influence of age, gender, and IQ as well as the inclusion of unpublished data could explain the lower effect sizes at least in part.

Our results are in line with published VBM results in that euthymic bipolar patients were significantly slower and made more errors than controls (see for instance Green et al. [1994a], Green et al. [1994b], MacQueen et al. [2001]). Contrary to our hypothesis, however, our lithium-treated patients demonstrated worse early information processing. This, however, seems to support the data from Fleming and Green, who observed impairment in backward masking to be partly associated with lithium treatment (in that bipolar patients on lithium present a significantly higher critical inter-stimulus interval than patients not on lithium, the latter still performing non-significantly worse than healthy controls (Fleming and Green [1995])).

We wish to stress our initial hypothesis in several ways: (A) Was it reasonable to suggest that phenomena such as the neuroprotective effects identified in preclinical studies and decreased dementia rates found in lithium-treated patients are associated phenomena and that these can be examined by applying neurocognitive performance tests, in this age group, and in conjunction with the duration of illness our patients presented? (B) Might lithium’s potential neuroprotective benefit compensate for the cognitive side effects seen with lithium treatment? We could also query (C) whether our patient control group (non-Li) might have differed from the lithium group and whether they might have received treatments more frequently that might themselves exert a neuroprotective effect.

Regarding hypothesis A and based on the results we present: we suggest that effective phase prophylaxis can reduce impairment in cognitive performance (independent of whether patients respond to lithium or to another mood-stabilizing medication). Rybakowski and Suwalska ([2010]) reported good cognitive functioning in conjunction with long-term lithium treatment in excellent responders. Lithium’s additional specific neuroprotective effects might compensate for cognitive functioning impairment due to the illness’s neurodegenerative or toxic effects until brain areas are on the verge of substantial damage (and dementia). Drawing on the data from their Danish registry, Kessing et al. ([2010]) demonstrated a reduced risk for dementia only in those patients on lithium treatment, but not when taking the other drugs studied (i.e., anticonvulsants, antipsychotics, antidepressants). Volumetric results from our study show, in line with our hypothesis, that the lithium-treated patients’ hippocampal volumes were larger than those of non-Li patients (who received the above-mentioned alternative mood stabilizing medication) and were similar to those of healthy controls, independent of long-term treatment response including the number of episodes while on lithium (Hajek et al. [2013]). Moreover, our non-Li group patients’ prefrontal NAA levels were lower, while the Li group’s were similar to those in the healthy controls (Hajek et al. [2012]), with NAA assumed to be a marker of cellular integrity. Since general cognitive functioning and verbal learning and memory were not substantially impaired in our non-Li patients, the smaller hippocampal volumes might still suffice for functioning, not causing a substantial impact until a greater loss of hippocampal volume has occurred.

Regarding hypothesis B: We suggest that we still observed minor cognitive impairment in our long-term ill lithium-treated bipolar patients compared to healthy controls because lithium side effects were balanced but not outweighed by potential neuroprotective effects. In their meta-analysis, Wingo et al. showed that lithium treatment was associated with rather minor cognitive impairment (Wingo et al. [2009]). In the two longitudinal studies included in this meta-analysis, the cognitive performance over time was somewhat stable in the lithium-treated patients (Smigan and Perris [1983], Engelsmann et al. [1988]), in line with our study’s result of an only insubstantial influence of duration of illness on overall cognitive performance, verbal learning, and memory as well as early visual information processing (though keeping in mind that we had little variance in duration of illness because of our inclusion criteria). Results from a recently published prospective study showed that neurocognitive performance of bipolar patients on lithium (monotherapy in half, combination therapy with AD or neuroleptics in the other half) was stable over the 6 years of follow-up (Mora et al. [2013]). However, as in the aforementioned studies by Martinez-Aran et al. and Fleck et al., their patients were more severely impaired than ours (CVLT total words remembered of list A: baseline 51.5 ± 10.0, follow-up 6 years 49.4 ± 12.1).

To question hypothesis C: Every clinician agrees that patients that do not receive or sufficiently respond to prophylactic lithium treatment differ in some characteristics from patients that tolerate lithium and respond adequately. Contraindications to lithium treatment were of course more prominent in the non-Li group; however, we did not detect major differences in the patient groups’ comorbidity profiles. We matched them for potential influencing characteristics and required the same duration of illness and minimum number of previous episodes. Regarding medication however, the groups differed substantially with respect to the use of, for example, valproate (35% in the non-Li vs. 5% in the Li group), which may exert a potential neuroprotective effect itself (via modulating for instance the WNT pathway, also a major target of lithium, Sutton and Rushlow [2011]). However, the clinical evidence does not support this suggested effect showing no increase in gray matter volume (Lyoo et al. [2010]) or even greater loss in hippocampal and whole-brain volume (Tariot et al. [2011]) with valproate. It is thus not clear whether the similar overall cognitive performance and that in verbal learning and memory could in part result from the use of protective medication in the patient control group.

Strength and limitations

The present study applied strict inclusion and exclusion criteria. We only included patients in the Li group who had been taking lithium for at least the previous 2 years. In the non-Li group, a maximum of less than 3 months lifetime use of lithium was allowed, and treatment had to have been discontinued more than 2 years prior to the study’s beginning. The reason for these criteria was that we are unsure how long the potential neuroprotective effect (of lithium) takes to unfold. With the medication exclusion list we applied, acute impairing influences of drugs on cognitive performance were prevented. By allowing for only a limited number of drugs as comedication, we aimed to be able to ascribe effects to lithium in the Li group and to circumvent problems with multiple drug treatment. The sample sizes of our patient groups and controls are higher than those in most comparable studies.

Limitations include (a) the design of a cross-sectional study (here with inclusion of retrospective information) which is valid for specifying the hypothesis but bears a high potential for bias and does not allow causation to be substantiated and (b) the recruitment of long-term ill patients that did not have a substantial history of lithium treatment was a major task in the study. In the end, relatively few patients could be included in this group. This can be partially explained by the IGSLi centers involved in the study where expert clinicians are working in the treatment of BD and who are engaged in clinical and scientific research on the effects of lithium. (c) The study was conducted in five centers in different countries, where treatment and care settings as well as patient characteristics might well have varied. Our analysis was therefore adjusted to control for the influence of center effects. (d) As mentioned above, in the patient control group (non-Li), potentially neuroprotective drugs were used more frequently than in the lithium group, i.e., valproate, a circumstance however, that cannot be easily avoided in a future study since the patients who were included had been ill for a long time and needed prophylactic treatment. Additionally (e), when designing and conducting the presented study, we suggested a somewhat homogeneous distribution of cognitive impairment in the group of patients with bipolar disorder as a whole. However, results of a recently published cross-sectional study show that although patients with bipolar disorder as a group showed more cognitive impairment compared to healthy controls, only a third of the patients showed more severe impairment, whereas 70% showed either impairment of smaller effect or was indistinguishable from controls (Martino et al. [2014]).