Mendelian randomisation (MR) uses genetic variation to investigate causal relations between potentially modifiable risk factors and health outcomes. In this study we compared observational estimates from multivariable regression with those from MR analyses to make inferences about the likely causal effects of three sleep traits on breast cancer risk.

In multivariable regression analysis using data on breast cancer incidence in the UK Biobank study, morning preference was inversely associated with breast cancer, whereas there was little evidence for an association with sleep duration and insomnia. Using genetic variants associated with chronotype, sleep duration, and insomnia symptoms, one sample MR analysis in UK Biobank provided some evidence for a protective effect of morning preference but imprecise estimates for sleep duration and insomnia. Findings for a protective effect of morning preference and adverse effect of increased sleep duration on breast cancer (both oestrogen receptor positive and oestrogen receptor negative) were supported by two sample MR using data from the Breast Cancer Association Consortium (BCAC), whereas there was inconsistent evidence for insomnia symptoms. Results were largely robust to sensitivity analyses accounting for horizontal pleiotropy.

Overall, we found inconsistent evidence about the causal effect of insomnia symptoms on breast cancer risk in multivariable and MR analyses. A previous study of incident breast cancer in the Nord-Trøndelag Health Study (HUNT) revealed no strong evidence of an association with individual insomnia symptoms, 11 although people with multiple insomnia problems were found to be at increased risk. In our analysis, insomnia was defined based on self report of either difficulty initiating sleep or waking in the night. Further work is therefore required to investigate individual symptoms of insomnia on breast cancer risk, and the potential cumulative effect. Interestingly, MR analysis provided some evidence for adverse causal effect of accelerometer derived number of nocturnal sleep episodes on breast cancer risk.

Evidence for an adverse effect of increased sleep duration on breast cancer risk contrasts with the observational findings in UK Biobank as well as much of the literature on circadian disruption and breast cancer risk, 6 and unlike our findings for chronotype are not aligned with the light at night hypothesis. However, recent studies implicate longer sleep duration as a risk factor for breast cancer. 9 Given previous reports of a J-shaped relation between sleep duration and breast cancer risk, 9 as well as investigating sleep duration as a continuous variable, we also investigated the causal effects of both short and long sleep duration to investigate non-linear effects. In line with our main findings, we found evidence for a protective effect of short sleep duration and adverse effect of long sleep duration on breast cancer risk. Furthermore, using genetic variants associated with accelerometer derived nocturnal sleep duration, we found evidence for an adverse effect of sleep duration with a similar magnitude of effect.

Findings of an adverse effect of evening preference on breast cancer risk in all analyses performed go some way to supporting hypotheses around carcinogenic light-at-night 4 and findings of increased risk among night shift workers who might be exposed to artificial light at night. 1 In particular, the specificity of the causal effect of chronotype on breast cancer, which was not observed in relation to other cancers or all cause mortality, is consistent with the hormonal mechanisms implicated in the light-at-night hypothesis. However, findings when using an objective measure of chronotype (the least active five hours (L5 timing)) did not reveal the same adverse effect. Although this last analysis might be limited by the number and strength of the genetic variants used to instrument L5 timing, the lack of consistency in estimates draws to question the mechanisms by which morning or evening preference (rather than actual activity) influences breast cancer risk. Further analysis using the single nucleotide polymorphisms (SNPs) identified in relation to chronotype as instruments for L5 timing were consistent with a protective effect of morning preference, suggesting a protective effect of activity as well as reported preference, but as the pleiotropy robust tests were not consistent, more work is needed to distinguish the causal effect of morning preference from activity—for example, with the use of multivariable MR methods. 43

Previous studies have found an enrichment of circadian pathway genetic variants in breast cancer. 25 42 Nonetheless, these studies did not directly implicate modifiable sleep traits by which risk of breast cancer could be minimised and did not attempt to separate the effects of the genetic variants on breast cancer risk through circadian disruption from pleiotropic pathways.

Strengths and limitations of this study

Key strengths of the study are the integration of multiple approaches to assess the causal effect of sleep traits on breast cancer, the inclusion of data from two large epidemiological resources—UK Biobank and BCAC—as well as use of data derived from both self reported and objectively assessed measures of sleep. Furthermore, for MR analysis we used the largest number of SNPs identified in the genome-wide association studies (GWAS) literature, with full summary statistics available to obtain strong genetic instruments for MR analysis and to explore potential pleiotropic pathways.

The approaches of multivariable Cox regression of incident cases, multivariable logistic regression of prevalent and incident cases, one and two sample MR, each have different strengths and limitations in terms of key sources of bias (see supplementary file, table 28). In multivariable analysis, attempts were made to mitigate key sources of bias, including confounding and reverse causation, with the use of multivariable Cox regression analysis of incident cases of breast cancer and adjustment for several hypothesised confounders. Nonetheless, residual or unmeasured confounding, selection bias, and measurement error could also have distorted effect estimates. We used MR analysis to minimise the likelihood of bias due to measurement error, confounding, and reverse causation. In addition, we conducted a series of sensitivity analyses to assess the core assumptions that the genetic instruments are strongly associated with the exposures of interest, are not influenced by confounding factors, and do not directly influence the outcome other than through the exposure.

One limitation of this study related to the self reported measures used in multivariable regression analyses and used to identify genetic variants for MR analysis. In particular, the measure of sleep duration might capture time spent napping and the any insomnia variable is really a measure of insomnia symptoms and not necessarily clinical insomnia. However, both these measures have been validated with the use of objective measures from accelerometer data in the UK Biobank and concordance is good, particularly for the effects of the genetic variants identified.151617

Another limitation relates to the selection of participants. Analysis in the two large epidemiological studies included here (UK Biobank and BCAC) was restricted to women of European ancestry. Further work is required to investigate whether these findings translate to women in other ancestry groups. Although the UK Biobank represents a large and well characterised epidemiological resource, it is not representative of the UK population owing to low participation.27 As well as influencing the generalisability of findings, selection into the study can lead to biased estimates of association through “collider bias.”44 To minimise the influence of this, we also used genetic data from a large case-control study of breast cancer (BCAC), and we compared MR effect estimates across these datasets.

In all MR analyses, SNP exposure estimates were obtained from the UK Biobank as this has formed a major component of the GWAS of sleep traits conducted to date.151617202123 This could lead to winner’s curse, when the magnitude of the effect sizes for genetic variants identified within a discovery sample are likely to be larger than in the overall population. In a one sample MR analysis, the impact of winner’s curse of the SNP exposure association can bias causal estimates towards the confounded observational estimate, whereas in two sample MR, winner’s curse can result in bias of the causal estimate towards the null. To minimise the impact of winner’s curse in one sample MR analysis we derived an additional allele score for chronotype composed of SNPs that replicated beyond a Bonferroni correction threshold in an independent study (23andMe).15 Similarly, for two sample MR analysis, we used SNP exposure estimates from this replication analysis in sensitivity analyses, and findings were consistent with the main analysis (see supplementary file, tables 14 and 22).

We were unable to apply the same approach to investigate the impact of winner’s curse in the sleep duration and insomnia analysis owing to the relatively small sample size of the replication datasets in those studies, meaning genetic associations could be imprecise. Although we are aware of a large GWAS for insomnia that was conducted using data from both UK Biobank and 23andMe, full summary data for the top SNPs in the replication analysis are not freely available.23 We used unweighted allele scores to minimise the contribution of potential weak instruments in the one sample MR analysis. We also applied a robust adjusted profile score method in the two sample MR analysis, which provides unbiased estimates in the presence of weak instruments, and this revealed similar causal estimates for chronotype, sleep duration, and insomnia as in the main analysis.

Although associations between the allele scores and confounders in UK Biobank imply violation of the MR assumption that genetic variants should not be associated with confounding factors, there are several explanations for these findings. Previous MR studies have identified causal effects of sleep traits on reproductive traits, body mass index, and activity levels,15161723 suggesting that these factors might be mediators of the association between sleep traits and breast cancer rather than confounders. Furthermore, some of the genetic variants associated with chronotype and insomnia have been found to be adiposity related loci,1516 implying potential pleiotropic pathways. Nonetheless, we also applied a series of pleiotropy robust MR methods and outlier detection to rigorously explore the possibility that findings of a causal effect of chronotype and sleep duration were not biased as a result of pleiotropy.

As well as attempting to mitigate key sources of bias for each epidemiological approach applied, we also assessed the consistency in estimates between the approaches to provide the best inference about the causal effect of sleep traits on breast cancer. This is aligned with the practice of triangulation, which aims to obtain more reliable answers to research questions through the integration of results from different approaches, where each approach has different sources of potential bias that are unrelated to each other.4546 We also compared estimates based on self reported sleep with the use of genetic variants associated with accelerometer derived measures of sleep,40 although we did not use female specific SNP estimates here given the smaller number of participants in UK Biobank with these data.