Collectively, our results confirm that novel, genome-wide PRS is able to predict CAD in French Canadians; with further improvements, this is likely to pave the way towards more targeted strategies to predict and prevent CAD-related adverse events.

Our results confirm the ability of both PRS to predict prevalent CAD comparable to the original reports (area under the curve=0.72–0.89). Furthermore, the PRS identified about 6% to 7% of individuals at CAD risk similar to carriers of the LDLR delta >15 kb mutation, consistent with previous estimates. However, the PRS did not perform as well in predicting an incident or recurrent CAD (area under the curve=0.56–0.60), maybe because of confounding because 76% of the participants were on statin treatment. This result suggests that additional work is warranted to better understand how ascertainment biases and study design impact PRS for CAD.

We calculated both PRS (GPS CAD and metaGRS CAD ) in French-Canadian individuals from 3 cohorts totaling 3639 prevalent CAD cases and 7382 controls and tested their power to predict prevalent, incident, and recurrent CAD. We also estimated the impact of the founder French-Canadian familial hypercholesterolemia deletion ( LDLR delta >15 kb deletion) on CAD risk in one of these cohorts and used this estimate to calibrate the impact of the PRS.

Coronary artery disease (CAD) represents one of the leading causes of morbidity and mortality worldwide. Given the healthcare risks and societal impacts associated with CAD, their clinical management would benefit from improved prevention and prediction tools. Polygenic risk scores (PRS) based on an individual’s genome sequence are emerging as potentially powerful biomarkers to predict the risk to develop CAD. Two recently derived genome-wide PRS have shown high specificity and sensitivity to identify CAD cases in European-ancestry participants from the UK Biobank. However, validation of the PRS predictive power and transferability in other populations is now required to support their clinical utility.

Introduction

Genome-wide association studies (GWAS) have shed light on the polygenic architecture of human quantitative traits, such as height and blood pressure, as well as common diseases, such as type 2 diabetes mellitus and coronary artery disease (CAD).1–4 These studies have shown that complex human phenotypes are controlled by hundreds of genetic variants, each with small effect size. Although individually they contribute to a small fraction of the phenotypic variation, together they account for a relatively large fraction of the heritability.5 This observation has raised the possibility to use genetic variants distributed across the genome to calculate polygenic risk scores (PRS) and use them to predict the risk to develop diseases.6 The availability of large human genetic data sets, such as the UK Biobank, now allows for calibration and validation of genome-wide PRS in >100 000 individuals.

CAD remains one of the main causes of morbidity and mortality worldwide.7 GWAS have already identified >100 loci associated with CAD, mostly in populations of European ancestry.2,8 Early prediction would benefit prevention, optimal management, and treatment strategies for CAD. Although CAD has high heritability (50%–60%),9,10 genetic testing is not readily used in the clinic, except in the context of Mendelian disease such as familial hypercholesterolemia (FH). Two recently developed genome-wide PRS for CAD by Khera et al11 (GPS CAD ) and Inouye et al12 (metaGRS CAD ) suggest that genetic risk prediction for CAD is ready to be applied in the clinical setting. Khera et al11 used the LDpred algorithm to model linkage disequilibrium and variant effect sizes from a CAD GWAS in the UK Biobank to create GPS CAD , which includes >6 million genetic variants throughout the genome.13 In contrast, Inouye et al12 created a PRS termed metaGRS CAD with >1.7 million variants, themselves explaining 26% of CAD heritability, using a meta-analysis of association results from 3 large CAD GWAS.2,14,15 The conclusions from both studies were encouraging. Khera et al11 showed that GPS CAD can identify a significant portion of individuals in the general population with a polygenic CAD risk as high as those who carry mutations that cause FH. For Inouye et al,12 the CAD risk estimated with metaGRS CAD was higher than the risk conferred by any single traditional risk factors, such as smoking or hypertension.12

Although these results are promising, the introduction of CAD PRS in clinical practice is likely to encounter resistance.16–18 In particular, whether PRS are sufficiently accurate to justify on their own early interventions—including pharmaceutical treatments—is an important debate. For this reason, it is critical to validate PRS in additional populations (GPS CAD and metaGRS CAD were initially only tested in European-ancestry participants from the UK Biobank) and determine whether ascertainment biases and study design impact their clinical utility. Khera et al11 recently tested the utility of GPS CAD in Americans from different ethnicities (white, black, Hispanic, and Asian) and compared the predicted risk to individuals with monogenic mutations in hypercholesterolemia genes.19 Their results indicate that GPS CAD can predict CAD risk in non-white individuals, although with lower accuracy. Here, we validated these 2 novel CAD PRS in individuals of French-Canadian descent recruited from the population- and hospital-based cohorts. We evaluated the performance of these polygenic predictors on prevalent, incident, and recurrent CAD. Finally, we used whole-genome sequence data to identify participants that carry a known FH mutation and compared its impact on CAD risk with that because of the inheritance of millions of weak effects common variants.

Methods

The data and materials used to perform the study cannot be made available because of ethical considerations. All analytical methods used are readily available and reported. All participants have provided written, informed consent and the project was approved by the ethics committee of the Montreal Heart Institute (MHI). The full methods are available as part of the Data Supplement.

Results

Genome-Wide PRS for Prevalent CAD in French Canadians

Using both models (GPS CAD and metaGRS CAD ), we calculated PRS in French Canadians from 3 studies: 2 hospital-based cohorts from the MHI Biobank (phase 1, n=1964 and phase 2, n=3309),20,21 and 5762 participants from CARTaGENE, a public health research platform in the Province of Quebec, Canada.22 We present demographics and baseline clinical information for all participants in Table 1. After DNA genotyping and variant imputation (Data Supplement), most variants used to calculate GPS CAD and metaGRS CAD were present in our data sets (missingness range: 0.09%–6.96%; Table I in the Data Supplement), suggesting that our data sets can accurately capture the previously proposed CAD polygenic models. Both PRS were strongly correlated with each other in the French-Canadian data sets (Pearson r>0.73, P<2.2×10−16; Figure 1). We tested the association between the CAD PRS and prevalent CAD status in all 3 cohorts. The distributions of both GPS CAD and metaGRS CAD were shifted towards higher values in CAD cases when compared with controls (Figure 2). Combining results across the 3 cohorts, we found that one SD increase in GPS CAD or metaGRS CAD was associated with increased odds of CAD of 1.61 (P=6.18×10−42) and 1.69 (P=3.28×10−49), respectively (Table 2). In terms of prediction of prevalent CAD in French Canadians, the area under the receiver operating characteristic curve for both PRS was 0.72 to 0.89, largely consistent with the original reports (Table 2).

Table 1. Demographics and Clinical Information for the Participants Involved in the Study Characteristic MHI Biobank Phase 1 MHI Biobank Phase 2 CARTaGENE Genotyping platform Low-pass WGS (5X) Illumina MEGA Illumina GSA Baseline status Controls Cases Controls Cases Controls Cases Sample size, n (% women) 976 (28) 974 (27) 817 (60) 2492 (17) 5589 (60) 173 (17) Mean age, y (SD) 66.0 (10.1) 66.9 (8.87) 66.0 (10.7) 66.7 (8.31) 54.9 (7.78) 60.5 (6.90) Type 2 diabetes mellitus, n (%) 195 (20) 291 (30) 59 (7) 702 (28) 336 (6) 42 (24) Hypertension, n (%) 632 (65) 751 (77) 287 (35) 1878 (75) 1132 (20) 94 (54) Dyslipidemia, n (%) 741 (76) 923 (95) 306 (37) 2350 (94) 1551 (28) 121 (70) Mean LDL-cholesterol, mmol/L (SD) 2.65 (0.87) 2.09 (0.7) 3.12 (0.85) 2.67 (0.8) 3.03 (0.85) 1.99 (0.76) Follow-up no. of years, median (range) 4.1 (2.8–6.6) 4.1 (2.7–7) 3.8 (0.1–7.2) 3.7 (1.1–7) NA NA Statin treatment, n (%) 625 (64) 878 (90) 229 (28) 2263 (91) NA NA

Table 2. Association With and Prediction of CAD by Polygenic Risk Scores in 3 Cohorts Model Cohort Phenotype Cases Controls P Value Odds Ratio OR (95% CI) AUC AUC (95% CI) GPS CAD MHI Biobank phase 1 CAD prevalence 974 976 3.82×10−21 1.64 1.48–1.81 0.72 0.70–0.74 MHI Biobank phase 2 CAD prevalence 2492 817 7.23×10−14 1.55 1.38–1.73 0.89 0.88–0.91 CARTaGENE CAD prevalence 173 5589 2.55×10−10 1.69 1.44–1.99 0.84 0.81–0.87 Meta-analysis* CAD prevalence* 3639* 7382* 6.18×10−42* 1.61* 1.51–1.73* NA* NA* MHI Biobank phase 1 Recurrent events 446 416 0.026 1.17 1.02–1.35 0.58 0.54–0.62 MHI Biobank phase 2 Recurrent events 937 1396 7.99×10−03 1.12 1.03–1.22 0.57 0.55–0.59 Meta-analysis* Recurrent events* 1383* 1812* 6.21×10−04* 1.13* 1.06–1.22* NA* NA* MetaGRS CAD MHI Biobank phase 1 CAD prevalence 974 976 3.37×10−25 1.74 1.57–1.93 0.72 0.70–0.75 MHI Biobank phase 2 CAD prevalence 2492 817 9.49×10−16 1.60 1.43–1.80 0.89 0.88–0.91 CARTaGENE CAD prevalence 173 5589 8.55×10−12 1.75 1.49–2.05 0.84 0.81–0.87 Meta-analysis* CAD prevalence* 3639* 7382* 3.28×10−49* 1.69* 1.58–1.81* NA* NA* MHI Biobank phase 1 Recurrent events 446 416 2.84×10−03 1.24 1.08–1.43 0.60 0.56–0.63 MHI Biobank phase 2 Recurrent events 937 1396 2.96×10−03 1.14 1.05–1.24 0.57 0.55–0.59 Meta-analysis* Recurrent events* 1383* 1812* 4.33×10−05* 1.17* 1.08–1.26* NA* NA*

Figure 1. Correlation between normalized GPS CAD and metaGRS CAD . The correlation between GPS CAD and metaGRS CAD in (A) the Montreal Heart Institute (MHI) Biobank phase 1 (Pearson r=0.75, P<2×10−16), (B) the MHI Biobank phase 2 (Pearson r=0.75, P<2×10−16), and (C) CARTaGENE (Pearson r=0.74, P<2×10−16).

Figure 2. Distributions of GPS CAD and metaGRS CAD in the Montreal Heart Institute (MHI) Biobank phase 2 cohort. Distributions of the normalized polygenic risk score from Khera et al11 (GPS CAD , left column) and Inouye et al12 (metaGRS CAD , right column) in the MHI Biobank phase 2 data for prevalent (A and B), incident (C and D), and recurrent (E and F) coronary artery disease (CAD) events.

Estimation of CAD Risk for LDLR Delta >15 kb Deletion Carriers

Approximately 60% of FH cases in the French-Canadian population of Quebec are because of the delta >15 kb deletion of the LDLR gene.23 To compare the predictive power of CAD PRS with the impact of penetrant FH mutations on CAD risk in this population, we used whole-genome sequence data available in 1964 MHI Biobank participants to call copy-number variants at the LDLR locus.20 We identified a total of 14 heterozygous carriers of the LDLR delta >15 kb deletion (breakpoints: chr19:11 188 403-11 204 295 [hg19]). The estimated allele frequency in this cohort is 0.36%, which is in the range of the reported frequency for this mutation (≈0.03%–0.38%).24,25 In our data set, the LDLR delta >15 kb deletion was associated with increased low-density lipoprotein–cholesterol levels (1.34 mmol/L increase per copy of the LDLR deletion, P=1.2×10−8). When combining baseline and follow-up data, we found that 12 out of the 14 LDLR deletion carriers were CAD cases (odds ratio [OR]=3.30 and 95% CI, 0.72–15.2; P=0.13). Although this result is not statistically significant owing to our limited sample size, it allows us to estimate that French Canadians who carry a strong FH mutation are ≈3× more at risk to develop CAD. This provides a direct opportunity to identify the proportion of individuals at similar or increased risk for CAD based on their PRS. Using the distributions of GPS CAD and metaGRS CAD , we estimate that 6% to 7% of the French-Canadian population is at the same or higher risk for CAD than carriers of the FH LDLR delta >15 kb deletion. This result is consistent with the estimate by Khera et al11 that 8% of European-ancestry individuals in the UK Biobank have a PRS that confers comparable or higher CAD risk than rare FH mutations.

Prediction of Incident and Recurrent CAD

The MHI Biobank is a prospective hospital-based cohort with available regular follow-up clinical information collected. We took advantage of this design to also test the CAD PRS against the incident and recurrent CAD events. Because genetic variants are present at birth, it can be argued that PRS analyses of late-onset diseases, such as CAD, are always prospective. However, analyses of clinical information collected retrospectively is subject to selection biases and thereby, analysis of such information might impact the accuracy of the PRS. Inouye et al12 had shown that metaGRS CAD can identify incident cases in the UK Biobank. Among the 1245 controls at baseline with follow-up available in the combined MHI Biobank cohorts, 402 had a first CAD event between recruitment and follow-up (median follow-up time =4 years [range =5 weeks to 7.2 years]). Importantly, we note that most participants in the MHI Biobank, including controls free of CAD, were taking statins at baseline and this may confound our analyses of incident CAD events. With this important caveat in mind, we tested the CAD PRS against incident CAD events on statin treatment. GPS CAD was not associated with incident CAD (OR=1.11, P=0.071), whereas the association between metaGRS CAD and incident CAD was only modest (OR=1.15, P=0.022; Table II in the Data Supplement). The prediction of incident CAD by GPS CAD and metaGRS CAD was also markedly lower than for prevalent CAD (area under the receiver operating characteristic curve=0.57–0.60; Table II in the Data Supplement). Of the 1812 CAD cases at baseline with follow-up information available, 1382 had a recurrent CAD event during the follow-up period (median follow-up time =3.9 years [range =1.1–7). We found that GPS CAD and metaGRS CAD , 2 PRS developed to predict primary CAD events, were also associated with recurrent CAD events (GPS CAD : OR=1.13; P=6.12×10−4; metaGRS CAD : OR=1.17; P=4.33×10−5), although the area under the receiver operating characteristic curve was relatively small (0.57–0.60; Table 2).

Discussion

Because PRS are simple and relatively inexpensive, their implementation in the clinical setting holds great promises. For CAD, in particular, early detection could lead to simple yet extremely efficacious therapeutic interventions (eg, statins and aspirin). Given this exciting possibility, we tested 2 recently developed CAD PRS in French Canadians recruited from population- and hospital-based cohorts. We validated previous findings that both GPS CAD and metaGRS CAD perform well for prevalent CAD cases. However, their performance was lower for the incident and recurrent CAD in the MHI Biobank. Although both PRS could not predict incident CAD events in the MHI Biobank, these analyses might be confounded given that the majority of participants were on statin treatment at baseline. Using the French Canadian founder FH LDLR delta >15 kb mutation to calibrate CAD risk, we confirmed that PRS can identify about 6% to 7% of the population that is at equal or higher CAD risk than carriers of an FH monogenic mutation.

Our study raises a few interesting questions. Although it is appreciated that PRS do not transfer well between ancestral populations,26,27 little is known about the transferability of PRS across populations within the same ancestry. Our results indicate that CAD PRS developed in European-ancestry individuals perform quite well in the genetically and environmentally homogenous French-Canadian population. How well these same PRS would predict CAD in a more diverse European-ancestry population, or in a population living in a different environment, remain critical open questions for further investigation.19 Another important result from our analyses is the lower accuracy that these PRS have to predict an incident or recurrent CAD cases when compared with prevalent CAD cases, highlighting the importance of the method used to create the PRS. GPS CAD and metaGRS CAD were built using mainly GWAS for prevalent CAD and are, therefore, particularly suitable to predict prevalent CAD as opposed to incident or recurrent events. In particular, our analyses of incident and recurrent CAD were based on the MHI Biobank, which is a hospital-based cohort. Thus, it is possible that confounders such as the presence of comorbidities and medications (eg, antithrombotic, statin treatment at baseline [discussed above]) would impact PRS performance. Furthermore, because we matched cases and controls based on age at baseline, participants with incident or recurrent CAD were older at the time of their CAD events than prevalent cases. If the cause of CAD at an older age is less polygenic, as suggested12 it might not be surprising that GPS CAD and metaGRS CAD do not perform as well on incident or recurrent CAD. It is important to clarify these differences to determine what factors in the study design and what ascertainment biases influence the PRS. Furthermore, an extension of our results implicates that GWAS that aim to specifically identify the genetic architecture of incident of recurrent CAD events might yield improved predictive power to calibrate risk score models over PRS based on CAD prevalence alone.12

In conclusion, while it may still take some time before PRS become widely applicable in the clinic to predict CAD, their utility is likely to increase as the community continues to improve methods and gain access to large GWAS performed in populations of different ethnic backgrounds. But the true improvement in CAD prediction based on PRS will only occur if the scientific progress is mirrored by an effort to explain the strengths and limitations of this new biomarker to the medical community and the general population.

Acknowledgments

We thank all participants and staff of the André and France Desmarais Montreal Heart Institute (MHI) Biobank. Sequencing of the MHI Biobank samples (phase 1) was performed at the McGill University and Génome Québec Innovation Centre. Genotyping of the MHI Biobank samples (phase 2) was performed at the Université de Montréal Beaulieu–Saucier Pharmacogenomics Centre at the MHI. We would also like to thank the CARTaGENE staff for their support in validating phenotypes. We also thank Aikaterini Kritikou and Rafik Tadros for comments on an earlier version of this article. Web links: Genetic risk score model for GPS CAD Khera et al Nature Genetics 2018: http://www.broadcvdi.org/informational/data. Genetic risk score model for metaGRS CAD by Inouye et al JACC 2018: https://figshare.com/articles/Coronary_Artery_Disease_CAD_MetaGRS/5748096.

Sources of Funding This work was funded by the Canadian Institutes of Health Research (MOP no. 136979), the Heart and Stroke Foundation of Canada (Grant no. G-18-0021604), the Canada Research Chair Program, Genome Quebec and Genome Canada, and the Montreal Heart Institute Foundation. F. Wünnemann holds a postdoctoral training scholarship from the Fonds de recherche Quebec Santé (FRQS).

Disclosures None.

Footnotes