Abstract Background and Aims Proton pump inhibitors (PPIs) have been associated with adverse clinical outcomes amongst clopidogrel users after an acute coronary syndrome. Recent pre-clinical results suggest that this risk might extend to subjects without any prior history of cardiovascular disease. We explore this potential risk in the general population via data-mining approaches. Methods Using a novel approach for mining clinical data for pharmacovigilance, we queried over 16 million clinical documents on 2.9 million individuals to examine whether PPI usage was associated with cardiovascular risk in the general population. Results In multiple data sources, we found gastroesophageal reflux disease (GERD) patients exposed to PPIs to have a 1.16 fold increased association (95% CI 1.09–1.24) with myocardial infarction (MI). Survival analysis in a prospective cohort found a two-fold (HR = 2.00; 95% CI 1.07–3.78; P = 0.031) increase in association with cardiovascular mortality. We found that this association exists regardless of clopidogrel use. We also found that H 2 blockers, an alternate treatment for GERD, were not associated with increased cardiovascular risk; had they been in place, such pharmacovigilance algorithms could have flagged this risk as early as the year 2000. Conclusions Consistent with our pre-clinical findings that PPIs may adversely impact vascular function, our data-mining study supports the association of PPI exposure with risk for MI in the general population. These data provide an example of how a combination of experimental studies and data-mining approaches can be applied to prioritize drug safety signals for further investigation.

Citation: Shah NH, LePendu P, Bauer-Mehren A, Ghebremariam YT, Iyer SV, Marcus J, et al. (2015) Proton Pump Inhibitor Usage and the Risk of Myocardial Infarction in the General Population. PLoS ONE 10(6): e0124653. https://doi.org/10.1371/journal.pone.0124653 Academic Editor: Yiru Guo, University of Louisville, UNITED STATES Received: January 9, 2015; Accepted: March 17, 2015; Published: June 10, 2015 Copyright: © 2015 Shah et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited Data Availability: The data in consideration are electronic medical records of patients at Stanford university, and medical records of a subset of patients at Practice Fusion. Current patient privacy rules do not allow sharing of electronic medical records without an explicit IRB review. The authors can make access to de-identified data available after appropriate approvals. Contact: Nigam Shah, nigam@stanford.edu. Funding: PL, ABM, NHS and SVI acknowledge support from the NIH grant U54HG004028 for the National Center for Biomedical Ontology, NLM grant R01 LM011369,and NIGMS grant R01 GM101430. NHS also acknowledges research gift support from Apixio, Inc. This work was also supported in part by grants to JPC from the NIH (1U01HL100397), AHA (11IRG5180026), and the Stanford SPARK Translational Research Program. YTG was a recipient of the Stanford School of Medicine Dean’s fellowship (1049528-149- KAVFB) and the Tobacco-Related Disease Research Program (TRDRP) of the University of California (20FT-0090). YTG is currently supported by the NHLBI grant 5K01HL118683-04 and by intramural funding from the Houston Methodist Research Institute. Practice Fusion provided support in the form of salaries for author JM, but did not have any additional role in the study design, and analysis, decision to publish, or preparation of the manuscript. The specific roles of the authors are articulated in the ‘author contributions’ section. Apixio, Inc., supported the project by providing an unrestricted research gift to Stanford University, but had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: PL, ABM, SVI, and NHS are inventors on technology disclosures and/or patents, name: Methods for Ontology based Analytics and numbers: US13/273,038, US13/420,402, US13/424,375, and US13/424,376, owned by Stanford University, that enable the use of clinical text for data-mining. JPC and YTG are inventors on patents titled: Dimethylarginine Dimethylaminohydrolase Inhibitors and Methods of Use Thereof and number: US13/766,336, owned by Stanford University, that protect the use of agents that therapeutically modulate the DDAH/NOS pathway. YTG and JPC are also founders of Altitude Pharma, Inc., a biotechnology Company that is developing PPI-based products for airway diseases. Apixio, Inc. partly funded this study. JM is an employee of Practice Fusion, Inc., which owns inventions by JM. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.

Introduction The primary indication for proton pump inhibitors (PPIs) is gastroesophageal reflux disease (GERD). Each year, it is estimated that over 113 million PPI prescriptions are filled globally. This, together with over-the-counter use, accounts for over $13 billion sales worldwide [1] [2]. In the US alone, about 21 million people used one or more prescription PPIs in 2009, making it the third highest seller in the country [3][2]. The availability of PPIs over-the-counter is particularly more worrisome due to the absence of medical supervision [1]. For individuals with a history of acute coronary syndrome (ACS), PPIs appear to reduce the efficacy of clopidogrel, an antiplatelet agent used to reduce the risk for subsequent ischemic events [4]. There are several competing theories about whether (and how) PPIs enhance the risk of major adverse cardiovascular events (MACE) amongst individuals with a history of ACS.[5–10] A leading hypothesis is that PPIs compete for and inhibit the clopidogrel-activating hepatic isoenzyme, CYP2C19, thereby interfering with clopidogrel’s capacity to prevent clot formation in subjects at risk for coronary thrombosis and myocardial infarction (MI).[11] However, some studies have associated PPI usage with adverse clinical outcomes in high-risk cardiovascular populations, independently of clopidogrel use.[7] For example, a reduction in therapeutic benefit has been reported in ACS patients treated with the antiplatelet agents aspirin and ticagrelor, neither of which requires activation by CYP2C19. [12, 13] While it is possible that PPIs may reduce the absorption of these drugs (a controversial hypothesis given that PPIs have been shown not to diminish the anti-platelet aggregation properties of aspirin [14, 15]), it is important to note that a similar reduction in gastric pH is achieved with H 2 blockers (H 2 Bs), which have been shown not to increase cardiovascular risk [12, 13]. An alternative explanation is that the observed risk of PPIs is due to some unknown mechanistic pathway [12], and that this pathway may not be restricted to vasculopathic patients. In this regard, we recently reported that PPIs inhibit the enzymatic activity of dimethylarginine dimethylaminohydrolase (DDAH), [16] which is responsible for 80% of the clearance of asymmetric dimethylarginine (ADMA)—an endogenous molecule known to inhibit the enzymatic activity of nitric oxide synthase (NOS).[17] An impairment in endothelial NOS (eNOS) is well-known to increase vascular resistance, and promote inflammation and thrombosis.[18] ADMA is a potent disease marker and independent predictor of MACE in prior observational studies.[19–24] Our recent pre-clinical studies found that PPIs increase ADMA levels in human endothelial cells and in mice by about 20–30%.[16] To date, we are aware of only one study which has examined the cardiovascular risk association of PPIs outside of high-risk cohorts [25]. This is a concern given our translational data, which suggests that the risk of these drugs may apply to subjects not taking antiplatelet agents, and those without any vascular disease. Therefore, we employed a novel and recently validated [26, 27] data-mining approach for pharmacovigilance on multiple electronic medical record datasets as well as examined a prospectively followed clinical cohort [28, 29], to explore the possibility that PPIs may be associated with cardiovascular risk in the general US population.

Methods The data mining studies were deemed by the Stanford IRB not to involve human patients. The Stanford GenePAD study was approved by the Stanford Human Subjects Research Institutional Review Board and was conducted under the guidelines of the Declaration of Helsinki, with written informed consent was obtained from all participants. Data sources We used two data sources for our data mining analysis—a primary source from Stanford and a secondary source from Practice Fusion, Inc—and one prospective source for the survival analysis. At Stanford University, all clinical notes (both inpatient and outpatient) have been transcribed and recorded electronically since 1994. These data are warehoused for research use in the Stanford Translational Research Integrated Database Environment (STRIDE).[30] STRIDE contains data from 1.8 million patients, 19 million encounters, 35 million coded International Classification of Disease (ICD-9) diagnoses, and a combination of pathology, radiology, and transcription reports totaling over 11 million unstructured clinical notes. Practice Fusion, Inc. (PF) provides a free, web-based Electronic Health Record (HER) system for clinicians. The company’s users are primarily small practices providing outpatient care. Roughly, half of these practices specialize in primary care, with 29% of users from the West, 13% from the Southwest, 14% from Midwest, 27% from the Southeast, and 18% from the Northeast. The de-identified subset of PF data used in our analysis contained data on 1.1 million patients, 5.5 million coded diagnoses, 6.8 million prescriptions, and 5.5 million unstructured clinical notes dating back to 2007. Additionally, we examined the association of PPI use at enrollment with subsequent cardiovascular mortality in the GenePAD (the Genetic Determinants of Peripheral Arterial Disease) [28, 29] study. The GenePAD cohort is comprised of individuals who underwent an elective, non-emergent coronary angiogram for angina, shortness of breath or an abnormal stress test at Stanford University or Mount Sinai Medical Centers. Cardiovascular mortality was defined as that from myocardial infarction, cardiac arrest, stroke, heart failure or aneurysm rupture. Cardiovascular outcomes were assessed through medical record review and confirmed by contacting the patient or next of kin directly. This form of dual follow-up was specifically implemented to limit detection bias from differential frequencies in physician contact between groups. Finally, all deaths were confirmed and cross-referenced to the SSDI to minimize detection bias. The study cohort commenced in 2004 and included 1,503 individuals. Data-mining pipeline for pharmacovigilance We used a previously validated data-mining pipeline for pharmacovigilance using clinical data [26] [31] to screen whether the exposure to proton pump inhibitors is associated with an elevated risk of myocardial infarction in the general population. Note that such a data-mining procedure is not the same as performing an epidemiological study. The difference between performing an epidemiological study and a data-mining study is categorically described in [32]. Briefly, data-mining approaches focus on learning a valid function f(x)—which is modeled as an algorithm that operates on variables (x) to predict the responses (y). The linking function f(x) in a data-mining study can be a regression, but cannot, and should not, be interpreted as a causal regression model which is typically the goal of an epidemiological study. The validation of data-mining approaches is performed by measuring predictive accuracy and is widely adopted in computer science [33], and increasingly in economics [34]. Our data-mining approach, which aims to minimize false positives, has 97.5% specificity and 39% sensitivity in discerning a true association as determined using a gold standard set of 28 true positive and 165 negative associations spanning 78 drugs and 12 different outcomes [35]. This performance provides an accuracy of 89% and has a positive predictive value of 81% if we test an equal number of true and false associations. We summarize the approach briefly, and further details are provided in LePendu et al [26]. The pipeline extracted positive-present mentions of drug, disease, device, and procedure concepts from all clinical notes, accounting for negation and other contexts, into a patient–feature matrix that we analyzed. Drug terms were normalized to active ingredients using RxNorm, and classified according to the Anatomical Therapeutical Chemical classification system. For example, “Prilosec” and “omeprazole” were treated equally; while omeprazole, rabeprazole, and so on were grouped together as the class of PPIs. Disease terms were normalized and aggregated according to the hierarchical relationships from the Unified Medical Language System Metathesaurus and BioPortal. Finally, we aligned records temporally based on the time at which each note was recorded and only kept positive-present–first mentions. The matrix (for STRIDE) comprises nearly a trillion pieces of data—roughly, 1.8 million patients as rows, thousands of clinical concepts as columns, with time as the third dimension (see Fig 5 in LePendu et al [26]). Patient population and outcome definition. GERD is the primary indication for PPIs, so we used the presence of this indication to define the baseline population in our pipeline. We excluded all patients under the age of 18 at their first GERD mention. We defined GERD by International Classification of Diseases, Ninth Revision (ICD-9) codes for esophageal reflux (530.81) and heartburn (787.1), and the UMLS code for gastroesophageal reflux disease (C0017168). The main outcome of interest, MI, was defined by acute myocardial infarction (ICD-9 code 410), and more than 18 different UMLS codes including myocardial infarction (C0027051) and silent myocardial infarction (C0340324). See S1 Table for full definitions. Study groups and study periods. The study period included all data from 1994 through 2011 in STRIDE and 2007 through 2012 in PF. We defined two study groups within the GERD baseline population in this period. The primary study group was the subset defined by patients taking PPIs, including a sub-group of those patients who were not on clopidogrel. We considered six PPIs (omeprazole, lansoprazole, pantoprazole, esomeprazole, rabeprazole, and dexlansoprazole) individually and as a class. We excluded dexlansoprazole from individual analysis because of insufficient exposure (<100 patients). As an alternative treatment for GERD we examined H2 blockers (H2Bs—cimetidine, famotidine, nizatidine, and ranitidine) as a separate association test. Association estimation. The summary of the data-mining pipeline shown in the S1 Fig outlines the decisions used in the data-mining pipeline to populate a contingency table for each of the associations tested. Each patient was counted according to the temporal ordering of concepts in the patient–feature matrix as described in LePendu et al [26]. For example, a mention of PPI use after a GERD indication would be counted as an exposure. A subsequent mention of MI counts as an associated outcome. Our data-mining method works based on “beforeness” of treatments and events and given the uncertainty the exact times of treatment and the messy EMR data used, we follow a two-step process for detecting drug safety signals (details in methods of LePendu et al) [26]. First we compute a raw association, followed by adjustment which involves matching on age, gender, race, length of observation, and, as proxies for health status, the number of unique drug and disease concepts mentioned in the full record. The first step is useful for flagging putative signals, and the second step in reducing false alarms. As in prior work, we attempted to match up to 5 controls. In cases where there are not enough controls to draw from, we tried either 1:3 or finally 1:1 matching (Table 1). The balance of variables before and after matching for the PPI study group is shown in Table 2. The balance of variables for the H2Bs study group is shown in Table 3. Note that the purpose of this matching is to reuse our validated two-step data-mining approach from LePendu et al [26] and not emulate an epidemiological study from the EMR data. In each of the two steps, we compute the odds-ratio as well as confidence interval (CI) using logistic regression and use a significance cutoff of p-value < 0.01. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 1. Study group populations for the STRIDE dataset, including 5:1 propensity matching. https://doi.org/10.1371/journal.pone.0124653.t001 PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 2. Balance of variables for patients on PPIs in the STRIDE dataset. https://doi.org/10.1371/journal.pone.0124653.t002 PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 3. Balance of variables in patients on H 2 blockers in the STRIDE dataset. https://doi.org/10.1371/journal.pone.0124653.t003 Survival analysis in a prospective cohort For all survival analyses in the GenePAD cohort, the follow-up time was defined as the period between the enrollment interview and the last confirmed follow-up or date of death. Cox proportional hazards models were used to calculate adjusted and unadjusted hazard ratios (HR) and 95% CI for the association of PPI use with cardiovascular mortality. Adjusted models included age, gender, race, total cholesterol, high-density lipoprotein cholesterol, systolic blood pressure, use of anti-hypertension medications, and lifetime pack-years.

Discussion Our results demonstrate that PPIs appear to be associated with elevated risk of MI in the general population; and H2 blockers show no such association. The associations are independent of clopidogrel use or age-related risks and are seen in two large independent datasets and a prospective cohort. In particular, the association is seen outside of the high-risk populations previously examined, such as the elderly [38] or patients with ACS [2]. Our results are consistent with findings in the extensively-studied cohort of subjects with coronary artery disease (CAD) [5, 7, 12, 36], where PPIs have repeatedly been associated with adverse outcomes amongst patients receiving clopidogrel. [15] While two prospective studies in the post-ACS population failed to detect an association between PPI use and an increased risk of cardiovascular death, MI, or stroke [9, 10], the authors acknowledged that their results do not rule out a clinically meaningful difference in cardiovascular events due to use of a PPI.[10] In fact both studies included patients at a higher risk of MI than the general population, which may eclipse any potential harm conferred by PPIs due to competing risks. [38, 39] Based on the concern that PPIs could reduce the metabolism of clopidogrel to its active form, the FDA issued a warning about this possible drug-drug interaction in 2009 [40]. The current study suggests that the risk of PPIs may extend beyond previously studied high risk individuals. These findings confirm and extend the findings of Shih and colleagues, which suggested that PPIs were associated with short term cardiovascular harm amongst Taiwanese individuals [25], and are consistent with studies which have shown that PPIs may diminish the cardioprotective effects of drugs that do not depend on CYP2C19 activation, such as ticagrelor [7, 12, 13]. While it has been argued that this phenomenon might result from PPI-induced changes in drug absorption, we view this as a less likely possibility given that H 2 blockers induce a similar reduction in gastric pH—without consistently increasing cardiovascular risk, as observed in each of three datasets studied here.[12] Other potential explanations for the observed association are that PPIs might impair cardiovascular hemodynamics or promote nutritional deficiencies. For example, PPIs have been reported to induce negative inotropic effects on myocardial tissue ex vivo, [41, 42] and to potentially increase the cardiovascular risk factor, homocysteine, by impairing the absorption of vitamin B12. [43, 44] However, population-based cohort studies have demonstrated a lack of excess mortality in patients with both ischaemic and non-ischaemic heart failure prescribed PPIs, [45] and consensus opinion is that PPIs are unlikely to cause a clinically relevant reduction in B12 levels in people on a normal diet, with otherwise normal gastrointestinal function [43]. Our observation that PPI usage is associated with harm in the general population—including the young and those taking no antiplatelet agent—suggests that PPIs may promote risk via an unknown mechanism that does not directly involve platelet aggregation. Accordingly, our recent molecular, cellular, physiological, and in vivo data [16] demonstrating that PPIs inhibit DDAH activity may explain how PPIs promote cardiovascular risk, and do so even in individuals not taking clopidogrel. DDAH, an enzyme necessary for cardiovascular health, metabolizes ADMA, an endogenous and competitive inhibitor of nitric oxide synthase (NOS).[46] Increases in plasma ADMA levels of as little as 10% are associated with increased risk of major adverse cardiovascular events.[19–24] We previously confirmed that PPIs inhibit purified DDAH enzyme using orthogonal assays. As a result, PPIs increased intracellular ADMA in cultured human endothelial cells by approximately 30%, increased serum ADMA levels in mice by approximately 20%, impaired endothelium-dependent vasodilation of isolated mouse aortae, and reduced the generation of nitric oxide by human saphenous vein segments obtained at the time of coronary artery bypass.[16] Taken together, these results provide a plausible mechanism for how PPI usage can manifest with dysregulation of vascular NOS, and therefore explain the association with increased risk of MI in the general population. Our study is subject to several limitations. Most importantly, these observational data may be subject to confounding in multiple ways, and it is possible that PPI usage is merely a marker of a sicker patient population [13]. For example, we were unable to control for factors such as obesity and insulin resistance, and it may be that in some individuals PPIs were prescribed for angina that was misidentified as acid reflux. However, the observation that alternative heartburn medications such as H 2 blockers were not associated with harm lends support to the concept that PPIs may specifically promote risk. Although our data-mining pipeline has high specificity and was validated to have high accuracy (89%), there is still a possibility that the association detected is a false positive. We also cannot account for over-the-counter PPI usage, or differences by drug dosage. We attempt to partially offset these limitations by including replication data from multiple sources (the community-based PF dataset, the tertiary-care Stanford dataset, and the prospective GenePAD study), and by adjusting for several cardiovascular covariates in the survival analysis. Nonetheless, we recognize that these findings are hypothesis generating, and a prospective randomized study in the general population (inclusive of both lean and obese individuals) is required before changing clinical practice. However, the number of subjects needed to detect harm among PPI users for MI is considerable, projected to be about 4,000 by Shih et al [25]. In conclusion, we use a novel analytical pipeline to associate PPI usage with risk of MI in the general population, independent of clopidogrel use. These findings, in conjunction with the preclinical results, necessitate additional investigation. Our work also puts forth an example use case of the learning health system on how multiple clinical data sources can be examined via data-mining to identify drug safety signals for further investigation. [47, 48]

Supporting Information S1 Table. Indication, Drug, and Event definitions. For each clinical concept, a set of seed concept unique identifiers (CUIs) is used to generate a list of strings used to search through the clinical text. https://doi.org/10.1371/journal.pone.0124653.s001 (PDF) S1 Fig. Summary of the data-mining pipeline. To construct a contingency table, patients with gastroesophageal reflux disease (GERD) who were over 18 years old at the time of indication were identified and used to form the baseline population. The drugs of interest were PPIs, clopidogrel, and H2 blockers. The outcome was MI. The temporal ordering of the drug and outcome determined into which cell of a 2x2 contingency table each patient would be counted. https://doi.org/10.1371/journal.pone.0124653.s002 (PDF) S2 Fig. Cumulative risk and exposure plots for PPI–MI. Reveal that pharmacovigilance algorithms could have flagged omeprazole and lansoprazole for monitoring as early as the year 2000. https://doi.org/10.1371/journal.pone.0124653.s003 (PDF)

Author Contributions Conceived and designed the experiments: NHS JPC NJL PL ABM SVI. Performed the experiments: YTG. Analyzed the data: JM KTN. Contributed reagents/materials/analysis tools: PL ABM SVI. Wrote the paper: NHS PL NJL. Participated in the editing of the manuscript: NHS PL ABM YTG SVI JM KTN JPC NJL.