The protocol of this systematic review and meta-analysis was previously registered in the PROSPERO database (registration number: CRD42017068278). The systematic review was performed in accordance with the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions [15]. The findings were reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [16]. All the review process (study selection and risk of bias assessment) was undertaken using Covidence, the Cochrane platform for systematic reviews, and was performed by EC, NF, SS and LS.

Eligibility Criteria

Randomized, double-blind, placebo-controlled, parallel-group trials that have assessed the AEs associated with COX-2 inhibitors (celecoxib, rofecoxib, etoricoxib, valdecoxib but not lumiracoxib as it never gained full FDA approval) in patients with OA were eligible for inclusion in this meta-analysis.

Studies that allowed concomitant anti-osteoarthritis treatments during the trial (other than rescue medication as acetaminophen or aspirin) were also excluded, as were animal trials.

Data Sources and Search Strategies

A comprehensive literature search was undertaken in the following databases: MEDLINE (via Ovid), Cochrane Central Register of Controlled Trials (Ovid CENTRAL) and Scopus. Each database was searched from inception up to 30 June 2017. We searched for randomized placebo-controlled trials of COX-2 inhibitors in OA, using a combination of study design-, treatment- and disease-specific key words and Medical Subject Heading (MeSH) terms.

While adverse effects were the outcomes of interest for this study, we decided to avoid the outcome-specific key words in the search strategies, because of the possibility that a study on the efficacy of a drug may have not mentioned terms related to adverse events in its title, abstract or in the keyword section. The search was limited to English and French publications and to human subjects. Detailed search strategies for MEDLINE/CENTRAL and Scopus databases are reported as Electronic Supplementary Material (ESM1).

Two clinical trial registries, ClinicalTrials.gov (clinicaltrials.gov/) and the World Health Organization’s International Clinical Trials Registry Platform Search portal (apps.who.int/trialsearch/) were also checked for trial results that were unpublished. Finally, recent meta-analyses were also screened for any additional relevant studies.

Study Selection

Two members of the review team independently evaluated each title and abstract to exclude only obvious irrelevant studies, according to the predefined eligibility criteria. At this step, the criteria related to adverse effects was not considered for selection, as studies focusing on the efficacy of a treatment may not report data about adverse effects in the abstract; this means that all trials mentioning only the efficacy information were retrieved at this step. After this first step, the two investigators independently reviewed the full text of each of the articles not excluded during the initial screening stage to determine whether the studies met all selection criteria. At this stage, studies were excluded due to previously unidentified duplication, conference abstracts alone being available, a non-placebo comparator being used alone against COX-2 medication in the trial, an indication other than OA, safety not being included as an outcome of the trial, a non-COX-2 intervention or incorrect study design. All differences of opinion regarding the selection of articles were resolved through discussion and consensus between the two investigators; any persistent disagreement was solved with the intervention of a third person (another member of the review team).

Data Extraction

The full texts of the selected studies were screened by independent reviewers for extraction of relevant data, using a standard data extraction form. Outcome results data were independently extracted by two investigators from the review team. For each study, the following data were extracted: characteristics of the manuscript, characteristics of the trial, objective and design of the study, characteristics of the patients, characteristics of the disease, characteristics of the treatments, AEs (outcomes) reported during the trial and the main conclusion of the study. In the case of multiple dosage arms for COX-2 inhibitors being included in a trial, the maximum dose was used to categorize the study. If multiple follow-up times were included, the longest follow-up time was used to categorize the study. The raw data (number of events in each group) were extracted for each outcome. The number of patients who experienced at least once any body-system–related AE (e.g. nervous system, gastrointestinal system), as well as AEs within each body system (e.g. headache, abdominal pain) were extracted. As much as possible, data from the intention-to-treat (ITT) analysis were considered.

Outcomes of Interest

The main System Organ Classes (SOCs) that are likely to be affected by the use of COX-2 inhibitors in the treatment of OA were explored in this meta-analysis. The primary outcomes of interest were gastrointestinal disorders, cardiac disorders, vascular disorders, nervous system disorders, skin and subcutaneous tissue disorders, hepatobiliary disorders, renal and urinary disorders, as well as overall severe and serious AEs, drug-related AEs and mortality. Secondary outcomes were withdrawals because of AEs (i.e. the number of participants who stopped the treatment because of an AE) and total number of AEs (i.e. the number of patients who experienced any AE at least once).

Assessment of Risk of Bias in Included Studies

Two authors of the review team independently assessed the risk of bias in each study using the Cochrane Collaboration’s tool for risk of bias assessment [15]. The following characteristics were evaluated:

Random sequence generation: we assessed whether the allocation sequence was adequately generated.

Allocation concealment: we assessed the method used to conceal the allocation sequence, evaluating whether the intervention allocation could have been foreseen in advance.

Blinding of participants and personnel: we assessed the method used to blind study participants and personnel from knowledge of which intervention a participant received and whether the intended blinding was effective.

Blinding of outcome assessment: we assessed the method used to blind outcome assessors from knowledge of which intervention a participant received and whether the intended blinding was effective.

Incomplete outcome data: we assessed whether participants’ exclusions, attrition and incomplete outcome data were adequately addressed in the paper.

Selective outcomes reporting: we checked whether there was evidence of selective reporting of adverse events.

Each of these items was either categorized as ‘low risk of bias’, ‘high risk of bias’, or ‘unclear risk of bias’. ‘Low risk of bias’ or ‘high risk of bias’ was attributed to an item when there was sufficient information in the manuscript to judge the risk of bias as ‘low’ or ‘high’; otherwise, ‘unclear risk of bias’ was attributed to the item. Disagreements were solved by discussion between the two reviewers during a consensus meeting and involved, when necessary, another member of the review team for the final decision.

Data Analysis

Analyses were performed using STATA 14.2 software. The units of analysis were the number of participants experiencing a specific adverse event. We described harms associated with the treatment as risk ratio with 95% confidence interval (95% CI). We computed an overall effect size for each primary or secondary outcome (AE). Anticipating substantial variability among trial results (i.e. the inter-study variability), we assumed heterogeneity in the occurrence of the AEs; thus, we planned to use random-effects models for the meta-analyses. We estimated the overall effects and heterogeneity using the DerSimonian and Laird random-effects model [17]. As this method provides a biased estimate of the between-study variance with sparse events [18, 19], we also performed the meta-analyses using the Restricted Maximum Likelihood (REML) method [20]. As rofecoxib was withdrawn by the FDA and EMA in 2004 due to thrombotic cardiovascular events, we performed a sensitivity analysis for AEs for the COX-2 inhibitor class minus rofecoxib.

We tested heterogeneity using the Cochran’s Q test. As we are performing a random-effect meta-analysis, we used the Tau-squared (τ2) estimate as the measure of the between-study variance. The I-squared (I2) statistic was used to quantify heterogeneity, measuring the percentage of total variation across studies due to heterogeneity [21]. The quality of each evidence was assessed using the GRADE approach [22] and a summary of findings table was prepared using GRADEpro online software [23].