Inclusion and exclusion criteria

The criteria for inclusion were: (i) case–control and cohort studies evaluating the relationship between alcohol consumption and prostate cancer; (ii) original articles published in English up till December 2014; (iii) articles that reported findings in odds ratio, hazard ratio, incidence ratio or standardized mortality ratio; and (iv) articles reporting at least three levels of alcohol consumption with drinking amounts, including the reference level. Articles with no abstainer group or a lowest drinking level greater than 0.33g/d were excluded. Additionally, studies reporting total alcohol consumption were included while studies based on consumption of specific beverages only such as wine, whiskey, vodka, sake or hard liquors were excluded. When the results of the study were published more than once or if the same dataset was used multiple times, only the most recent or more complete data were included in analyses. The primary outcomes of interest were mortality and/or morbidity from prostate cancer (ICD–9: 185 or ICD–10: C61) [26].

While published and peer reviewed cohort or case–control studies were included in the review, all other article types including narrative reviews, letters, editorials, commentaries, unpublished manuscripts, dissertations, government reports, books and book chapters, conference proceedings, meeting abstracts, lectures and address, and consensus development statement including guideline statements, were excluded.

Search strategy

The systematic review follows the Preferred Reporting Items for Systematic Reviews and Meta–Analyses (PRISMA) guidelines [27]. We identified all potentially relevant articles by searching Pubmed and Web of Science, through reference list cross–checking including those of previous meta–analyses and incorporating publications up to 31 December 2014. Hand searches of cited references in the selected articles, reviews and meta–analysis published on the same topic were also performed. The following MESH terms and text words were used: (“prostatic neoplasms” OR (“prostate” AND “neoplasms”) OR “prostate cancer “OR (“prostate” AND “Cancer”)) AND (“alcohol” OR (alcohol drinking) OR “alcohol consumption” OR “alcohol intake” OR (“alcohol” AND “consumption”)).

Study selection

Two reviewers trained and supervised by the PI read the titles and/or abstracts of all the citations retrieved from the electronic database searches and removed all citations that were clearly not related to studies of the relationship between prostate cancer and alcohol consumption. The screening further involved abstract review. Full–text articles were obtained for all abstracts except for those that clearly did not meet eligibility criteria. The investigators were consulted in the event of any disagreement. Two of the investigators independently evaluated all studies selected for inclusion. The initial search identified a total of 340 studies of which 27 studies [4–6, 9, 11, 12, 28–48] satisfied the criteria for the meta–analysis after removing 313 records for reasons identified in Fig. 1.

Fig. 1 Flowchart of summarizing systematic review of studies of prostate cancer morbidity or mortality and alcohol consumption from literature search to inclusion in meta–analysis Full size image

Data extraction

Two reviewers independently reviewed all eligible papers to extract and code data from all studies fulfilling the inclusion criteria, and any disagreements were resolved by discussion with the investigators. Each study was coded with reference to a standardized code–book (available from authors on request) and under the supervision of investigators. The coding of all variables in the meta–dataset was double–checked by the first two authors. The data to be extracted were: (1) outcome, mortality or morbidity of prostate cancer; (2) measures of alcohol consumption; (3) study characteristics; (4) types of misclassification error of alcohol measure; and (5) controlled variables in individual studies.

A multitude of different approaches are used for assessing alcohol consumption in this literature [49]. Problematic approaches include assessing some beverage types and not others, assessing quantity consumed on a drinking day but not frequency, assessing consumption over very short time periods (e.g. two days) and assessing frequency but not quantity of consumption. We coded alcohol measurement as ‘adequate’ if both quantity and frequency of consumption was assessed for all alcoholic beverages and for a period of at least one week.

The primary exposure variable was level of daily alcohol consumption in grams of ethanol assessed at baseline and compared with a reference group of variously defined “non–drinkers” or “abstainers”. When studies did not define the grams of alcohol per unit or drink, we used 8 g/unit for the UK; 10 g/drink for Australia, Austria, France, Greece, Hungary, Ireland, Netherlands, New Zealand, Poland, Spain, Sweden; 11 g/drink for Finland; 12 g/drink for Denmark, Germany, Italy, South Africa and Switzerland; 13.45 g/drink for Canada; 14 g/drink for US; 12.5 g/drink for China, 19.75 g/drink for Japan and 12 g/drink for other countries [50, 51]. We converted alcohol intake into grams per day using the mid–points of reported categories to estimate mean values. Following practice in other meta–analyses involving self–reported alcohol consumption, the open–ended top categories (e.g. 6+ drinks/day) were coded by adding three–quarters of the range of the next lowest category to the lower bound (e.g. if 3 to 5 drinks this would be 6 + (5–3)*0.75 = 7.5) [52]. It is necessary to make some higher estimate than the lowest level possible for these open–ended categories with no fixed upper level (e.g., 7.5 in this case instead of 6 for 6+ drinks). We employed predetermined definitions of “low–volume” drinking (up to 20g ethanol per day) based on Australian NHMRC low risk drinking guidelines [53]. This was operationalised as up to 24 g per day given that respondents in the studies reported whole drinks or units rather than grams i.e. 24g per day is closer to two than three 10g standard drinks per day. All data extracted from individual studies and analyzed during this study can be found in Additional file 1.

Studies were classified according to the presence or absence of two types of potential abstainer group bias: (i) including former drinkers and/or (ii) including occasional drinkers in the abstainer reference category. Studies were coded as having former drinker bias if a) results were not reported separately for former drinkers and b) there was no mention of removing former drinkers from the abstainer reference group. Following Fillmore et al. [16], lifetime abstention was strictly defined as zero consumption and did not include studies with any level of occasional lifetime or past year drinking (e.g. less than 12 drinks or “rarely” or “hardly ever” drinking). Our rationale for this strict criterion was that self–reported infrequent drinkers have been shown to greatly underreport their personal consumption [54, 55]. Studies were coded as having occasional drinker bias if a) results were not reported separately for occasional drinkers and b) frequency of drinking was assessed for a “usual” period or over less than 30 days. The rationale here is that if a person reports “usually” not drinking over the course of a month, persons drinking less than monthly may still be occasional drinkers. When a study used occasional drinkers as the reference category and risk for abstainers was independently assessed, the risk values were recalculated using the abstainer category as the reference group [16].

Strategy for data analysis

Where studies only reported mortality or incidence rates, these were converted to RR estimates [56]. Otherwise hazard ratios in cohort studies and odds ratio estimates in case–control studies were entered as observations of the estimated risk relationships for meta–analysis. When the odds ratios (OR as RR estimates) are estimated using logistic regression models in a case–control study, the OR tends to overestimate RR when it is more than one and to underestimate RR when it is less than one if the outcome becomes more frequent [57]. Therefore, the formula below was used to correct the adjusted OR and its 95% CIs obtained from logistic regression in studies and derive an estimate of an association that better represents the true RR [57].

$$ RR=\frac{OR}{\left(1-{P}_0\right)+\left({P}_0\times OR\right)}, $$

where RR is relative risk, OR is odds ratio and P 0 is the incidence of outcome of interest in the non–exposed group.

Publication bias was assessed through visual inspection of the funnel plot of log–RR of morbidity or mortality of prostate cancer due to alcohol consumption against the inverse standard error of log–RR [56] and Egger’s linear regression method [58]. We plotted a forest graph to examine how the RR estimate for any drinking in one study is different from others [56]. We also assessed between–study heterogeneity of RRs overall and by drinking groups using Cochran’s Q [59] and the I2 statistic [60]. As no heterogeneity was detected, fixed effects models were used to obtain the summarized RR estimates [56]. We also conducted sensitivity tests using random effects models, but patterns of results were very similar and are not reported here.

We used the fixed effects models to estimate the weighted RRs of prostate cancer for any alcohol use and by drinking groups while adjusting for the potential effects of study–level covariates [56, 61–63]. Drinking level in each study group was examined in terms of pre–defined specific consumption levels. Drinking categories were defined and reclassified as: (1) lifetime occasional drinkers (0.02–0.33 g/day); (2) former drinkers now completely abstaining; (3) current occasional drinkers, up to one drink per week (<1.30 g per day); (4) low volume drinkers, up to 2 drinks or 1.30–24 g per day; (5) medium volume, up to 4 drinks or 25–44g per day; (6) high volume drinkers, up to 6 drinks or 25–64g per day; and (7) higher volume drinkers, 6 drinks or 65g or more per day. All studies had an open–ended heavier drinking group, i.e., with no upper limit of quantity consumed per day for responses accepted as valid. We investigated the dose–response relationship between the RR and alcohol consumption for those who drank one drink or more per week using the midpoint of each exposure category using t-test in multivariate linear regression analysis [56].

We investigated the potential modification and confounding effects of study–level covariates using bivariate analysis of RR of prostate cancer morbidity or mortality and any alcohol consumption [64]. According to the availability of the data from 27 included studies, the following study characteristics were investigated: (1) study designs which included cohort study, population–based case–control study and hospital–based case–control study; (2) outcomes, i.e., morbidity or mortality of prostate cancer; (3) adequacy of drinking measurement method defined as whether both quantity and frequency of total alcohol consumption was assessed for at least one week; (4) mean or median age of individual study populations at baseline; (5) year at baseline, if recruited over a number of years then take midpoint; (6) whether subjects with a history of cancer were excluded at baseline or prior to randomization (yes, no or unknown); (7) presence of misclassification errors, i.e., including both former and occasional drinkers, only former drinkers, only occasional drinkers or neither former nor occasional drinkers in the abstaining reference group; (8) whether or not the study and control for social status (yes or no) using income or occupation measures; (9) whether or not a study controlled for racial identity or country of origin (yes or no); (10) whether or not a study control for smoking status (yes or no); (11) whether or not a study was conducted in US. We made stratified RR estimates for studies with different values for these characteristics and also examined the differences in the RR estimates between these same subgroups of studies [64].

The covariates above were selected for control in multivariate regression analyses on empirical grounds based on the P–value of bivariate tests of the log–RR of each covariate, and correlations with other covariates. Using all 27 studies, any variable whose bivariate test had a P–value <0.10 was considered as a candidate for the multivariate regression analyses of the log–RR of prostate cancer morbidity or mortality [65, 66]. If two or more covariates were moderately to highly correlated (coefficient > 0.30), the one with lowest P–value from the bivariate test was included in the multivariate regression analyses. Abstainer bias was the main interest of the present study and thus its potential confounding effect was adjusted for in the pooled analysis (Table 3) and further examined in the stratified analysis (Table 4). On the basis of these criteria, two other covariates were included in the analyses: (i) whether or not the study was conducted in the US and (ii) whether smoking was controlled in the individual studies (Tables 3 and 4). Although the study design variable was not selected as a controlled covariate in the final models using bivariate analysis, the study design was a concern as these were unevenly distributed across the studies with different abstainer biases and the RR estimates were slightly different in case-control studies from cohort studies [23]. We still examined the potential effect of the design variable by performing a sensitivity analysis by including and excluding it in multivariate regression analyses (Tables 3 and 4). However, the estimates remained unchanged. We also conducted a correlation analysis of the study design variable and other selected covariates. The design variable was highly correlated with the abstainer bias variable (the coefficient = 0.48 and P < 0.001) and it was not included in the final models.

In multivariate regression analysis, the dependent variable was the natural log of the RR estimated using the rate ratio, hazard ratio or odds ratio of each drinking group in relation to the abstainer category. All analyses were weighted by the inverse of the estimated variance of the natural log RR. Variance was estimated from reported standard errors or confidence intervals. The weights for each individual study were created using the inverse variance weight scheme used in fixed regression analysis in order to obtain maximum precision for the main results of the meta–analysis [56] and such analyses may adjust for confounding among the characteristics [63].

Studies with large or small estimates and/or variance can be highly influential. Univariate analysis [56, 67, 68] was performed to identify outliers. If a particular RR was more than twice the standard deviation of the RR estimates by drinking groups it was considered to be an outlier; five risk estimates were identified as outliers among 126 risk estimates. Sensitivity analyses were run after excluding outliers but no substantial changes in the risk estimates resulted [56]. A sensitivity analysis was also run after excluding one study by Putnam et al. [41] with markedly higher risk estimates but, again, the estimates remained unchanged. There was also no substantial effect on the RR estimates when each of other studies were excluded or included.

All significance tests assumed two–tailed P values or 95% CIs. All statistical analyses were performed using SAS 9.3 and the SAS PROC MIXED procedure was used to model the log–transformed RR [69].

Role of the funding sources

The study funders had no role in study design, data collection, analysis or interpretation, report preparation and the decision to publish. All authors had full access to all the data and had final responsibility for the decision to submit for publication.