The present study protocol describes an ongoing systematic review and meta-analysis of prognostic factors on clinical outcomes following MDR. The protocol adheres to the requirements of Preferred Reporting Items for Systematic Reviews and Meta-Analyses protocols (PRISMA-P) [18], with a populated PRISMA-P 2015 checklist included as an additional file [see Additional file 1], in the interest of transparency and completeness. The review will conform to the related PRISMA guidelines [19]. We registered this systematic review in PROSPERO, the International Prospective Register of Systematic Reviews, on February 5, 2016 (ref id: CRD42016025339) and the project is ongoing (May 31, 2017).

Eligibility criteria

Included in this review are articles on longitudinal studies that report empirical data, either observational (cohort, case-control) or experimental/clinical trials (RCT), in which predictive factors are presented from baseline to follow-up. Articles need to be original research papers published in full-text and in peer-reviewed journals, thus studies that reflect commentaries or editorials are excluded together with articles written in languages other than English. We use an explorative approach, aiming to reach all investigated factors potentially predictive of treatment outcome. Below, eligibility criteria are defined according to population, intervention, or variable of interest (as in exposure or predictive factor), comparators/referents, outcomes (PICO).

Population of interest

Adults aged 18–67 years (i.e., the working-age population) with chronic musculoskeletal pain, who have taken part in multidisciplinary rehabilitation programmes following the biopsychosocial model; defining chronic as a duration of > 3 months and delimitating musculoskeletal pain conditions to those not emanating from malignancies or systemic diseases (e.g., rheumatoid arthritis). The population thus consists of a wide range of common benign chronic pain diagnoses, i.e., patients with back pain, neck pain, and generalized pain syndromes (including fibromyalgia and general widespread pain).

Variable of interest (as in exposure or predictive factor)

Any independent variable investigated for potential predictive ability. We aim to evaluate all personal, work- and rehabilitation-based factors described in the scientific literature. Our broad study approach will consequently cover a variety of investigated prognostic factors, which will later be grouped into relevant domains. Predictive factors may be “personal”, e.g., demographic factors (sex, age, socioeconomic status, lifestyle), symptoms-related factors (pain intensity, pain duration, comorbidity), physical functioning (self-rated, assessed), psychosocial factors (cognitive, emotional, or social), and work-related factors or “treatment-related” (MDR duration, intensity, or content). A variable of interest might, for example, be the association of high baseline depression level with treatment outcome.

Comparators

A comparator is the alternative exposure within the predictive factor. The comparators (referents) are, thus, those not exposed to the predictor of interest, for example, those with a low depression level (vs. high depression level) at baseline.

Outcome

Longitudinal follow-up according to what is recommended by IMMPACT as core outcome measures in subjects with chronic pain [20]. For the purpose of this review, we will primarily focus on pain and physical functioning including measures of health-related quality of life and work ability. Any additional outcomes falling within the IMMPACT recommended outcome domains (emotional functioning, participant ratings of global improvement, and satisfaction with treatment) that are found during the review process will be analyzed elsewhere. Only long-term follow-up data (6 months or more) will be included in the analyses.

Study identification

The search strategy was developed with the support of a research librarian at Karolinska Institutet University Library, to optimize structure and completeness, covering all necessary descriptors to the topic definition [21, 22]. The search strategy adheres to the aforementioned PICO descriptors, but for purposeful recall, the search parameters were modified to appropriately define the intervention of interest and to filter for studies including prediction analyses. Thus, four search parameters (domains) were set: “chronic pain”–“multidisciplinary rehabilitation”–“treatment outcome”–“prediction”, joined with the Boolean operator “AND”. For each parameter, proper and exhaustive terms were used, identified through the screening of search strategies from previous systematic reviews on similar topics [10–13, 23, 24], from Medical Subject Heading-indexations (MeSH) and search terms of known, relevant primary studies, complemented with browsing of the thesauri of selected databases for additional controlled vocabulary. Validation procedures of sensitivity and specificity for each search parameter were performed to ensure comprehensiveness and relevance of the search strategy [25,26,27], which was subsequently adapted to every other reference database and peer-reviewed by the research librarian in the final stage.

Information sources

The six electronic databases MEDLINE and PsycINFO (via Ovid), EMBASE (via Elsevier), CINAHL (via EBSCO), Web of Science (via Thomson Reuters), and the Cochrane Central Register of Controlled Trials (CENTRAL) were searched in September 2015 to identify studies published from 1980 until that date. The search was later updated to include additional studies published up to April 2017. To maximize recall, the search was unrestricted except for the two limitations publication language and publication time. Our search strategy for MEDLINE (Ovid) is presented in an additional file [see Additional file 2]. The sample search strategy was translated into database-specific syntax for the other databases used. The reference lists of included studies and of related review papers will also be examined for additional records.

Study selection

An interdisciplinary review team, consisting of six members with expertise in multidisciplinary rehabilitation and chronic pain, was compiled for the selection process. The process of decision-making for inclusion based on the eligibility criteria was first piloted on a small sample of articles to validate the criteria and interpretation of studies.

The selection process will be performed in four steps:

1. Screening of titles: removal of clearly unrelated records, one reviewer (ET). 2. Screening of abstracts: independent assessment by two reviewers (BMS and KB), any conflicts to be resolved by a third reviewer (ET). 3. Screening of full texts for PICO eligibility: two pairs of reviewers (BMS and ET, KB and PE) will each screen half of the articles. Each article will be independently appraised by the two reviewers it is assigned to, and disagreements will be resolved through discussions with the full review team. 4. Screening of full texts for Relevance according to study objective: all remaining articles will be assessed once more by three senior reviewers (BMS, KB, BG), for fulfillment of relevance criteria according to an objective—compliant protocol, which has been developed by the authors (available from the corresponding author on request). One senior reviewer (BG) will examine all studies, while the other two (BMS and KB) will re-examine half of the studies each. To date, we have finalized the selection procedure from the first database search (Sept 2015) and have proceeded to step 3, screening of full texts, with the additional records retrieved from the second database search (April 2017).

Data management

A PRISMA flow diagram [19] will be used to document the selection process, along with the reasons for exclusion (Fig. 1). We use EndNote reference software to organize, collate, and deduplicate search results. All records are saved in EndNote subfolders according to selection stage, for future reference. The online, systematic review production software, Covidence (www.covidence.org), will be used throughout the study selection procedure, archiving the full review process. Records are allocated to the reviewers by randomization and inter-rater agreement throughout the review process will be evaluated using appropriate analyses (kappa coefficients).

Fig. 1 PRISMA flowchart illustrating the study selection process and planned structure of quantitative synthesis. From Moher D, Liberati A, Tetzlaff J, Altman DG, and the PRISMA group [19] Full size image

Quality assessment

Articles deemed relevant from the full-text screening will be assessed for internal validity using the Quality In Prognostic Studies (QUIPS) tool [28]. QUIPS is designed to assess potential bias in prognostic factor studies which, preferably, use a prospective cohort design. The tool has been successfully used in several review projects with moderate/substantial inter-rater reliability. Risk of bias (RoB) will be evaluated within the six RoB domains: (1) study participation, (2) study attrition, (3) prognostic factor measurement, (4) outcome measurement, (5) study confounding, and (6) statistical analysis and reporting, and is rated individually as low, moderate, or high RoB. As recommended [22, 28], summary scores for overall study quality are avoided; thus, all assessed features will be presented in a complete RoB table and a RoB summary figure will be compiled for each outcome. RoB will be evaluated primarily at study level while comments on outcome-specific RoB will be noted for further detailing during data synthesis and sensitivity analyses. The RoB will also be incorporated in the judgment about the quality of evidence in the summary of findings and, if possible, in sensitivity analyses. All articles will be assessed independently by a senior epidemiologist (WG) and one of two reviewers (ET, PE) in accordance with the randomization scheme. Consensus on the final rating is reached through discussion. The process was piloted a priori on a small sample of studies for inter-rater agreement.

Data synthesis

A digital coding protocol was created and pilot tested for data extraction on the first 10% of the included articles, before final revision. From each included study, data will be collected for six data domains: (1) participant and sample characteristics (including age, diagnosis, duration of pain, and inclusion and exclusion criteria), (2) characteristics of intervention (including type, professions and modalities involved, dose, duration, frequency, and setting), (3) investigated independent variables (= predictor/s) and how the information was collected, (4) outcome domains (dependent variables): pain, physical functioning, work ability, health-related quality of life, and how they were measured, (5) research design, length of follow-up, and percentage of loss to follow-up, and (6) statistical analyses and outcomes.

Data from the first third of included studies was extracted in duplicate, independently by two reviewers (WG, ET), and then compared for data accuracy and consensus on the coding procedure. The remaining studies will be coded independently for low inference data, whereas high inference data, on statistical outcomes and effect sizes, will be extracted jointly. If deemed relevant, clarification of reported data may be requested from investigators.

Currently, 69 articles are included in our study and the material encompasses in excess of 200 investigated predictive factors and their associations with our four predefined outcome domains. In these, preliminary data management indicated that work-related outcomes were investigated in the majority of articles while QoL was investigated in only a limited number: work (n = 47), physical function (n = 30), pain (n = 22), and QoL (n = 11). In order to competently sort and condense the material, our interdisciplinary review team will jointly perform a “consensus grouping process”; synonymous variables will be identified and labeled alike. Thereafter, all related predictive factors will be sorted into their pertinent domains, e.g., personal and psychosocial predictors, upon which coherent “predictor groups,” suitable for synthesis, will be identified, e.g., emotional factors and cognitive behavioral factors. Following this, predictors will be assembled into their related outcome domain, resulting in four outcome tables (pain, physical function, work, and QoL), which will constitute the basis for data synthesis.

Narrative synthesis

We will perform a narrative synthesis of all included studies in which the direction of the associations between predictors and outcomes will be presented as positive, negative, or absent in a tabular summary for each group of predictive factors. Data from both univariate and multivariate analysis models will be reported and included in the analyses. Depending on how data was presented in the original studies, results will, if necessary, be reversed to fit the chosen reporting direction of synthesis, i.e., for “good outcome” in every domain.

Quantitative synthesis

If the data proves to be appropriate for quantitative synthesis, meta-analyses will be performed. For pooling of predictor data pertaining to each outcome, at least two studies must provide data on the same predictor group, provided the judgment of study heterogeneity permits relevant summaries. Outcome data will be transformed to effect sizes and then to OR with corresponding 95% confidence intervals (95% CI), (if needed via standardized mean difference and standardized regression coefficients) according to what is suggested by Cooper (2009) [21] on research synthesis and meta-analysis. The strength of the relationships between identified predictors and corresponding outcomes will then be quantified using weighted pooled ORs in a random effects model for each outcome domain. A random effects model will be preferred for the statistical analysis since we expect substantial variability between studies related to both clinical and methodological diversity (heterogeneity). In the event of incomplete data for standardization, results will be reported in the narrative synthesis only. To avoid double counting [29], we plan to pool the data from studies that provide data from different predictive factors within the same predictor group. In this way, a study can only contribute with one predictive factor from the same predictor group. Statistical analyses will be conducted using the generic inverse-variance method in Review Manager software (RevMan, version 3. Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2014). When necessary, a web-based effect size calculator will be used for the purpose of computing and converting effect sizes [30].

Sensitivity analyses

Analyses of heterogeneity will be assessed by sensitivity analyses to test the robustness of our results. Characteristics of studies that may be examined as potential causes of heterogeneity are pain diagnosis/duration, MDR-intervention profile/duration, RoB, and follow-up time. Subgroup analyses (e.g., diagnoses group) will be performed if possible. Funnel plots will be used to determine potential publication bias and heterogeneity of the included studies. Heterogeneity across studies will be assessed and quantified by the inconsistency index (I2). A summary table reporting any sensitivity analyses will be presented in the final report.

Confidence in evidence

The strength of the emerging evidence will be evaluated using the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) method [31]. The five GRADE domains consider confidence in estimates of treatment effect, i.e., risk of bias, imprecision, inconsistency, indirectness, and publication bias. GRADE can also readily be applied to bodies of evidence estimating longitudinal risks or prognosis of future events [32]. Applying GRADE domains to our included systematic reviews of prognostic studies will hence provide a useful approach to estimating confidence in the body of evidence included in our review. Any amendments to the stated procedure will be updated in PROSPERO and discussed in the final manuscript.