Considerations of complexity in rating certainty of evidence in systematic reviews: a primer on using the GRADE approach in global health

Paul Montgomery; Ani Movsisyan; Sean P Grant; Geraldine Macdonald; Eva Annette Rehfuess

doi:10.1136/bmjgh-2018-000848

Article Text

Research

Considerations of complexity in rating certainty of evidence in systematic reviews: a primer on using the GRADE approach in global health

http://orcid.org/0000-0001-8008-1370Paul Montgomery1,
Ani Movsisyan2,
Sean P Grant3,
Geraldine Macdonald4,
Eva Annette Rehfuess5

¹School of Social Policy, University of Birmingham, Birmingham, UK
²Centre for Evidence-Based Intervention, Department of Social Policy and Intervention, University of Oxford, Oxford, UK
³Pardee RAND Graduate School, RAND Corporation, Santa Monica, California, USA
⁴School for Policy Studies, University of Bristol, Bristol, UK
⁵Institute for Medical Information Processing, Biometry and Epidemiology, Pettenkofer School of Public Health, Ludwig Maximilian University, Munich, Germany

Correspondence to Dr Paul Montgomery; p.x.montgomery{at}bham.ac.uk

Abstract

Public health interventions and health technologies are commonly described as ‘complex’, as they involve multiple interacting components and outcomes, and their effects are largely influenced by contextual interactions and system-level processes. Systematic reviewers and guideline developers evaluating the effects of these complex interventions and technologies report difficulties in using existing methods and frameworks, such as the Grading of Recommendations Assessment, Development and Evaluation (GRADE). As part of a special series of papers on implications of complexity in the WHO guideline development, this paper serves as a primer on how to consider sources of complexity when using the GRADE approach to rate certainty of evidence. Relevant sources of complexity in systematic reviews, health technology assessments and guidelines of public health are outlined and mapped onto the reported difficulties in rating the estimates of the effect of these interventions. Recommendations on how to address these difficulties are further outlined, and the need for an integrated use of GRADE from the beginning of the review or guideline development is emphasised. The content of this paper is informed by the existing GRADE guidance, an ongoing research project on considering sources of complexity when applying the GRADE approach to rate certainty of evidence in systematic reviews and the review authors’ own experiences with using GRADE.

systematic review
public health

This is an open access article distributed under the terms of the Creative Commons Attribution-Non commercial IGO License (CC BY 3.0 IGO), which permits use, distribution,and reproduction for non-commercial purposes in any medium, provided the original work is properly cited. In any reproduction of this article there should not be any suggestion that WHO or this article endorse any specific organization or products. The use of the WHO logo is not permitted. This notice should be preserved along with the article’s original URL.

https://doi.org/10.1136/bmjgh-2018-000848

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key questions

What is already known?

The Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach is an internationally prominent system for rating certainty of the evidence in systematic reviews estimating intervention effects.
Researchers conducting systematic reviews on public health and health system interventions report difficulties in using the GRADE approach.

What are the new findings?

Applying a ‘complexity perspective’ can help identify aspects of using the GRADE approach that require particular consideration when rating certainty in the estimates from systematic reviews estimating the effects of global health interventions.
These aspects include: sources of complexity when framing the review questions, such as important dimensions of context and implementation and other potential mediators and moderators of effect; a choice of thresholds or ranges for certainty of evidence ratings that matches the needs of intended users of the review; assessment of evidence from non-randomised study designs; the criteria within each GRADE domain for rating certainty of evidence and coherence of evidence across the hypothesised causal pathway of the intervention.

What do the new findings imply?

This primer can help systematic reviewers, health technology assessors and guideline developers better assess evidence relating to complex interventions and systems, which could enhance the use of such evidence in global health policy and practice decisions.

Introduction

Systematic reviews on the effects of interventions in global health

Systematic reviews that estimate the effects of interventions can have a significant influence on subsequent decisions to either implement or disinvest in an intervention. In biomedicine, there are established methods for synthesising and rating certainty in the effects of medications and other single-component interventions.1 Researchers in public health and health technology assessment (HTA) report difficulties in using these methods,2 3 largely stemming from failure to account for ‘what interventions work, for whom, and under what circumstances’,4 to further inform development of context-specific recommendations.5 6

WHO is the leading institution for producing evidence-informed guidelines at a global level. WHO recommendations for practice and policy are underpinned by systematic reviews of evidence on the effects of interventions and health technologies and aim to follow a transparent and evidence-based process.7 Review and HTAs that inform recommendations often need to consider a range of populations, interventions with different implementation strategies, multiple health and non-health outcomes and various contextual factors that may interact and modify intervention effects.8 Given the pressing needs and limited resources in low-income and middle-income countries (LMICs), appropriate application of evidence synthesis on a broad range of health interventions is crucial for optimal decisions about implementation.9

Using a complexity perspective in reviews of intervention effects

Recently, there has been increased attention to evaluating effects of complex health interventions implemented within complex systems.10 This complexity creates challenges for guideline development and HTAs, from the conceptualisation of the questions asked to synthesising diverse types of evidence, assessing or rating the evidence and formulating recommendations. This paper is one of a series exploring the implications of complexity for systematic reviews, HTAs and guideline development.

Using a complexity perspective in systematic reviews of the effects of interventions can facilitate the more nuanced conceptualisation and assessment of interventions ultimately needed for health decision making.11 An earlier paper in this series differentiates between two main perspectives on sources of complexity in the evaluation of interventions.10 A ‘complex interventions perspective’ locates sources of complexity in the features of interventions themselves, such as interventions with different components addressing different and multiple causes of problems.12 A ‘complex systems perspective’, on the other hand, locates sources of complexity in the properties of systems into which interventions are introduced, such as how the intervention interacts with, and impacts on, the system as a whole.13 We use the overarching term ‘complexity perspective’ to encompass both perspectives and acknowledge the many sources of complexity.10 Depending on the priority questions of a review, an HTA, or a guideline, either perspective or a combination may be adopted.

In planning and undertaking systematic reviews, a ‘complex systems perspective’ necessarily entails broadening the scope of a review to include evidence on how the wider system changes when the intervention, such as a complex technology, is introduced. This may involve collecting qualitative evidence on social norms and the dynamics of social networks to describe the broader system impact of the intervention. Not all sources of complexity are relevant to every systematic review and HTA. Researchers should take a pragmatic approach that focuses on the key aspects of interventions, their causal pathways and the levels of target relevant to the specific aims of the review and users’ needs. There is a growing body of literature and guidance, which can be helpful in deciding on the important sources of complexity to consider in systematic reviews, HTAs and guidelines, including approaches described in earlier papers in this series, by Booth et al,14 Petticrew et al10 and Rehfuess et al15 (see box 1).

Box 1

Examples of guidance and tools for addressing sources of complexity

Context and Implementation of Complex Interventions framework63
Guidance on the integrated assessment of complex health technologies: the INTEGRATE-HTA model64
Intervention Complexity Assessment Tool for Systematic Reviews65
Preferred Reporting Items for Complex Interventions for Systematic Reviews and Meta-analyses66
Template for Intervention Description and Replication tool6 67

The GRADE approach to rating certainty of evidence

The GRADE Working Group has taken a leading role in developing guidance and methods for using research evidence to inform healthcare recommendations. Grading of Recommendations Assessment, Development and Evaluation (GRADE) offers an explicit and transparent system for rating certainty in the body of evidence underpinning conclusions in a systematic review, an HTA or a guideline (box 2). In GRADE, certainty of the effect estimate for each outcome is ultimately assigned one of four categories: high, moderate, low or very low. The GRADE approach has been widely adopted by systematic reviewers, authors of HTA and guideline developers in healthcare, including over 100 organisations worldwide.16 Among these, WHO uses GRADE to inform global health recommendations,16 and the Cochrane Collaboration mandates use of GRADE in Cochrane intervention reviews.17

Box 2

Summary of the GRADE process for rating the certainty of evidence for intervention effects

The Grading of Recommendations Assessment, Development and Evaluation (GRADE) process starts with an initial certainty rating based on the design of studies included in the body of evidence: if the body of evidence contributing to an outcome consists of randomised controlled trials, certainty is initially rated as ‘high’, whereas a body of evidence consisting of observational or non-randomised studies (NRSs) is initially rated as ‘low’.
The assessing team then uses five domains for potential downgrading of the initial certainty rating: study limitations, indirectness, inconsistency, imprecision and publication bias.
Next, the team assesses three further domains for potential upgrading of the initial certainty rating: magnitude of the effect, dose–response relationship in the effect and counteracting plausible residual bias or confounding.43 These upgrading domains are primarily relevant to NRSs (eg, cohort, before–after and interrupted time series).
Evidence Profiles and Summary of Findings tables are used to summarise the effect estimates and the certainty ratings for those estimates for each main outcome in the assessment.
The GRADE ratings are further used as one of the criteria in the Evidence to Decision frameworks to inform recommendations about implementing interventions in practice, where high-certainty evidence is more likely to result in a strong recommendation compared with low-certainty evidence.68

Despite its wide uptake in biomedicine, systematic reviewers and guideline developers report difficulties applying GRADE in reviews of broader health technologies, health system and public health interventions.2 3 These challenges are frequently attributed to the complexity of these interventions, often requiring sophisticated consideration and analysis.3 13 For example, high levels and various sources of heterogeneity in reviews of public health interventions often lead to challenges in deciding how and whether to downgrade for inconsistency.3 18 Another common challenge results from the difficulty, if not impossibility, of using RCTs to evaluate policy-level and health system interventions.2 3 With all types of non-RCT evidence starting off at a ‘low’ certainty level, public and global health researchers have voiced concerns that GRADE may inadvertently produce ratings that steer decision-makers away from implementing important system-level interventions.19–21 Concerns have also been raised on how to conceptualise the construct of ‘certainty’ in reviews of global health interventions, and consensus is currently lacking.3 Consequently, global health researchers could benefit from targeted guidance on how to rate certainty when encountering these challenges.2 3 9 21 22

Objectives

As part of the overall series intended to stimulate thinking about how methods for reviewing and assessing evidence in guideline development can be enhanced to take account of complexity, this paper clarifies how a complexity perspective may be applied when using the GRADE approach to rate certainty of evidence. The GRADE Working Group is actively working to advance the GRADE methodology for different applications, including for diagnostic tests, prognostic studies and qualitative evidence.16 23 This paper focuses on using the GRADE approach for rating certainty in the evidence from systematic reviews estimating the effects of complex interventions and technologies in global health.

Methods

This paper is largely informed by an ongoing, mixed-methods research project, GRADE Guidance for Complex Interventions, involving five key studies that follow an established methodology on developing guidance for health research.24 In Study 1, we investigated GRADE certainty ratings in 24 ‘complex’ and 16 ‘simple’ systematic reviews18 and obtained feedback from review authors on 19 of these reviews about their process of applying GRADE.2 In Study 2, we compared domains and criteria across GRADE and 16 other systems for rating certainty of evidence in health and social interventions.25 In Study 3, we interviewed 10 Cochrane review authors and 5 GRADE methodologists on their views about the challenges of, and suggestions for, using GRADE in specific systematic reviews incorporating various sources and degrees of intervention and system complexity (Movsisyan et al, forthcoming). In Study 4, we conducted an online modified-Delphi process to explore areas of agreement and disagreement among 116 stakeholders about the importance of specific domains and criteria for rating certainty in systematic reviews of complex interventions (Grant et al, forthcoming). In Study 5, we held a 3-day consensus meeting to discuss proposals for the content of the new GRADE guidance for complex interventions with 28 stakeholders, purposively invited from the Delphi process, representing a range of subject areas (Movsisyan et al, forthcoming). All studies were approved by the Departmental Research Ethics Committee at the Department of Social Policy and Intervention, University of Oxford (SPI_C1A_16_009). This project draws suggestions from several sources–including the existing GRADE guidelines and conceptual papers, previous work considering complexity in systematic reviews, HTAs and guideline development and consultation with relevant stakeholders–to advise how to apply the GRADE approach using a complexity perspective in the context of global health.

Results

Several aspects of using GRADE require particular consideration when using a complexity perspective in systematic reviews and HTAs on the effects of interventions in global health (see table 1). Particularly important is that authors consider GRADE from the outset of the review or HTA and not at the end when evidence has already been synthesised. In this way, the totality of the evidence will become an integral part of the assessment from its inception. As systematic reviews represent an important source of evidence and are integral to most HTAs and guidelines in global health, below we describe how specific constructs and domains of GRADE can be used in systematic reviews using a complexity perspective.

View this table:

Table 1

Mapping the main sources of complexity onto difficulties in rating estimates of the effect of interventions (data taken from Movsisyan et al, 2016; Petticrew et al, 2013; Petticrew et al, 2019; Rehfuess and Akl, 2013)2 3 10 69

Considering complexity and GRADE when framing the question(s) and conducting the systematic review(s)

The certainty ratings at the final stage of the review are inextricably linked to the purpose and key questions established at the beginning of each complex intervention review. Similar to the approach described in Petticrew et al, 201910, reviewers and guideline developers should identify sources and degrees of complexity inherent in interventions themselves, as well as the systems in which they are implemented and intended to influence (see table 1). As emphasised in the WHO-INTEGRATE framework (see Rehfuess et al, 201915), incorporation of sources of complexity into the review and, ultimately, into the GRADE ratings should be considered at the earliest stages of the review process: explicitly addressing sources of complexity when formulating review questions14 ,15 and structuring the proposed GRADE Evidence Profiles and Summary of Findings (SoFs) tables.26 Thinking through all relevant sources of heterogeneity at the beginning of the systematic review process will influence the types of data extracted and syntheses conducted. For instance, for a Cochrane review of environmental interventions to reduce the consumption of sugar-sweetened beverages and their adverse effects on health,27 reviewers developed a system-based logic model to guide data extraction, analysis and interpretation. The frameworks to extract and report relevant data were prespecified as were internal and external sources of heterogeneity for subgroup analysis (box 3). Such an approach is essential for capturing heterogeneity in the methodology and Population, Intervention, Comparison, Outcome (PICO) elements, which would otherwise remain unexplained and almost inevitably lead to downgrading of evidence.

Box 3

Consideration of sources of complexity in a protocol for a Cochrane review of environmental interventions to reduce consumption of sweetened beverages (von Philipsborn et al)27

von Philipsborn et al (2016) developed a system-based logic model taking into account:

Beverage choices and diet-related health and non-health outcomes
Physiological and psychological mechanisms linking sugar-sweetened beverages with health outcomes at an individual level
Interventions aimed at policy (macro) and settings (meso) and interpersonal and intrapersonal factors (micro level)
Determinants of diet-related outcomes and related interventions

This logic model was used to guide data extraction, analysis and interpretation.

Tools used for data collection:

Template for Intervention Description and Replication framework to extract and report relevant data related to the intervention6
Context and Implementation of Complex Interventions framework for contextual data63

Internal and external sources of heterogeneity were predefined at multiple levels, such as at policy or setting level:

With or without behavioural cointerventions
Targeted at: sugar-sweetened beverages, sugar-sweetened milk, beverages with non-nutritive sweeteners or beverages without added sweeteners
Implemented in high-income, middle-income or low-income countries
Targeted at the general population or at disadvantaged populations

A frequent challenge for authors of intervention reviews in global health arises from posing broad review questions on bundles of conceptually similar interventions (often referred to as a ‘lumping’28 or a ‘holistic'29 approach) that may actually have very different characteristics.30 31 In these cases, authors should carefully consider upfront which sources of complexity are critical to include in their research questions, such as the active or ‘prototypical’ components of an intervention that are most likely to modify intervention effects.32

Logic models may be particularly helpful in depicting intervention components and identifying potential effect modifiers.10 11 32 33 A recent example involves a systematic review by Welch et al (2016) aiming to estimate the effects of deworming interventions on the developmental health and well-being outcomes of children in LMICs.34 While evidence for deworming programmes had been debated,35–37 review authors made considerable efforts to describe the complexities of the programme. Specifically, they developed a logic model at the outset of the review to elucidate the entire causal chain from worm infection to nutritional status and educational outcomes, how deworming–in combination with other strategies (such as hygiene promotion and sanitation)–intervenes in the pathway and which factors might be important in mediating or moderating the effects (such as poverty, prevalence and intensity of infection and spill-over effects).34 This complexity perspective allowed them to add important questions to their overall question of ‘what works’, such as the effects of deworming according to the prevalence of infections, as well as the synergistic effects of cointerventions and treatment externalities for untreated children. They then constructed three separate GRADE SoF tables to provide certainty ratings depending on the levels of endemicity for which different strategies of mass deworming were relevant.34

Defining the thresholds or the ranges for certainty of evidence ratings

The GRADE Working Group conceptualises ‘certainty of evidence’ as confidence that the true effect of an intervention lies on one side of a specified threshold or within a chosen range (see table 2).38 In general, depending on the purpose of the assessment (ie, whether the systematic review informs a guideline or not), certainty of evidence ratings are presented as ‘non-contextualised’, ‘partly contextualised’ and ‘fully contextualised’. Non-contextualised ratings are relevant for assessments conducted outside of a guideline (eg, Cochrane and Campbell reviews): in these circumstances, authors may prioritise the threshold of the null effect and conceptualise certainty of evidence as confidence that a non-null effect is present, that is to say, that the effect of one intervention differs from another. Alternatively, the range approach may be chosen, and certainty of evidence may be conceptualised as confidence that the effect lies within a given range (eg, a 95% CI or prediction interval). Finally, authors may instead choose a partly contextualised rating, setting thresholds of specified magnitudes of effect (eg, what may be considered as a trivial, small, moderate or large effect). Fully contextualised ratings are relevant when systematic reviews are conducted as part of a specific guideline development or decision-making process, which enables integration of other considerations relevant for a health decision. In this case, authors could rate the certainty that the effect lies above a threshold that makes implementation of the intervention worthwhile.38

View this table:

Table 2

Approaches for setting thresholds or ranges for certainty of evidence ratings (adapted from Hultcrantz et al, 2017)38

The non-null effect is likely the simplest and most feasible threshold for rating certainty on the effects of public health interventions. Since these intervention effects may vary depending on implementation factors, context and settings, it may be very challenging for a review group to define specific magnitudes of effect for various outcomes that are practically important for all potential contexts of application. Rating certainty in the non-null effect would inform the broad global readership about the general direction (positive or negative) of an intervention effect. The task will then be left for the end-users of the evidence at the local level to further contextualise the evidence and set the corresponding thresholds informing implementation of the intervention depending on their specific circumstances. Importantly, contextualising the evidence usually involves a broad range of considerations and decisions not solely driven by evidence of intervention effectiveness.15

The choice of the thresholds or ranges will have implications for how the domains of the GRADE approach are applied in a given review or guideline. For example, the criteria of imprecision and inconsistency are only marginally relevant when assessing certainty in the non-null effect (as long as there is consistency in the direction of effect across studies), but are highly relevant when assessing certainty in a specified magnitude of effect. It is therefore critical for systematic reviewers and guideline developers to make their choice explicit and to carry it through into rating the evidence (see table 3 for further details).

View this table:

Table 3

Key considerations for rating certainty in systematic reviews on the effects of complex interventions

Initial certainty rating based on study design

Following definitions of review scope and thresholds or ranges for rating the certainty of evidence, the initial rating is based on study design. Given the practical impossibilities of conducting RCTs for many complex interventions, global health researchers have struggled with the convention that a body of RCTs is initially rated as ‘high’ certainty in the GRADE approach and a body of NRSs as ‘low’ certainty.2 3 39 Moreover, there are clearly differences with respect to one’s ability to draw causal inferences about intervention effects from a cross-sectional study, a very weak design for that aim, compared with a much stronger design, such as controlled interrupted-time series; however, both study designs would start off as ‘low’ certainty in GRADE.

Partly in response to these concerns, the GRADE Working Group has recently suggested a second approach, in which any body of evidence may receive a ‘high’ initial certainty rating provided that a rigorous tool has been used to assess risk of bias.40 The new risk of bias in non-randomised studies (ROBINS-I) tool for cohort-type studies is the only tool recognised by the GRADE Working Group as meeting this standard for NRSs.40 41 Compared with other tools, ROBINS-I offers a rigorous process for assessing risk of bias through seven distinct domains and overall 34 signalling questions, but requires both significant human resources and substantial epidemiological expertise.41 42 This nuanced assessment of risk of bias in ROBINS-I, specifically, regarding the domains of confounding and selection bias, allows for RCTs and NRSs to be placed on a common metric for risk of bias. While this approach enables one to initially rate evidence from NRS designs as ‘high’ certainty, it is expected that the certainty of evidence will eventually be downgraded for risk of bias, so that the final rating is the same no matter the starting point (ie, whether a body of evidence from NRSs was initially rated as ‘low’ certainty and subsequently rated up or down or whether it was initially rated as ‘high’ certainty and further downgraded). It is important to note that the current version of ROBINS-I is primarily designed for cohort studies. Although future initiatives may develop extensions to ROBINS-I for other types of NRS designs, following the original GRADE guidance for initial rating of evidence based on study design, the body of evidence comprised of NRSs other than cohort studies should be initially rated as ‘low’ certainty.40 43 Authors should then explain their decisions for further downgrading or upgrading of the certainty of evidence (eg, subsequent downgrading the certainty of a body of evidence from cross-sectional studies for additional concerns over the risk of bias).

Applying GRADE domains for rating certainty

We further highlight how reviewers and guideline developers in global health may address sources of complexity when making judgements on specific domains of the GRADE approach.

Risk of bias

Handling performance bias in certainty ratings has proven challenging in reviews of interventions, where it is often impossible to blind participants and/or providers.44 45 A common source of complexity is the contingency of intervention effects on recipients’ and providers’ agency.46 The challenge therefore is to assess if the lack of blinding introduces a risk of bias that implies reduced confidence in the effect estimates. To do so, review authors should be careful to differentiate between ‘lack of blinding’ and the judgement for the potential of ‘performance bias’ typically associated with the lack of blinding.47 Lack of blinding does not always cause sufficient bias to warrant downgrading for risk of bias. Indeed, lack of blinding may be an essential aspect of the intervention of interest, particularly when knowledge of the presence of the intervention is an important aspect of its effectiveness as in a traffic safety enforcement campaign. In these circumstances, other considerations become even more important, such as blinding of outcome assessors or the nature of the comparator.48 For example, for the outcome ‘quality of life’, authors conducting a review on rehabilitation for chronic obstructive pulmonary disease did not downgrade for lack of blinding of provider and participants because they judged the procedures used in included studies to blind outcome assessors sufficient to address any concerns about risk of bias.49 Furthermore, the potential for bias due to lack of blinding will also depend on the PICO question of the review.48 Subjective outcomes are more prone to bias than objectively measured outcomes, such as all-cause mortality. Lack of blinding would be a more important source of bias when the comparator in the review is ‘usual care’ than an active intervention, such as when comparing two different educational interventions.

It is worth noting that assessment of performance bias has been revised in the new Cochrane tool to assess risk of bias in randomised trials (RoB V.2.0)42 and in ROBINS-I.41 In this revised version of the RoB tool, performance bias is assessed under the domain of ‘bias due to deviations from intended interventions’, which allows for assessing two different aims of the trial: either the effect of assignment to intervention or the effect of starting and adhering to intervention. When interest is in the effect of assignment to intervention (also known as ‘treatment offer’), lack of blinding of intervention recipients and providers may not warrant downgrading certainty of evidence, as the deviations from the intended intervention should not create a prognostic unbalance between the trial groups. However, in relation to starting and adhering to the intervention, deviations, such as poor adherence, poor implementation and cointerventions, may lead to risk of bias.42 Many interventions in global health, such as educational and behaviour change interventions, require tailoring to specific contexts. Accordingly, authors should exercise judgement on the level of differences in intervention implementation that are beyond what would otherwise be expected in a real-world context.

Inconsistency

Interventions examined in global health reviews often vary in how they are implemented in different contexts and in outcome measures used across settings.3 18 Consequently, reviews often find considerable heterogeneity in effect estimates. The proper consideration of sources of complexity when framing the review questions can facilitate assessments of whether to downgrade for inconsistency at this later stage of the review. Namely, review authors can group and synthesise the included studies according to the nature of relevant sources of complexity and, if these sources of complexity help explain heterogeneity, provide separate certainty ratings for each of these groupings.32 50

Judgements of inconsistency in the magnitude or direction of effects should correspond with the chosen threshold or range that the review team adopts for rating certainty of evidence. For example, if the review team chooses the null effect as a threshold for rating certainty in the estimate of effect, then judging inconsistency in the direction of effect (ie, beneficial or harmful) would be a relevant approach to follow. In this case, variation in point estimates and statistically significant heterogeneity may not warrant downgrading for inconsistency if the effects across studies are consistently in the same direction with respect to the null effect.50 51 However, if reviewers are rating their certainty in whether the average effect lies within an estimated range, such as within the CIs, then authors should consider multiple criteria for inconsistency (eg, overlap of CIs, degree of variation with respect to chosen thresholds and I² and its p value), rather than using only a single statistical measure of heterogeneity.50

Imprecision

Judgements for imprecision are contingent on reviewers’ chosen thresholds for rating certainty. If the chosen threshold is the null effect, then imprecision will not be a concern to warrant downgrading evidence unless the confidence or prediction interval includes the null effect, in which case the evidence is either (1) imprecise (due to small number of events or participants) or (2) precise and the intervention does not have an effect relative to the comparator.38 For the latter, the confidence or prediction interval needs to be sufficiently narrow around the null effect to exclude a ‘meaningful’ effect established a priori.38 If clearly stated, authors rating certainty of effects within 95% CIs may omit a precision assessment.38 This approach, however, has not yet been used in any review and needs further testing on examples using complex health interventions (Montgomery et al, forthcoming).

Indirectness

Many of the reported challenges of judging indirectness can be addressed by specifying appropriate review questions. Important differences in the PICO elements beyond what is expected and specified in the review may weaken inferences regarding the directness of evidence. As outlined in the original GRADE guidance on indirectness, “it is however rare and usually unnecessary, for the intended populations and interventions to be identical to those in the studies, and evidence should be downgraded only if the differences are considered sufficient to make a difference in the outcome likely”.52

Another potential challenge relates to incompleteness of available evidence with regard to the review question. As questions for the global health audience might necessarily be broad (ie, often follow a ‘lumping’ approach), the available evidence might not address all elements in the PICO framework (eg, while the question may ask for evidence in both LMICs and high-income countries, the evidence may only be available for high-income countries). If authors suspect major differences in effect across the locations, rather than downgrading all evidence for indirectness, an alternative approach is to split the question to be able to provide direct evidence for a subset of conditions (eg, make separate certainty of evidence ratings for LMICs and high-income countries). In this case, authors may report a lack of evidence for the remaining subset of conditions (eg, LMICs) or extrapolate based on available data. In the latter case, however, reviewers may need to downgrade evidence for indirectness. Again, it is highly recommended that reviewers think about the factors that may modify intervention effects at the beginning of the review process, when scoping the review and formulating specific questions.

Publication bias

Many evaluations of global health interventions are published as reports, working papers or programme evaluations. If review authors suspect that eligible studies are likely published in this format (rather than in indexed scientific literature), a comprehensive multicomponent search that includes grey literature and contacting of experts is critical. In addition, authors should assess whether a substantial number of studies are sponsored by any vested industries (eg, intervention developers and representatives from industries benefiting from the status quo) or run by researchers with a potential ‘allegiance bias’ to warrant downgrading for publication bias.53

Upgrading

The criteria for upgrading evidence of complex interventions should follow the guidance of the GRADE Working Group,54 including guidance on upgrading when all types of study design are initially rated at ‘high’ certainty.40 In line with the GRADE guidance, upgrading criteria commonly apply when there are no major limitations in the body of evidence (such as risk of bias, inconsistency or imprecision).54 One special case for complex interventions involves intervention fidelity: authors may upgrade their certainty rating if (1) larger effects are found in studies with better implementation (criterion of dose–response effect) or (2) positive results are found among studies with low implementation fidelity (counteracting plausible residual confounding).

Future work for rating certainty in reviews using a complexity perspective

Based on our project findings, we suggest several areas of future work. Many interventions and complex technologies may have long and variable causal pathways. Our consultation with stakeholders suggests a strong interest in developing a robust domain for an approach to rating certainty that is based on the ‘coherence of the causal pathway’ or ‘chain of evidence’.55 56 A similar approach is currently used by the US Preventive Services Task Force to describe different links in the causal chain of an intervention and inform what types of evidence should be searched for and synthesised (figure 1).57 If review authors manage to populate different links in the causal chain of an intervention with rigorous evidence (eg, links 4 and 7 in figure 1), then this may increase their certainty in the effects of its distal outcomes (eg, link 5 in figure 1). As discussed in the previous example on deworming interventions, logic models (also known as analytical frameworks) that visually depict the links in these causal pathways can be useful in identifying the important items of evidence that should be searched for and synthesised in a systematic review.33 Authors could revisit their initial logic models at predefined stages of the review process, in particular, at the end of the review, using the evidence collected and synthesised for each individual link in the pathway, to assess coherence in the causal pathway originally proposed.58 This chain of evidence approach might be particularly informative in circumstances where direct evidence linking the intervention with the distal outcomes is unavailable. The work is ongoing both within the GRADE Working Group and beyond, for example, on how to conduct model-driven synthesis of evidence.59

Figure 1

Example chain of evidence approach: screening and interventions for overweight in childhood. Arrow 1: Is there direct evidence that screening (and intervention) for overweight in childhood improves age-appropriate behavioural or physiological measures or health outcomes? Arrow 2: (1) What are appropriate standards for overweight in childhood, and what is prevalence of overweight based on these? (2) What clinical screening tests for overweight in childhood are reliable and valid in predicting obesity in childhood? (3) What clinical screening tests for overweight in childhood are reliable and valid in predicting poor health outcomes in adulthood? Arrow 3: What are the adverse effects of screening, including labelling? Is screening acceptable to patients? Arrow 4: (1) Do weight control interventions lead to improved intermediate outcomes? (2) What are common behavioural and health system elements of efficacious interventions? (3) Are there differences in efficacy between patient subgroups? Arrow 5: Do weight control interventions lead to improved health outcome and/or improved functioning? Arrow 6: What are the adverse effects of interventions? Are interventions acceptable to patients? Arrow 7: Are improvements in intermediate outcomes associated with improved health outcomes? (Only evaluated if there is no direct evidence for link 1 or link 5 and if there is sufficient evidence for link 4). BMI, body mass index. Taken from Whitlock et al, 2005.57

Second, several stakeholders are interested in more systematically examining whether there are specific NRS designs that are consistently sufficiently robust to start as ‘moderate’ rather than ‘low’ certainty in the traditional GRADE approach. Such an approach would allow for a ‘quick and dirty’ sorting of study designs, where stronger NRS designs would start off as ‘moderate’ certainty and weaker designs as ‘low’ certainty. Advantages would be a quicker and more user-friendly distinction between different levels of certainty, which is likely to be applicable by many Cochrane and non-Cochrane systematic reviewers; the initial rating could then be refined by risk of bias assessments, but would be less dependent on the use of a very sophisticated risk of bias tool, such as the ROBINS-I tool. The significant challenge, however, relates to which study designs and associated features merit starting in one category versus the other.60 61 Based on the results of our Delphi process and the consensus meeting, this option was generally supported by those concerned with inconsistent implementation of NRS risk of bias tools leading to overestimation of certainty, especially given the novelty of the tools and varying levels of expertise in using them.62 However, we do not recommend this approach until future research conclusively establishes such a set of acceptable NRS designs.

Conclusion

This primer provides a concise discussion of how to incorporate a complexity perspective when applying the GRADE approach in systematic reviews estimating the effects of interventions in global health. Key considerations include: sources of complexity when framing the review questions, such as important dimensions of context and implementation and other potential mediators and moderators of effect; a choice of a threshold or a range that matches the needs of intended users of their review, assessment of evidence from NRS designs and the criteria within each GRADE domain for rating certainty (see table 3). Suggested future work involves investigating the feasibility of (1) a domain on the coherence of evidence across the hypothesised causal pathway of an intervention, which may not need to be integrated into the GRADE ratings as it will apply at a higher, systematic review level beyond assessment of certainty of evidence in specific outcomes and (2) the identification of specific NRS designs that could start as ‘moderate’ rather than ‘low’ certainty. Researchers, including systematic reviewers, authors of HTA and guideline developers in global health should continue to report their experience using GRADE and this primer in reviews aiming to address sources of complexity. More examples of using GRADE are particularly needed for social interventions, interventions in LMIC contexts, assessments where meta-analysis may not be possible, in bodies of evidence with rigorous NRSs (as well as mixed bodies of evidence), and by researchers with varying degrees of experience in systematic reviewing and evidence assessment.

There is wide interest in finding ways to assess which interventions are effective and, equally, which ones are relevant and appropriate in diverse contexts. Using a complexity perspective can contribute to this. This discussion on considering the ‘complexity perspective’ in GRADE ratings provides a primer for systematic reviewers, authors of HTA and guideline developers to better assess evidence relating to complex interventions and systems, which could ultimately enhance the use of such evidence in global health policy and practice decisions.

Acknowledgments

The authors would like to acknowledge the contribution of WHO and in particular the members of the group working on this linked series of papers. We also want to thank Elaine Kidney for her editing assistance.

References

1.↵
1. Guyatt GH,
2. Oxman AD,
3. Vist GE, et al
. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008;336:924–6.doi:10.1136/bmj.39489.470347.AD
OpenUrl FREE Full Text
2.↵
1. Movsisyan A,
2. Melendez-Torres GJ,
3. Montgomery P
. Users identified challenges in applying GRADE to complex interventions and suggested an extension to GRADE. J Clin Epidemiol 2016;70:191–9.doi:10.1016/j.jclinepi.2015.09.010
OpenUrl
3.↵
1. Rehfuess EA,
2. Akl EA
. Current experience with applying the GRADE approach to public health interventions: an empirical study. BMC Public Health 2013;13:9.doi:10.1186/1471-2458-13-9
4.↵
1. Petticrew M
. Time to rethink the systematic review catechism? Moving from ’what works' to ’what happens'. Syst Rev 2015;4:36.doi:10.1186/s13643-015-0027-1
OpenUrl
5.↵
1. Koplan JP,
2. Bond TC,
3. Merson MH, et al
. Towards a common definition of global health. Lancet 2009;373:1993–5.doi:10.1016/S0140-6736(09)60332-9
OpenUrl CrossRef PubMed Web of Science
6.↵
1. Hoffmann TC,
2. Glasziou PP,
3. Boutron I, et al
. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ 2014;348:g1687.doi:10.1136/bmj.g1687
OpenUrl Abstract/FREE Full Text
7.↵
WHO. World Health Organization handbook for guideline development. 2nd ed. Geneva, Switzerland: WHO Press, 2014.
8.↵
1. Doyle J,
2. Waters E,
3. Yach D, et al
. Global priority setting for Cochrane systematic reviews of health promotion and public health research. J Epidemiol Community Health 2005;59:193–7.doi:10.1136/jech.2003.019547
OpenUrl Abstract/FREE Full Text
9.↵
1. Peters DH,
2. Bennett S
. Better guidance is welcome, but without blinders. PLoS Med 2012;9:e1001188.doi:10.1371/journal.pmed.1001188
OpenUrl PubMed
10.↵
1. Petticrew M,
2. Knai C,
3. Thomas J, et al
. Implications of a complexity perspective for systematic reviews and guideline development in health decision making. BMJ Glob Health 2018;0:e000899.
11.↵
1. Anderson LM,
2. Petticrew M,
3. Chandler J, et al
. Introducing a series of methodological articles on considering complexity in systematic reviews of interventions. J Clin Epidemiol 2013;66:1205–8.doi:10.1016/j.jclinepi.2013.07.005
OpenUrl CrossRef PubMed
12.↵
1. Craig P,
2. Dieppe P,
3. Macintyre S, et al
; Developing and evaluating complex interventions: new guidance: Medical Research Council (MRC, 2008.
13.↵
1. Hawe P,
2. Shiell A,
3. Riley T
. Theorising interventions as events in systems. Am J Community Psychol 2009;43(3-4):267–76.doi:10.1007/s10464-009-9229-9
OpenUrl CrossRef PubMed Web of Science
14.↵
1. Booth A,
2. Moore G,
3. Flemming K, et al
. Taking account of context in systematic reviews and guidelines considering a complexity perspective. BMJ Glob Health 2019;0:e000840. doi:10.1136/ bmjgh-2018-000840.
15.↵
1. Rehfuess EA,
2. Stratil JM,
3. Scheel IB, et al
. The WHO-INTEGRATE evidence to decision framework version 1.0: integrating WHO norms and values and a complexity perspective. BMJ Glob Health 2019;4:e000844.doi:10.1136/bmjgh-2018-000844
16.↵
Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group. 2017 http://gradeworkinggroup.org/
17.↵
1. Higgins J,
2. Lasserson T,
3. Chandler J, et al
. Methodological expectations of Cochrane interventions reviews. London: Cochrane, 2016.
18.↵
1. Movsisyan A,
2. Melendez-Torres GJ,
3. Montgomery P
. Outcomes in systematic reviews of complex interventions never reached "high" GRADE ratings when compared with those of simple interventions. J Clin Epidemiol 2016;78:22–33.doi:10.1016/j.jclinepi.2016.03.014
OpenUrl
19.↵
1. Ogilvie D,
2. Egan M,
3. Hamilton V, et al
. Systematic reviews of health effects of social interventions: 2. Best available evidence: how low should you go? J Epidemiol Community Health 2005;59:886–92.
OpenUrl
20.↵
1. Parkhurst JO,
2. Abeysinghe S
. What constitutes ‘good’ evidence for public health and social policy making? From hierarchies to appropriateness. SERRC 2014;3:34–46.
OpenUrl
21.↵
1. Nasser SM,
2. Cooke G,
3. Kranzer K, et al
. Strength of recommendations in WHO guidelines using GRADE was associated with uptake in national policy. J Clin Epidemiol 2015;68:703–7.doi:10.1016/j.jclinepi.2014.11.006
OpenUrl
22.↵
1. Lewin S,
2. Bosch-Capblanch X,
3. Oliver S, et al
. Guidance for evidence-informed policies about health systems: assessing how much confidence to place in the research evidence. PLoS Med 2012;9:e1001187.doi:10.1371/journal.pmed.1001187
OpenUrl CrossRef PubMed
23.↵
1. Lewin S,
2. Booth A,
3. Glenton C, et al
. Applying GRADE-CERQual to qualitative evidence synthesis findings: introduction to the series. Implement Sci 2018;13(Suppl 1):2.doi:10.1186/s13012-017-0688-3
OpenUrl PubMed
24.↵
1. Moher D,
2. Schulz KF,
3. Simera I, et al
. Guidance for developers of health research reporting guidelines. PLoS Med 2010;7:e1000217.doi:10.1371/journal.pmed.1000217
OpenUrl CrossRef PubMed
25.↵
1. Movsisyan A,
2. Dennis J,
3. Rehfuess E, et al
. Rating the quality of a body of evidence on the effectiveness of health and social interventions: A systematic review and mapping of evidence domains. Res Synth Methods 2018;9:224–42.doi:10.1002/jrsm.1290
OpenUrl
26.↵
1. Murad MH,
2. Mustafa RA,
3. Schünemann HJ, et al
. Rating the certainty in evidence in the absence of a single estimate of effect. Evid Based Med 2017;22:85–7.doi:10.1136/ebmed-2017-110668
OpenUrl Abstract/FREE Full Text
27.↵
1. von Philipsborn P,
2. Stratil JM,
3. Burns J, et al
. Environmental interventions to reduce the consumption of sugar-sweetened beverages and their effects on health. Cochrane Database Syst Rev 2016;135.doi:10.1002/14651858.CD012292
28.↵
1. Caldwell DM,
2. Welton NJ
. Approaches for synthesising complex mental health interventions in meta-analysis. Evid Based Ment Health 2016;19:16–21.doi:10.1136/eb-2015-102275
OpenUrl Abstract/FREE Full Text
29.↵
1. Guise JM,
2. Chang C,
3. Viswanathan M, et al
. Agency for Healthcare Research and Quality Evidence-based Practice Center methods for systematically reviewing complex multicomponent health care interventions. J Clin Epidemiol 2014;67:1181–91.doi:10.1016/j.jclinepi.2014.06.010
OpenUrl CrossRef PubMed
30.↵
1. Turley R,
2. Saith R,
3. Bhan N, et al
. Slum upgrading strategies involving physical environment and infrastructure interventions and their effects on health and socio-economic outcomes. Cochrane Database Syst Rev 2013;1:CD010067.doi:10.1002/14651858.CD010067.pub2
OpenUrl CrossRef PubMed
31.↵
1. Weir MC,
2. Grimshaw JM,
3. Mayhew A, et al
. Decisions about lumping vs. splitting of the scope of systematic reviews of complex interventions are not well justified: a case study in systematic reviews of health care professional reminders. J Clin Epidemiol 2012;65:756–63.doi:10.1016/j.jclinepi.2011.12.012
OpenUrl CrossRef PubMed
32.↵
1. Squires JE,
2. Valentine JC,
3. Grimshaw JM
. Systematic reviews of complex interventions: framing the review question. J Clin Epidemiol 2013;66:1215–22.doi:10.1016/j.jclinepi.2013.05.013
OpenUrl CrossRef PubMed
33.↵
1. Rohwer A,
2. Pfadenhauer L,
3. Burns J, et al
. Series: Clinical Epidemiology in South Africa. Paper 3: Logic models help make sense of complexity in systematic reviews and health technology assessments. J Clin Epidemiol 2017;83:37–47.doi:10.1016/j.jclinepi.2016.06.012
OpenUrl CrossRef PubMed
34.↵
1. Welch VA,
2. Ghogomu E,
3. Hossain A, et al
. Deworming and adjuvent interventions for improving the developmental health and well-being of children in low- and middle-income countries: a systematic review and network meta-analysis. Campbell Syst Rev 2016;7.
35.↵
1. Hilton J,
2. Tovey D
. Debating the evidence for deworming programmes. Cochrane Database Syst Rev 2012;8:ED000045.doi:10.1002/14651858.ED000045
OpenUrl
36.↵
1. Andrews JR,
2. Bogoch II,
3. Utzinger J
. The benefits of mass deworming on health outcomes: new evidence synthesis, the debate persists. Lancet Glob Health 2017;5:e4–e5.doi:10.1016/S2214-109X(16)30333-3
OpenUrl
37.↵
1. Taylor-Robinson DC,
2. Jones AP,
3. Garner P
. Deworming drugs for treating soil-transmitted intestinal worms in children: effects on growth and school performance. Cochrane Database Syst Rev 2007;4:CD000371.doi:10.1002/14651858.CD000371.pub3
OpenUrl CrossRef PubMed
38.↵
1. Hultcrantz M,
2. Rind D,
3. Akl EA, et al
. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol 2017;87:4–13.doi:10.1016/j.jclinepi.2017.05.006
OpenUrl CrossRef PubMed
39.↵
1. Guyatt GH,
2. Oxman AD,
3. Vist G, et al
. GRADE guidelines: 4. Rating the quality of evidence–study limitations (risk of bias). J Clin Epidemiol 2011;64:407–15.doi:10.1016/j.jclinepi.2010.07.017
OpenUrl CrossRef PubMed Web of Science
40.↵
1. Schünemann HJ,
2. Cuello C,
3. Akl EA, et al
. GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in non-randomized studies should be used to rate the certainty of a body of evidence. J Clin Epidemiol 2018. doi: 10.1016/j.jclinepi.2018.01.012. [Epub ahead of print].doi:10.1016/j.jclinepi.2018.01.012
41.↵
1. Sterne JA,
2. Hernán MA,
3. Reeves BC, et al
. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ 2016;355:i4919.doi:10.1136/bmj.i4919
OpenUrl FREE Full Text
42.↵
Risk of bias tools. http://www.riskofbias.info/.
43.↵
1. Balshem H,
2. Helfand M,
3. Schünemann HJ, et al
. GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol 2011;64:401–6.doi:10.1016/j.jclinepi.2010.07.015
OpenUrl CrossRef PubMed Web of Science
44.↵
1. Grant S,
2. Pedersen ER,
3. Osilla KC, et al
. Reviewing and interpreting the effects of brief alcohol interventions: comment on a Cochrane review about motivational interviewing for young adults. Addiction 2016;111:1521–7.doi:10.1111/add.13136
OpenUrl
45.↵
1. Foxcroft DR
. We cannot ignore bias, especially if effects are small, but we need better methods for evaluating prevention systems. Addiction 2016;111:1532–3.doi:10.1111/add.13317
OpenUrl
46.↵
1. May C
. Towards a general theory of implementation. Implement Sci 2013;8:18.doi:10.1186/1748-5908-8-18
OpenUrl CrossRef PubMed
47.↵
1. Grant S,
2. Pedersen ER,
3. Osilla KC, et al
. It is time to develop appropriate tools for assessing minimal clinically important differences, performance bias and quality of evidence in reviews of behavioral interventions. Addiction 2016;111:1533–5.doi:10.1111/add.13380
OpenUrl
48.↵
1. Schünemann HJ
. Methodological idiosyncracies, frameworks and challenges of non-pharmaceutical and non-technical treatment interventions. Z Evid Fortbild Qual Gesundhwes 2013;107:214–20.doi:10.1016/j.zefq.2013.05.002
OpenUrl
49.↵
1. McCarthy B,
2. Casey D,
3. Devane D, et al
. Pulmonary rehabilitation for chronic obstructive pulmonary disease. Cochrane Database Syst Rev 2015;2:CD003793.
OpenUrl PubMed
50.↵
1. Guyatt GH,
2. Oxman AD,
3. Kunz R, et al
. GRADE guidelines: 7. Rating the quality of evidence–inconsistency. J Clin Epidemiol 2011;64:1294–302.doi:10.1016/j.jclinepi.2011.03.017
OpenUrl CrossRef PubMed
51.↵
1. O’Connor AM,
2. Stacey D,
3. Entwistle V, et al
. Decision aids for people facing health treatment or screening decisions. Cochrane Database Syst Rev 2003;2:CD001431.
OpenUrl PubMed
52.↵
1. Guyatt GH,
2. Oxman AD,
3. Kunz R, et al
. GRADE guidelines: 8. Rating the quality of evidence–indirectness. J Clin Epidemiol 2011;64:1303–10.doi:10.1016/j.jclinepi.2011.04.014
OpenUrl CrossRef PubMed
53.↵
1. Dragioti E,
2. Dimoliatis I,
3. Evangelou E
. Disclosure of researcher allegiance in meta-analyses and randomised controlled trials of psychotherapy: a systematic appraisal. BMJ Open 2015;5:e007206.doi:10.1136/bmjopen-2014-007206
54.↵
1. Guyatt GH,
2. Oxman AD,
3. Sultan S, et al
. GRADE guidelines: 9. Rating up the quality of evidence. J Clin Epidemiol 2011;64:1311–6.doi:10.1016/j.jclinepi.2011.06.004
OpenUrl CrossRef PubMed
55.↵
1. Sawaya GF,
2. Guirguis-Blake J,
3. LeFevre M, et al
. Update on the methods of the U.S. Preventive Services Task Force: estimating certainty and magnitude of net benefit. Ann Intern Med 2007;147:871–5.doi:10.7326/0003-4819-147-12-200712180-00007
OpenUrl CrossRef PubMed Web of Science
56.↵
1. Harris RP,
2. Helfand M,
3. Woolf SH, et al
. Current methods of the US Preventive Services Task Force: a review of the process. Am J Prev Med 2001;20(3 Suppl):21–35.
OpenUrl CrossRef PubMed Web of Science
57.↵
1. Whitlock EP,
2. Williams SB,
3. Gold R, et al
. Screening and Interventions for Childhood Overweight. Rockville (MD): U.S. Preventive Services Task Force Evidence Syntheses, formerly Systematic Evidence Reviews 2005.
58.↵
1. Rehfuess EA,
2. Booth A,
3. Brereton L, et al
. Towards a taxonomy of logic models in systematic reviews and health technology assessments: A priori, staged, and iterative approaches. Res Synth Methods 2018;9:13–24.doi:10.1002/jrsm.1254
OpenUrl CrossRef PubMed
59.↵
1. Higgins JPT,
2. López-López JA,
3. Becker BJ, et al
. Synthesising quantitative evidence in systematic reviews of complex health interventions. BMJ Glob Health;2019;0:e000858. doi:10.1136/ bmjgh-2018-000858.
60.↵
1. Polus S,
2. Pieper D,
3. Burns J, et al
. Heterogeneity in application, design, and analysis characteristics was found for controlled before-after and interrupted time series studies included in Cochrane reviews. J Clin Epidemiol 2017;91:56–69.doi:10.1016/j.jclinepi.2017.07.008
OpenUrl
61.↵
1. Higgins JP,
2. Ramsay C,
3. Reeves BC, et al
. Issues relating to study design and risk of bias when including non-randomized studies in systematic reviews on the effects of interventions. Res Synth Methods 2013;4:12–25.doi:10.1002/jrsm.1056
OpenUrl CrossRef PubMed
62.↵
1. Voss PH,
2. Rehfuess EA
. Quality appraisal in systematic reviews of public health interventions: an empirical study on the impact of choice of tool on meta-analysis. J Epidemiol Community Health 2013;67:98–104.doi:10.1136/jech-2011-200940
OpenUrl Abstract/FREE Full Text
63.↵
1. Pfadenhauer LM,
2. Gerhardus A,
3. Mozygemba K, et al
. Making sense of complexity in context and implementation: the Context and Implementation of Complex Interventions (CICI) framework. Implement Sci 2017;12:21.doi:10.1186/s13012-017-0552-5
OpenUrl CrossRef PubMed
64.↵
1. Wahlster P,
2. Brereton L,
3. Burns J, et al
. Guidance on the integrated assessment of complex health technologies - the INTEGRATE-HTA Model. 2016 http://www.integrate-hta.eu/downloads/
65.↵
1. Lewin S,
2. Hendry M,
3. Chandler J, et al
. Assessing the complexity of interventions within systematic reviews: development, content and use of a new tool (iCAT_SR). BMC Med Res Methodol 2017;17:76.doi:10.1186/s12874-017-0349-x
OpenUrl CrossRef PubMed
66.↵
1. Guise JM,
2. Butler ME,
3. Chang C, et al
. AHRQ series on complex intervention systematic reviews-paper 6: PRISMA-CI extension statement and checklist. J Clin Epidemiol 2017;90:43–50.doi:10.1016/j.jclinepi.2017.06.016
OpenUrl PubMed
67.↵
1. Campbell M,
2. Katikireddi SV,
3. Hoffmann T, et al
. TIDieR-PHP: a reporting guideline for population health and policy interventions. BMJ 2018;361:k1079.doi:10.1136/bmj.k1079
OpenUrl FREE Full Text
68.↵
1. Alonso-Coello P,
2. Schünemann HJ,
3. Moberg J, et al
. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ 2016;353:i2016.
OpenUrl FREE Full Text
69.↵
1. Petticrew M,
2. Anderson L,
3. Elder R, et al
. Complex interventions and their implications for systematic reviews: a pragmatic approach. J Clin Epidemiol 2013;66:1209–14.doi:10.1016/j.jclinepi.2013.06.004
OpenUrl CrossRef PubMed
70.
1. Kneale D,
2. Thomas J,
3. Harris K
. Developing and Optimising the Use of Logic Models in Systematic Reviews: Exploring Practice and Good Practice in the Use of Programme Theory in Reviews. PLoS One 2015;10:e0142187.doi:10.1371/journal.pone.0142187

Footnotes

Handling editor Seye Abimbola
Contributors Conceptualisation: all authors. Data curation: AM. Formal analysis: PM, AM, SPG and EAR. Funding acquisition: PM. Investigation: PM, AM and SPG. Methodology: all authors. Project administration: PM. Supervision: PM and EAR. Validation: all authors. Visualisation: PM, AM and SPG. Writing and original draft preparation: PM and AM. Writing, review and editing: all authors.
Funding This project was prepared as part of the GRADE Guidance for Complex Interventions, funded by the Economic and Social Research Council (ES/N012267/1). Additional funding was also provided by the World Health Organization Department of Maternal, Newborn, Child and Adolescent Health through grants received from the United States Agency for International Development and the Norwegian Agency for Development Cooperation.
Competing interests All authors are current members of the GRADE Working Group. SPG’s spouse is a salaried employee of Eli Lilly and Company and owns stock. SPG has accompanied his spouse on company-sponsored travels.
Patient consent Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Data will be made available when the entire study is completed on the University of Birmingham repository.

[1] 1.↵
Guyatt GH,
Oxman AD,
Vist GE, et al
. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008;336:924–6.doi:10.1136/bmj.39489.470347.AD
OpenUrl FREE Full Text

[2] Guyatt GH,

[3] Oxman AD,

[4] Vist GE, et al

[5] 2.↵
Movsisyan A,
Melendez-Torres GJ,
Montgomery P
. Users identified challenges in applying GRADE to complex interventions and suggested an extension to GRADE. J Clin Epidemiol 2016;70:191–9.doi:10.1016/j.jclinepi.2015.09.010
OpenUrl

[6] Movsisyan A,

[7] Melendez-Torres GJ,

[8] Montgomery P

[9] 3.↵
Rehfuess EA,
Akl EA
. Current experience with applying the GRADE approach to public health interventions: an empirical study. BMC Public Health 2013;13:9.doi:10.1186/1471-2458-13-9

[10] Rehfuess EA,

[11] Akl EA

[12] 4.↵
Petticrew M
. Time to rethink the systematic review catechism? Moving from ’what works' to ’what happens'. Syst Rev 2015;4:36.doi:10.1186/s13643-015-0027-1
OpenUrl

[13] Petticrew M

[14] 5.↵
Koplan JP,
Bond TC,
Merson MH, et al
. Towards a common definition of global health. Lancet 2009;373:1993–5.doi:10.1016/S0140-6736(09)60332-9
OpenUrl CrossRef PubMed Web of Science

[15] Koplan JP,

[16] Bond TC,

[17] Merson MH, et al

[18] 6.↵
Hoffmann TC,
Glasziou PP,
Boutron I, et al
. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ 2014;348:g1687.doi:10.1136/bmj.g1687
OpenUrl Abstract/FREE Full Text

[19] Hoffmann TC,

[20] Glasziou PP,

[21] Boutron I, et al

[22] 7.↵
WHO. World Health Organization handbook for guideline development. 2nd ed. Geneva, Switzerland: WHO Press, 2014.

[23] 8.↵
Doyle J,
Waters E,
Yach D, et al
. Global priority setting for Cochrane systematic reviews of health promotion and public health research. J Epidemiol Community Health 2005;59:193–7.doi:10.1136/jech.2003.019547
OpenUrl Abstract/FREE Full Text

[24] Doyle J,

[25] Waters E,

[26] Yach D, et al

[27] 9.↵
Peters DH,
Bennett S
. Better guidance is welcome, but without blinders. PLoS Med 2012;9:e1001188.doi:10.1371/journal.pmed.1001188
OpenUrl PubMed

[28] Peters DH,

[29] Bennett S

[30] 10.↵
Petticrew M,
Knai C,
Thomas J, et al
. Implications of a complexity perspective for systematic reviews and guideline development in health decision making. BMJ Glob Health 2018;0:e000899.

[31] Petticrew M,

[32] Knai C,

[33] Thomas J, et al

[34] 11.↵
Anderson LM,
Petticrew M,
Chandler J, et al
. Introducing a series of methodological articles on considering complexity in systematic reviews of interventions. J Clin Epidemiol 2013;66:1205–8.doi:10.1016/j.jclinepi.2013.07.005
OpenUrl CrossRef PubMed

[35] Anderson LM,

[36] Petticrew M,

[37] Chandler J, et al

[38] 12.↵
Craig P,
Dieppe P,
Macintyre S, et al
; Developing and evaluating complex interventions: new guidance: Medical Research Council (MRC, 2008.

[39] Craig P,

[40] Dieppe P,

[41] Macintyre S, et al

[42] 13.↵
Hawe P,
Shiell A,
Riley T
. Theorising interventions as events in systems. Am J Community Psychol 2009;43(3-4):267–76.doi:10.1007/s10464-009-9229-9
OpenUrl CrossRef PubMed Web of Science

[43] Hawe P,

[44] Shiell A,

[45] Riley T

[46] 14.↵
Booth A,
Moore G,
Flemming K, et al
. Taking account of context in systematic reviews and guidelines considering a complexity perspective. BMJ Glob Health 2019;0:e000840. doi:10.1136/ bmjgh-2018-000840.

[47] Booth A,

[48] Moore G,

[49] Flemming K, et al

[50] 15.↵
Rehfuess EA,
Stratil JM,
Scheel IB, et al
. The WHO-INTEGRATE evidence to decision framework version 1.0: integrating WHO norms and values and a complexity perspective. BMJ Glob Health 2019;4:e000844.doi:10.1136/bmjgh-2018-000844

[51] Rehfuess EA,

[52] Stratil JM,

[53] Scheel IB, et al

[54] 16.↵
Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group. 2017 http://gradeworkinggroup.org/

[55] 17.↵
Higgins J,
Lasserson T,
Chandler J, et al
. Methodological expectations of Cochrane interventions reviews. London: Cochrane, 2016.

[56] Higgins J,

[57] Lasserson T,

[58] Chandler J, et al

[59] 18.↵
Movsisyan A,
Melendez-Torres GJ,
Montgomery P
. Outcomes in systematic reviews of complex interventions never reached "high" GRADE ratings when compared with those of simple interventions. J Clin Epidemiol 2016;78:22–33.doi:10.1016/j.jclinepi.2016.03.014
OpenUrl

[60] Movsisyan A,

[61] Melendez-Torres GJ,

[62] Montgomery P

[63] 19.↵
Ogilvie D,
Egan M,
Hamilton V, et al
. Systematic reviews of health effects of social interventions: 2. Best available evidence: how low should you go? J Epidemiol Community Health 2005;59:886–92.
OpenUrl

[64] Ogilvie D,

[65] Egan M,

[66] Hamilton V, et al

[67] 20.↵
Parkhurst JO,
Abeysinghe S
. What constitutes ‘good’ evidence for public health and social policy making? From hierarchies to appropriateness. SERRC 2014;3:34–46.
OpenUrl

[68] Parkhurst JO,

[69] Abeysinghe S

[70] 21.↵
Nasser SM,
Cooke G,
Kranzer K, et al
. Strength of recommendations in WHO guidelines using GRADE was associated with uptake in national policy. J Clin Epidemiol 2015;68:703–7.doi:10.1016/j.jclinepi.2014.11.006
OpenUrl

[71] Nasser SM,

[72] Cooke G,

[73] Kranzer K, et al

[74] 22.↵
Lewin S,
Bosch-Capblanch X,
Oliver S, et al
. Guidance for evidence-informed policies about health systems: assessing how much confidence to place in the research evidence. PLoS Med 2012;9:e1001187.doi:10.1371/journal.pmed.1001187
OpenUrl CrossRef PubMed

[75] Lewin S,

[76] Bosch-Capblanch X,

[77] Oliver S, et al

[78] 23.↵
Lewin S,
Booth A,
Glenton C, et al
. Applying GRADE-CERQual to qualitative evidence synthesis findings: introduction to the series. Implement Sci 2018;13(Suppl 1):2.doi:10.1186/s13012-017-0688-3
OpenUrl PubMed

[79] Lewin S,

[80] Booth A,

[81] Glenton C, et al

[82] 24.↵
Moher D,
Schulz KF,
Simera I, et al
. Guidance for developers of health research reporting guidelines. PLoS Med 2010;7:e1000217.doi:10.1371/journal.pmed.1000217
OpenUrl CrossRef PubMed

[83] Moher D,

[84] Schulz KF,

[85] Simera I, et al

[86] 25.↵
Movsisyan A,
Dennis J,
Rehfuess E, et al
. Rating the quality of a body of evidence on the effectiveness of health and social interventions: A systematic review and mapping of evidence domains. Res Synth Methods 2018;9:224–42.doi:10.1002/jrsm.1290
OpenUrl

[87] Movsisyan A,

[88] Dennis J,

[89] Rehfuess E, et al

[90] 26.↵
Murad MH,
Mustafa RA,
Schünemann HJ, et al
. Rating the certainty in evidence in the absence of a single estimate of effect. Evid Based Med 2017;22:85–7.doi:10.1136/ebmed-2017-110668
OpenUrl Abstract/FREE Full Text

[91] Murad MH,

[92] Mustafa RA,

[93] Schünemann HJ, et al

[94] 27.↵
von Philipsborn P,
Stratil JM,
Burns J, et al
. Environmental interventions to reduce the consumption of sugar-sweetened beverages and their effects on health. Cochrane Database Syst Rev 2016;135.doi:10.1002/14651858.CD012292

[95] von Philipsborn P,

[96] Stratil JM,

[97] Burns J, et al

[98] 28.↵
Caldwell DM,
Welton NJ
. Approaches for synthesising complex mental health interventions in meta-analysis. Evid Based Ment Health 2016;19:16–21.doi:10.1136/eb-2015-102275
OpenUrl Abstract/FREE Full Text

[99] Caldwell DM,

[100] Welton NJ

[101] 29.↵
Guise JM,
Chang C,
Viswanathan M, et al
. Agency for Healthcare Research and Quality Evidence-based Practice Center methods for systematically reviewing complex multicomponent health care interventions. J Clin Epidemiol 2014;67:1181–91.doi:10.1016/j.jclinepi.2014.06.010
OpenUrl CrossRef PubMed

[102] Guise JM,

[103] Chang C,

[104] Viswanathan M, et al

[105] 30.↵
Turley R,
Saith R,
Bhan N, et al
. Slum upgrading strategies involving physical environment and infrastructure interventions and their effects on health and socio-economic outcomes. Cochrane Database Syst Rev 2013;1:CD010067.doi:10.1002/14651858.CD010067.pub2
OpenUrl CrossRef PubMed

[106] Turley R,

[107] Saith R,

[108] Bhan N, et al

[109] 31.↵
Weir MC,
Grimshaw JM,
Mayhew A, et al
. Decisions about lumping vs. splitting of the scope of systematic reviews of complex interventions are not well justified: a case study in systematic reviews of health care professional reminders. J Clin Epidemiol 2012;65:756–63.doi:10.1016/j.jclinepi.2011.12.012
OpenUrl CrossRef PubMed

[110] Weir MC,

[111] Grimshaw JM,

[112] Mayhew A, et al

[113] 32.↵
Squires JE,
Valentine JC,
Grimshaw JM
. Systematic reviews of complex interventions: framing the review question. J Clin Epidemiol 2013;66:1215–22.doi:10.1016/j.jclinepi.2013.05.013
OpenUrl CrossRef PubMed

[114] Squires JE,

[115] Valentine JC,

[116] Grimshaw JM

[117] 33.↵
Rohwer A,
Pfadenhauer L,
Burns J, et al
. Series: Clinical Epidemiology in South Africa. Paper 3: Logic models help make sense of complexity in systematic reviews and health technology assessments. J Clin Epidemiol 2017;83:37–47.doi:10.1016/j.jclinepi.2016.06.012
OpenUrl CrossRef PubMed

[118] Rohwer A,

[119] Pfadenhauer L,

[120] Burns J, et al

[121] 34.↵
Welch VA,
Ghogomu E,
Hossain A, et al
. Deworming and adjuvent interventions for improving the developmental health and well-being of children in low- and middle-income countries: a systematic review and network meta-analysis. Campbell Syst Rev 2016;7.

[122] Welch VA,

[123] Ghogomu E,

[124] Hossain A, et al

[125] 35.↵
Hilton J,
Tovey D
. Debating the evidence for deworming programmes. Cochrane Database Syst Rev 2012;8:ED000045.doi:10.1002/14651858.ED000045
OpenUrl

[126] Hilton J,

[127] Tovey D

[128] 36.↵
Andrews JR,
Bogoch II,
Utzinger J
. The benefits of mass deworming on health outcomes: new evidence synthesis, the debate persists. Lancet Glob Health 2017;5:e4–e5.doi:10.1016/S2214-109X(16)30333-3
OpenUrl

[129] Andrews JR,

[130] Bogoch II,

[131] Utzinger J

[132] 37.↵
Taylor-Robinson DC,
Jones AP,
Garner P
. Deworming drugs for treating soil-transmitted intestinal worms in children: effects on growth and school performance. Cochrane Database Syst Rev 2007;4:CD000371.doi:10.1002/14651858.CD000371.pub3
OpenUrl CrossRef PubMed

[133] Taylor-Robinson DC,

[134] Jones AP,

[135] Garner P

[136] 38.↵
Hultcrantz M,
Rind D,
Akl EA, et al
. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol 2017;87:4–13.doi:10.1016/j.jclinepi.2017.05.006
OpenUrl CrossRef PubMed

[137] Hultcrantz M,

[138] Rind D,

[139] Akl EA, et al

[140] 39.↵
Guyatt GH,
Oxman AD,
Vist G, et al
. GRADE guidelines: 4. Rating the quality of evidence–study limitations (risk of bias). J Clin Epidemiol 2011;64:407–15.doi:10.1016/j.jclinepi.2010.07.017
OpenUrl CrossRef PubMed Web of Science

[141] Guyatt GH,

[142] Oxman AD,

[143] Vist G, et al

[144] 40.↵
Schünemann HJ,
Cuello C,
Akl EA, et al
. GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in non-randomized studies should be used to rate the certainty of a body of evidence. J Clin Epidemiol 2018. doi: 10.1016/j.jclinepi.2018.01.012. [Epub ahead of print].doi:10.1016/j.jclinepi.2018.01.012

[145] Schünemann HJ,

[146] Cuello C,

[147] Akl EA, et al

[148] 41.↵
Sterne JA,
Hernán MA,
Reeves BC, et al
. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ 2016;355:i4919.doi:10.1136/bmj.i4919
OpenUrl FREE Full Text

[149] Sterne JA,

[150] Hernán MA,

[151] Reeves BC, et al

[152] 42.↵
Risk of bias tools. http://www.riskofbias.info/.

[153] 43.↵
Balshem H,
Helfand M,
Schünemann HJ, et al
. GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol 2011;64:401–6.doi:10.1016/j.jclinepi.2010.07.015
OpenUrl CrossRef PubMed Web of Science

[154] Balshem H,

[155] Helfand M,

[156] Schünemann HJ, et al

[157] 44.↵
Grant S,
Pedersen ER,
Osilla KC, et al
. Reviewing and interpreting the effects of brief alcohol interventions: comment on a Cochrane review about motivational interviewing for young adults. Addiction 2016;111:1521–7.doi:10.1111/add.13136
OpenUrl

[158] Grant S,

[159] Pedersen ER,

[160] Osilla KC, et al

[161] 45.↵
Foxcroft DR
. We cannot ignore bias, especially if effects are small, but we need better methods for evaluating prevention systems. Addiction 2016;111:1532–3.doi:10.1111/add.13317
OpenUrl

[162] Foxcroft DR

[163] 46.↵
May C
. Towards a general theory of implementation. Implement Sci 2013;8:18.doi:10.1186/1748-5908-8-18
OpenUrl CrossRef PubMed

[164] May C

[165] 47.↵
Grant S,
Pedersen ER,
Osilla KC, et al
. It is time to develop appropriate tools for assessing minimal clinically important differences, performance bias and quality of evidence in reviews of behavioral interventions. Addiction 2016;111:1533–5.doi:10.1111/add.13380
OpenUrl

[166] Grant S,

[167] Pedersen ER,

[168] Osilla KC, et al

[169] 48.↵
Schünemann HJ
. Methodological idiosyncracies, frameworks and challenges of non-pharmaceutical and non-technical treatment interventions. Z Evid Fortbild Qual Gesundhwes 2013;107:214–20.doi:10.1016/j.zefq.2013.05.002
OpenUrl

[170] Schünemann HJ

[171] 49.↵
McCarthy B,
Casey D,
Devane D, et al
. Pulmonary rehabilitation for chronic obstructive pulmonary disease. Cochrane Database Syst Rev 2015;2:CD003793.
OpenUrl PubMed

[172] McCarthy B,

[173] Casey D,

[174] Devane D, et al

[175] 50.↵
Guyatt GH,
Oxman AD,
Kunz R, et al
. GRADE guidelines: 7. Rating the quality of evidence–inconsistency. J Clin Epidemiol 2011;64:1294–302.doi:10.1016/j.jclinepi.2011.03.017
OpenUrl CrossRef PubMed

[176] Guyatt GH,

[177] Oxman AD,

[178] Kunz R, et al

[179] 51.↵
O’Connor AM,
Stacey D,
Entwistle V, et al
. Decision aids for people facing health treatment or screening decisions. Cochrane Database Syst Rev 2003;2:CD001431.
OpenUrl PubMed

[180] O’Connor AM,

[181] Stacey D,

[182] Entwistle V, et al

[183] 52.↵
Guyatt GH,
Oxman AD,
Kunz R, et al
. GRADE guidelines: 8. Rating the quality of evidence–indirectness. J Clin Epidemiol 2011;64:1303–10.doi:10.1016/j.jclinepi.2011.04.014
OpenUrl CrossRef PubMed

[184] Guyatt GH,

[185] Oxman AD,

[186] Kunz R, et al

[187] 53.↵
Dragioti E,
Dimoliatis I,
Evangelou E
. Disclosure of researcher allegiance in meta-analyses and randomised controlled trials of psychotherapy: a systematic appraisal. BMJ Open 2015;5:e007206.doi:10.1136/bmjopen-2014-007206

[188] Dragioti E,

[189] Dimoliatis I,

[190] Evangelou E

[191] 54.↵
Guyatt GH,
Oxman AD,
Sultan S, et al
. GRADE guidelines: 9. Rating up the quality of evidence. J Clin Epidemiol 2011;64:1311–6.doi:10.1016/j.jclinepi.2011.06.004
OpenUrl CrossRef PubMed

[192] Guyatt GH,

[193] Oxman AD,

[194] Sultan S, et al

[195] 55.↵
Sawaya GF,
Guirguis-Blake J,
LeFevre M, et al
. Update on the methods of the U.S. Preventive Services Task Force: estimating certainty and magnitude of net benefit. Ann Intern Med 2007;147:871–5.doi:10.7326/0003-4819-147-12-200712180-00007
OpenUrl CrossRef PubMed Web of Science

[196] Sawaya GF,

[197] Guirguis-Blake J,

[198] LeFevre M, et al

[199] 56.↵
Harris RP,
Helfand M,
Woolf SH, et al
. Current methods of the US Preventive Services Task Force: a review of the process. Am J Prev Med 2001;20(3 Suppl):21–35.
OpenUrl CrossRef PubMed Web of Science

[200] Harris RP,

[201] Helfand M,

[202] Woolf SH, et al

[203] 57.↵
Whitlock EP,
Williams SB,
Gold R, et al
. Screening and Interventions for Childhood Overweight. Rockville (MD): U.S. Preventive Services Task Force Evidence Syntheses, formerly Systematic Evidence Reviews 2005.

[204] Whitlock EP,

[205] Williams SB,

[206] Gold R, et al

[207] 58.↵
Rehfuess EA,
Booth A,
Brereton L, et al
. Towards a taxonomy of logic models in systematic reviews and health technology assessments: A priori, staged, and iterative approaches. Res Synth Methods 2018;9:13–24.doi:10.1002/jrsm.1254
OpenUrl CrossRef PubMed

[208] Rehfuess EA,

[209] Booth A,

[210] Brereton L, et al

[211] 59.↵
Higgins JPT,
López-López JA,
Becker BJ, et al
. Synthesising quantitative evidence in systematic reviews of complex health interventions. BMJ Glob Health;2019;0:e000858. doi:10.1136/ bmjgh-2018-000858.

[212] Higgins JPT,

[213] López-López JA,

[214] Becker BJ, et al

[215] 60.↵
Polus S,
Pieper D,
Burns J, et al
. Heterogeneity in application, design, and analysis characteristics was found for controlled before-after and interrupted time series studies included in Cochrane reviews. J Clin Epidemiol 2017;91:56–69.doi:10.1016/j.jclinepi.2017.07.008
OpenUrl

[216] Polus S,

[217] Pieper D,

[218] Burns J, et al

[219] 61.↵
Higgins JP,
Ramsay C,
Reeves BC, et al
. Issues relating to study design and risk of bias when including non-randomized studies in systematic reviews on the effects of interventions. Res Synth Methods 2013;4:12–25.doi:10.1002/jrsm.1056
OpenUrl CrossRef PubMed

[220] Higgins JP,

[221] Ramsay C,

[222] Reeves BC, et al

[223] 62.↵
Voss PH,
Rehfuess EA
. Quality appraisal in systematic reviews of public health interventions: an empirical study on the impact of choice of tool on meta-analysis. J Epidemiol Community Health 2013;67:98–104.doi:10.1136/jech-2011-200940
OpenUrl Abstract/FREE Full Text

[224] Voss PH,

[225] Rehfuess EA

[226] 63.↵
Pfadenhauer LM,
Gerhardus A,
Mozygemba K, et al
. Making sense of complexity in context and implementation: the Context and Implementation of Complex Interventions (CICI) framework. Implement Sci 2017;12:21.doi:10.1186/s13012-017-0552-5
OpenUrl CrossRef PubMed

[227] Pfadenhauer LM,

[228] Gerhardus A,

[229] Mozygemba K, et al

[230] 64.↵
Wahlster P,
Brereton L,
Burns J, et al
. Guidance on the integrated assessment of complex health technologies - the INTEGRATE-HTA Model. 2016 http://www.integrate-hta.eu/downloads/

[231] Wahlster P,

[232] Brereton L,

[233] Burns J, et al

[234] 65.↵
Lewin S,
Hendry M,
Chandler J, et al
. Assessing the complexity of interventions within systematic reviews: development, content and use of a new tool (iCAT_SR). BMC Med Res Methodol 2017;17:76.doi:10.1186/s12874-017-0349-x
OpenUrl CrossRef PubMed

[235] Lewin S,

[236] Hendry M,

[237] Chandler J, et al

[238] 66.↵
Guise JM,
Butler ME,
Chang C, et al
. AHRQ series on complex intervention systematic reviews-paper 6: PRISMA-CI extension statement and checklist. J Clin Epidemiol 2017;90:43–50.doi:10.1016/j.jclinepi.2017.06.016
OpenUrl PubMed

[239] Guise JM,

[240] Butler ME,

[241] Chang C, et al

[242] 67.↵
Campbell M,
Katikireddi SV,
Hoffmann T, et al
. TIDieR-PHP: a reporting guideline for population health and policy interventions. BMJ 2018;361:k1079.doi:10.1136/bmj.k1079
OpenUrl FREE Full Text

[243] Campbell M,

[244] Katikireddi SV,

[245] Hoffmann T, et al

[246] 68.↵
Alonso-Coello P,
Schünemann HJ,
Moberg J, et al
. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ 2016;353:i2016.
OpenUrl FREE Full Text

[247] Alonso-Coello P,

[248] Schünemann HJ,

[249] Moberg J, et al

[250] 69.↵
Petticrew M,
Anderson L,
Elder R, et al
. Complex interventions and their implications for systematic reviews: a pragmatic approach. J Clin Epidemiol 2013;66:1209–14.doi:10.1016/j.jclinepi.2013.06.004
OpenUrl CrossRef PubMed

[251] Petticrew M,

[252] Anderson L,

[253] Elder R, et al

[254] 70.
Kneale D,
Thomas J,
Harris K
. Developing and Optimising the Use of Logic Models in Systematic Reviews: Exploring Practice and Good Practice in the Use of Programme Theory in Reviews. PLoS One 2015;10:e0142187.doi:10.1371/journal.pone.0142187

[255] Kneale D,

[256] Thomas J,

[257] Harris K

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Key questions

What is already known?

What are the new findings?

What do the new findings imply?

Introduction

Systematic reviews on the effects of interventions in global health

Using a complexity perspective in reviews of intervention effects

Examples of guidance and tools for addressing sources of complexity

The GRADE approach to rating certainty of evidence

Summary of the GRADE process for rating the certainty of evidence for intervention effects

Objectives

Methods

Results

Considering complexity and GRADE when framing the question(s) and conducting the systematic review(s)

Consideration of sources of complexity in a protocol for a Cochrane review of environmental interventions to reduce consumption of sweetened beverages (von Philipsborn et al)27

Defining the thresholds or the ranges for certainty of evidence ratings

Initial certainty rating based on study design

Applying GRADE domains for rating certainty

Risk of bias

Inconsistency

Imprecision

Indirectness

Publication bias

Upgrading

Future work for rating certainty in reviews using a complexity perspective

Conclusion

Acknowledgments

References

Footnotes

Read the full text or download the PDF:

Log in using your username and password