Introduction
Key points
- •
GRADE includes three criteria for rating up quality of evidence particularly applicable to observational studies.
- •
Rating up one or even two levels is possible when effects in observational studies are sufficiently large, particularly if they occur over short periods of time.
- •
A dose–response gradient, or a conclusion that plausible residual confounding would further support inferences regarding treatment effect, may also raise the quality of the evidence.
In prior papers in this series devoted to exploring GRADE's approach to rating the quality of evidence and grading strength of recommendations, we have dealt with issues of framing the question; introduced GRADE's conceptual approach to rating the quality of a body of evidence; and presented five reasons for rating down the quality of evidence: risk of bias, imprecision, inconsistency, indirectness, and publication bias. This ninth article in the series examines the criteria for rating up the quality of evidence.
The three primary reasons for rating up the quality of evidence are (Table 1) as follows:
- 1.
When a large magnitude of effect exists,
- 2.
When there is a dose–response gradient, and
- 3.
When all plausible confounders or other biases increase our confidence in the estimated effect.
We have noted previously that GRADE is relevant to rating evidence regarding the impact of interventions on patient-important outcomes—rather than, for instance, prognostic studies that identify patient characteristics associated with desirable or adverse outcomes. Using the GRADE framework, evidence from observational studies is generally classified as low. Unsystematic clinical observations are usually at a high risk of bias and therefore generally receive a rating of very low quality evidence. There are times, however, when we have high confidence in the estimate of effect from such studies. GRADE has therefore suggested mechanisms for rating up the quality of evidence from observational studies.
The circumstances under which we may wish to rate up the quality of evidence for intervention studies will likely occur infrequently and are primarily relevant to observational studies (including cohort, case–control, before–after, and time series studies) and to nonrandomized experimental or interventional studies (e.g., providing treatment to one of the two matched groups). Indeed, although it is theoretically possible to rate up results from randomized control trials (RCTs), we have yet to find a compelling example of such an instance.