Skip to main content
Log in

Drawing conclusions about causes from systematic reviews of risk factors: The Cambridge Quality Checklists

  • Published:
Journal of Experimental Criminology Aims and scope Submit manuscript

Abstract

Systematic reviews summarize evidence about the effects of social interventions on crime, health, education, and social welfare. Social scientists should also use systematic reviews to study risk factors, which are naturally occurring predictors of these outcomes. To do this, the quality of risk factor research needs to be evaluated. This paper presents three new methodological quality checklists to identify high-quality risk factor research. They are designed so that reviewers can separately summarize the best evidence about correlates, risk factors, and causal risk factors. Studies need appropriate samples and measures to draw valid conclusions about correlates. Studies need prospective longitudinal data to draw valid conclusions about risk factors. And, in the absence of experimental evidence, controlled studies need to compare changes in risk factors over time with changes in outcomes to draw valid conclusions about causal risk factors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. By ‘observational’ we do not mean research based on systematic observation. We mean research of naturally occurring events and experiences, without experimental manipulation of participants.

  2. Factor is a different word for variable. Factors can be measured on a continuous or interval scale, for example intelligence quotient (IQ) scores. They can have nominal or several values, for example, social class categories. Factors can also be dichotomous, for example, living in a ‘broken’ or ‘intact’ home.

  3. A distinction can be made between the time that a variable is measured and the time period that it refers to. For example, a measure of self-reported delinquency administered at age 18 years may refer to offending behavior between the ages of 14 years and 18 years. This would mean that, for example, joining a gang at age 16 years may not strictly be a risk factor for self-reported delinquency measured at age 18 years. The relevant time dimension of a variable is not when it was measured but the time period that it refers to.

  4. Some commentators argue that each score derived from a scale should only reflect one underlying construct (e.g., statistical conclusion validity). However, the aim of a quality scale is not to measure a common underlying trait but, rather, to measure a common effect—confidence that the study results are accurate.

  5. We use some simple quality cut-offs to make the checklist easy to use and understand, although we recognize that they do not capture all issues of methodological quality. For example, a measurement scale can be made more reliable by using many items, excluding inverse items, and presenting all items in the same sequence, although this would result in a poorer quality measure.

  6. Some people argue that fixed risk factors, like a person’s gender, cannot be causal because they cannot change within an individual over time (Holland 1986; Kraemer et al. 2005). Others disagree, based on a counterfactual view of causality (Glymour 1986) and possible mechanisms linking fixed characteristics and behavior (Rutter 2003b). The resolution to this debate lies in philosophical issues beyond the scope of this paper. However, we note that, even if fixed risk factors could be causal, the fact that they are fixed precludes empirical tests of whether changing them causes changes in outcomes. Therefore, for most practical purposes, fixed risk factors cannot be shown to be causal and cannot be targeted in interventions (see, e.g., Farrington 1988).

  7. As well as comparing risk-exposed individuals with an unexposed comparison group, researchers can study variation in risk exposure on an ordinal, interval or continuous scale. A completely unexposed ‘control group’ is not needed in risk research, so long as there is variation in risk exposure that can be compared with variation in the outcome. Hence, when we refer to studies needing to include comparison groups, this condition can also be met by the use of risk variables that capture variation in the level of risk exposure (for example, IQ scores, levels of social class, or intervals of family income).

  8. Most of the methodological findings that we cite here come from intervention research and may or may not be replicated in risk factor research. We encourage work on similar methodological issues (such as the relative validity of one group before–after studies and control group studies) in risk factor research, to evaluate whether the same findings would apply to risk factor research.

  9. Sometimes researchers test whether risk-exposed and comparison groups are similar on covariates without matching. If it is demonstrated that risk-exposed and comparison groups are similar on covariates, the groups can be treated as matched on those covariates. However, it is important to measure effect size as well as statistical significance in this comparison, to guard against the danger of concluding that there is no difference because of low statistical power.

  10. Selection models are another statistical method used to adjust for covariates. They attempt to estimate both observed and unobserved bias caused by covariates (McCartney et al. 2006; Winship and Morgan 1999). However, if results from selection models are compared with those obtained by randomized experiments and other matching and statistical adjustment methods, it seems that selection models are not much more valid than conventional matching and modeling procedures are (Glazerman et al. 2003; Stolzenberg and Relles 1997).

  11. We note that controlled interrupted time–series studies and regression discontinuity designs also potentially provide strong grounds for inferences about causal relationships (Shadish et al. 2002), although they are not included in this checklist. Interrupted time–series studies examine change in a large number of outcome measures from before to after an intervention, in control and treatment conditions. Regression discontinuity designs use knowledge about how treatment assignment was made to draw causal conclusions about its effects. Because these methods were designed for investigating the effects of intervention programs and are not used in risk factor research, we do not include them in the checklist.

References

  • Altman, D. G. (2001). Systematic reviews in health care—systematic reviews of evaluations of prognostic variables. British Medical Journal, 323, 224–228.

    Article  Google Scholar 

  • Arceneaux, K., Gerber, A. S., & Green, D. P. (2006). Comparing experimental and matching methods using a large-scale voter mobilization experiment. Political Analysis, 14, 37–62.

    Article  Google Scholar 

  • Bloom, H. S., Michalopoulos, C., Hill, C. J., & Lei, Y. (2002). Can nonexperimental comparison group methods match the findings from a random assignment evaluation of mandatory welfare-to-work programs? (working paper). New York: Manpower Demonstration Research Corporation.

    Google Scholar 

  • Boruch, R. F. (1997). Randomized experiments for planning and evaluation. Thousand Oaks, CA: Sage.

    Google Scholar 

  • Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment (Sage university paper series on quantitative applications in the social sciences no. 17). Thousand Oaks, CA: Sage.

    Google Scholar 

  • Chalmers, T. C., Smith, H., Blackburn, B., Silverman, B., Schroeder, B., Reitman, D., et al. (1981). A method for assessing the quality of a randomized control trial. Controlled Clinical Trials, 2, 31–49.

    Article  Google Scholar 

  • Christenfeld, N. J. S., Sloan, R. P., Carroll, D., & Greenland, S. (2004). Risk factors, confounding, and the illusion of statistical control. Psychosomatic Medicine, 66, 868–875.

    Article  Google Scholar 

  • Concato, J., Feinstein, A. R., & Holford, T. R. (1993). The risk of determining risk with multivariable models. Annals of Internal Medicine, 118, 201–210.

    Google Scholar 

  • Conn, V. S., & Rantz, M. J. (2003). Research methods: managing primary study quality in meta-analyses. Research in Nursing and Health, 26, 322–333.

    Article  Google Scholar 

  • Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: design and analysis issues for field settings. Chicago, IL: Rand-McNally.

    Google Scholar 

  • Deeks, J. J., Dinnes, J., D’Amico, R., Sowden, A. J., Sakarovitch, C., Song, F., et al. (2003). Evaluating non-randomised intervention studies. Health Technology Assessment, 7(iii-x), 1–173.

    Google Scholar 

  • Dehejia, R. H., & Wahba, S. (1999). Causal effects in nonexperimental studies: reevaluating the evaluation of training programs. Journal of the American Statistical Association, 94, 1053–1062.

    Article  Google Scholar 

  • Downs, S. H., & Black, N. (1998). The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. Journal of Epidemiology and Community Health, 52, 377–384.

    Article  Google Scholar 

  • Egger, M., Davey Smith, G., & Altman, D. G. (Eds.) (2001). Systematic reviews in health care: meta-analysis in context (2nd ed.). London: BMJ.

  • Farrington, D. P. (1988). Studying changes within individuals: the causes of offending. In M. Rutter (Ed.), Studies of psychosocial risk: the power of longitudinal data (pp. 158–183). Cambridge: Cambridge University Press.

    Google Scholar 

  • Farrington, D. P. (1989). Self-reported and official offending from adolescence to adulthood. In M. W. Klein (Ed.), Cross-national research on self-reported crime and delinquency (pp. 399–423). Dordrecht, Netherlands: Kluwer.

    Google Scholar 

  • Farrington, D. P. (2000). Explaining and preventing crime: the globalization of knowledge—the American Society of Criminology 1999 presidential address. Criminology, 38, 1–24.

    Article  Google Scholar 

  • Farrington, D. P. (2003). Methodological quality standards for evaluation research. Annals of the American Academy of Political and Social Science, 587, 49–68.

    Article  Google Scholar 

  • Farrington, D. P., & Petrosino, A. (2001). The Campbell Collaboration Crime and Justice Group. Annals of the American Academy of Political and Social Science, 578, 35–49.

    Article  Google Scholar 

  • Farrington, D. P., Gottfredson, D. C., Sherman, L. W., & Welsh, B. C. (2002a). The Maryland scientific methods scale. In L. W. Sherman, D. P. Farrington, B. C. Welsh, & D. L. MacKenzie (Eds.), Evidence-based crime prevention (pp. 13–21). London: Routledge.

    Google Scholar 

  • Farrington, D. P., Loeber, R., Yin, Y., & Anderson, S. J. (2002b). Are within-individual causes of delinquency the same as between-individual causes? Criminal Behaviour and Mental Health, 12, 53–68.

    Article  Google Scholar 

  • Ferriter, M., & Huband, N. (2005). Does the non-randomized controlled study have a place in the systematic review? A pilot study. Criminal Behaviour and Mental Health, 15, 111–120.

    Article  Google Scholar 

  • Fleiss, J. L. (1981). Statistical methods for rates and proportions. New York: Wiley.

    Google Scholar 

  • Forgatch, M. S., & DeGarmo, D. S. (1999). Parenting through change: an effective prevention program for single mothers. Journal of Consulting and Clinical Psychology, 67, 711–724.

    Article  Google Scholar 

  • Glasziou, P., Vandenbroucke, J., & Chalmers, I. (2004). Assessing the quality of research. British Medical Journal, 328, 39–41.

    Article  Google Scholar 

  • Glazerman, S., Levy, D. M., & Myers, D. (2003). Nonexperimental versus experimental estimates of earnings impacts. Annals of the American Academy of Political and Social Science, 589, 63–93.

    Article  Google Scholar 

  • Glymour, C. (1986). Comment: Statistics and metaphysics. Journal of the American Statistical Association, 81, 964–966.

    Article  Google Scholar 

  • Hardt, J., & Rutter, M. (2004). Validity of adult retrospective reports of adverse childhood experiences: review of the evidence. Journal of Child Psychology and Psychiatry, 45, 260–273.

    Article  Google Scholar 

  • Hawton, K., Sutton, L., Haw, C., Sinclair, J., & Deeks, J. J. (2005). Schizophrenia and suicide: systematic review of risk factors. British Journal of Psychiatry, 187, 9–20.

    Article  Google Scholar 

  • Henry, B., Moffitt, T. E., Caspi, A., & Silva, P. A. (1994). On the remembrance of things past: a longitudinal evaluation of the retrospective method. Psychological Assessment, 6, 92–101.

    Article  Google Scholar 

  • Higgins, J. P. T., & Green, S. (Eds.). (2006). Cochrane handbook for systematic reviews of interventions 4.2.6 (updated September 2006). In: The Cochrane Library, issue 4, 2006. Chichester, UK: Wiley.

  • Hill, A. B. (1965). The environment and disease: association or causation? Proceedings of the Royal Society of Medicine, 15, 295–300.

    Google Scholar 

  • Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81, 945–960.

    Article  Google Scholar 

  • Jolliffe, D., & Farrington, D. P. (2004). Empathy and offending: a systematic review and meta-analysis. Aggression and Violent Behavior, 9, 441–476.

    Article  Google Scholar 

  • Jüni, P., Witchi, A., Bloch, R., & Egger, M. (1999). The hazards of scoring the quality of clinical trials for meta-analysis. Journal of the American Medical Association, 282, 1054–1060.

    Article  Google Scholar 

  • Kazdin, A. E., Kraemer, H. C., Kessler, R. C., Kupfer, D. J., & Offord, D. R. (1997). Contributions of risk-factor research to developmental psychopathology. Clinical Psychology Review, 17, 375–406.

    Article  Google Scholar 

  • Khan, K. S., ter Riet, G., Popay, J., Nixon, J., & Kleijnen, J. (2001). Study quality assessment. In Centre for Reviews and Dissemination (Ed.), Undertaking systematic reviews of research on effectiveness: CRD’s guidance for those carrying out or commissioning reviews (2nd ed.). York, England: York Publishing Services.

    Google Scholar 

  • Kraemer, H. C., Kazdin, A. E., Offord, D., Kessler, R. C., Jensen, P. S., & Kupfer, D. J. (1997). Coming to terms with the terms of risk. Archives of General Psychiatry, 54, 337–343.

    Google Scholar 

  • Kraemer, H. C., Lowe, K. K., & Kupfer, D. J. (2005). To your health: how to understand what research tells us about risk. New York: Oxford University Press.

    Google Scholar 

  • Labouvie, E. W. (1986). Methodological issues in the prediction of psychopathology: a life span perspective. In L. Erlenmeyer-Kimling & N. E. Miller (Eds.), Life span research on the prediction of psychopathology (pp. 137–155). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Lalonde, R. J. (1986). Evaluating the econometric evaluations of training-programs with experimental data. American Economic Review, 76, 604–620.

    Google Scholar 

  • Lieberson, S. (1985). Making it count: the improvement of social research and theory. Berkeley, CA: University of California Press.

    Google Scholar 

  • Lipsey, M. W., & Derzon, J. H. (1998). Predictors of violent or serious delinquency in adolescence and early adulthood: a synthesis of longitudinal research. In D. P. Farrington & R. Loeber (Eds.), Serious and violent juvenile offenders: risk factors and successful interventions (pp. 86–105). Thousand Oaks, CA: Sage.

    Google Scholar 

  • Lipsey, M. W., & Landenberger, N. A. (2006). Cognitive-behavioral interventions. In B. C. Welsh & D. P. Farrington (Eds.), Preventing crime: what works for children, offenders, victims, and places (pp. 57–71). Dordrecht, The Netherlands: Springer.

    Google Scholar 

  • Lipsey, M. W., & Wilson, D. B. (1993). The efficacy of psychological, educational, and behavioral treatment—confirmation from metaanalysis. American Psychologist, 48, 1181–1209.

    Article  Google Scholar 

  • Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.

    Google Scholar 

  • Loeber, R., & Farrington, D. P. (2008). Advancing knowledge about causes in longitudinal studies: experimental and quasi-experimental methods. In A. M. Liberman (Ed.), The long view of crime: a synthesis of longitudinal research (pp. 257–279). New York: Springer.

    Chapter  Google Scholar 

  • Lösel, F., & Beelman, A. (2006). Child social skills training. In B. C. Welsh & D. P. Farrington (Eds.), Preventing crime: what works for children, offenders, victims, and places (pp. 33–54). Dordrecht, Netherlands: Springer.

    Google Scholar 

  • Lösel, F., & Köferl, P. (1989). Evaluation research on correctional treatment in West Germany: a meta-analysis. In H. Wegener, F. Lösel, & J. Haisch (Eds.), Criminal behavior and the justice system: psychological perspectives (pp. 334–355). New York: Springer.

    Google Scholar 

  • McCartney, K., Bub, K. L., & Burchinal, M. R. (2006). Selection, detection, and reflection. In K. McCartney, M. R. Burchinal, & K. L. Bub (Eds.), Best practices in quantitative methods for developmentalists. Monographs of the Society for Research in Child Development, Vol. 71, No. 3 (pp. 105–126). Boston, MA: Blackwell.

    Google Scholar 

  • Moher, D., Jadad, A. R., Nichol, G., Penman, M., Tugwell, P., & Walsh, S. (1995). Assessing the quality of randomized controlled trials. Controlled Clinical Trials, 16, 62–73.

    Article  Google Scholar 

  • Pelz, D. C., & Andrews, F. M. (1964). Detecting causal priorities in panel study data. American Sociological Review, 29, 836–848.

    Article  Google Scholar 

  • Perry, A., & Johnson, M. (2008). Applying the consolidated standards of reporting trials (CONSORT) to studies of mental health provision for juvenile offenders: a research note. Journal of Experimental Criminology, 4, 165–185.

    Article  Google Scholar 

  • Petrosino, A. (2003). Estimates of randomized controlled trials across six areas of childhood intervention: a bibliometric analysis. Annals of the American Academy of Political and Social Science, 589, 190–202.

    Article  Google Scholar 

  • Petrosino, A., Boruch, R. F., Farrington, D. P., Sherman, L. W., & Weisburd, D. (2003a). Toward evidence-based criminology and criminal justice: systematic reviews, the Campbell Collaboration, and the Crime and Justice Group. International Journal of Comparative Criminology, 3, 42–61.

    Google Scholar 

  • Petrosino, A., Turpin-Petrosino, C., & Buehler, J. (2003b). Scared Straight and other juvenile awareness programs for preventing juvenile delinquency: a systematic review of the randomized experimental evidence. Annals of the American Academy of Political and Social Science, 589, 41–62.

    Article  Google Scholar 

  • Petticrew, M., & Roberts, H. (2006). Systematic reviews in the social sciences: a practical guide. Oxford: Blackwell.

    Google Scholar 

  • Pratt, T. C., McGloin, J. M., & Fearn, N. E. (2006). Maternal cigarette smoking during pregnancy and criminal/deviant behavior: a meta-analysis (vol. 50, pp. 672–690).

  • Rhee, S. H., & Waldman, I. D. (2002). Genetic and environmental influences on antisocial behavior: a meta-analysis of twin and adoption studies. Psychological Bulletin, 128, 49–529.

    Article  Google Scholar 

  • Robins, L. N. (1992). The role of prevention experiments in discovering causes of children’s antisocial behavior. In J. McCord & R. E. Tremblay (Eds.), Preventing antisocial behavior: interventions from birth through adolescence (pp. 3–18). New York: Guilford.

    Google Scholar 

  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.

    Article  Google Scholar 

  • Rubin, D. B., & Thomas, N. (1996). Matching using propensity scores: relating theory to practice. Biometrics, 52, 249–264.

    Article  Google Scholar 

  • Rutter, M. (1981). Epidemiological/longitudinal strategies and causal research in child-psychiatry. Journal of the American Academy of Child and Adolescent Psychiatry, 20, 513–544.

    Article  Google Scholar 

  • Rutter, M. (1988). Longitudinal data in the study of causal processes: some uses and some pitfalls. In M. Rutter (Ed.), Studies of psychosocial risk: the power of longitudinal data (pp. 1–28). Cambridge: Cambridge University Press.

    Google Scholar 

  • Rutter, M. (2003a). Crucial paths from risk indicator to causal mechanism. In B. B. Lahey, T. E. Moffitt, & A. Caspi (Eds.), Causes of conduct disorder and juvenile delinquency (pp. 3–24). New York: Guilford.

    Google Scholar 

  • Rutter, M. (2003b). Using sex differences in psychopathology to study causal mechanisms: unifying issues and research strategies. Journal of Child Psychology and Psychiatry, 44, 1092–1115.

    Article  Google Scholar 

  • Sanderson, S., Tatt, I. D., & Higgins, J. P. T. (2007). Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography. International Journal of Epidemiology, 36, 666–676.

    Article  Google Scholar 

  • Shadish, W. R., & Ragsdale, K. (1996). Random versus nonrandom assignment in controlled experiments: do you get the same answer? Journal of Consulting and Clinical Psychology, 64, 1290–1305.

    Article  Google Scholar 

  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.

    Google Scholar 

  • Shah, B. R., Laupacis, A., Hux, J. E., & Austin, P. C. (2005). Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review. Journal of Clinical Epidemiology, 58, 550–559.

    Article  Google Scholar 

  • Sherman, L. W., Gottfredson, D., MacKenzie, D., Eck, J., Reuter, P., & Bushway, S. (1997). Preventing crime: what works, what doesn’t, what’s promising. Report to the U.S. Congress. Washington, DC: US Department of Justice.

    Google Scholar 

  • Smith, J. A., & Todd, P. E. (2001). Reconciling conflicting evidence on the performance of propensity-score matching methods. American Economic Review, 91, 112–118.

    Article  Google Scholar 

  • Stolzenberg, R. M., & Relles, D. A. (1997). Tools for intuition about sample selection bias and its correction. American Sociological Review, 62, 494–507.

    Article  Google Scholar 

  • The Cochrane Collaboration (2007). The name behind the Cochrane Collaboration. Retrieved July, 2007, from http://www.cochrane.org/docs/archieco.htm.

  • Valentine, J. C., & Cooper, H. (2008). A systematic and transparent approach for assessing the methodological quality of intervention effectiveness research: the study design and implementation assessment device (Study DIAD). Psychological Methods, 13, 130–149.

    Article  Google Scholar 

  • Wakschlag, L. S., Pickett, K. E., Cook, E., Benowitz, N. L., & Leventhal, B. L. (2002). Maternal smoking during pregnancy and severe antisocial behavior in offspring: a review. American Journal of Public Health, 92, 966–974.

    Article  Google Scholar 

  • Weisburd, D., Lum, C. M., & Petrosino, A. (2001). Does research design affect study outcomes in criminal justice? Annals of the American Academy of Political and Social Science, 578, 50–70.

    Article  Google Scholar 

  • Wells, L. E., & Rankin, J. H. (1991). Families and delinquency: a meta-analysis of the impact of broken homes. Social Problems, 38, 71–93.

    Article  Google Scholar 

  • Wikström, P.-O. H. (2007). In search of causes and explanations of crime. In R. D. King & E. Wincup (Eds.), Doing research on crime and justice (2nd ed., pp. 117–139). Oxford: Oxford University Press.

    Google Scholar 

  • Wilson, D. B., & Lipsey, M. W. (2001). The role of method in treatment effectiveness research: evidence from meta-analysis. Psychological Methods, 6, 413–429.

    Article  Google Scholar 

  • Winship, C., & Morgan, S. L. (1999). The estimation of causal effects from observational data. Annual Review of Sociology, 25, 659–706.

    Article  Google Scholar 

  • Yarrow, M. R., Campbell, J. D., & Burton, R. V. (1970). Recollections of childhood: a study of the retrospective method. Monographs of the Society for Research in Child Development, 35(iii-iv), 1–83.

    Google Scholar 

Download references

Acknowledgments

The authors are grateful to David Humphreys for his help with this paper and to the British Academy and the UK Economic and Social Research Council (grant RES-000-22-2311) for financially supporting the research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joseph Murray.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Murray, J., Farrington, D.P. & Eisner, M.P. Drawing conclusions about causes from systematic reviews of risk factors: The Cambridge Quality Checklists. J Exp Criminol 5, 1–23 (2009). https://doi.org/10.1007/s11292-008-9066-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11292-008-9066-0

Keywords

Navigation