Review Article
Using data sources beyond PubMed has a modest impact on the results of systematic reviews of therapeutic interventions

https://doi.org/10.1016/j.jclinepi.2014.12.017Get rights and content

Abstract

Objectives

Searching multiple sources when conducting systematic reviews is considered good practice. We aimed to investigate the impact of using sources beyond PubMed in systematic reviews of therapeutic interventions.

Study Design and Setting

We randomly selected 50 Cochrane reviews that searched the PubMed (or MEDLINE) and EMBASE databases and included a meta-analysis of ≥10 studies. We checked whether each eligible record in each review (n = 2,700) was retrievable in PubMed and EMBASE. For the first-listed meta-analysis of ≥10 studies in each review, we examined whether excluding studies not found in PubMed affected results.

Results

A median of one record per review was indexed in EMBASE but not in PubMed; a median of four records per review was not indexed in PubMed or EMBASE. Meta-analyses included a median of 13.5 studies; a median of zero studies per meta-analysis was indexed in EMBASE but not in PubMed; a median of one study per meta-analysis was not indexed in PubMed or EMBASE. Meta-analysis using only PubMed-indexed vs. all available studies led to a different conclusion in a single case (on the basis of conventional criteria for statistical significance). In meta-regression analyses, effects in PubMed- vs. non–PubMed-indexed studies were statistically significantly different in a single data set.

Conclusion

For systematic reviews of the effects of therapeutic interventions, gains from searching sources beyond PubMed, and from searching EMBASE in particular are modest.

Introduction

What is new?

  • For systematic reviews of therapeutic interventions, the gains from searching sources beyond PubMed, and from searching Embase in particular, are modest.

  • When additional relevant studies are identified from non-PubMed sources they tend to not contribute substantial amounts of information to meta-analysis, and their omission does not appear to substantially influence meta-analysis results.

  • We provide empirical evidence to inform decisions about searching databases beyond PubMed. This evidence should be considered in view of the goals of a particular systematic review, the context in which the review is conducted, and the available resources.

  • When reviews are prepared under resource constraints and the expected number of relevant studies is not too small (e.g., when 10 or more studies are expected to be included in meta-analysis), systematic searching limited to PubMed can provide reliable inputs for subsequent decision and economic analyses.

Using multiple sources to identify relevant evidence is considered good practice when conducting systematic reviews [1], [2], [3]. Searching multiple electronic bibliographic databases will generally yield a greater number of abstracts to screen, thus increasing the probability that all relevant studies will be identified [4]. However, searching multiple databases is a resource-intensive process in that access to major bibliographic databases often requires a paid subscription and the increase in the number of abstracts retrieved increases the screening effort. Furthermore, using multiple databases complicates the logistics of search and retrieval of studies and requires developing expertise in formulating sensitive search strategies for diverse databases (e.g., knowledge regarding indexing methods or the design of appropriate search filters is database specific). Moving beyond electronic bibliographic databases (e.g., hand searching journals, reviewing conference proceedings and dissertations, or directly contacting investigators who may have access to unpublished studies) imposes even more severe demands on resources often for limited gains in yield [5], [6], [7], [8].

MEDLINE, a bibliographic database maintained by the U.S. National Library of Medicine, is the most commonly used electronic database in applied systematic reviews of biomedical research [9]. MEDLINE records from 5,600 biomedical journals are freely accessible via PubMed (http://www.ncbi.nlm.nih.gov/pubmed) [10]. PubMed also includes in-process citations, citations to life science articles from journals that are out of the scope of MEDLINE, author manuscripts reporting research funded by the National Institutes of Health, and some books, for a total 23 million records.

EMBASE is a commercial bibliographic database, maintained by Elsevier Science, EMBASE covers sources of biomedical citations similar to those covered by PubMed [11], indexes roughly 8,300 biomedical journals and contains more than 28 million records. It covers roughly 90% of PubMed-indexed publications, resulting in large overlap between the two databases (10% of PubMed publications are considered to be out of EMBASE's scope). Additionally, EMBASE includes all MEDLINE and OldMEDLINE records. It also indexes over 2,500 journals not found in PubMed (many are journals published outside North America) and has been adding 300,000 conference abstracts per year since 2009.

Several empirical studies have evaluated the impact of conducting searches exclusively in MEDLINE as compared with developing additional search strategies for use in other electronic databases or using additional sources of studies [2], [4], [5], [12], [13]. These studies have reported that searching MEDLINE alone may miss up to 50% of all relevant randomized controlled trials; in recent investigations, this proportion has ranged from 32.2% to 45%. Two reasons, besides incomplete coverage by MEDLINE, may explain these results. First, some empirical assessments have relied on simple search strategies and have not taken full advantage of using medical subject heading terms or truncated keywords. Second, with few exceptions [5], [6], [14], studies have relied on comparisons of yield based on independent search strategies constructed for use in each database instead of looking for included articles in the databases being compared. Although such comparisons are reflective of actual review practice (where identical search strategies for use in different databases cannot always be constructed), they conflate the ability of review teams to construct sensitive search strategies with the actual coverage offered by the databases and render the interpretation of the yield from each database less straightforward [4], [5], [6], [12].

We sought to evaluate the impact of using sources of evidence beyond PubMed in applied systematic reviews of health care interventions, by focusing on the maximum possible yield that can be attained by each possible evidence source. In particular, we aimed to evaluate the incremental yield of using EMBASE and other non-PubMed/non-EMBASE sources of evidence, in addition to PubMed. Furthermore, we examined whether using PubMed alone or multiple sources of evidence affects the results of quantitative synthesis (meta-analysis).

Section snippets

Selection of systematic reviews

We screened systematic reviews of therapeutic interventions from the Cochrane Database of Systematic Reviews (Issue 4, 2012) in random order until we obtained 50 reviews that met our inclusion criteria (I.T.S). Specifically, we assigned a random number between 0 and 1 (with a uniform distribution) to every review in the Cochrane database. We then ranked reviews by this random number and sequentially applied our selection criteria until we included 50 reviews. Eligible reviews were those that

Results

We screened 613 randomly selected Cochrane reviews (among 4,694 included in Issue 4 of the 2012 Cochrane library) to identify 50 that had searched both the PubMed/MEDLINE and EMBASE databases and contained a meta-analysis with at least 10 primary studies. The remaining 563 reviews were not considered further (Fig. 1). The complete list of included Cochrane reviews is provided in Appendix Document 1 at www.jclinepi.com.

Discussion

Systematic reviews aim to include all studies relevant to their guiding question, to minimize bias and increase precision [1], [5]. Reviews that search only a single database (e.g., PubMed) can therefore miss studies identifiable only from other sources. Missing relevant evidence can result in lower precision of estimates. It can also introduce bias, if the probability that a study is indexed in a database is dependent on its results, a phenomenon sometimes referred to as “database bias.”

Acknowledgments

Some of the results reported in this article were presented at the 2013 Annual Meeting for the Society for Medical Decision Making in Baltimore, Maryland. The first author (C.W.H.) received the Lee Lusted Award in the Health Services and Policy Research category for this work.

References (31)

  • J.M. Glanville et al.

    How to identify randomized controlled trials in MEDLINE: ten years on

    J Med Libr Assoc

    (2006)
  • S. Hopewell et al.

    Grey literature in meta-analyses of randomized trials of health care interventions

    Cochrane Database Syst Rev

    (2007)
  • MEDLINE® Fact Sheet. Available at: http://www.nlm.nih.gov/pubs/factsheets/medline.html. Accessed August 5,...
  • PubMed®: MEDLINE® retrieval on the World Wide Web Fact Sheet. Available at:...
  • I. Crowlesmith

    Coverage of MEDLINE in Embase

    (2011)
  • Cited by (0)

    Funding: None.

    Conflict of interest: None.

    View full text