Original Article
The binomial distribution of meta-analysis was preferred to model within-study variability

https://doi.org/10.1016/j.jclinepi.2007.03.016Get rights and content

Abstract

Objective

When studies report proportions such as sensitivity or specificity, it is customary to meta-analyze them using the DerSimonian and Laird random effects model. This method approximates the within-study variability of the proportion by a normal distribution, which may lead to bias for several reasons. Alternatively an exact likelihood approach based on the binomial within-study distribution can be used. This method can easily be performed in standard statistical packages. We investigate the performance of the standard method and the alternative approach.

Study Design and Setting

We compare the two approaches through a simulation study, in terms of bias, mean-squared error, and coverage probabilities. We varied the size of the overall sensitivity or specificity, the between-studies variance, the within-study sample sizes, and the number of studies. The methods are illustrated using a published meta-analysis data set.

Results

The exact likelihood approach performs always better than the approximate approach and gives unbiased estimates. The coverage probability, in particular for the profile likelihood, is also reasonably acceptable. In contrast, the approximate approach gives huge bias with very poor coverage probability in many cases.

Conclusion

The exact likelihood approach is the method of preference and should be used whenever feasible.

Introduction

In this paper we consider meta-analysis of proportions. Very frequently occurring examples of proportions being meta-analyzed are sensitivity or specificity of a diagnostic test. Therefore, this article is written from a diagnostic research perspective, though the results apply to meta-analysis of proportions in general, such as prevalences or incidences.

Meta-analytic methods for a diagnostic test depend on the type of data that are available from the different studies. In most medical articles, the commonly reported measures of diagnostic test accuracy are sensitivity and/or specificity. Alternatively, other measures such as diagnostic odds ratio (OR), predictive values, and area under the receiver operating characteristic (ROC) curve are reported.

Statistical methods to pool the results of diagnostic test measures from different studies lay on different assumptions. For example, it might be assumed that the observed differences between individual study results are only due to sampling variation, leading to what is called a fixed effect analysis. When an estimate of the sensitivity or specificity is reported in a single study, the simplest method to get a summary measure is to calculate the average sensitivity and/or specificity, possibly with weights depending on the within-study sample sizes or standard errors (SEs). However, this approach is usually inappropriate because it is likely that variability beyond chance can be attributed to between-study differences [1], [2]. Some of the between-study variability could be accounted for using explanatory variables in a regression analysis. But mostly not all heterogeneity can be explained, and a random effects model is used in the statistical analysis that allows between-studies heterogeneity [3], [4].

In the last decade, many random effects methods have been developed to relax the fixed effect assumptions in meta-analysis [5], [6], [7], [8] of diagnostic tests [9], [10]. Some of these methods enable analyzing sensitivity and specificity jointly. However, in the medical literature numerous meta-analyses are published in which one is interested in meta-analyzing only sensitivity or specificity, and in this paper we concentrate on this situation. Then the standard way of analysis is with the DerSimonian and Laird [6] random effects model. It is not well known that this method can be heavily biased when it is applied to proportions, such as specificities or sensitivities, though some authors have mentioned this [5], [11], [12]. Chang et al. [11] have proposed a method that repairs the bias. However, this article has been cited only once since the year 2001, showing that in practice this method is not used. It might be due to the difficulty to perform the method easily in standard statistical packages. The reason for the standard method being biased is that the binomial within-study likelihood of the sensitivity or specificity is approximated using a normal likelihood. It is well known that this approximation can be bad if the proportion is close to one or zero, and/or the sample size is small. So bias can be expected if this is the case in a meta-analysis. However, even if the normal approximation would be good enough for ordinary applications, bias could be introduced because the use of the normal approximation in meta-analysis ignores the correlation between the estimated proportion and its variance. We come back to this point in the next section. Nowadays standard statistical packages allow for fitting generalized linear mixed models (GLMM). This makes it very easy to use the exact binomial within-study distribution of the estimated sensitivity or specificity instead of a normal approximation of it. In this article, we call the latter the approximate method and the former the exact method.

The purpose of this article is to compare the performance of the two modeling approaches, approximate and exact, through a simulation study. In Section 2, both methods are discussed. In Section 3 we describe the design of the simulation study, and in Section 4 we present the results. In Section 5, we apply the methods on real meta-analysis data. We end with a discussion in Section 6. We used SAS software (version 9; SAS Institute, Cary, NC) to simulate the data and to estimate the parameters for the models discussed in Section 2.

Section snippets

Random effects model

In a situation where the interest is to meta-analyze sensitivities or specificities separately, the commonly used method is the DerSimonian and Laird [6] random effects model. In the remainder of this paper we will talk about meta-analyzing sensitivities, but all the results apply to specificities as well. In fact, the results apply to any meta-analysis where the target parameter is a proportion or probability and each study contributes a sample size and a number of “successes.” Unlike a fixed

Simulation study

A simulation study was carried out to compare the performance of the two methods, approximate and exact, discussed in Section 2. We investigated the effect of the number of studies included in the meta-analysis, the mean within-study sample size, the between-study variability, and the true median sensitivity. The data were simulated in two steps. First, the true logit sensitivity, ηi, was simulated from a normal distribution with a given mean logit sensitivity η and between-studies variance τ2.

Simulation results

The results from the simulations are presented in Fig. 1, Fig. 2, and Table 2. Fig. 1 shows the biases and MSEs for the mean logit sensitivity η. It can be seen from Fig. 1a that the exact likelihood approach yields estimates of η that are quite unbiased regardless of the different scenarios; that is, the expected value of the estimated η using the exact method is almost equal to the true value, and always closer to the true value than the approximate likelihood method. The bias in the

Data example

To illustrate the methods discussed in this article, we reanalyzed the data of a published meta-analysis [33]. Patwardhan et al. [33] present data from 15 studies to assess the operating characteristics of positron emission tomography (PET) by using fluorine 18 fluorodexyglucose (FDG). They performed a literature search in the MEDLINE, CINAHL, and HealthSTAR databases published between 1989 and 2003. Articles were selected if FDG PET was performed with a dedicated scanner and the resolution was

Discussion

In numerous medical articles sensitivities or specificities, or more generally proportions are analyzed, nowadays almost invariably with the DerSimonian and Laird [6] random effects model. This model uses a normal distribution for the logit transformed true probabilities. Alternatively, one could assume a beta distribution for the true probabilities. Then the model can be fitted in a statistical package such as EGRET. However, this model is not used in practice, may be due to the fact that many

References (40)

  • R. DerSimonian et al.

    Meta-analysis in clinical trials

    Control Clin Trials

    (1986)
  • J.B. Reitsma et al.

    Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews

    J Clin Epidemiol

    (2005)
  • L. Irwig et al.

    Guidelines for meta-analyses evaluating diagnostic tests

    Ann Intern Med

    (1994)
  • J.G. Lijmer et al.

    Exploring sources of heterogeneity in systematic reviews of diagnostic tests

    Stat Med

    (2002)
  • S.G. Thompson et al.

    How should meta-regression analyses be undertaken and interpreted

    Stat Med

    (2002)
  • R.J. Hardy et al.

    Detecting and describing heterogeneity in meta-analysis

    Stat Med

    (1998)
  • C.S. Berkey et al.

    A random-effect regression model for meta-analysis

    Stat Med

    (1995)
  • L.R. Arends et al.

    Combining multiple outcome measures in meta-analysis: an application

    Stat Med

    (2003)
  • H.C. Van Houwelingen et al.

    Advanced methods in meta-analysis: multivariate approach and meta-regression

    Stat Med

    (2002)
  • C.M. Rutter et al.

    A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations

    Stat Med

    (2001)
  • B.H. Chang et al.

    Meta-analysis of binary data: which study variance estimate to use?

    Stat Med

    (2001)
  • R.W. Platt et al.

    Generalized linear mixed models for meta-analysis

    Stat med

    (1999)
  • D.R. Cox

    The analysis of binary data

    (1970)
  • L.E. Moses et al.

    Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations

    Stat Med

    (1993)
  • R.M. Turner et al.

    A multilevel model framework for meta-analysis of clinical trials with binary outcomes

    Stat Med

    (2000)
  • G. Knapp et al.

    Assessing the amount of heterogeneity in random-effects meta-analysis

    Biom J

    (2006)
  • SAS Institute Inc.

    SAS/STAT 9.1 user's guide

    (2004)
  • J. Rasbash et al.

    A user's guide to MLwiN

    (2004)
  • B.W.J. Mol et al.

    The accuracy of single serum progesterone measurement in the diagnosis of ectopic pregnancy: a meta-analysis

    Human Reproduction

    (1998)
  • M. Cruciani et al.

    Systematic review of the accuracy of the ParaSightTM-F test in the diagnosis of Plasmodium falciparum malaria

    Med Sci Monit

    (2004)
  • Cited by (324)

    • Performance of saliva compared with nasopharyngeal swab for diagnosis of COVID-19 by NAAT in cross-sectional studies: Systematic review and meta-analysis

      2023, Clinical Biochemistry
      Citation Excerpt :

      The degree of interdependence between performance measures (sensitivity and specificity) was tested using bivariate box plot. For subgroups with <4 studies, we used random effects logistic regression for meta-analysis of diagnostic accuracy data (an extension of generalized linear model for binomial family with a logit link) [22,23]. Publication bias was also assessed using both statistical tests and visual inspection of funnel plot asymmetry [24].

    View all citing articles on Scopus
    View full text