Original ArticleThe binomial distribution of meta-analysis was preferred to model within-study variability
Introduction
In this paper we consider meta-analysis of proportions. Very frequently occurring examples of proportions being meta-analyzed are sensitivity or specificity of a diagnostic test. Therefore, this article is written from a diagnostic research perspective, though the results apply to meta-analysis of proportions in general, such as prevalences or incidences.
Meta-analytic methods for a diagnostic test depend on the type of data that are available from the different studies. In most medical articles, the commonly reported measures of diagnostic test accuracy are sensitivity and/or specificity. Alternatively, other measures such as diagnostic odds ratio (OR), predictive values, and area under the receiver operating characteristic (ROC) curve are reported.
Statistical methods to pool the results of diagnostic test measures from different studies lay on different assumptions. For example, it might be assumed that the observed differences between individual study results are only due to sampling variation, leading to what is called a fixed effect analysis. When an estimate of the sensitivity or specificity is reported in a single study, the simplest method to get a summary measure is to calculate the average sensitivity and/or specificity, possibly with weights depending on the within-study sample sizes or standard errors (SEs). However, this approach is usually inappropriate because it is likely that variability beyond chance can be attributed to between-study differences [1], [2]. Some of the between-study variability could be accounted for using explanatory variables in a regression analysis. But mostly not all heterogeneity can be explained, and a random effects model is used in the statistical analysis that allows between-studies heterogeneity [3], [4].
In the last decade, many random effects methods have been developed to relax the fixed effect assumptions in meta-analysis [5], [6], [7], [8] of diagnostic tests [9], [10]. Some of these methods enable analyzing sensitivity and specificity jointly. However, in the medical literature numerous meta-analyses are published in which one is interested in meta-analyzing only sensitivity or specificity, and in this paper we concentrate on this situation. Then the standard way of analysis is with the DerSimonian and Laird [6] random effects model. It is not well known that this method can be heavily biased when it is applied to proportions, such as specificities or sensitivities, though some authors have mentioned this [5], [11], [12]. Chang et al. [11] have proposed a method that repairs the bias. However, this article has been cited only once since the year 2001, showing that in practice this method is not used. It might be due to the difficulty to perform the method easily in standard statistical packages. The reason for the standard method being biased is that the binomial within-study likelihood of the sensitivity or specificity is approximated using a normal likelihood. It is well known that this approximation can be bad if the proportion is close to one or zero, and/or the sample size is small. So bias can be expected if this is the case in a meta-analysis. However, even if the normal approximation would be good enough for ordinary applications, bias could be introduced because the use of the normal approximation in meta-analysis ignores the correlation between the estimated proportion and its variance. We come back to this point in the next section. Nowadays standard statistical packages allow for fitting generalized linear mixed models (GLMM). This makes it very easy to use the exact binomial within-study distribution of the estimated sensitivity or specificity instead of a normal approximation of it. In this article, we call the latter the approximate method and the former the exact method.
The purpose of this article is to compare the performance of the two modeling approaches, approximate and exact, through a simulation study. In Section 2, both methods are discussed. In Section 3 we describe the design of the simulation study, and in Section 4 we present the results. In Section 5, we apply the methods on real meta-analysis data. We end with a discussion in Section 6. We used SAS software (version 9; SAS Institute, Cary, NC) to simulate the data and to estimate the parameters for the models discussed in Section 2.
Section snippets
Random effects model
In a situation where the interest is to meta-analyze sensitivities or specificities separately, the commonly used method is the DerSimonian and Laird [6] random effects model. In the remainder of this paper we will talk about meta-analyzing sensitivities, but all the results apply to specificities as well. In fact, the results apply to any meta-analysis where the target parameter is a proportion or probability and each study contributes a sample size and a number of “successes.” Unlike a fixed
Simulation study
A simulation study was carried out to compare the performance of the two methods, approximate and exact, discussed in Section 2. We investigated the effect of the number of studies included in the meta-analysis, the mean within-study sample size, the between-study variability, and the true median sensitivity. The data were simulated in two steps. First, the true logit sensitivity, ηi, was simulated from a normal distribution with a given mean logit sensitivity η and between-studies variance τ2.
Simulation results
The results from the simulations are presented in Fig. 1, Fig. 2, and Table 2. Fig. 1 shows the biases and MSEs for the mean logit sensitivity η. It can be seen from Fig. 1a that the exact likelihood approach yields estimates of η that are quite unbiased regardless of the different scenarios; that is, the expected value of the estimated η using the exact method is almost equal to the true value, and always closer to the true value than the approximate likelihood method. The bias in the
Data example
To illustrate the methods discussed in this article, we reanalyzed the data of a published meta-analysis [33]. Patwardhan et al. [33] present data from 15 studies to assess the operating characteristics of positron emission tomography (PET) by using fluorine 18 fluorodexyglucose (FDG). They performed a literature search in the MEDLINE, CINAHL, and HealthSTAR databases published between 1989 and 2003. Articles were selected if FDG PET was performed with a dedicated scanner and the resolution was
Discussion
In numerous medical articles sensitivities or specificities, or more generally proportions are analyzed, nowadays almost invariably with the DerSimonian and Laird [6] random effects model. This model uses a normal distribution for the logit transformed true probabilities. Alternatively, one could assume a beta distribution for the true probabilities. Then the model can be fitted in a statistical package such as EGRET. However, this model is not used in practice, may be due to the fact that many
References (40)
- et al.
Meta-analysis in clinical trials
Control Clin Trials
(1986) - et al.
Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews
J Clin Epidemiol
(2005) - et al.
Guidelines for meta-analyses evaluating diagnostic tests
Ann Intern Med
(1994) - et al.
Exploring sources of heterogeneity in systematic reviews of diagnostic tests
Stat Med
(2002) - et al.
How should meta-regression analyses be undertaken and interpreted
Stat Med
(2002) - et al.
Detecting and describing heterogeneity in meta-analysis
Stat Med
(1998) - et al.
A random-effect regression model for meta-analysis
Stat Med
(1995) - et al.
Combining multiple outcome measures in meta-analysis: an application
Stat Med
(2003) - et al.
Advanced methods in meta-analysis: multivariate approach and meta-regression
Stat Med
(2002) - et al.
A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations
Stat Med
(2001)
Meta-analysis of binary data: which study variance estimate to use?
Stat Med
Generalized linear mixed models for meta-analysis
Stat med
The analysis of binary data
Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations
Stat Med
A multilevel model framework for meta-analysis of clinical trials with binary outcomes
Stat Med
Assessing the amount of heterogeneity in random-effects meta-analysis
Biom J
SAS/STAT 9.1 user's guide
A user's guide to MLwiN
The accuracy of single serum progesterone measurement in the diagnosis of ectopic pregnancy: a meta-analysis
Human Reproduction
Systematic review of the accuracy of the ParaSightTM-F test in the diagnosis of Plasmodium falciparum malaria
Med Sci Monit
Cited by (324)
Worldwide prevalence of natal and neonatal teeth: Systematic review and meta-analysis
2023, Journal of the American Dental AssociationChlamydia pneumonia infection and risk of multiple sclerosis: A meta-analysis
2023, Multiple Sclerosis and Related DisordersPerformance of saliva compared with nasopharyngeal swab for diagnosis of COVID-19 by NAAT in cross-sectional studies: Systematic review and meta-analysis
2023, Clinical BiochemistryCitation Excerpt :The degree of interdependence between performance measures (sensitivity and specificity) was tested using bivariate box plot. For subgroups with <4 studies, we used random effects logistic regression for meta-analysis of diagnostic accuracy data (an extension of generalized linear model for binomial family with a logit link) [22,23]. Publication bias was also assessed using both statistical tests and visual inspection of funnel plot asymmetry [24].