SeriesMultiplicity in randomised trials I: endpoints and treatments
Section snippets
The issue
Multiplicity portends troubles for researchers and readers alike for two main reasons. First, investigators should report all analytical comparisons implemented. Unfortunately, they sometimes hide the complete analysis, handicapping the reader's understanding of the results. Second, if researchers properly report all comparisons made, statisticians proffer statistical adjustments to account for multiple comparisons. Investigators need to know whether they should use such adjustments, and
A proposed statistical solution
Most statisticians would recommend reducing the number of comparisons as a solution to multiplicity. Given many tests, however, some statisticians recommend making adjustments such that the overall probability of a false-positive finding equals α after making d comparisons in the trial. Authors usually attribute the method to Bonferroni and simply state that to test comparisons in a trial at α, all comparisons should be performed at the α/d significance level, not at the α level.5, 12 Thus, for
Multiple endpoints
Although the ideal approach for the design and analysis of randomised controlled trials relies on one primary endpoint, investigators typically examine more than one. The most egregious abuse with multiplicity arises in the data-dredging that happens behind the scenes and remains unreported. Investigators analyse many endpoints, but only report the favourable significant comparisons. Failure to note all the comparisons actually made is unscientific if unwitting and fraudulent if intentional. “
Multiple treatments (multiarm trials)
Addressing multiplicity from multiple treatments is a more tractable problem than from multiple endpoints. First, investigators can avert multiple tests by one global test of significance across comparison groups13—eg, comparing A vs B vs C in a three-arm trial—or by modelling a dose-response relation.21 Second, and perhaps most importantly, researchers have less opportunity to data-dredge on many treatments and not report them. While they easily can add more endpoints for analysis, they would
The role of adjustments for multiplicity
Sometimes formal adjustments for multiplicity are inescapable. An obvious example would arise with certain decision-making criteria in submissions to a regulatory agency for drug approval. If the sponsor specifies more than one primary endpoint and proposes to claim treatment effect if one or more are significant, investigators should adjust for multiplicity.3 Furthermore, the same principle extends to all investigators whose decision-making intent is to claim an effect based on any one of a
What readers should look for
Readers should expect the researchers to report all the endpoints analysed and treatments compared. Assessing whether they reported them all is usually difficult. Access to the protocol would be helpful but is usually impossible. We urge greater access to protocols. Poor, incomplete reporting, however, frequently renders readers helpless to know the complete analysis undertaken by the investigators. Reporting according to the CONSORT statement obviates these difficulties.16, 17
Readers should
References (25)
P-value interpretation and alpha allocation in clinical trials
Ann Epidemiol
(1998)- et al.
Sample size calculations in randomised trials: mandatory and mystical
Lancet
(2005) - et al.
The CONSORT statement: revised recommendations for improving the quality of reports or parallel-group trials
Lancet
(2001) Clinical trials with multiple outcomes: a statistical perspective on their design, analysis, and interpretation
Control Clin Trials
(1997)What's wrong with Bonferroni adjustments
BMJ
(1998)- Schulz KF, Grimes DA. Multiplicity in randomised trials II: subgroup and interim analyses. Lancet (in...
- et al.
Efficacy endpoint selection and multiplicity adjustment methods in clinical trials with inherent multiple endpoint issues
Stat Med
(2003) No adjustments are needed for multiple comparisons
Epidemiology
(1990)- et al.
- et al.
Multiple comparisons and related issues in the interpretation of epidemiologic data
Am J Epidemiol
(1995)