Multiplicity in randomised trials I: endpoints and treatments

doi:10.1016/S0140-6736(05)66461-6

The Lancet

Volume 365, Issue 9470, 30 April–6 May 2005, Pages 1591-1595

https://doi.org/10.1016/S0140-6736(05)66461-6 Get rights and content

Summary

Multiplicity problems emerge from investigators looking at many additional endpoints and treatment group comparisons. Thousands of potential comparisons can emanate from one trial. Investigators might only report the significant comparisons, an unscientific practice if unwitting, and fraudulent if intentional. Researchers must report all the endpoints analysed and treatments compared. Some statisticians propose statistical adjustments to account for multiplicity. Simply defined, they test for no effects in all the primary endpoints undertaken versus an effect in one or more of those endpoints. In general, statistical adjustments for multiplicity provide crude answers to an irrelevant question. However, investigators should use adjustments when the clinical decision-making argument rests solely on one or more of the primary endpoints being significant. In these cases, adjustments somewhat rescue scattershot analyses. Readers need to be aware of the potential for under-reporting of analyses.

Section snippets

The issue

Multiplicity portends troubles for researchers and readers alike for two main reasons. First, investigators should report all analytical comparisons implemented. Unfortunately, they sometimes hide the complete analysis, handicapping the reader's understanding of the results. Second, if researchers properly report all comparisons made, statisticians proffer statistical adjustments to account for multiple comparisons. Investigators need to know whether they should use such adjustments, and

A proposed statistical solution

Most statisticians would recommend reducing the number of comparisons as a solution to multiplicity. Given many tests, however, some statisticians recommend making adjustments such that the overall probability of a false-positive finding equals α after making d comparisons in the trial. Authors usually attribute the method to Bonferroni and simply state that to test comparisons in a trial at α, all comparisons should be performed at the α/d significance level, not at the α level.5, 12 Thus, for

Multiple endpoints

Although the ideal approach for the design and analysis of randomised controlled trials relies on one primary endpoint, investigators typically examine more than one. The most egregious abuse with multiplicity arises in the data-dredging that happens behind the scenes and remains unreported. Investigators analyse many endpoints, but only report the favourable significant comparisons. Failure to note all the comparisons actually made is unscientific if unwitting and fraudulent if intentional. “

Multiple treatments (multiarm trials)

Addressing multiplicity from multiple treatments is a more tractable problem than from multiple endpoints. First, investigators can avert multiple tests by one global test of significance across comparison groups¹³—eg, comparing A vs B vs C in a three-arm trial—or by modelling a dose-response relation.²¹ Second, and perhaps most importantly, researchers have less opportunity to data-dredge on many treatments and not report them. While they easily can add more endpoints for analysis, they would

The role of adjustments for multiplicity

Sometimes formal adjustments for multiplicity are inescapable. An obvious example would arise with certain decision-making criteria in submissions to a regulatory agency for drug approval. If the sponsor specifies more than one primary endpoint and proposes to claim treatment effect if one or more are significant, investigators should adjust for multiplicity.³ Furthermore, the same principle extends to all investigators whose decision-making intent is to claim an effect based on any one of a

What readers should look for

Readers should expect the researchers to report all the endpoints analysed and treatments compared. Assessing whether they reported them all is usually difficult. Access to the protocol would be helpful but is usually impossible. We urge greater access to protocols. Poor, incomplete reporting, however, frequently renders readers helpless to know the complete analysis undertaken by the investigators. Reporting according to the CONSORT statement obviates these difficulties.16, 17

Readers should

References (25)

LA Moye
P-value interpretation and alpha allocation in clinical trials
Ann Epidemiol
(1998)
KF Schulz et al.
Sample size calculations in randomised trials: mandatory and mystical
Lancet
(2005)
D Moher et al.
The CONSORT statement: revised recommendations for improving the quality of reports or parallel-group trials
Lancet
(2001)
SJ Pocock
Clinical trials with multiple outcomes: a statistical perspective on their design, analysis, and interpretation
Control Clin Trials
(1997)
TV Perneger
What's wrong with Bonferroni adjustments
BMJ
(1998)
Schulz KF, Grimes DA. Multiplicity in randomised trials II: subgroup and interim analyses. Lancet (in...
AJ Sankoh et al.
Efficacy endpoint selection and multiplicity adjustment methods in clinical trials with inherent multiple endpoint issues
Stat Med
(2003)
KJ Rothman
No adjustments are needed for multiple comparisons
Epidemiology
(1990)
P Westfall et al.
DA Savitz et al.
Multiple comparisons and related issues in the interpretation of epidemiologic data
Am J Epidemiol
(1995)

DG Altman

Statistics in medical journals: some recent trends

Stat Med

(2000)

KJ Ottenbacher

Quantitative evaluation of multiplicity in epidemiology and public health research

Am J Epidemiol

(1998)

Cited by (0)

View full text

SeriesMultiplicity in randomised trials I: endpoints and treatments

Summary

Section snippets

The issue

A proposed statistical solution

Multiple endpoints

Multiple treatments (multiarm trials)

The role of adjustments for multiplicity

What readers should look for

Ann Epidemiol

Lancet

Lancet

Control Clin Trials

What's wrong with Bonferroni adjustments

BMJ

Efficacy endpoint selection and multiplicity adjustment methods in clinical trials with inherent multiple endpoint issues

Stat Med

No adjustments are needed for multiple comparisons

Epidemiology

Multiple comparisons and related issues in the interpretation of epidemiologic data

Am J Epidemiol

Statistics in medical journals: some recent trends

Stat Med

Quantitative evaluation of multiplicity in epidemiology and public health research

Am J Epidemiol

Series
Multiplicity in randomised trials I: endpoints and treatments