Skip Navigation


IJE Advance Access originally published online on December 12, 2008
International Journal of Epidemiology 2009 38(1):298-303; doi:10.1093/ije/dyn265
This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
38/1/298    most recent
dyn265v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Nüesch, E.
Right arrow Articles by Jüni, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nüesch, E.
Right arrow Articles by Jüni, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press on behalf of the International Epidemiological Association © The Author 2008; all rights reserved.

Commentary: Which meta-analyses are conclusive?

Eveline Nüesch1,2 and Peter Jüni1,2,*

1Institute of Social and Preventive Medicine, University of Bern, Switzerland.
2CTU Bern, Bern University Hospital, Switzerland.

* Corresponding author. Institute of Social and Preventive Medicine, University of Bern, Switzerland. E-mail: juni{at}ispm.unibe.ch

Accepted 11 November 2008

In 1991, a meta-analysis of seven small-scale trials of intravenous magnesium in a total of 1266 patients with suspected acute myocardial infarction indicated a >50% reduction in the risk of death associated with magnesium (relative risk 0.48, 95% CI 0.26–0.88).1 Yusuf et al. updated this meta-analysis in 19932 to include LIMIT-2,3 at the time the only adequately sized trial, with a power of 80% to detect a moderate to large relative reduction in the risk of death of 33% associated with magnesium. Based on a total of eight trials in 3617 patients with a pooled relative risk of 0.59 (95% CI 0.38–0.91), the authors concluded that ‘intravenous magnesium is a safe, effective, widely practicable and inexpensive intervention that has the potential of making an important impact on the management of patients with myocardial infarction’.2 In 1995, ISIS-4 became available,4 a large-scale trial in 58 050 patients, which had nearly 95% power to detect a small, but potentially clinically relevant reduction in the relative risk of death of 10% associated with magnesium. ISIS-4 clearly refuted the earlier meta-analyses and showed a trend towards more deaths in the patients allocated to magnesium, with the lower limit of the 95% CI excluding any relevant benefit of the intervention (relative risk 1.05, 95% CI 0.99–1.12).

The case of magnesium in acute myocardial infarction cast serious doubts on the trustworthiness of meta-analyses. Which meta-analyses were conclusive and which were likely to be refuted by subsequent large-scale trials? Intrigued by the magnesium example, Egger and Davey Smith5 suggested in 1995 that funnel plots could have been used as a diagnostic tool, in which estimates of treatment effect obtained in trials included in the magnesium meta-analyses1,2 were plotted against a measure of sample size or statistical precision, to detect bias associated with small trials. In the absence of bias, the plot will typically resemble a symmetrical inverted funnel with the results of smaller trials more widely scattered than those of larger, more precise trials. Publication bias,6 and poor design, execution and analysis of small trials7 may result in skewed funnel plots. Visual inspection of the funnel plot of magnesium trials and a formal statistical test of its asymmetry indicated that the funnel plot was clearly asymmetrical before ISIS-4 became available.5,7

In 1997, Pogue and Yusuf8,9 took a different approach and suggested that multiple looks in meta-analyses of randomized trials may be interpreted similarly to interim looks in a single trial. The problem of interim looks in a single trial was originally addressed by Armitage10 and Pocock11 by group sequential analysis. Lan and DeMets12 extended the suggested concept with an alpha-spending function to allow flexible unplanned monitoring in a trial. They introduced the cumulative z-curve modelled as a Brownian motion and an alpha-spending function according to O’Brien and Flemming13 for the construction of monitoring boundaries. If a treatment effect larger than expected occurs, a trial should be terminated early when the cumulative z-curve for this treatment effect crossed the constructed sequential monitoring boundary. In early stages of a trial when data are sparse, only very extreme results corresponding to extreme z-values are accepted to indicate premature termination of a trial. The monitoring boundaries become less stringent as more data accumulate and the planned sample size of the trial is approached. The same principle could be applied to meta-analyses to determine when a meta-analysis is conclusive. Only extreme results leading to z-values that cross highly stringent boundaries should be accepted if little information was accrued in a meta-analysis of few, small-scale trials. Boundaries should become less stringent as more information accumulates.8,9 In a cumulative meta-analysis of 10 magnesium trials, Pogue and Yusuf8 found that the cumulative z-curve of the meta-analysis did not cross the specified monitoring boundary for overall mortality before ISIS-44 and suggested that the meta-analysis was not conclusive. However, Egger et al. identified 15 trials of magnesium in myocardial infarction published before ISIS-4.4 When based on all 15 trials, rather than the 10 trials selected by Pogue and Yusuf, the meta-analysis crossed the monitoring boundary and became conclusive, although the results were still contradicted by ISIS-4.14 Pogue and Yusuf's approach failed to become widely adopted.

Recently, Wetterslev et al. coined the term ‘trial sequential analysis’ for an extension of Pogue and Yusuf's approach, which reflects an increase in uncertainty if heterogeneity between trials is present in a meta-analysis.15 In this issue, two articles by the same group use trial sequential analysis to determine whether results of published meta-analyses in neonatology16 and across different fields17 are conclusive. Accounting for the observed heterogeneity between trials, they find a substantial proportion of published meta-analyses potentially inconclusive. In both articles,16,17 the authors point out that trial sequential analysis does not deal with systematic errors resulting from the inclusion of flawed trials18 and outcome reporting19 or publication biases20 and that these sources of systematic errors should be appropriately examined using funnel plots21 and analyses stratified according to methodological characteristics of trials accompanied by appropriate tests for interaction between trial characteristic and effect estimates.22

Here, we re-analyse the trials of intravenous magnesium in acute myocardial infarction to determine how the different diagnostic measures—funnel plots, stratified analyses according to methodological characteristics of trials and heterogeneity-adjusted trial sequential analysis—contribute to our understanding of bias and inconclusive results at four stages of the meta-analysis: (A) trials available until 1991, before LIMIT-2;3 (B) trials until 1995, before ISIS-44 became available; (C) all trials until 1995, including ISIS-4;4 and (D) all trials available to date.14,24 Figure 1 presents funnel plots of effect sizes on the horizontal axis against their standard errors on the vertical axis, displaying asymmetry as regression lines with 95% confidence bands derived from predicting the treatment effect from univariable meta-regression analysis with the standard error as the explanatory variable.21 Visual inspection of funnel plot and regression line suggest asymmetry at all four stages A–D of the meta-analysis, but Egger's test for funnel plot asymmetry23 becomes positive only at stage B, after the inclusion of LIMIT-2,3 the only adequately sized trial at that time. In subsequent stages, the shape of the funnel plot remains essentially unchanged and Egger's test for asymmetry positive, suggesting bias.


Figure 1
View larger version (7K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1 Funnel plots. Funnel plots are presented (A) for trials published until 1991, before LIMIT-2 became available; (B) until 1995, before ISIS-4 became available; (C) until 1995, including ISIS-4; and (D) up to 2004. Dotted lines indicate predicted treatment effects (regression line) from univariable meta-regression by using standard error as explanatory variable; dashed lines represent 95% CI. Regression lines are truncated at standard errors typically found in adequately sized trials with sufficient power to detect a moderate to large relative risk reduction of 30–40% (stages A and B) and at the standard error found in the largest trial included in the meta-analysis (stages C and D). P-values are derived from Egger's test for funnel plot asymmetry.23

 
Table 1 presents the results from corresponding stratified analyses according to concealment of allocation and sample size. At stage A, stratified analyses using a fixed- and a random-effects models indicate no relevant differences between trials with adequate concealment and the remaining trials, whereas no adequately sized trials with sample sizes of 2200 patients or more were available. At stage B, after LIMIT-23 became available, differences become apparent between trials with and without concealment of allocation and between large and small trials, but pooled effects are statistically significant in all stratified analyses and interaction tests are positive only in fixed-effect meta-analyses. With the inclusion of ISIS-4,4 the between trial heterogeneity becomes prominent. Therefore, random-effects models attribute considerably more weight to smaller studies than fixed-effect models and results from fixed- and random-effects meta-analyses including all trials are discordant: there is still a clinically relevant mortality reduction according to the random-effects, but a clear-cut null result according to the fixed-effect meta-analysis. Even in the presence of high between-trial heterogeneity, random- and fixed-effect models show concordant results if stratified according to trial size: no effect in adequately sized trials and an unrealistically large beneficial effect of magnesium on overall mortality in small trials. Positive tests of interaction in both random- and fixed-effect analyses indicate that these differences between adequately sized and small trials are unlikely to have occurred by chance alone.


View this table:
[in this window]
[in a new window]

 
Table 1 Stratified analyses

 
Figure 2 presents results from trial sequential analysis using fixed-effect meta-analysis (top) and random-effects meta-analysis (bottom). The dashed horizontal line represents the monitoring boundaries to be reached by the z-value of a meta-analysis to indicate that results are conclusive before the number of 24 899 patients is reached, which is necessary to detect a relative risk reduction of 15% with 80% power at a two-sided {alpha} of 0.01. The boundary becomes less stringent with more patients accruing and will converge to a z-value of 2.58 corresponding to the {alpha}-level of 0.01 indicating conclusive results when sufficient numbers of patients have been accumulated. Neither in random-effects, nor in fixed-effect meta-analyses, the z-curve crosses the boundary before ISIS-4 becomes available and the necessary information size of nearly 25 000 patients is reached, suggesting that the results of both, random- and fixed-effect meta-analyses were inconclusive. After inclusion of ISIS-4,4 however, results are conflicting: evidence of a null effect according to the fixed-effect model, but evidence of a benefit of magnesium according to the random-effects model, which vanishes only after the analysis is restricted to trials with adequate sample size (data available on request).


Figure 2
View larger version (18K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2 Heterogeneity-adjusted trial sequential analysis Trial sequential analysis of trials of intravenous magnesium using fixed-effect (top) and random-effects meta-analysis (bottom). The dashed vertical line indicates that the number of patients necessary to detect a relative risk reduction of 15% with 80% power at {alpha} = 0.01 is 24 899 if a baseline risk of 10% and a heterogeneity between trials of I2 = 30% are assumed. The dashed horizontal line represents the monitoring boundaries to be reached by the z-value of a meta-analysis to indicate that results are conclusive before the necessary number of 24 899 patients is reached. The boundary becomes less stringent when more trials and patients are included and will converge to a z-value of 2.58, corresponding to the {alpha}-level of 0.01, to indicate conclusive results when sufficient numbers of patients are accumulated

 
It is the overall pattern found in funnel plots, stratified analyses and heterogeneity-adjusted trial sequential analysis, which provides a clear-cut insight into the trustworthiness of the different stages of the meta-analysis of magnesium in acute myocardial infarction.1,2,14,24 At stage A, formal tests of funnel plot asymmetry and interaction tests accompanying stratified analyses are still negative due to a lack of power and some would have concluded that the evidence accumulated was unbiased and trustworthy. Heterogeneity-adjusted trial sequential analysis unequivocally indicates, however, that the evidence was inconclusive at this stage. At stage B, trial sequential analysis suggests that the accumulated evidence is still unconvincing even though LIMIT-23 was included. In addition, the test for funnel plot asymmetry becomes positive. At stages C and D, after the inclusion of ISIS-4,4 heterogeneity-adjusted trial sequential analyses of random- and fixed-effects meta-analyses are discordant. Here, the appropriately powered tests of funnel plot asymmetry and tests of interaction between sample size and treatment effect indicate that the inclusion of trials of inadequate size leads to a severe distortion of results.

Egger and Davey Smith concluded in 1995 that ‘results of meta-analyses that are exclusively based on small trials should be distrusted - even if the combined effect is statistically highly significant. Several medium-sized trials of high quality seem necessary to render results trustworthy.’5 These conclusions still hold in 2009. If appropriately used and interpreted, funnel plots with formal statistical tests of asymmetry, stratified analyses accompanied by tests of interaction and heterogeneity-adjusted trial sequential analyses will all contribute to our understanding about when to consider a meta-analysis conclusive.


    Acknowledgements
 Top
 Acknowledgements
 References
 
We are grateful to Kristian Thorlund, Jørn Wetterslev and Christian Gluud for help with trial sequential analysis of the magnesium trials and for stimulating discussions.

Conflict of interest: None declared.


    References
 Top
 Acknowledgements
 References
 
1 Teo KK, Yusuf S, Collins R, Held PH, Peto R. Effects of intravenous magnesium in suspected acute myocardial infarction: overview of randomised trials. Br Med J (1991) 303:1499–503.[Abstract/Free Full Text]

2 Yusuf S, Teo K, Woods K. Intravenous magnesium in acute myocardial infarction. An effective, safe, simple, and inexpensive intervention. Circulation (1993) 87:2043–46.[Free Full Text]

3 Woods KL, Fletcher S, Roffe C, Haider Y. Intravenous magnesium sulphate in suspected acute myocardial infarction: results of the second Leicester Intravenous Magnesium Intervention Trial (LIMIT-2). Lancet (1992) 339:1553–58.[CrossRef][Web of Science][Medline]

4 ISIS-4 (Fourth International Study of Infarct Survival) Collaborative Group. ISIS-4: a randomised factorial trial assessing early oral captopril, oral mononitrate, and intravenous magnesium sulphate in 58 050 patients with suspected acute myocardial infarction. Lancet (1995) 345:669–85.[CrossRef][Web of Science][Medline]

5 Egger M, Davey Smith G. Misleading meta-analysis. Br Med J (1995) 310:752–54.[Free Full Text]

6 Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research. Lancet (1991) 337:867–72.[CrossRef][Web of Science][Medline]

7 Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ (1997) 315:629–34.[Abstract/Free Full Text]

8 Pogue J, Yusuf S. Overcoming the limitations of current meta-analysis of randomised controlled trials. Lancet (1998) 351:47–52.[CrossRef][Web of Science][Medline]

9 Pogue JM, Yusuf S. Cumulating evidence from randomized trials: utilizing sequential monitoring boundaries for cumulative meta-analysis. Control Clin Trials (1997) 18:580–93. discussion 661–66.[CrossRef][Web of Science][Medline]

10 Armitage P. Sequential analysis in therapeutic trials. Annu Rev Med (1969) 20:425–30.[CrossRef][Web of Science][Medline]

11 Pocock S. Group sequential methods in the design and analysis of clinical trials. Biometrika (1977) 64:191–99.[Abstract/Free Full Text]

12 Lan K, DeMets D. Discrete sequential boundaries for clinical trials. Biometrika (1983) 70:659–63.[Abstract/Free Full Text]

13 O'Brien PC, Fleming TR. A multiple testing procedure for clinical trials. Biometrics (1979) 35:549–56.[CrossRef][Web of Science][Medline]

14 Egger M, Davey Smith G, Sterne JA. Meta-analysis: is moving the goal post the answer? Lancet (1998) 351:1517.[Web of Science][Medline]

15 Wetterslev J, Thorlund K, Brok J, Gluud C. Trial sequential analysis may establish when firm evidence is reached in cumulative meta-analysis. J Clin Epidemiol (2008) 61:64–75.[CrossRef][Web of Science][Medline]

16 Brok J, Thorlund K, Wetterslev J, Gluud C. Apparently conclusive meta-analyses may be inconclusive – trial sequential analysis adjustment of random error risk due to repetitive testing of accumulating data in apparently conclusive neonatal meta-analyses. Int J Epidemiol (2009) 38:287–98.[Abstract/Free Full Text]

17 Thorlund K, Devereaux PJ, Wetterslev J, et al. Can trial sequential monitoring boundaries reduce spurious inferences from meta-analyses? Int J Epidemiol (2009) 38:276–86.[Abstract/Free Full Text]

18 Wood L, Egger M, Gluud LL, et al. Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study. Br Med J (2008) 336:601–05.[Abstract/Free Full Text]

19 Chan AW, Hrobjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA (2004) 291:2457–65.[Abstract/Free Full Text]

20 Dwan K, Altman DG, Arnaiz JA, et al. Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PLoS ONE (2008) 3:e3081.[Medline]

21 Sterne JA, Egger M. Funnel plots for detecting bias in meta-analysis: guidelines on choice of axis. J Clin Epidemiol (2001) 54:1046–55.[CrossRef][Web of Science][Medline]

22 Jüni P, Altman DG, Egger M. Systematic reviews in health care: assessing the quality of controlled clinical trials. Br Med J (2001) 323:42–46.[Free Full Text]

23 Harbord RM, Egger M, Sterne JA. A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Stat Med (2006) 25:3443–57.[CrossRef][Web of Science][Medline]

24 Li J, Zhang Q, Zhang M, Egger M. Intravenous magnesium for acute myocardial infarction. Cochrane Database Syst Rev (2007) CD002755.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
38/1/298    most recent
dyn265v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Nüesch, E.
Right arrow Articles by Jüni, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nüesch, E.
Right arrow Articles by Jüni, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?