Commentary |
Commentary: On Bayesian perspectives for epidemiological research
London School of Hygiene & Tropical Medicine, Keppel Street, London WC1E 7HT, UK. E-mail: james.carpenter{at}lshtm.ac.uk
Prevention is better and cheaper than cure, and the contribution of epidemiology to public health has been immense. Yet, when the number of practising epidemiologists is probably at an all time high, the discipline suffers angst.13 Why is this, and what should be done?
Arguably, at a time when typical exposure effects are likely to be smaller, so the potential for being mistaken about exposure risks larger, the number of studies and (fuelled by the pressure to publish) the number of articles they generate is increasing. The result is more false-positive results published, and an increasingly sceptical reception to all publications.4
Greenland lays into frequentist inference with relish and effectively tackles some common misunderstandings surrounding the Bayesian paradigm, arguing for its inclusion in epidemiological students' curricula as a matter of course.5 He describes approximate procedures for using sceptical priors to shrink estimates towards the null.
Would epidemiology be much better off without frequentist statistics? Certainly it is true that the NeymanPearson hypothesis testing paradigm, which forces a choice between two alternatives, is not appropriate when we are in the process of gathering evidence on possible risks. Further we are all much more Bayesian than we often admit, with prior beliefs motivating research questions, designs, and interpretations. You do not have to be a card-carrying Bayesian to vaguely expect to see a horse, and glimpse a donkey, then convince yourself you have seen a mule6witness the peptic ulcer story.3
As Greenland says, raw data tells us nothing about the risk of an exposure without a statistical model. Even making inferences from a sample mean assumes an implicit model. Models are built on assumptions about the data (e.g. observations from different individuals are treated as independent) and relate the data to a parameter representing a quantity of interest. Suppose, as is common, we fit a model by maximum likelihood. Then the information about the parameter is in the likelihood; the question is how to interpret, or calibrate, this information.
Frequentist and pure likelihood7 inference looks at the ratio of the likelihood of the data at the maximum likelihood estimate to the likelihood of the data at the null value. The frequentist paradigm differs from the pure likelihood paradigm in its interpretation of this information. Frequentists ask: given there is really no exposure effect, what is the chance of seeing a likelihood ratio as extreme as this if we repeat our study lots of times? Roughly, pure likelihood inference interprets the evidence in the likelihood ratio by reference to likelihood ratios from familiar experiments in other settings. In this respect, I believe the pure likelihood approach has an implicit subjective element and disagree with Greenland's sweeping statement that the likelihood paradigm replicates the pretense of objectivity that render frequentist methods so misleading. If the likelihood ratio is too large to be believed, I submit the model is quite badly wrong.
Could this be the case here? As Greenland argues, issues about the appropriateness of additive or multiplicative models are secondary. But this should not let us move on, ignoring the elephants in the room. One is surely selection bias, on the basis of the P-valuewhose presence is established in the submission8, publication,9,10 and citation of11,12 randomized controlled trials, and surely bedevils epidemiology. Others are data contamination processes, from measurement error to missing data.
Frequentist inference may well coincide with poor modelling, or encourage its adherents not to think hard about the issues because they are being objective. However, this does not mean it causes it; neither can a Bayesian approach average its effects away. Further, modelling can be improved without recourse to a Bayesian approach, although it may often be convenient.
On one reading, then, this paper5 can be summarized as follows. It is hard to justify the data models often used in epidemiology. So, without quite knowing how these models are wrong, we are best advised to shrink the results by our vaguely formulated scepticism. In so doing, we dethrone the P-value. Better, though, to think hard in advance about what our priors should be, and inform them with concrete evidence. For measurement error, we need additional information to establish accuracy. For selection bias, a register of studies, to which new studies were added when they received ethical approval, would then enable us to read of the number of published studies and unpublished studies about a particular exposure. A simple formula13 then gives us a bound on the publication bias. We can then decide if we wish to be less sceptical than this.
Unfortunately, studies are often poorly reported14,15; the further difficulty of accurate communication to the wider public is well known.16 With regard to public presentation, epidemiological findings are often given to the press in black and white terms1, with predictable consequences. The truth is never clear-cut. Why not summarize the results by the odds of the relative risk being, say, 1.5 vs 1. Expressing uncertainty through odds is already familiar to a large proportion of the public. Moreover, using such odds a journal could bet on the studies it publishes: the degree to which it stays in the black (as opposed to the impact factor) is a measure of success much closer to the aims of epidemiology!
I also think odds may be a more accessible way into a Bayesian approach for students. We do not need to think of priors as expressing prior bets and then turn our priors into pseudo data to include in a statistical analysis. The latter process is useful, but not always easy, and a prior does not always lead to unique prior data. Furthermore, if the model is badly wrong, can the prior data really fix the problem?
Turning to the methods presented in the paper, information weighted averaging is a very useful tool and should be taught to epidemiological students. However, I do not share Greenland's views about the limited usefulness of MCMC methods in epidemiology. These provide a natural, and often computationally most feasible, general framework to model both the intrinsically interesting multilevel nature of much epidemiological data and data contamination processes [see e.g. Ref. (17) discussion].
In summary, I believe teaching epidemiology students to think more deeply about inference, modelling, and their implicit assumptions is vital, and I endorse teaching and using the Bayesian paradigm as part of this. However, I believe that to think it alone will save epidemiology from the fate some have predicted is wishful thinking.
| References |
|---|
|
|
|---|
1 Taubes G. Epidemiology faces its limits. Science 1995;269:16469.
2 Tricopolous D. The future of epidemiology. BMJ 1996;313:43637.
3 Davey Smith G, Ebrahim S. Epidemiologyis it time to call it a day? Int J Epidemiol 2001;30:111.
4 Ionnidis JPA. Why most published research findings are false. PLoS Med 2005;2:e124.[CrossRef][Medline]
5 Greenland S. Bayseian perspectives for epidemiologic research. I. Foundations and basic methods. Int J Epidemiol 2006;35:76575.
6 McPearson G. The Devil's drug development dictionary. Available at: www.senns.demon.co.uk/wdict.html (Accessed February 2, 2006).
7 Royall R. Statistical Inference: A Likelihood Paradigm. New York: Chapman and Hall, 1997.
8 Chan A, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials. J Am Med Assoc 2004;20:245765.
9 Song F, Eastwood AJ, Gilbody S, Duley L, Sutton AJ. Publication and related biases. Health Technology Assess 2000;4:1115.
10 Sterling TD, Rosenbaum WL, Weinkam JJ. Publication decision revisited: The effect of the outcome of statistical tests on the decision to publish and vice versa. Am Stat 1995;49:10812.[CrossRef]
11 Nieminen P, Rucker G, Miettunen J, Schumacher M. Empirical evidence for preferential citation based on statistical significance. Submitted for publication, 2006.
12 Ravnskov U. Cholesterol lowering trials in coronary heart disease: frequency of citation and outcome. BMJ 1992;305:1519.
13 Copas J, Jackson D. A bound for publication bias based on the fraction of unpublished studies. Biometrics 2004;60:14653.[CrossRef][Web of Science][Medline]
14 Pocock SJ, Collier TJ, Dandreo KJ et al. Issues in the reporting of epidemiological studies: a survey of recent practice. BMJ 2004;329:883.
15 Altman D, Egger M, Pocock S, Vandenbrouke JP, von Elm E. Strengthening the Reporting of Observational studies in Epidemiology (STROBE). Available at: http://www.strobe-statement.org/ (Accessed February 6, 2006).
16 Cox DR, Darby SC. The communication of risk. J R Stat Soc [Ser A] 2003;166:20304.[CrossRef]
17 Greenland S. Multiple-bias modeling for analysis of observational data (with discussion). J R Stat Soc [Ser A] 2005;168:267308.[CrossRef]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Greenland Response: Bayesian perspectives for epidemiological research Int. J. Epidemiol., June 1, 2006; 35(3): 777 - 778. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
