IJE Advance Access originally published online on April 27, 2006
International Journal of Epidemiology 2006 35(3):777-778; doi:10.1093/ije/dyl081
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Article |
Response: Bayesian perspectives for epidemiological research
1 Department of Epidemiology, University of California, Los Angeles 90095-1772, USA
2 Department of Statistics, University of California, Los Angeles 90095-1772, USA
E-mail: lesdomes{at}ucla.edu
Dr Carpenter1 and I agree on the value of Bayesian perspectives and the inappropriateness of NeymanPearsonian testing for epidemiology. Unfortunately, he misrepresents several of my positions and misunderstands the import of data priors.
To prevent misuse of Bayesian methods, the meaning of priors must be made clear. Shockingly to me, Carpenter dismisses data priors with We don't need to think of priors as expressing prior bets, and then turn our priors into pseudo data to include in a statistical analysis. The latter process is useful, but not always easy, and a prior does not always lead to unique prior data. This passage overlooks every advantage of data priors:
- We need not think at all to apply any statistical method, and that is not a virtue of statistics: One can just cram data into software, give incorrect (if standard) interpretations of the output, and get those published. Translating priors into data makes painfully clear how much prior information is assumed by a Bayesian analysis, in a way odds do not. Data priors reveal the enormous knowledge claimed by some writers, as shown in the magnetic field/leukaemia example in my paper.
- Data priors demonstrate that the same approximate methods taught for frequentist analysis serve just as well for Bayesian analysis (this fact is apparently upsetting to Bayesians who have laboured over specialized Bayesian techniques).
- Creation of data priors is much less difficult than what is routinely done to torture data, such as logistic regression. It requires only a moment with a calculator to solve the equations given in my paper; whether the solution is unique is irrelevant, since all solutions lead to approximately the same posterior. It does not begin to approach the computing or convergence issues needed for Markov Chain Monte Carlo.
As I will describe in part II of this series,2 posterior computation via data priors extends far more easily to regression and multilevel modelling than do other methods. It has the remarkable property of not increasing in complexity with the underlying model, because it involves only adding data records and using ordinary regression software, and can employ highly non-normal priors. In contrast, information weighting for regression requires normal priors and additional matrix computations.
Dr Carpenter goes on to ask if the model is badly wrong, can the prior data really fix the problem? I never claimed data priors fix model problems, because they do not; neither do Markov-Chain Monte-Carlo, propensity scores, inverse-probability weighting, or other modern tools, for they are all methods that assume models and compute from there. Model-expansion methods like bias modelling addresses model deficiencies; data priors address contextual understanding and computational transparency. Sadly, context and transparency are largely neglected by statistical research, which instead focuses on generality and precision beyond any relevance to most health or social scientists.
Dr Carpenter closes his commentary with However, I believe that to hope [the Bayesian paradigm] alone will save epidemiology from the fate some have predicted is wishful thinking. I never said and do not believe that the Bayesian paradigm alone is sufficient for saving epidemiology or for epidemiological inference; I only argue that Bayesian thinking is necessary as part of a broad and well-rounded approach. A major flaw shared by most statistical methods (be they frequentist, likelihoodist, or Bayesian) is that they pretend the data came from an ideal study in which all important influences on the data (including those of the investigators and the subjects) can be approximated by a known model. In observational epidemiology this can be a fatal assumption, resulting in far too much certainty placed on statistical results. Others and I have discussed how statistics can approach this problem in a contextually informed manner by using bias models with explicit priors for bias parameters.3
Note well: There is no single Bayesian paradigm4 just as there is no single frequentist paradigm.5 I argue that a subjective Bayesian perspective is needed. Neither objective Bayesian methods6 nor pure-likelihood methods7 address the pseudo-objectivity that frequentism perpetuates, and like frequentist methods they contain hidden subjective elements. I do not hold that subjective Bayesian methods should replace these methods; instead I concluded that they should take their place alongside frequentist approaches3 in a Bayesian/frequentist dualism.4,8,9
A dualistic approach is needed because Bayesianism and frequentism address different questions. Bayesianism addresses questions of the form Having seen the data, what odds should I place on this hypothesis versus another? and seeks methods that use contextual information to improve the bet; it cares about the observed data, not counterfactual data as might arise under a hypothetical long run. In contrast, frequentism addresses questions of the form if I applied this method to a hypothetical long run of studies like this one, how would it behave? and seeks methods with desirable long-run behaviour; it does not care about odds of hypotheses given the data that were actually observed, or even whether a particular decision produced by applying its methods to those data is better than alternatives. Importantly, Bayesian methods exhibit desirable frequentist (long-run) properties when both they and the evaluation are well informed by the scientific context.10
Most researchers are not interested in evaluating statistical methods, however, but instead are interested in contextual hypotheses. Smart researchers understand that an epidemiological study cannot by itself form a sound basis for accepting or rejecting hypotheses but can only provide evidence within a larger context. Contextual statements about hypotheses are sought and are what Bayesian statistics can provide. It is thus unsurprising that, having been given only frequentist methods, researchers consistently misinterpret frequentist outputs as if those were Bayesian (e.g. they write their discussions as if their two-sided P-value is the probability of the null hypothesis given the data).
The mismatch between the methods researchers are taught and the questions they actually ask has produced a chronic psychosis in study reporting, in which nonsignificance is taken as evidence for the null, or (as Dr Carpenter notes) even as grounds to not report or cite the result, thus distorting entire literatures and reviews. It is technically correct that the blame for such nonsense is with the user, not with frequentism; emphasis on the width and limits of confidence intervals rather than on statistical significance might avoid such problems.11,12 But blaming users is like blaming consumers for eating junk food when that is most promoted to them. We should blame elementary statistics textbooks and teachers for giving users such poor conceptual foundations, and providing only tools inappropriate for the most user's questions. In the US at least, most statistics for non-statisticians still fails to give Bayesian perspectives any meaningful time. Fortunately, applied statistics has rediscovered the importance of those perspectives for answering scientific questions.8,9 It is time for basic statistical educationand epidemiologyto catch up, and data priors are the simplest way to do so.
| References |
|---|
|
|
|---|
1 Carpenter JR. Commentary: On Bayesian perspectives for epidemiological research. Int J Epidemiol 2006;35:77577.
2 Greenland S. Bayesian perspectives for epidemiologic research. II. Extensions to non-normal priors and regression analysis. Submitted.
3 Greenland S. Multiple-bias modeling for analysis of observational data (with discussion). J R Stat Soc Ser A 2005;168:267308.[CrossRef]
4 Good IJ. Good Thinking. Minneapolis: University of Minnesota Press, 1983.
5 Goodman SN. P values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate (with discussion). Am J Epidemiol 1993;137:485501.
6 Berger JO. The case for objective Bayesian analysis. Bayesian Analysis, to appear 2006 (Available at: http://www.stat.cmu.edu/bayesworkshop/2005/panel.html).
7 Royall R. Statistical Inference: A Likelihood Paradigm. New York: Chapman and Hall, 1997.
8 Rubin DB. Practical implications of modes of statistical inference for causal effects and the critical role of the assignment mechanism. Biometrics 1991;47:121334.[CrossRef][Web of Science][Medline]
9 Efron B. Bayesians, frequentists, and scientists. J Am Stat Assoc 2005;100:15.[CrossRef]
10 Gustafson P, Greenland S. The performance of random coefficient regression in accounting for residual confounding. Biometrics 2006;62:In press.
11 Poole C. Low P-values or narrow confidence intervals: which are more durable? Epidemiology 2001;12:29194.[CrossRef][Web of Science][Medline]
12 Altman DG, Machin D, Bryant TN, Gardner MA (eds). Statistics with Confidence, 2nd edn. London: BMJ Publishing Group, 2000.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||