Skip Navigation


IJE Advance Access originally published online on April 27, 2006
International Journal of Epidemiology 2006 35(3):777-778; doi:10.1093/ije/dyl081
This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
35/3/777    most recent
dyl081v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Greenland, S.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Greenland, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press on behalf of the International Epidemiological Association © The Author 2006; all rights reserved.

Article

Response: Bayesian perspectives for epidemiological research

Sander Greenland1,2

1 Department of Epidemiology, University of California, Los Angeles 90095-1772, USA
2 Department of Statistics, University of California, Los Angeles 90095-1772, USA

E-mail: lesdomes{at}ucla.edu

Dr Carpenter1 and I agree on the value of Bayesian perspectives and the inappropriateness of Neyman–Pearsonian testing for epidemiology. Unfortunately, he misrepresents several of my positions and misunderstands the import of data priors.

To prevent misuse of Bayesian methods, the meaning of priors must be made clear. Shockingly to me, Carpenter dismisses data priors with ‘We don't need to think of priors as expressing prior bets, and then turn our priors into pseudo data to include in a statistical analysis. The latter process is useful, but not always easy, and a prior does not always lead to unique prior data.’ This passage overlooks every advantage of data priors:

  1. We need not think at all to apply any statistical method, and that is not a virtue of statistics: One can just cram data into software, give incorrect (if standard) interpretations of the output, and get those published. Translating priors into data makes painfully clear how much prior information is assumed by a Bayesian analysis, in a way odds do not. Data priors reveal the enormous knowledge claimed by some writers, as shown in the magnetic field/leukaemia example in my paper.
  2. Data priors demonstrate that the same approximate methods taught for frequentist analysis serve just as well for Bayesian analysis (this fact is apparently upsetting to Bayesians who have laboured over specialized Bayesian techniques).
  3. Creation of data priors is much less difficult than what is routinely done to torture data, such as logistic regression. It requires only a moment with a calculator to solve the equations given in my paper; whether the solution is unique is irrelevant, since all solutions lead to approximately the same posterior. It does not begin to approach the computing or convergence issues needed for Markov Chain Monte Carlo.

As I will describe in part II of this series,2 posterior computation via data priors extends far more easily to regression and multilevel modelling than do other methods. It has the remarkable property of not increasing in complexity with the underlying model, because it involves only adding data records and using ordinary regression software, and can employ highly non-normal priors. In contrast, information weighting for regression requires normal priors and additional matrix computations.

Dr Carpenter goes on to ask ‘if the model is badly wrong, can the prior data really fix the problem?’ I never claimed data priors fix model problems, because they do not; neither do Markov-Chain Monte-Carlo, propensity scores, inverse-probability weighting, or other modern tools, for they are all methods that assume models and compute from there. Model-expansion methods like bias modelling addresses model deficiencies; data priors address contextual understanding and computational transparency. Sadly, context and transparency are largely neglected by statistical research, which instead focuses on generality and precision beyond any relevance to most health or social scientists.

Dr Carpenter closes his commentary with ‘However, I believe that to hope [the Bayesian paradigm] alone will save epidemiology from the fate some have predicted is wishful thinking.’ I never said and do not believe that ‘the Bayesian paradigm alone’ is sufficient for ‘saving epidemiology’ or for epidemiological inference; I only argue that Bayesian thinking is necessary as part of a broad and well-rounded approach. A major flaw shared by most statistical methods (be they frequentist, likelihoodist, or Bayesian) is that they pretend the data came from an ideal study in which all important influences on the data (including those of the investigators and the subjects) can be approximated by a known model. In observational epidemiology this can be a fatal assumption, resulting in far too much certainty placed on statistical results. Others and I have discussed how statistics can approach this problem in a contextually informed manner by using bias models with explicit priors for bias parameters.3

Note well: There is no single Bayesian paradigm4 just as there is no single frequentist paradigm.5 I argue that a subjective Bayesian perspective is needed. Neither ‘objective’ Bayesian methods6 nor ‘pure-likelihood’ methods7 address the pseudo-objectivity that frequentism perpetuates, and like frequentist methods they contain hidden subjective elements. I do not hold that subjective Bayesian methods should replace these methods; instead I concluded that ‘they should take their place alongside frequentist approaches’3 in a Bayesian/frequentist dualism.4,8,9

A dualistic approach is needed because Bayesianism and frequentism address different questions. Bayesianism addresses questions of the form ‘Having seen the data, what odds should I place on this hypothesis versus another?’ and seeks methods that use contextual information to improve the bet; it cares about the observed data, not counterfactual data as might arise under a hypothetical long run. In contrast, frequentism addresses questions of the form ‘if I applied this method to a hypothetical long run of studies like this one, how would it behave?’ and seeks methods with desirable long-run behaviour; it does not care about odds of hypotheses given the data that were actually observed, or even whether a particular decision produced by applying its methods to those data is better than alternatives. Importantly, Bayesian methods exhibit desirable frequentist (long-run) properties when both they and the evaluation are well informed by the scientific context.10

Most researchers are not interested in evaluating statistical methods, however, but instead are interested in contextual hypotheses. Smart researchers understand that an epidemiological study cannot by itself form a sound basis for accepting or rejecting hypotheses but can only provide evidence within a larger context. Contextual statements about hypotheses are sought and are what Bayesian statistics can provide. It is thus unsurprising that, having been given only frequentist methods, researchers consistently misinterpret frequentist outputs as if those were Bayesian (e.g. they write their discussions as if their two-sided P-value is the probability of the null hypothesis given the data).

The mismatch between the methods researchers are taught and the questions they actually ask has produced a chronic psychosis in study reporting, in which ‘nonsignificance’ is taken as evidence for the null, or (as Dr Carpenter notes) even as grounds to not report or cite the result, thus distorting entire literatures and reviews. It is technically correct that the blame for such nonsense is with the user, not with frequentism; emphasis on the width and limits of confidence intervals rather than on statistical significance might avoid such problems.11,12 But blaming users is like blaming consumers for eating junk food when that is most promoted to them. We should blame elementary statistics textbooks and teachers for giving users such poor conceptual foundations, and providing only tools inappropriate for the most user's questions. In the US at least, most statistics for non-statisticians still fails to give Bayesian perspectives any meaningful time. Fortunately, applied statistics has rediscovered the importance of those perspectives for answering scientific questions.8,9 It is time for basic statistical education—and epidemiology—to catch up, and data priors are the simplest way to do so.


    References
 Top
 References
 
1 Carpenter JR. Commentary: On Bayesian perspectives for epidemiological research. Int J Epidemiol 2006;35:775–77.[Free Full Text]

2 Greenland S. Bayesian perspectives for epidemiologic research. II. Extensions to non-normal priors and regression analysis. Submitted.

3 Greenland S. Multiple-bias modeling for analysis of observational data (with discussion). J R Stat Soc Ser A 2005;168:267–308.[CrossRef]

4 Good IJ. Good Thinking. Minneapolis: University of Minnesota Press, 1983.

5 Goodman SN. P values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate (with discussion). Am J Epidemiol 1993;137:485–501.[Abstract/Free Full Text]

6 Berger JO. The case for objective Bayesian analysis. Bayesian Analysis, to appear 2006 (Available at: http://www.stat.cmu.edu/bayesworkshop/2005/panel.html).

7 Royall R. Statistical Inference: A Likelihood Paradigm. New York: Chapman and Hall, 1997.

8 Rubin DB. Practical implications of modes of statistical inference for causal effects and the critical role of the assignment mechanism. Biometrics 1991;47:1213–34.[CrossRef][Web of Science][Medline]

9 Efron B. Bayesians, frequentists, and scientists. J Am Stat Assoc 2005;100:1–5.[CrossRef]

10 Gustafson P, Greenland S. The performance of random coefficient regression in accounting for residual confounding. Biometrics 2006;62:In press.

11 Poole C. Low P-values or narrow confidence intervals: which are more durable? Epidemiology 2001;12:291–94.[CrossRef][Web of Science][Medline]

12 Altman DG, Machin D, Bryant TN, Gardner MA (eds). Statistics with Confidence, 2nd edn. London: BMJ Publishing Group, 2000.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
35/3/777    most recent
dyl081v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Greenland, S.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Greenland, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?