IJE Advance Access originally published online on July 17, 2006
International Journal of Epidemiology 2006 35(4):1081-1082; doi:10.1093/ije/dyl139
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Commentary |
Commentary: Dealing with measurement error: multiple imputation or regression calibration?
MRC Biostatistics Unit, Institute of Public Health, Cambridge CB2 2SR, UK
E-mail: ian.white{at}mrc-bsu.cam.ac.uk
Multiple imputation (MI) is a well-established method of handling missing data and is increasingly implemented in statistical software packages. Unlike other imputation methods, MI produces not one but several imputed datasets. This enables it to appropriately reflect the uncertainty due to missing data and, hence, to produce valid statistical inferences.1
Cole et al. in this issue2 propose that MI may also be useful in dealing with a second problem rife in epidemiology: exposure measurement error, which typically causes underestimation of exposuredisease associations (regression dilution bias).3 They coin the acronym MIME (multiple imputation for measurement error) and show that this method can indeed remove regression dilution bias. How widely should MIME be used?
Unfortunately, MIME is only appropriate for measurement error problems in which the true exposure is measured in a sub-sample (a validation study). This is because MIME involves fitting a regression model of true exposures on observed exposures, in order to impute the unobserved true exposures. Often, the degree of measurement error is assessed by taking repeat measurements (a repeatability study).4 In such cases, the true exposure is never observed, so MIME as described by Cole et al. would not be appropriate (and complex modifications would be required to make MIME work).
The main alternative to MIME is regression calibration (RC).5 RC replaces observed exposures by predicted values of the true exposure; using these predicted values as if they were the true exposure can yield valid estimates of the true exposuredisease relationship.6 Both MIME and RC are based on regression models for the true exposure (although disease status is included in the model for MIME and not for RC). But whereas MIME must create several possible values of true exposure, RC creates only a single predicted value: this is why RC works with a repeatability study.
With validation studies, is MIME superior to the RC approach? Cole et al. report in their abstract that MIME was sometimes more powerful than misclassified or RC analyses. Although this is true, it is fair to point out that MIME was the most powerful of the three analyses in only two of eight scenarios considered (Table 3); in the other six scenarios it was the least powerful of the three. Surprisingly, Cole et al. also found that RC was uniformly less powerful than misclassified analyses: however, the difference in power would have disappeared if they had used asymmetric confidence intervals based on Feiller's theorem7 instead of symmetrical intervals based on Rosner's variance estimate.
There is one good theoretical reason for expecting MIME to perform better than RC. MIME uses the true exposure when it is available, rather than imputing a value, whereas RC always predicts the true exposure from the observed exposure. It is, therefore, surprising that MIME performed so badly in several cases in Cole et al.'s simulation studies. Possible explanations include the difficulty in performing multiple imputation with survival outcomes8 and extreme estimates in a minority of imputed datasets.
Finally, it is important to remember that the main problem in epidemiology is measurement error in confounders, not in exposures.9 Whereas measurement error in exposures dilutes exposuredisease associations, measurement error in confounders can lead to overestimation of associations. Fortunately, given adequate information about measurement error, both MIME and RC can be directly extended to handle measurement error in confounders.10
| References |
|---|
|
|
|---|
1 Rubin DB. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons, 1987.
2 Cole SR, Chu H, Greenland D. Multiple-imputation for measurement-error correction. Int J Epidemiol 2006;35:107481.
3 Knuiman MW, Divitini ML, Buzas JS, Fitzgerald PEB. Adjustment for regression dilution in epidemiological regression analyses. Ann Epidemiol 1998;8:5663.[CrossRef][Web of Science][Medline]
4 Rosner B, Spiegelman D, Willett W. Correction of logistic regression relative risk estimates and confidence intervals for random within-person measurement error. Am J Epidemiol 1992;136:140013.
5 Rosner B, Willett WC, Spiegelman D. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Stat Med 1989;8:105169.[Web of Science][Medline]
6 Carroll RJ, Ruppert D, Stefanski LA. Measurement Error in Nonlinear Models. London: Chapman and Hall, 1995.
7 Frost C, Thompson SG. Correcting for regression dilution bias: comparison of methods for a single predictor variable. J R Stat Soc A 2000;163:17389.[CrossRef]
8 van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis. Stat Med 1999;18:68194.[CrossRef][Web of Science][Medline]
9 Phillips A, Davey Smith G. Bias in relative odds estimation due to imprecise measurement of correlated exposures. Stat Med 1992;11:95361.[Web of Science][Medline]
10 Rosner B, Spiegelman D, Willett W. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am J Epidemiol 1990;132:73445.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||