Commentary: Calculations of EPIC proportions
*Department of Epidemiology and Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA. E-mail: stdls{at}channing.harvard.edu
Accepted 5 February 2008
Rosner et al.1 developed the necessary methods for the correction of bias due to measurement error in multivariate models in 1990. Although originally focussed on relative risk estimation and inference from logistic regression models, their method was extended in a completely straightforward manner to the Cox model for survival data analysis in a rare disease setting,2,3 as occurs in prospective studies of cancer incidence including the European Prospective Investigation into Cancer and Nutrition (EPIC) study,4 in the Pooling Project of Prospective Studies of Diet and Cancer in Men and Women5 and elsewhere. Smith-Warner and colleagues have successfully applied this method to numerous investigations from the Pooling Project of Prospective Studies of Diet and Cancer in Men and Women.6–11 In this commentary, we will explore how the methods presented in Ferrari et al.'s12 paper differ from Rosner, Spiegelman and Willett's,1,2 and, where differences are apparent, what are the advantages and disadvantages of the alternate approaches?
Let us start with the point estimators. When regression calibration was first proposed as a means of correcting, to some approximation, for bias due to exposure measurement error in non-linear models, such as, importantly, logistic regression and then Cox models—what appeared to be two distinct approaches were simultaneously proposed. One was Rosner et al.'s1,13 and the second was from Raymond J. Carroll's group.14–16 Rosner et al.'s method took the bias-correction approach, where the standard point and interval estimate of relative risk (also known as the naïve, uncorrected or unadjusted estimate) would be corrected for bias due to measurement error in a second step. Carroll et al.'s approach singly imputes an estimated true exposure variable from the measurement error model, and performs the standard analysis conditional on this estimated true exposure. Variance corrections can be performed post hoc, but standard software will underestimate the true variability when this procedure is applied. Both of these methods began to be called regression calibration methods, presumably because both use a regression model in the validation or reliability study to calibrate the usual, or surrogate, exposure to the true exposure (or an unbiased estimate of the true exposure17,18). In 2003, Thurston et al.19 showed that, in generalized linear models, the point and corrected interval estimates are algebraically identical. A straightforward comparison of Spiegelman et al.2 and Xie et al.3 shows that a similar result applies to Cox models, as used by Ferrari et al. in today's paper from the EPIC study.
Rosner et al.'s version of regression calibration for multivariate measurement error correction of point and interval estimates in logistic, linear and Cox models is supported by a user-friendly SAS macro accompanied by a detailed user's manual freely available on my website (http://www.hsph.harvard.edu/faculty/spiegelman/blinplus.html). Instead, Ferrari et al. used Carroll et al.'s single imputation regression calibration. They were then faced with a major drawback of this version of the method: standard software under-estimates the variance of the relative risk, leading to overly narrow confidence interval coverage probabilities and under-estimated P-values. Although Carroll et al. offer a non-iterative asymptotic variance estimator for their version of regression calibration,16 (Appendix B3), there is no software available, at least not in SAS and not in the case of the more general measurement error model given by Rosner et al. and utilized by Ferrari et al. [equation (1)]. Hence, computations of EPIC proportions, using the bootstrap to obtain the variance of their measurement-error corrected relative risks, were required.
The bootstrap, proposed by Efron in 1979,20 was a glorious advance in statistics—a means of constructing confidence intervals when sample sizes are too small to permit valid large sample (asymptotic) methods and when either exact calculations are numerically intractable or to avoid undue sensitivity to mis-specification of the assumed small sample (exact) distribution. To obtain bootstrap confidence intervals, sample with replacement from the original study, to create M new pseudo studies of the same sample size as the original study, where M typically ranges between 100 and 1000. Each of these M new pseudo-studies is analysed as the original, and the empirical 5th and 95th largest relative risk obtained from these M pseudo-studies bound the 95% bootstrap confidence intervals. Another version of the bootstrap uses re-sampling to obtain a non-parametric variance estimate [as in equation (3) of Ferrari et al.], and applies this to form asymptotic Wald-type confidence intervals. I prefer the first version—given the computational effort involved in bootstrapping to avoid possibly invalid exact or asymptotic distributional assumptions, why revert to asymptotics to form confidence intervals?
Now, let us discuss the motivation for the use of the bootstrap in the EPIC study. The sample size of the main study is large (N = 478 040), the case count high (1329), and the impressively large size of the center- and gender-specific validation studies are given by Slimani et al.,21 as between n = 200 to as high as 2000. Asymptotic methods are the norm in large sample settings such as these. Readers should know that to produce each of the confidence intervals given in Ferrari et al.'s Table 4, EPIC was re-analysed 300 times—that is, 300 Cox models were fit to these 1329 cases and 478 040 study participants after re-sampling the calibration studies. This was repeated for the three primary exposure variables—fish, energy from fat and energy from non-fat, for men and women separately. In total, 300 x 3 x 2 = 1800 Cox models were fit—indeed, calculations of Epic proportions, all but six of them unnecessary had the asymptotic variance formula been used.
As it turns out, the bootstrap variance given by software was not appreciably different than the biased variance provided by the standard software, which treats the measurement error model parameters from (1) as known. Although Carroll et al. recommend bootstrapping both the main study and the calibration study to form bootstrapped confidence intervals in a measurement error setting16 (Appendix 9.5), Ferrari et al. appear to have only bootstrapped the calibration studies. The possibly incomplete bootstrap algorithm used could explain why the bootstrap variance was the same as the biased variance from standard software in their study. On the other hand, Rosner et al.22 reported a similar finding in 1992 (Figure 1), where, in the case of random within-person variation such as occurs in the measurement of blood pressure and other biomarkers, once the sample size of the reliability study where replicate measures are taken was greater than 100, variability due to estimation of the measurement-error model parameters had little influence on the overall variance of the measurement error-corrected relative risk. Nonetheless, I would not recommend ignoring this component, as in situations of large measurement error and multiple highly-correlated mis-measured variables, larger sample sizes are undoubtedly needed.
In the discussion, Ferrari et al. suggest that the bootstrap may be needed in their study because of the zero-inflated distribution of food consumption. Several authors have shown that the regression calibration approximation holds without any normality assumptions on the measurement-error model (1), as long as measurement error is moderate, as here.13,23 Homoscedasticity of the error variance in (1) is required, and is likely violated in zero-inflated data. However, there is quite a bit of evidence indicating that the standard linear regression calibration gives a good approximation to the point and interval estimates of relative risk, nevertheless.24–26 Schmid and Rosner24 used a mixture model to treat mis-classification of the 0's in alcohol intake separately from measurement error in the non-zero values in the Nurses Health Study where the correlation between the absolute value of the residuals and the predicted values from the measurement error model was 0.44, indicating considerable heteroscedasticity. In an analysis of breast cancer incidence in relation to alcohol intake, the results from this more complex method of correction for measurement was not materially different from the results that used standard linear regression calibration as given by Rosner et al.1 The mixture model gave a relative risk (95% CI) of 1.52 (1.23–1.87) compared to 1.62 (1.23–2.12) from the standard linear regression calibration. In work in progress by Spiegelman and colleagues,27 an extended regression calibration estimator was developed that uses a second order Taylor series expansion to obtain an estimator that allows for heteroscedasticity. In a simulation study with correlations of the residual variance with the predicted values of the measurement error model as large as 0.6, this estimator offered little improvement over the standard regression calibration estimator, which exhibited little bias and nominal coverage probability27 (Table 5). Similarly, in two examples, including one from the Nurses Health Study looking at alcohol intake in relation to breast cancer incidence as in Schmid et al., the results given by the standard regression calibration estimator were similar to those when the extended estimator was applied27 (Table 4). Both gave results very close to those given by the maximum likelihood estimator for mis-classification of alcohol intake, where no parametric assumptions are made at all.25
In conclusion, in the large sample settings of prospective cohort studies with reasonably sized validation studies, asymptotic methods are typically valid; application of the bootstrap is likely to be an unnecessary waste of CPU and a barrier to widespread application of regression calibration for measurement error correction. Ferrari et al. are to be commended for their efforts to explicitly adjust their findings on fish and fat intake in relation to colorectal cancer incidence for bias in point and interval estimates due to measurement error by using a main study/validation study design and methods which make full use of their extraordinarily detailed calibration studies. By doing so, we are able to conclude with more confidence that higher levels of fish intake are related to a decrease in risk of colorectal cancer risk, and that, in particular, for each 10 g/day increase in fish intake, there is a 10% lowering of risk, and that even after adjusting for measurement error, there is no evidence for an effect of excess energy intake or increased fat intake on risk. By taking this extra step beyond standard methods, bias due to measurement error is largely eliminated as a source of uncertainty in the interpretation of these data. Investigators should be cautioned not to avoid the use of measurement error correction because of an incorrect perception that to do so, computations of the Epic proportions are required.
| References |
|---|
|
|
|---|
1 Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am J Epidemiol (1990) 132:734–45.
2 Spiegelman D, McDermott A, Rosner B. Regression calibration method for correcting measurement-error bias in nutritional epidemiology. Am J Clin Nutr (1997) 65:1179S–86S.[Medline]
3 Xie SX, Wang CY, Prentice RL. A risk set calibration method for failure time regression by using a covariate reliability sample. J Roy Stat Soc, Ser B (2001) 63:855–70.[CrossRef]
4 Riboli E. Nutrition and cancer: background and rationale of the European Prospective Investigation into Cancer and Nutrition (EPIC). Ann Oncol (1992) 3:783–91.
5 Hunter DJ, Spiegelman D, Adami HO, et al. Cohort studies of fat intake and the risk of breast cancer–a pooled analysis. N Engl J Med (1996) 334:356–61.
6 Smith-Warner SA, Spiegelman D, Ritz J, et al. Methods for pooling results of epidemiologic studies: the Pooling Project of Prospective Studies of Diet and Cancer. Am J Epidemiol (2006) 163:1053–64.
7 Smith-Warner SA, Spiegelman D, Adami HO, et al. Types of dietary fat and breast cancer: a pooled analysis of cohort studies. Int J Cancer (2001) 92:767–74.[CrossRef][ISI][Medline]
8 Freudenheim JL, Ritz J, Smith-Warner SA, et al. Alcohol consumption and risk of lung cancer: a pooled analysis of cohort studies. Am J Clin Nutr (2005) 82:657–67.
9 Park Y, Hunter D, Spiegelman D, et al. Dietary fiber intake and risk of colorectal cancer: a pooled analysis of prospective cohort studies. JAMA (2005) 294:2849–57.
10 Genkinger JM, Hunter DJ, Spiegelman D, et al. Dairy products and ovarian cancer: a pooled analysis of 12 cohort studies. Cancer Epidemiol Biomarkers Prev (2006) 15:364–72.
11 Genkinger JM, Hunter DJ, Spiegelman D, et al. A pooled analysis of 12 cohort studies of dietary fat, cholesterol and egg intake and ovarian cancer. Cancer Causes Control (2006) 17:273–85.[CrossRef][ISI][Medline]
12 Ferrari P, Day NE, Boshuizen HC, et al. The evaluation of the diet/disease relation in the EPIC study: considerations for the calibration and the disease models. Int J Epidemiol (2008) 37:368–78.
13 Rosner B, Willett WC, Spiegelman D. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Stat Med (1989) 8:1051–69. discussion 1071–73.[ISI][Medline]
14 Carroll RJ, Stefanski LA. Approximate quasi-liklihood estimation in models with surrogate predictors. J Am Stat Assoc (1990) 85:652–63.[CrossRef][ISI]
15 Carroll RJ, Ruppert D, Stefanski LA. Measurement Error in Nonlinear Models. (1995) London: Chapman & Hall.
16 Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement Error in Nonlinear Models. (2006) Second. London: Chapman & Hall.
17 Wacholder S, Armstrong B, Hartge P. Validation Studies using an Alloyed Gold Standard. Am J Epidemiol (1993) 137:1251–58.
18 Spiegelman D, Schneeweiss S, McDermott A. Measurement error correction for logistic regression models with an "alloyed gold standard". Am J Epidemiol (1997) 145:184–96.
19 Thurston SW, Spiegelman D, Ruppert D. Equivalence of regression calibration methods for main study/external validation study designs. J Stat Plan Inference (2003) 113:527–39.[CrossRef]
20 Efron B. 1977 Rietz lecture – bootstrap methods – another look at the Jackknife. Annals of Statistics (1979) 7:1–26.[CrossRef][ISI]
21 Slimani N, Kaaks R, Ferrari P, et al. European Prospective Investigation into Cancer and Nutrition (EPIC) calibration study: rationale, design and population characteristics. Public Health Nutr (2002) 5:1125–45.[CrossRef][ISI][Medline]
22 Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for random within-person measurement error. Am J Epidemiol (1992) 136:1400–13.
23 Kuha J. Corrections for exposure measurement error in logistic regression models with an application to nutritional data. Stat Med (1994) 13:1135–48.[ISI][Medline]
24 Schmid CH, Rosner B. A Bayesian approach to logistic regression models having measurement error following a mixture distribution. Stat Med (1993) 12:1141–53.[ISI][Medline]
25 Spiegelman D, Rosner B, Logan R. Estimation and inference for logistic regression with covariate misclassification and measurement error in main study/validation study designs. J Am Stat Assoc (2000) 95:51–61.[CrossRef][ISI]
26 Spiegelman D, Carroll RJ, Kipnis V. Efficient regression calibration for logistic regression in main study/internal validation study designs with an imperfect reference instrument. Stat Med (2001) 20:139–60.[CrossRef][ISI][Medline]
27 Spiegelman D, Logan R, Grove D. Regression Calibration with Heteroscedastic Error Variance. last accessed 6th March 2008. Available at: http://www.hsph.harvard.edu/faculty/spiegelman/manuscripts/beta_RCH_2.pdf.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||