Skip Navigation


IJE Advance Access originally published online on January 9, 2008
International Journal of Epidemiology 2008 37(2):382-385; doi:10.1093/ije/dym291
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
37/2/382    most recent
dym291v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Jurek, A. M
Right arrow Articles by Maldonado, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Jurek, A. M
Right arrow Articles by Maldonado, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press on behalf of the International Epidemiological Association © The Author 2008; all rights reserved.

Brief Report

How far from non-differential does exposure or disease misclassification have to be to bias measures of association away from the null?

Anne M Jurek1,*, Sander Greenland2 and George Maldonado3

1 Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA.
2 Department of Epidemiology and Department of Statistics, University of California, Los Angeles, CA, USA.
3 Division of Environmental Health Sciences, University of Minnesota, Minneapolis, MN, USA.

* Corresponding author. Department of Pediatrics, University of Minnesota, Mayo Mail Code 715, 420 Delaware St. SE, Minneapolis, MN 55455, USA. E-mail: jure0007{at}umn.edu


    Abstract
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
A well-known heuristic in epidemiology is that non-differential exposure or disease misclassification biases the expected values of an estimator toward the null value. This heuristic works correctly only when additional conditions are met, such as independence of classification errors. We present examples to show that, even when the additional conditions are met, if the misclassification is only approximately non-differential, then bias is not guaranteed to be toward the null. In light of such examples, we advise that evaluation of misclassification should not be based on the assumption of exact non-differentiality unless the latter can be deduced logically from the facts of the situation.


Keywords Exposure measurement, misclassification, odds ratio, prevalence, sensitivity analysis

Accepted 17 December 2007


    Introduction
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
A well-known heuristic in epidemiology is that non-differential exposure or disease misclassification biases the expected values of an estimator toward the null value.1–6 A more precise version is that our estimate is probably closer to the null than it would be were there no misclassification.7–12 Even with this probabilistic modification, the rule works only when special conditions besides non-differentiality are met,13–18 e.g. that the misclassification in question is independent of other errors.16,17 Furthermore, many forms of differential error will also produce bias toward the null under the same conditions.19 Thus, non-differentiality is neither necessary nor sufficient for bias toward the null.

Even allowing some utility for the rule, a careful reading of the epidemiological literature reveals that ‘non-differential’ is not consistently defined. Some epidemiological textbooks state correctly that it means the error probabilities must be ‘the same for both groups compared’20 (p. 107) or ‘identical’21 (p. 192) in both groups. Following the latter definitions, most books use examples of non-differential misclassification in which the misclassification probabilities are exactly the same. Nonetheless, it is our impression that epidemiologists believe that approximate non-differentiality is sufficient for the rule to work, as reflected in books that say non-differential misclassification results when the classification errors ‘occur in similar proportions’22 (p. 169). The question is then how close to non-differential must the classification error be to produce bias toward the null, given that the other conditions necessary for the rule are satisfied.23

We present examples to demonstrate that, even if other conditions for bias toward the null are met, the bias is not guaranteed to be toward the null if the misclassification is only ‘approximately’ non-differential by certain ordinary judgments. Our examples will concern misclassification of an uncommon exposure (under 10% prevalence) using the odds ratio as the measure of association. Because of the parallel algebra, the points also apply to misclassification of an uncommon disease in a cohort or prevalence study using the odds ratio, or the ratio of rates or proportions. With the values of sensitivity and specificity reversed, they would also apply to misclassification using the odds ratio when non-exposure was uncommon.


    Methods
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
Table 1 gives a hypothetical 2 x 2 table of expected cell counts which we will use for illustration, identical to data from a study of the association of private pesticide-applicator exposure with circulatory and respiratory birth anomalies.24 Suppose for the moment that the expected counts are correctly classified on outcome status (case, non-case) but to some degree incorrectly classified on exposure status (exposed, unexposed). In this single-stratum set-up, with a binary exposure and no outcome misclassification, the impact of exactly non-differential misclassification is to produce bias toward the null and possibly beyond.2–5


View this table:
[in this window]
[in a new window]

 
Table 1 Hypothetical expected cell counts and odds ratio with exposure misclassification, taken from data reported by Garry et al.24

 
Our examples will be limited to less extreme cases in which the error probabilities are always less than the measured exposure prevalences. That is, within both the case and non-case groups, in our examples the false-negative probability (probability of being classified as unexposed if exposed; equal to 1 – sensitivity) is less than the measured non-exposure prevalence, and the false-positive probability (probability of being classified as exposed if unexposed; equal to 1 – specificity) is less than the measured exposure prevalence. These restrictions avoid negative corrections and allow the expected corrected odds ratio to be computed using the simple formula


Formula 1

(1)
where i = 1 for cases, 0 for non-cases, Pi is the expected proportion classified as exposed in the cases and the noncases, Fni is the false-negative probability, and Fpi is the false-positive probability.25


    Results
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
As an initial example, suppose that, for both cases and non-cases, the false-negative probability is 0.26 and the false-positive probability is 0.01 (i.e., Fn1 = Fn0 = 0.26 and Fp1 = Fp0 = 0.01). Applying formula (1) to Table 1, we obtain 2.06, a value greater than the expected odds ratio with misclassification of 1.62.

Suppose now that the Fn1 = Fn0 = 0.21, but Fp1 = 0.02 and Fp0 = 0.01. Although the latter two error probabilities are clearly very different on a relative scale, their absolute difference is very small and so they might be judged the same for all practical purposes. While the odds ratio from Table 1 (expected counts with exposure misclassification) is 1.62, the expected odds ratio after correction is 1.34. Thus the rather small difference in Fp between the cases and non-cases led to a large correction in the direction opposite that expected from non-differential misclassification.

Table 2 shows the expected corrected odds ratio for various combinations of false-negative and false-positive probabilities. Overall, in accord with the low exposure prevalence, the corrected odds ratios are strongly affected by small changes in the false-positive probabilities, whereas changes in the false-negative probabilities have comparatively little impact on the results. In rows 5 and 7 the correction more than quadruples the odds ratio, while in row 6 the correction halves the odds ratio, going beyond the null. In such extreme instances we would not deem the corrected odds ratio reliable, and we would recommend instead approaches to the problem that can easily handle such extremes, such as Bayesian or shrinkage methods.26–28


View this table:
[in this window]
[in a new window]

 
Table 2 Corrected odds ratios for Table 1

 

    Discussion
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
In our examples, the low prevalence of exposure led to extreme sensitivity of the results to the false-positive probabilities. This sensitivity is a manifestation of the well-known screening problem that low specificity for an uncommon condition can lead to huge errors in estimating prevalence. In our setting the problem translates into extreme sensitivity of misclassification corrections to violations of non-differentiality. We would take this problem as a good reason to avoid reliance on the non-differentiality assumption in drawing inferences.

Even if exact non-differentiality holds, misclassification is guaranteed to produce bias toward the null only under certain conditions, which if sufficiently violated can lead to bias away from the null.13–17 Sometimes these conditions may be obviously correct, e.g. when the exposure variable is a binary state such as employment in an industry. Other conditions may not be so obvious, however, especially when the exposure is the result of categorizing a continuous variable.15 When the status of these conditions is uncertain or the exposure is rare, one should not be too certain that classification errors have produced bias toward the null, even if one is fairly sure that the classification probabilities are very similar in cases and non-cases.

We thus recommend that quantitative evaluation of misclassification such as sensitivity analysis25–27,29,30 be used in place of qualitative heuristics, especially when decisions based on the magnitude of effects are to be based on the data in question. Even better is to obtain data on replicate or alternative measures of exposure, so that data-based correction methods can be brought to bear on the problem.26,28,31–33 Regardless of whether such data are available, we advise that evaluation not be based on the assumption of exact non-differentiality unless the latter can be deduced logically from the facts of the situation.34


    Acknowledgements
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
This study was supported in part by the Children's Cancer Research Fund, Minneapolis, MN (to A.J.). We thank a reviewer for helpful comments on an earlier draft.

Conflict of interest: None declared.


KEY MESSAGES

  • Exact non-differential misclassification is guaranteed to produce bias toward the null only under certain conditions.
  • Even if those conditions are met, approximately non-differential misclassification does not guarantee that measures of association will be biased toward the null.
  • We thus recommend that quantitative evaluation of misclassification such as sensitivity analysis be used in place of qualitative heuristics.

 


    References
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
1 Bross I. Misclassification in 2 x 2 tables. Biometrics (1954) 10:478–86.[CrossRef][Web of Science]

2 Newell DJ. Errors in the interpretation of errors in epidemiology. Am J Public Health Nations Health (1962) 52:1925–28.[Web of Science][Medline]

3 Keys A, Kihlberg JK. Effect of misclassification on estimated relative prevalence of a characteristic. Part I. Two populations infallibly distinguished. Part II. Errors in two variables. Am J Public Health Nations Health (1963) 53:1656–65.[Web of Science][Medline]

4 Gullen WH, Bearman JE, Johnson EA. Effects of misclassification in epidemiologic studies. Public Health Rep (1968) 83:914–18.[Medline]

5 Goldberg JD. The effects of misclassification on the bias in the difference between two proportions and the relative odds in the fourfold table. J Am Stat Assoc (1975) 70:561–67.[CrossRef][Web of Science]

6 Weinberg CR, Umbach DM, Greenland S. When will nondifferential misclassification of an exposure preserve the direction of a trend*. Am J Epidemiol (1994) 140:565–71.[Abstract/Free Full Text]

7 Thomas DC. RE: ‘When will nondifferential misclassification of an exposure preserve the direction of a trend*’. Am J Epidemiol (1995) 142:782–83.[Free Full Text]

8 Weinberg CR, Umbach DM, Greenland S. Weinberg et al. reply [letter]. Am J Epidemiol (1995) 142:784.[Free Full Text]

9 Sorahan T, Gilthorpe MS. Non-differential misclassification of exposure always leads to an underestimate: An incorrect conclusion. Occup Environ Med (1994) 51:839–40.[Free Full Text]

10 Wacholder S, Hartge P, Lubin JH, Dosemeci M. Non-differential misclassification and bias towards the null: a clarification. Occup Environ Med (1995) 52:557–58.[Free Full Text]

11 Sorahan T, Gilthorpe MS. Sorahan and Gilthorpe reply [letter]. Occup Environ Med (1995) 52:558.[Free Full Text]

12 Jurek AM, Greenland S, Maldonado G, Church TR. Proper interpretation of misclassification effects: expectations versus observations. Int J Epidemiol (2005) 34:680–87.[Abstract/Free Full Text]

13 Dosemeci M, Wacholder S, Lubin JH. Does nondifferential misclassification of exposure always bias a true effect toward the null value*. Am J Epidemiol (1990) 132:746–48.[Abstract/Free Full Text]

14 Wacholder S, Dosemeci M, Lubin JH. Blind assignment of exposure does not always prevent differential misclassification. Am J Epidemiol (1991) 134:433–37.[Abstract/Free Full Text]

15 Flegal KM, Keyl PM, Nieto FJ. Differential misclassification arising from nondifferential errors in exposure measurement. Am J Epidemiol (1991) 134:1233–44.[Abstract/Free Full Text]

16 Kristensen P. Bias from nondifferential but dependent misclassification of exposure and outcome. Epidemiology (1992) 3:210–15.[Web of Science][Medline]

17 Chavance M, Dellatolas G, Lellouch J. Correlated nondifferential misclassifications of disease and exposure: application to a cross-sectional study of the relation between handedness and immune disorders. Int J Epidemiol (1992) 21:537–46.[Abstract/Free Full Text]

18 Greenland S, Gustafson P. Accounting for independent nondifferential misclassification does not increase certainty that an observed association is in the correct direction. Am J Epidemiol (2006) 164:63–68.[Abstract/Free Full Text]

19 Drews CD, Greenland S. The impact of differential recall on the results of case-control studies. Int J Epidemiol (1990) 19:1107–12.[Abstract/Free Full Text]

20 Checkoway H, Pearce N, Kriebel D. Research Methods in Occupational Epidemiology. (2004) New York: Oxford University Press.

21 Savitz DA. Interpreting Epidemiologic Evidence: Strategies for Study Design and Analysis. (2003) New York: Oxford University Press.

22 Hennekens CH, Buring JE. Epidemiology in Medicine. (1987) Boston: Little, Brown and Company.

23 Maldonado G, Greenland S, Phillips C. Approximately nondifferential exposure misclassification does not ensure bias toward the null [abstract]. Am J Epidemiol (2000) 151:S39.

24 Garry VF, Schreinemachers D, Harkins ME, Griffith J. Pesticide appliers, biocides, and birth defects in rural Minnesota. Env Health Perspect (1996) 104:394–99.[CrossRef]

25 Greenland S, Lash TL. Bias analysis (Ch. 19). In: Modern Epidemiology.—Rothman KJ, Greenland S, Lash TL, eds. (2008) 3rd. Philadelphia, PA: Lippincott-Raven.

26 Gustafson P. Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. (2004) Boca Raton, FL: Chapman & Hall/CRC.

27 Gustafson P, Greenland S. Curious phenomena in Bayesian adjustment for exposure misclassification. Stat Med (2006) 25:87–103.[CrossRef][Web of Science][Medline]

28 Carroll RJ, Ruppert D, Stefanski LA, Crainceanu CM. Measurement Error in Nonlinear Models: A Modern Perspective. (2006) (2nd edn). Boca Raton, FL: Chapman & Hall/CRC.

29 Greenland S. Multiple-bias modelling for analysis of observational data (with discussion). J R Stat Soc A (2005) 168:267–308.

30 Fox MP, Lash TL, Greenland S. A method to automate probabilistic sensitivity analyses of misclassified binary variables. Int J Epidemiol (2005) 34:1370–76.[Abstract/Free Full Text]

31 Spiegelman D, Zhao B, Kim J. Correlated errors in biased surrogates: study designs and methods for measurement error correction. Stat Med (2005) 24:1657–82.[CrossRef][Web of Science][Medline]

32 Cole S, Chu H, Greenland S. Multiple-imputation for measurement-error correction. Int J Epidemiol (2006) 35:1074–81.[Abstract/Free Full Text]

33 Greenland S. Maximum-likelihood and closed-form estimators of epidemiologic measures under misclassification. J Stat Plan Inference (2007) 138:528–38.[CrossRef]

34 Greenland S, Fox MP, Lash TL. Reply to Roger Marshall [letter]. Int J Epidemiol (2006) 35:1589–90.[Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Epidemiol. Community HealthHome page
A M Jurek, G Maldonado, L G Spector, and J A Ross
Periconceptional maternal vitamin supplementation and childhood leukaemia: an uncertainty analysis
J Epidemiol Community Health, February 1, 2009; 63(2): 168 - 172.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
37/2/382    most recent
dym291v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Jurek, A. M
Right arrow Articles by Maldonado, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Jurek, A. M
Right arrow Articles by Maldonado, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?