International Journal of Epidemiology, Vol 26, 651-656, Copyright © 1997 by International Epidemiological Association
P Boyle, R Flowerdew and A Williams
BACKGROUND: Epidemiological studies of rare events, which are common in the
medical literature, often involve modeling sparse data sets. Assessing the
fit of these models may be complicated by the large numbers of observed
zeros in the data set. METHODS: Poisson models, fitted as generalized
linear models, were used to investigate the referral patterns of patients
suffering from end-stage renal failure in south west Wales. The usual
method for assessing the goodness of fit is to compare the deviance with a
chi 2 distribution with appropriate degrees of freedom. However, this test
may be invalid when the data set is sparse, as the deviance values may be
unusually low compared to the degrees of freedom. This would suggest that
there is a problem with underdispersion when, in fact, the large numbers of
zeros in the data set make the comparison with the chi 2 distribution
unreliable. A simulation approach is advocated as an alternative method of
assessing model fit in these situations. RESULTS: Three models are
considered in detail here. The first modelled the total referrals in each
of the 245 wards in the study area and included two explanatory variables.
These observations were not unusually sparse and both the chi 2 goodness of
fit test and the simulation methodology outlined here suggested that the
model did not fit. The second model included the population 'at risk' as an
offset and the model improved considerably. Both the chi 2 test and the
simulation approach suggested that this model did fit. Finally, the data
were disaggregated into five age groups providing 1225 observations and a
very sparse data set. According to the chi 2 goodness of fit test, the
deviance was very low suggesting that the model was underdispersed. Using
simulated data, it was shown that the deviance was not unusually low and
that the model fitted the data reasonably well. CONCLUSION: In cases where
the data set being modelled is sparse, it is useful to test the goodness of
fit of a Poisson model using a simulation approach, rather than relying on
the chi 2 test.
ARTICLES
Evaluating the goodness of fit in models of sparse medical data: a simulation approach
School of Geography, University of Leeds, UK.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
R. Parslow, P. McKinney, G. Law, and H. Bodansky Population mixing and childhood diabetes Int. J. Epidemiol., June 1, 2001; 30(3): 533 - 538. [Abstract] [Full Text] [PDF] |
||||
