IJE Advance Access originally published online on March 23, 2006
International Journal of Epidemiology 2006 35(3):643-647; doi:10.1093/ije/dyl054
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Commentary |
Commentary: Advancing neighbourhood-effects researchselection, inferential support, and structural confounding
Division of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, USA. E-mail: oakes{at}epi.umn.edu
Everyone interested in how residential contexts independently shape health risks (i.e. neighbourhood effects) should study the new paper by Chaix, Rosvall, Lynch, and Merlo (herein after CRLM).1 This paper is one of the best to date on a topic fundamental to social epidemiology and related subdisciplines. I intend to include the paper in my doctoral-level seminars in social epidemiology and community trials and herewith encourage others to share it widely.
CRLM's paper is important because it is methodologically superior to most of the recent literature addressing the same phenomena. Distinguishing strengths include well-formed and clear a priori hypotheses, excellent exposure and outcome data permitting the assurance that study subjects actually resided in the areas of exposure, transparency of methodological choices, expressed interest in causal inference, fairly novel use of a shared-frailty model, simultaneous consideration of two factors, and most importantly, conservative conclusions.
Of course no study is perfect and while sympathetic to constraints it is clear that progress depends on constructive criticism. Among the many points that might be addressed, I aim to draw attention to two fundamental shortcomings from which the entire literature seems to suffer. Although others and I have tried to explain such issues before,26 it seems few epidemiologists appreciate the concerns.
| Critique |
|---|
|
|
|---|
It is easiest to see the issues by first considering the idealized experiment the investigators aim to mimic. CRLM wish to examine the effect of two factors, density and socioeconomic environment (SES), simultaneously. Within the counterfactual framework, they wish to examine the differential effects of a person assigned to each combination of their hypothetical two-factor intervention. In its simplest form, there would be two levels of each factor yielding four experimental conditions, called treatment combinations. In the real world there would be one observable outcome and three unobservable counterfactuals. Each counterfactual would be identical, or exchangeable, with each other and the observable measure but for the exposure/treatment condition and, of course, any resultant outcome effect. Were it possible to observe the counterfactuals, simple arithmetic would yield the desired answer for both main and modifying effects.
Within the experimental science framework, CRLM aim to mimic a factorial experiment in which all subjects are randomly assigned to predetermined levels of their two factors. With sufficient sample sizes, randomization ensures there is no confounding by either measured or unmeasured subject characteristics. If follows that the data from such an idealized experiment are easily analysed through regression of a chosen outcome (e.g. mortality) on each main factor and their interaction; no covariates need be added to the regression model, though a few may marginally increase precision. In other words, randomization assures the exchangeability of subjects between conditions and the (balanced) allocation of subjects to treatments assures that all inferences will be supported by actual observations. Unless we are forced to randomize whole neighbourhoods to the factors, there is no reason to be concerned with clustering, intraclass correlations, or fancy error structures such as those available in multilevel or spatial covariance models.7
Problem 1inferential support and structural confounding
Methodologists say inferences are on-support of the data when they are based on actual observations. Predicting future events (i.e. extrapolation) is necessarily off-support of one's data since, by definition, such data are not yet observed. Seemingly simple summary statistics such as means can yield off-support inferences too. Consider the average height of two men: one a strapping 2 m tall, the other a lamentable 1.7 m short. Although the average height of men in the group is 1.85 m, there is no observed data point at this value. Since the average man does not actually exist, the summary statistic is only as useful as the model that produced it.
The problem of off-support inferencea term I attribute to Manski,6 though others make this point clearappears woefully under-appreciated in epidemiology. Among other things, a concern about off-support inference is a concern about the ease with which statistical models mask fundamental differences between treatment and control groups and thus the identification of effects through actual observations. The statistician Rosenbaum makes this point by considering efforts to assess the impact of the Head Start educational programme by comparing low-income students treated by the programme with wealthy students not treated by the programme. He correctly insists that effects calculated from such comparisons rest on pure extrapolation and are thus off-support of the data since wealthy children are excluded from Head Start programmes.8 Estimation of treatment effects between groups is only reasonable when subjects between groups are exchangeable and both groups have some non-zero probability of being treated/exposed. Notice, the problem of off-support inference is not one of asymptotic efficiency, consistency, effect modification, or related technical worries; the issue is simply about the availability of data to support inference.
I see two interrelated problems of inferential support in CRLM's paper. The first is rather straightforward; I have labelled it Problem 1.1. The second problem is quite slippery; I have labelled it 1.2.
Problem 1.1
Although they may not have intended to do so, it is to CRLM's credit that they presented a graphic displaying the scant support for their principal inferences. Their Figure 2 plots the frequency of parishes by all combinations of density and SES. While interesting, the fitted curve belies the problem at hand, which is that it is not possible to disentangle the effects of density and SES in this study, at least without restricting the meaning of the comparison or leaving support of the actual data.
To see why, examine Table 1 above, which I developed by counting the observations in CRLM's Figure 2. Obviously my numbers are approximations since I did not have CRLM's data (I did not request it), but I believe the figure is representative. Notice that there are literally no observations for low SES in the upper three quartiles of density. Simply put, there are no higher-density and low-SES areas. Indeed, there is scant data for all cells in the entire right-hand corner of the matrix. This means that any effort to disentangle the effects is not based on any data but instead on model-based extrapolations. To be sure, resultant inferences may not be wrong, but I do not think empirically oriented scientists should place much confidence in them. In fact, formal experimental literature would say that such effects are non-estimable and/or confounded.9
|
Problem 1.2
Like nearly all neighbourhood-effect studies published to date, unsupported inferences also seem to emerge in CRLM's between-neighbourhood comparisons of riska problem I have tried to describe elsewhere.4,10 Simply put, because of social stratification, the better one controls for the selection of persons to neighbourhoods the less overlap there will be in the propensity for any subject to reside in any neighbourhood/parish other than their own. This result appears to be in contrast to the effect of adjustment in conventional data analyses wherein adjustment tends to increase exchangeability. Indeed, this apparent paradox seems unique to observational neighbourhood-effect studies wherein neighbourhoods are distinguished and ordered by various aspects of social stratification. In the absence of a better term, I call the phenomenon structural confounding since it cannot be overcome by the collection of more data.10
In order to see this problem one must first appreciate that the exchangeability between persons is necessary for proper inference in neighbourhood-effect studies (actually, in all studies). Random assignment, stratification, matching, and/or covariate adjustment are all techniques used to equate persons and permit the estimation of an effect of differing exposures on identical persons. In other words, non-exchangeable persons serve as poor substitutes for unobservable counterfactuals and, therefore, yield biased effect estimates.11 The next step is to recognize the relationship between exchangeability and idealized propensity scores. Persons with same propensity (i.e. probability) to be exposed to some condition or treatment are exchangeable. In a simple two-arm randomized study, the propensity that subjects are exposed to treatment conditions is a constant, 0.50. This means persons between treatment groups are exchangeable, at least in the long run. In observational studies, a persons' propensity to be exposed to some condition (i.e. their propensity score) is some unknown/unknowable function of their characteristics. Some may have a high propensity to be exposed, others a low propensity. The important point is that empirically based causal effect estimates are only defined where the propensity scores of subjects are equal across exposure/treatment conditions. Appreciating the uncertainty and error in the actual estimation of propensity scores, in practice we say that effects are only defined for approximately equal propensity scores. Nevertheless, comparison of two persons with widely divergent propensity scores is a comparison of non-exchangeables.
The paradox is obtained because social stratification of neighbourhoods means the background characteristics of a person i = 1 living in some neighbourhood j = 1 are quite different from the background characteristics of some other person i = 2 living in some different neighbourhood j = 2, provided the two neighbourhoods are reasonably distinct and thus identifiable. Clearly the differences in background characteristics potentially related to disease (i.e. confounding) make unadjusted comparisons between neighbourhoods inappropriate. The trouble is, upon fully adjusting for all these potential confounders, the propensity of person i = 1 living in some other neighbourhood such as j = 2 is often low, at best. This means persons 1 and 2 cannot be legitimately compared without strong model-imposed assumptions. In short, the full adjustment for background characteristics that select persons to one neighbourhood and not another in the typical observational neighbourhood-effect study yields divergent and non-overlapping propensities for exposure, and thus off-support effect estimates. Although masked by the inherent smoothing and extrapolation of (multilevel model) regression slopes, the upshot is that neighbourhood-effect estimates may only be calculated if we imagine a world where persons can live in any given neighbourhood (at least those under investigation). Thus, unless one is willing to imagine an alternative world with vastly different socioeconomic forces and constraintswhich is a violation of the closest possible world rule in counterfactual reasoning12the adjustment for all individual-level covariates necessary to obtain exchangeable subjects will often yield off-support effect estimates for any neighbourhoods with divergent socioeconomic profiles.
To be sure, it is not clear how this problem impacts CRLM's study. It may be that persons residing in the various parishes observed are more equal than those living in the highly stratified/segregated neighbourhoods of the US. Structural confounding would be mitigated to the extent this is true. Nevertheless, CRLM do not explain why they chose to control for their handful of individual covariates or the impact of doing so. Like so many others writing on this subject, they do not sufficiently address this fundamental problem of selection or its impact on necessary assumptions.
Problem 2selection and biased random effects (frailties)
The second fundamental problem is related but concerns bias, an idea more familiar to epidemiologists. I believe CRLM's study suffers biased random effect estimates and, thus, biased results including the reported parish-level variances, interquartile hazard ratios and changes in akaike information criteria. How can I make such a bold claim without examining the actual data? Because, again, by only adjusting for age, gender, education, and income, CRLM have probably not fully accounted for the selection of persons to parishes. It seems clear that the reason why one resides in parish A instead of parish B is neither random nor as simple as the four measures employed; at least if A and B present noticeably different environments. If nothing else, it is clear that income is not a perfect measure of wealth, especially in the elderly. Such specification error typically induces correlation between one's primary exposure measure and a model's (unobservable) disturbances/errors. And while often under-emphasized in epidemiology, correlation between an exposure and disturbances will yield biased effect estimates for that exposure.
The importance of the selection equation in contextual-effect studies cannot be overemphasized. Although it need not be so, increasing the number of measures in the selection equation will typically alter the between-neighbourhood (i.e. level-2) variance, often reducing it. This means that an analyst can produce various levels of neighbourhood-level variance by simply adding or subtracting selection variables. This is of concern since one may omit some measure and find greater between-neighbourhood variance, and then perhaps discover some statistically significant neighbourhood-level predictor that explains it. But such an explanation is more than a distraction; it hinders advancement. What is needed is a strong a priori rationale to adjust for some covariates and not others. Though incomplete, such a theory has been advanced by social scientists and revolves around choice under constraints.2,13 In short, the theory says that measures must be included such that the adjusted estimates mimic those from the ideal randomized experimentthat is, no confounding. This is no easy task, especially because unobservables such as personal values or preferences surely play a role. But things are not even this simple since it is not clear how to handle the many presumed confounders that have been influenced by the local contexts (i.e. are endogenous). The upshot is that in order to present meaningful results about the effect of neighbourhood contexts in observational studies we need a good theory to specify the selection or level-1 model. But there is no such theory available, and there probably will not be in future.
To clarify, it does not appear as if CRLM exploited or gamed their data in the fashion just described, and I cannot imagine this group of distinguished scholars ever doing so. My point is that there is no way to determine which selection variables to adjust for and thus no way to know, or defend, which parameter estimates are meaningful (i.e. uncorrelated with disturbances).
| Conclusions |
|---|
|
|
|---|
I have great ambivalence about the foregoing critique. Although I firmly believe neighbourhood contexts shape and alter exposures and, thus, health outcomes, I do not know how to demonstrate this in an observational study. Further, while I think the recent attention (social) epidemiologists are paying to multilevel contexts is long overdue and a major contribution to the larger discipline if not the public's health, I have yet to encounter convincing effect estimates in an observational neighbourhood-effects study. All told, I worry that there will be few appreciable gains when the current but misplaced exuberance over multilevel regression modelling subsides.
I predict advancement will only come when we pay much more attention to counterfactual reasoning, idealized experiments, and the limitation of our data. In the meantime, it is troubling to observe so few neighbourhood/contextual-effect papers discussing the design or results from the US Moving To Opportunity experiment, wherein subjects were literally randomized to new neighbourhoods and observed over time.14,15 The same goes for the many community randomized trials that have been conducted,7,16,17 to say nothing of the grim conclusions of distinguished social scientists who have worked on these very same issues for years.2,5,6,13,18
My hope is that CRLM's paper turns out to be more than just another important contribution from this productive and influential group. It would be beneficial if this paper signalled a change towards a deeper appreciation for the enormous methodological obstacles (there are many more than discussed here) to drawing credible inferences in observational neighbourhood-effect studies.
| References |
|---|
|
|
|---|
1 Chaix B, Rosvall M, Lynch J, Merlo J. Disentangling contextual effects on cause-specific mortality in a longitudinal 23-year follow up study: impact of population density or socioeconomic environment. Int J Epidemiol 2006;35:63343.
2 Jencks C, Mayer SE. The social consequences of growing up in a poor neighborhood. In: Lynn Jr. LE, McGeary MGH (eds). Inner-City Poverty in the United States. Washington, DC: National Academy Press, 1990.
3 Oakes JM. The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. Soc Sci Med 2004;58:192952.[CrossRef][Web of Science][Medline]
4 Oakes JM. Causal inference and the relevance of social epidemiology. Soc Sci Med 2004;58:196971.[Medline]
5 Durlauf SN. Neighborhood effects. In: Henderson JV, Thisse J-F (eds). Handbook of Regional and Urban Economics. Vol. 4. Amsterdam: North Holland, 2004, pp. 2173242.
6 Manski CF. Identification Problems in the Social Sciences. In: Marsden PV (ed.). Sociological Methodology. San Francisco: Jossey-Banks, 1993.
7 Murray D. Design and Analysis of Group-Randomized Trials. New York: Oxford, 1998.
8 Rosenbaum PR. Observational Studies. 2nd edn. New York: Springer Verlag, 2002.
9 Hinkelmann K, Kempthorne O. Design and Analysis of Experiments. New York: Wiley, 1994.
10 Oakes JM, Johnson PJ. Propensity score matching methods for social epidemiologists. In: Oakes JM, Kaufman JS (eds). Methods in Social Epidemiology. San Francisco: Jossey-Bass/Wiley, 2006, pp. 36486.
11 Maldonado G, Greenland S. Estimating causal effects. Int J Epidemiol 2002;31:42238.
12 Lewis D. Counterfactuals. Malden, MA: Blackwell, 1973.
13 Tiebout CM. A Pure Theory of Local Expenditures. J Polit Econ 1956;64:41624.[CrossRef]
14 Katz LF, Kling JR, Liebman JB. Moving to Opportunity in Boston: Early Results of A Randomized Mobility Experiment. Q J Econ 2001;116:60754.[CrossRef]
15 Ludwig J, Duncan GJ, Pinkston JC. Housing mobility programs and economic self-sufficiency: Evidence from a randomized experiment. J Public Econ 2005;89:13156.[CrossRef]
16 Susser M. Editorial: The tribulations of trialsintervention in communities. Am J Public Health 1995;85:15658.
17 Sorensen G, Emmons K, Hunt MK, Johnston D. Implications of the results of community intervention trials. Annu Rev Public Health 1998;19:379416.[CrossRef][Web of Science][Medline]
18 Sobel ME. Spatial Concentration and Social Stratification: Does the Clustering of Disadvantage "Beget" Bad Outcomes? In: Bowles S, Durlauf SS, Hoff K (eds). Poverty Traps. Princeton, NJ: Princeton University Press, 2006.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. M. Oakes, A. Forsyth, M. O. Hearst, and K. H. Schmitz Recruiting Participants for Neighborhood Effects Research: Strategies and Outcomes of the Twin Cities Walking Study Environment and Behavior, November 1, 2009; 41(6): 787 - 805. [Abstract] [PDF] |
||||
![]() |
J. M. Oakes The Effect of Media on Children: A Methodological Assessment From a Social Epidemiologist American Behavioral Scientist, April 1, 2009; 52(8): 1136 - 1151. [Abstract] [PDF] |
||||
![]() |
C. K. Chow, K. Lock, K. Teo, S. Subramanian, M. McKee, and S. Yusuf Environmental and societal influences acting on cardiovascular risk factors and disease at a population level: a review Int. J. Epidemiol., March 4, 2009; (2009) dyn258v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. F. Mendes de Leon, K. A. Cagney, J. L. Bienias, L. L. Barnes, K. A. Skarupski, P. A. Scherr, and D. A. Evans Neighborhood Social Cohesion and Disorder in Relation to Walking in Community-Dwelling Older Adults: A Multilevel Analysis J Aging Health, February 1, 2009; 21(1): 155 - 171. [Abstract] [PDF] |
||||
![]() |
L. Brannstrom Making Their Mark: The Effects of Neighbourhood and Upper Secondary School on Educational Achievement Eur. Sociol. Rev., September 1, 2008; 24(4): 463 - 478. [Abstract] [Full Text] [PDF] |
||||
![]() |
B Chaix, M Lindstrom, M Rosvall, and J Merlo Neighbourhood social interactions and risk of acute myocardial infarction J Epidemiol Community Health, January 1, 2008; 62(1): 62 - 68. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Oakes and T. R. Church Invited Commentary: Advancing Propensity Score Methods in Epidemiology Am. J. Epidemiol., May 15, 2007; 165(10): 1119 - 1121. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. C. Messer Invited Commentary: Beyond the Metrics for Measuring Neighborhood Effects Am. J. Epidemiol., April 15, 2007; 165(8): 868 - 871. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Forsyth, J. M. Oakes, K. H. Schmitz, and M. Hearst Does Residential Density Increase Walking and Other Physical Activity? Urban Stud, April 1, 2007; 44(4): 679 - 697. [Abstract] [PDF] |
||||
![]() |
J. MERLO and B. CHAIX Neighbourhood effects and the real world beyond randomized community trials: a reply to Michael J Oakes Int. J. Epidemiol., October 1, 2006; 35(5): 1361 - 1363. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||







