IJE Advance Access published online on September 21, 2007
International Journal of Epidemiology, doi:10.1093/ije/dym185
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
STROBE: new standards for reporting observational epidemiology, a chance to improve
1Co-Editor IJE and Coordinating Editor, Cochrane Heart Group, LSHTM.
2Director UK Cochrane Centre, Oxford.
*Corresponding author. London School of Hygiene & Tropical Medicine, Keppel Street, London WCIE 7HT. E-mail: Shah.ebrahim{at}lshtm.ac.uk
The shortfalls of observational epidemiology in terms of the generation of contradictory and spurious findings have been highlighted by many commentators—perhaps most memorably by pop-science journalist and general practitioner, James Le Fanu, who stated that the simple expedient of closing down most University departments of Epidemiology could both extinguish this endlessly fertile source of anxiety-mongering while simultaneously releasing funds for serious research.1 Serious critiques of observational epidemiology have highlighted its vulnerability to confounding, reverse causation, measurement error and selection bias.2,3 However, the closure of University departments and the abandonment of observational epidemiology might be premature without first trying some remedial steps.
One important step would be to improve the reporting of observational epidemiology studies. Lessons may be learned from the early days of randomized controlled trials where it was obvious that not all trials were equal—some were clearly better done than others and consequently produced more trustworthy results. Exploration of the sources of bias in the conduct of trials identified characteristics that were linked with the validity of estimates of effect sizes, and led to criteria that could be used to improve trial quality and, importantly, should help users to distinguish well conducted from poorly conducted trials.4–6 An interesting issue became apparent as more and more energy was put into abstracting data from published trials for systematic reviews—the quality of a trial is often difficult to assess from the published report of its findings. Published reports are not research protocols. Methods sections are often too brief for adequate understanding of how the trial was done. The vagaries of editors anxious to save space for the more exciting findings over the details of what was actually done, make large cuts to methodological detail but this detail is essential in assessing whether the findings are valid or not. With these concerns in mind the CONSORT—CONsolidated Standards Of Reporting Trials—statement was derived.6 CONSORT has required revisions to take into account reporting of adverse events in trials,7 non-inferiority and equivalence trials,8 and has been updated.9 Moreover, application of CONSORT appears to have improved the quality of reporting of trials. 10,11
In an effort to improve reporting of observational epidemiology studies, the STROBE (STrengthening the Reporting of OBservational studies in Epidemiology12) initiative was established in 2004, including several of the people involved in CONSORT. The group will publish its main statement in several international journals simultaneously and a longer explanatory paper giving a detailed rationale will be published in Epidemiology. STROBE decided to limit its initial work to three major designs—cohort, case-control and cross-sectional. This was wise. If the approach proves applicable in practice and improves reporting of the major study designs, its application to other designs such as case series, audits and database studies can be considered in updates.
The authors of STROBE state many things that the checklist of 22 items is not. It is not intended to be prescriptive, to regulate terminology, to be used to assess study quality or to guide the planning or design of studies. So what is it actually for? It is meant to assist authors in writing up their work, help editors and reviewers as part of the peer review process, and make it easier for readers to critically appraise published papers. These purposes and some of the items do seem to overlap with assessing study quality as indicated by the similarity of items identified in a systematic review of tools to assess quality and susceptibility to bias.13
So how successfully have the STROBE group risen to the challenges of the varied nature of observational epidemiology? The effort to be non-prescriptive has resulted in rather general suggestions for authors, some of which will be found in any epidemiology text targeted at Masters students in the first term of their first year. Despite the companion explanatory long paper, the investigator seeking guidance may end up confused. For example the item State the scientific background and rationale for the investigation being reported sounds unambiguous but in long-term cohort studies the original rationale may have been understanding the causes of cardiovascular disease but the purpose of the report is study of a quite different condition—say incontinence of urine. The original purpose of the cohort will have a bearing on the types of covariates available for investigation and will illuminate what is feasible for the investigators to attempt in analyses but ultimately the availability of the latter are the important factor.
The STROBE group recommend cautious overall interpretation of results .... If observational studies are largely incapable of making definitive conclusions on the basis of robust findings perhaps Le Fanu is correct and it is now time to close the enterprise down! A better approach would be to give an interpretation of the results appropriate to the available data. Did Richard Doll and Ernest Wynder make only a tentative interpretation from the series of observational studies conducted from 1948 on smoking and lung cancer? Certainly conclusions from the initial studies were cautious given that the associations between lung cancer and smoking were somewhat unexpected at the time but stronger inferences were possible as evidence from both retrospective and prospective studies emerged and were confirmed by others. Doll's important statement if you find something that is unexpected and is going to be of social significance you have a responsibility to be sure that youre right before you publicize your results to the rest of the world. This does at least require repeating some of your observations14 should find a place in the first update of STROBE.
Asking investigators to make clear which confounders were adjusted for and why they were included might deal with the curious case of the changing effects of folic acid on stroke risk published by one research group in the same year. In the first study, participants in the physicians health study enjoyed stroke protection with high folate levels.15 The second study published later that year using data from the nurses health study demonstrated no important effect of folate on stroke risk.16 One contribution to the apparent discrepancy may be that the correct null answer for the association of dietary folate and stroke derived from a randomized controlled trial was published between the two observational studies.17 The trial was actually cited in the second observational study in support of the negative finding. Both observational studies showed similar effects in unadjusted analyses but in the second study the effects attenuated on adjustment for potential confounders. A major difference between the two studies was the approach taken to adjusting for confounders. The second study adjusted for more covariates, and included Vitamin E adjustment (which is strongly correlated with socio-economic position18) and might therefore contain better adjustment for confounding by socio-economic position. A repeat analysis of the initial positive study with Vitamin E adjustment would be interesting. It remains to be seen whether a further attempt to get the new correct answer will result from the recent meta-analysis of randomised trials of folic acid supplementation which showed almost a 20% reduction in stroke risk.19
STROBE is perhaps missing the point in suggesting that a discussion of the existing external evidence is particularly important for studies reporting small increases in risk. Reports of small increases in risk generally evoke a so what response and are not nearly so dangerous as reports of large increases (or decreases) in risk. For example, the meta-analysis of observational epidemiological studies of the association between hormone replacement therapy (HRT) and coronary heart disease showed a too good to be true reduction of about 50%, prompting the authors to conclude overall, the bulk of the evidence strongly supports a protective effect of estrogens that is unlikely to be explained by confounding factors.20 Women took heed and started taking or continuing HRT for its alleged cardio-protective effects. It is in precisely these circumstances that more external evidence is required in order to protect the public from false claims. In this case the clue had been reported earlier by Pettiti and colleagues who found a lack of specificity of effect of HRT—it protected against accidents and violent deaths too.21 The same confounding structure probably explained the associations with accidents and violent deaths and also with CHD, as indicated by the subsequent trials of HRT.22
One of the really useful aspects of CONSORT to us as Cochrane systematic reviewers has been the vast improvement in reporting which makes the assessment of the quality of new randomized trials so much easier when updating a review, than the assessment of the old trials when the reviews were first done. With systematic reviewing in mind, it is disappointing that STROBE does not recommend that new findings are put into context by conducting a systematic review of other similar studies. This would alert us to the consistency of findings, allow exploration of sources of heterogeneity and would ensure, in the same way as it does for randomized trials, that research effort is not spent on rediscovering the same finding again and again.
STROBE will likely make a strong contribution to improving the quality of reporting of observational studies. In this issue Debbie Lawlor considers a different approach to improving the quality of epidemiological research by reducing publication bias—submitting papers for editorial appraisal and peer review without the results, which would permit papers to be assessed on the importance of the objectives and quality of the methods and statistical approaches to be used.23 In an accompanying commentary Sander Greenland offers some support for this suggestion and notes the error in assuming, as many epidemiologists do, that publication bias (in observational epidemiology or RCTs) is nearly always of null studies: trying to conceal positive findings from publication also introduces important bias to scientific understanding.24 We would like to see more discussion on how observational epidemiology can be made more robust—and on the feasibility of implementing these new options for authors and editors.
| References |
|---|
|
|
|---|
1 Le Fanu J. The Rise and Fall of Modern Medicine (1999) New York: Little: Brown.
2 Taubes G. Epidemiology faces its limits. Science (1995) 269:164–69.
3 Davey Smith G, Ebrahim S. Data dredging, bias or confounding. BMJ (2002) 325:1437–38.
4 Chalmers TC, Celano P, Sacks HS, Smith H Jr. Bias in treatment assignment in controlled clinical trials. N Engl J Med (1983) 309:1358–61.[Abstract]
5 Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. JAMA (1995) 273:408–12.
6 Moher D, Pham B, Jones A, et al. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet (1998) 352:609–13.[CrossRef][Web of Science][Medline]
7 Ioannidis JP, Evans SJ, Gotzsche PC, et al. CONSORT Group. Better reporting of harms in randomized trials: an extension of the CONSORT statement. Ann Intern Med (2004) 141:781–88.
8 Piaggio G, Elbourne DR, Altman DG, Pocock SJ, Evans SJ. CONSORT Group. Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement. JAMA (2006) 295:1152–60.
9 Altman DG, Schulz KF, Moher D, et al. CONSORT GROUP The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med (2001) 134:663–94.
10 Moher D, Jones A, Lepage L. CONSORT Group (Consolitdated Standards for Reporting of Trials). Use of the CONSORT statement and quality of reports of randomized trials: a comparative before-and-after evaluation. JAMA (2001) 285:1992–95.
11 Plint AC, Moher D, Morrison A, et al. Does the CONSORT checklist improve the quality of reports of randomised controlled trials? A systematic review. Med J Aust (2006) 185:263–67.[Web of Science][Medline]
12 http://www.strobe-statement.org/(accessed 15 August 2007).
13 Sanderson S, Tatt ID, Higgins JP. Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annonated bibliography (in press). Int J Epidemiol (2007).
14 Richmond C. Sir Richard Doll, Obituary. BMJ (2005) 331:295.
15 He K, Merchant A, Rimm EB, et al. Vitamin B6, and B12 intakes in relation to risk of stroke among men. Stroke (2004) 35:169–74.
16 Al-Delaimy WK, Rexrode KM, Hu FB, et al. Folate intake and risk of stroke among women. Stroke (2004) 35:1259–63.
17 Toole JF, Malinow MR, Chambless LE, et al. Lowering homocysteine in patients with ischemic stroke to prevent recurrent stroke, myocardial infarction, and death: the Vitamin Intervention for Stroke Prevention (VISP) randomized controlled trial. JAMA (2004) 294:565–75.
18 Lawlor DA, Davey Smith G, Bruckdorfer KR, et al. Those confounded vitamins: what can we learn from the differences between observational versus randomised trial evidence? Lancet (2004) 363:1724–27.[CrossRef][Web of Science][Medline]
19 Wang X, Qin X, Demirtas H, et al. Efficacy of folic acid supplementation in stroke prevention: a meta-analysis. Lancet (2007) 369:1876–82.[CrossRef][Web of Science][Medline]
20 Stampfer MJ, Colditz GA. Estrogen replacement therapy and coronary heart disease: a quantitative assessment of the epidemiologic evidence. Prev Med (1991) 20:47–63. (Reprinted Int J Epidemiol 2004;33:445–53).[CrossRef][Web of Science][Medline]
21 Petitti DB, Perlman JA, Sidney S. Postmenopausal estrogen use and heart disease. New Engl J Med (1986) 315:131–32.[Medline]
22 Lawlor DA, Davey Smith G, Ebrahim S. Commentary: The hormone replacement–coronary heart disease conundrum: is this the death of observational epidemiology? Int J Epidemiol (2004) 33:464–67.
23 Lawlor DA. Quality in epidemiological research: should we be submitting papers before we have the results and submitting more hypothesis generating research? Int J Epidemiol.
24 Greenland S. Commentary On "Quality in epidemiological research: should we be submitting papers before we have the results and submitting more hypothesis generating research?" Int J Epidemiol.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Muller and M. Egger Strengthening the reporting of observational epidemiology (STROBE) in sexual health Sex Transm Inf, June 1, 2009; 85(3): 162 - 164. [Full Text] [PDF] |
||||
![]() |
M. J. Knol, J. P. Vandenbroucke, P. Scott, and M. Egger What Do Case-Control Studies Estimate? Survey of Methods and Assumptions in Published Case-Control Research Am. J. Epidemiol., November 1, 2008; 168(9): 1073 - 1081. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Nijsten, P. Spuls, and R. S. Stern STROBE: A Beacon for Observational Studies Arch Dermatol, September 1, 2008; 144(9): 1200 - 1204. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


