IJE Advance Access originally published online on September 4, 2009
International Journal of Epidemiology 2009 38(5):1364-1373; doi:10.1093/ije/dyp285
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Ranking of genome-wide association scan signals by different measures
1 Department of Occupational and Environmental Medicine, Lund University Hospital, Lund, Sweden.
2 Competence Center for Clinical Research, Lund University Hospital, Lund, Sweden.
3 Department of Epidemiology and Public Health, Imperial College, London, UK.
4 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.
5 Wellcome Trust Sanger Institute, Hinxton, UK.
* Corresponding author. Department of Occupational and Environmental Medicine, Lund University Hospital, SE-221 85 Lund, Sweden. E-mail: ulf.stromberg{at}med.lu.se
| Abstract |
|---|
Background The P-value approach has been employed to prioritizing genome-wide association (GWA) scan signals, with a genome-wide significance defined by a prior P-value threshold, although this is not ideal. A rationale put forward is that the association signals rather should be expected to give less support for single nucleotide polymorphisms (SNPs) that are rare (with associated low-power tests) than for common SNPs with equivalent P-values, unless investigators believe, a priori, that rare causative variants contribute to the disease and have more pronounced effects.
Methods Using data from a GWA scan for type 2 diabetes (1924 cases, 2938 controls, 393 453 SNPs), we compared P-values with four alternative signal measures: likelihood ratio (LR), Bayes factor (BF; with a specified prior distribution for true effects), frequentist factor (FF; reflecting the ratio between estimated—post-data— power and P-value) and probability of pronounced effect size (PrPES).
Results The 19 common SNPs [minor allele frequency (MAF) among the controls >29%] yielding strong P-value signals (P < 5 x 10–7) were also top ranked by the other approaches. There was a strong similarity between the P-values, LR and BF signals, in terms of ranking SNPs. In contrast, FF and PrPES signals down-weighted rare SNPs (control MAF <10%) with low P-values.
Conclusions For prioritization of signals that do not achieve compelling levels of evidence for association, the main driving force behind observed differences between the various association signals appears to be SNP MAF. The statistical power afforded by follow-up samples for establishing replication should be taken into account when tailoring the signal selection strategy.
Keywords Bayes factor, effect size, likelihood ratio, single nucleotide polymorphism, statistical power, statistics
Accepted 28 July 2009