Genome-wide association studies (GWAS) have identified 12 epithelial ovarian cancer (EOC) susceptibility alleles. The pattern of association at these loci is consistent in BRCA1 and BRCA2 mutation carriers who are at high risk of EOC. After imputation to 1000 Genomes Project data, we assessed associations of 11 million genetic variants with EOC risk from 15,437 cases unselected for family history and 30,845 controls and from 15,252 BRCA1 mutation carriers and 8,211 BRCA2 mutation carriers (3,096 with ovarian cancer), and we combined the results in a meta-analysis. This new study design yielded increased statistical power, leading to the discovery of six new EOC susceptibility loci. Variants at 1p36 (nearest gene, WNT4), 4q26 (SYNPO2), 9q34.2 (ABO) and 17q11.2 (ATAD5) were associated with EOC risk, and at 1p34.3 (RSPO1) and 6p22.1 (GPX6) variants were specifically associated with the serous EOC subtype, all with P < 5 × 10(-8). Incorporating these variants into risk assessment tools will improve clinical risk predictions for BRCA1 and BRCA2 mutation carriers.
Genome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ∼14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS, comprising 15,748 breast cancer cases and 18,084 controls together with 46,785 cases and 42,892 controls from 41 studies genotyped on a 211,155-marker custom array (iCOGS). Analyses were restricted to women of European ancestry. We generated genotypes for more than 11 million SNPs by imputation using the 1000 Genomes Project reference panel, and we identified 15 new loci associated with breast cancer at P < 5 × 10(-8). Combining association analysis with ChIP-seq chromatin binding data in mammary cell lines and ChIA-PET chromatin interaction data from ENCODE, we identified likely target genes in two regions: SETBP1 at 18q12.3 and RNF115 and PDZK1 at 1q21.1. One association appears to be driven by an amino acid substitution encoded in EXO1.
Allelic heterogeneity in disease-causing genes presents a substantial challenge to the translation of genomic variation into clinical practice. Few of the almost 2,000 variants in the cystic fibrosis transmembrane conductance regulator gene CFTR have empirical evidence that they cause cystic fibrosis. To address this gap, we collected both genotype and phenotype data for 39,696 individuals with cystic fibrosis in registries and clinics in North America and Europe. In these individuals, 159 CFTR variants had an allele frequency of ł0.01%. These variants were evaluated for both clinical severity and functional consequence, with 127 (80%) meeting both clinical and functional criteria consistent with disease. Assessment of disease penetrance in 2,188 fathers of individuals with cystic fibrosis enabled assignment of 12 of the remaining 32 variants as neutral, whereas the other 20 variants remained of indeterminate effect. This study illustrates that sourcing data directly from well-phenotyped subjects can address the gap in our ability to interpret clinically relevant genomic variation.
Primary angle closure glaucoma (PACG) is a major cause of blindness worldwide. We conducted a genome-wide association study (GWAS) followed by replication in a combined total of 10,503 PACG cases and 29,567 controls drawn from 24 countries across Asia, Australia, Europe, North America, and South America. We observed significant evidence of disease association at five new genetic loci upon meta-analysis of all patient collections. These loci are at EPDR1 rs3816415 (odds ratio (OR) = 1.24, P = 5.94 × 10(-15)), CHAT rs1258267 (OR = 1.22, P = 2.85 × 10(-16)), GLIS3 rs736893 (OR = 1.18, P = 1.43 × 10(-14)), FERMT2 rs7494379 (OR = 1.14, P = 3.43 × 10(-11)), and DPM2-FAM102A rs3739821 (OR = 1.15, P = 8.32 × 10(-12)). We also confirmed significant association at three previously described loci (P < 5 × 10(-8) for each sentinel SNP at PLEKHA7, COL11A1, and PCMTD1-ST18), providing new insights into the biology of PACG.
Systemic lupus erythematosus (SLE) has a strong but incompletely understood genetic architecture. We conducted an association study with replication in 4,478 SLE cases and 12,656 controls from six East Asian cohorts to identify new SLE susceptibility loci and better localize known loci. We identified ten new loci and confirmed 20 known loci with genome-wide significance. Among the new loci, the most significant locus was GTF2IRD1-GTF2I at 7q11.23 (rs73366469, Pmeta = 3.75 × 10(-117), odds ratio (OR) = 2.38), followed by DEF6, IL12B, TCF7, TERT, CD226, PCNXL3, RASGRP1, SYNGR1 and SIGLEC6. We identified the most likely functional variants at each locus by analyzing epigenetic marks and gene expression data. Ten candidate variants are known to alter gene expression in cis or in trans. Enrichment analysis highlights the importance of these loci in B cell and T cell biology. The new loci, together with previously known loci, increase the explained heritability of SLE to 24%. The new loci share functional and ontological characteristics with previously reported loci and are possible drug targets for SLE therapeutics.
Pancreatic cancer is the fourth leading cause of cancer death in the developed world. Both inherited high-penetrance mutations in BRCA2 (ref. 2), ATM, PALB2 (ref. 4), BRCA1 (ref. 5), STK11 (ref. 6), CDKN2A and mismatch-repair genes and low-penetrance loci are associated with increased risk. To identify new risk loci, we performed a genome-wide association study on 9,925 pancreatic cancer cases and 11,569 controls, including 4,164 newly genotyped cases and 3,792 controls in 9 studies from North America, Central Europe and Australia. We identified three newly associated regions: 17q25.1 (LINC00673, rs11655237, odds ratio (OR) = 1.26, 95% confidence interval (CI) = 1.19-1.34, P = 1.42 × 10(-14)), 7p13 (SUGCT, rs17688601, OR = 0.88, 95% CI = 0.84-0.92, P = 1.41 × 10(-8)) and 3q29 (TP63, rs9854771, OR = 0.89, 95% CI = 0.85-0.93, P = 2.35 × 10(-8)). We detected significant association at 2p13.3 (ETAA1, rs1486134, OR = 1.14, 95% CI = 1.09-1.19, P = 3.36 × 10(-9)), a region with previous suggestive evidence in Han Chinese. We replicated previously reported associations at 9q34.2 (ABO), 13q22.1 (KLF5), 5p15.33 (TERT and CLPTM1), 13q12.2 (PDX1), 1q32.1 (NR5A2), 7q32.3 (LINC-PINT), 16q23.1 (BCAR1) and 22q12.1 (ZNRF3). Our study identifies new loci associated with pancreatic cancer risk.
Genome-wide association studies (GWAS) and fine-mapping efforts to date have identified more than 100 prostate cancer (PrCa)-susceptibility loci. We meta-analyzed genotype data from a custom high-density array of 46,939 PrCa cases and 27,910 controls of European ancestry with previously genotyped data of 32,255 PrCa cases and 33,202 controls of European ancestry. Our analysis identified 62 novel loci associated (P C, p.Pro1054Arg) in ATM and rs2066827 (OR = 1.06; P = 2.3 × 10-9; T>G, p.Val109Gly) in CDKN1B. The combination of all loci captured 28.4% of the PrCa familial relative risk, and a polygenic risk score conferred an elevated PrCa risk for men in the ninetieth to ninety-ninth percentiles (relative risk = 2.69; 95% confidence interval (CI): 2.55-2.82) and first percentile (relative risk = 5.71; 95% CI: 5.04-6.48) risk stratum compared with the population average. These findings improve risk prediction, enhance fine-mapping, and provide insight into the underlying biology of PrCa1.
In the version of this article initially published, the name of author Manuela Gago-Dominguez was misspelled as Manuela Gago Dominguez. The error has been corrected in the HTML and PDF version of the article.
To identify common alleles associated with different histotypes of epithelial ovarian cancer (EOC), we pooled data from multiple genome-wide genotyping projects totaling 25,509 EOC cases and 40,941 controls. We identified nine new susceptibility loci for different EOC histotypes: six for serous EOC histotypes (3q28, 4q32.3, 8q21.11, 10q24.33, 18q11.2 and 22q12.1), two for mucinous EOC (3q22.3 and 9q31.1) and one for endometrioid EOC (5q12.3). We then performed meta-analysis on the results for high-grade serous ovarian cancer with the results from analysis of 31,448 BRCA1 and BRCA2 mutation carriers, including 3,887 mutation carriers with EOC. This identified three additional susceptibility loci at 2q13, 8q24.1 and 12q24.31. Integrated analyses of genes and regulatory biofeatures at each locus predicted candidate susceptibility genes, including OBFC1, a new candidate susceptibility gene for low-grade and borderline serous EOC.
Most common breast cancer susceptibility variants have been identified through genome-wide association studies (GWAS) of predominantly estrogen receptor (ER)-positive disease. We conducted a GWAS using 21,468 ER-negative cases and 100,594 controls combined with 18,908 BRCA1 mutation carriers (9,414 with breast cancer), all of European origin. We identified independent associations at P < 5 × 10-8 with ten variants at nine new loci. At P < 0.05, we replicated associations with 10 of 11 variants previously reported in ER-negative disease or BRCA1 mutation carrier GWAS and observed consistent associations with ER-negative disease for 105 susceptibility variants identified by other studies. These 125 variants explain approximately 16% of the familial risk of this breast cancer subtype. There was high genetic correlation (0.72) between risk of ER-negative breast cancer and breast cancer risk for BRCA1 mutation carriers. These findings may lead to improved risk prediction and inform further fine-mapping and functional work to better understand the biological basis of ER-negative breast cancer.
Noncoding repeat expansions cause various neuromuscular diseases, including myotonic dystrophies, fragile X tremor/ataxia syndrome, some spinocerebellar ataxias, amyotrophic lateral sclerosis and benign adult familial myoclonic epilepsies. Inspired by the striking similarities in the clinical and neuroimaging findings between neuronal intranuclear inclusion disease (NIID) and fragile X tremor/ataxia syndrome caused by noncoding CGG repeat expansions in FMR1, we directly searched for repeat expansion mutations and identified noncoding CGG repeat expansions in NBPF19 (NOTCH2NLC) as the causative mutations for NIID. Further prompted by the similarities in the clinical and neuroimaging findings with NIID, we identified similar noncoding CGG repeat expansions in two other diseases: oculopharyngeal myopathy with leukoencephalopathy and oculopharyngodistal myopathy, in LOC642361/NUTM2B-AS1 and LRP12, respectively. These findings expand our knowledge of the clinical spectra of diseases caused by expansions of the same repeat motif, and further highlight how directly searching for expanded repeats can help identify mutations underlying diseases.
Autism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample-size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 individuals with ASD and 27,969 controls that identified five genome-wide-significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), we identified seven additional loci shared with other traits at equally strict significance levels. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across ASD subtypes. These results highlight biological insights, particularly relating to neuronal function and corticogenesis, and establish that GWAS performed at scale will be much more productive in the near term in ASD.
We conducted a combined genome-wide association study (GWAS) of 7,481 individuals with bipolar disorder (cases) and 9,250 controls as part of the Psychiatric GWAS Consortium. Our replication study tested 34 SNPs in 4,496 independent cases with bipolar disorder and 42,422 independent controls and found that 18 of 34 SNPs had P < 0.05, with 31 of 34 SNPs having signals with the same direction of effect (P = 3.8 × 10(-7)). An analysis of all 11,974 bipolar disorder cases and 51,792 controls confirmed genome-wide significant evidence of association for CACNA1C and identified a new intronic variant in ODZ4. We identified a pathway comprised of subunits of calcium channels enriched in bipolar disorder association intervals. Finally, a combined GWAS analysis of schizophrenia and bipolar disorder yielded strong association evidence for SNPs in CACNA1C and in the region of NEK4-ITIH1-ITIH3-ITIH4. Our replication results imply that increasing sample sizes in bipolar disorder will confirm many additional loci.
Coronary artery disease (CAD) is a leading cause of morbidity and mortality worldwide. Although 58 genomic regions have been associated with CAD thus far, most of the heritability is unexplained, indicating that additional susceptibility loci await identification. An efficient discovery strategy may be larger-scale evaluation of promising associations suggested by genome-wide association studies (GWAS). Hence, we genotyped 56,309 participants using a targeted gene array derived from earlier GWAS results and performed meta-analysis of results with 194,427 participants previously genotyped, totaling 88,192 CAD cases and 162,544 controls. We identified 25 new SNP-CAD associations (P < 5 × 10-8, in fixed-effects meta-analysis) from 15 genomic regions, including SNPs in or near genes involved in cellular adhesion, leukocyte migration and atherosclerosis (PECAM1, rs1867624), coagulation and inflammation (PROCR, rs867186 (p.Ser219Gly)) and vascular smooth muscle cell differentiation (LMOD1, rs2820315). Correlation of these regions with cell-type-specific gene expression and plasma protein levels sheds light on potential disease mechanisms.
We analyzed 3,872 common genetic variants across the ESR1 locus (encoding estrogen receptor α) in 118,816 subjects from three international consortia. We found evidence for at least five independent causal variants, each associated with different phenotype sets, including estrogen receptor (ER(+) or ER(-)) and human ERBB2 (HER2(+) or HER2(-)) tumor subtypes, mammographic density and tumor grade. The best candidate causal variants for ER(-) tumors lie in four separate enhancer elements, and their risk alleles reduce expression of ESR1, RMND1 and CCDC170, whereas the risk alleles of the strongest candidates for the remaining independent causal variant disrupt a silencer element and putatively increase ESR1 and RMND1 expression.
Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes.
In a three-stage genome-wide association study among East Asian women including 22,780 cases and 24,181 controls, we identified 3 genetic loci newly associated with breast cancer risk, including rs4951011 at 1q32.1 (in intron 2 of the ZC3H11A gene; P=8.82×10(-9)), rs10474352 at 5q14.3 (near the ARRDC3 gene; P=1.67×10(-9)) and rs2290203 at 15q26.1 (in intron 14 of the PRC1 gene; P=4.25×10(-8)). We replicated these associations in 16,003 cases and 41,335 controls of European ancestry (P=0.030, 0.004 and 0.010, respectively). Data from the ENCODE Project suggest that variants rs4951011 and rs10474352 might be located in an enhancer region and transcription factor binding sites, respectively. This study provides additional insights into the genetics and biology of breast cancer.
Breast cancer susceptibility variants frequently show heterogeneity in associations by tumor subtype1-3. To identify novel loci, we performed a genome-wide association study including 133,384 breast cancer cases and 113,789 controls, plus 18,908 BRCA1 mutation carriers (9,414 with breast cancer) of European ancestry, using both standard and novel methodologies that account for underlying tumor heterogeneity by estrogen receptor, progesterone receptor and human epidermal growth factor receptor 2 status and tumor grade. We identified 32 novel susceptibility loci (P
Primary angle closure glaucoma (PACG) is a major cause of blindness worldwide. We conducted a genome-wide association study including 1,854 PACG cases and 9,608 controls across 5 sample collections in Asia. Replication experiments were conducted in 1,917 PACG cases and 8,943 controls collected from a further 6 sample collections. We report significant associations at three new loci: rs11024102 in PLEKHA7 (per-allele odds ratio (OR)=1.22; P=5.33×10(-12)), rs3753841 in COL11A1 (per-allele OR=1.20; P=9.22×10(-10)) and rs1015213 located between PCMTD1 and ST18 on chromosome 8q (per-allele OR=1.50; P=3.29×10(-9)). Our findings, accumulated across these independent worldwide collections, suggest possible mechanisms explaining the pathogenesis of PACG.