Genome-wide association studies (GWAS) and fine-mapping efforts to date have identified more than 100 prostate cancer (PrCa)-susceptibility loci. We meta-analyzed genotype data from a custom high-density array of 46,939 PrCa cases and 27,910 controls of European ancestry with previously genotyped data of 32,255 PrCa cases and 33,202 controls of European ancestry. Our analysis identified 62 novel loci associated (P C, p.Pro1054Arg) in ATM and rs2066827 (OR = 1.06; P = 2.3 × 10-9; T>G, p.Val109Gly) in CDKN1B. The combination of all loci captured 28.4% of the PrCa familial relative risk, and a polygenic risk score conferred an elevated PrCa risk for men in the ninetieth to ninety-ninth percentiles (relative risk = 2.69; 95% confidence interval (CI): 2.55-2.82) and first percentile (relative risk = 5.71; 95% CI: 5.04-6.48) risk stratum compared with the population average. These findings improve risk prediction, enhance fine-mapping, and provide insight into the underlying biology of PrCa1.
In the version of this article initially published, the name of author Manuela Gago-Dominguez was misspelled as Manuela Gago Dominguez. The error has been corrected in the HTML and PDF version of the article.
We analyzed 3,872 common genetic variants across the ESR1 locus (encoding estrogen receptor α) in 118,816 subjects from three international consortia. We found evidence for at least five independent causal variants, each associated with different phenotype sets, including estrogen receptor (ER(+) or ER(-)) and human ERBB2 (HER2(+) or HER2(-)) tumor subtypes, mammographic density and tumor grade. The best candidate causal variants for ER(-) tumors lie in four separate enhancer elements, and their risk alleles reduce expression of ESR1, RMND1 and CCDC170, whereas the risk alleles of the strongest candidates for the remaining independent causal variant disrupt a silencer element and putatively increase ESR1 and RMND1 expression.
The transferability and clinical value of genetic risk scores (GRSs) across populations remain limited due to an imbalance in genetic studies across ancestrally diverse populations. Here we conducted a multi-ancestry genome-wide association study of 156,319 prostate cancer cases and 788,443 controls of European, African, Asian and Hispanic men, reflecting a 57% increase in the number of non-European cases over previous prostate cancer genome-wide association studies. We identified 187 novel risk variants for prostate cancer, increasing the total number of risk variants to 451. An externally replicated multi-ancestry GRS was associated with risk that ranged from 1.8 (per standard deviation) in African ancestry men to 2.2 in European ancestry men. The GRS was associated with a greater risk of aggressive versus non-aggressive disease in men of African ancestry (P = 0.03). Our study presents novel prostate cancer susceptibility loci and a GRS with effective risk stratification across ancestry groups.
Pancreatic cancer is the fourth leading cause of cancer death in the developed world. Both inherited high-penetrance mutations in BRCA2 (ref. 2), ATM, PALB2 (ref. 4), BRCA1 (ref. 5), STK11 (ref. 6), CDKN2A and mismatch-repair genes and low-penetrance loci are associated with increased risk. To identify new risk loci, we performed a genome-wide association study on 9,925 pancreatic cancer cases and 11,569 controls, including 4,164 newly genotyped cases and 3,792 controls in 9 studies from North America, Central Europe and Australia. We identified three newly associated regions: 17q25.1 (LINC00673, rs11655237, odds ratio (OR) = 1.26, 95% confidence interval (CI) = 1.19-1.34, P = 1.42 × 10(-14)), 7p13 (SUGCT, rs17688601, OR = 0.88, 95% CI = 0.84-0.92, P = 1.41 × 10(-8)) and 3q29 (TP63, rs9854771, OR = 0.89, 95% CI = 0.85-0.93, P = 2.35 × 10(-8)). We detected significant association at 2p13.3 (ETAA1, rs1486134, OR = 1.14, 95% CI = 1.09-1.19, P = 3.36 × 10(-9)), a region with previous suggestive evidence in Han Chinese. We replicated previously reported associations at 9q34.2 (ABO), 13q22.1 (KLF5), 5p15.33 (TERT and CLPTM1), 13q12.2 (PDX1), 1q32.1 (NR5A2), 7q32.3 (LINC-PINT), 16q23.1 (BCAR1) and 22q12.1 (ZNRF3). Our study identifies new loci associated with pancreatic cancer risk.
Allelic heterogeneity in disease-causing genes presents a substantial challenge to the translation of genomic variation into clinical practice. Few of the almost 2,000 variants in the cystic fibrosis transmembrane conductance regulator gene CFTR have empirical evidence that they cause cystic fibrosis. To address this gap, we collected both genotype and phenotype data for 39,696 individuals with cystic fibrosis in registries and clinics in North America and Europe. In these individuals, 159 CFTR variants had an allele frequency of ł0.01%. These variants were evaluated for both clinical severity and functional consequence, with 127 (80%) meeting both clinical and functional criteria consistent with disease. Assessment of disease penetrance in 2,188 fathers of individuals with cystic fibrosis enabled assignment of 12 of the remaining 32 variants as neutral, whereas the other 20 variants remained of indeterminate effect. This study illustrates that sourcing data directly from well-phenotyped subjects can address the gap in our ability to interpret clinically relevant genomic variation.
Linkage and candidate gene studies have identified several breast cancer susceptibility genes, but the overall contribution of coding variation to breast cancer is unclear. To evaluate the role of rare coding variants more comprehensively, we performed a meta-analysis across three large whole-exome sequencing datasets, containing 26,368 female cases and 217,673 female controls. Burden tests were performed for protein-truncating and rare missense variants in 15,616 and 18,601 genes, respectively. Associations between protein-truncating variants and breast cancer were identified for the following six genes at exome-wide significance (P
Coronary artery disease (CAD) is a leading cause of morbidity and mortality worldwide. Although 58 genomic regions have been associated with CAD thus far, most of the heritability is unexplained, indicating that additional susceptibility loci await identification. An efficient discovery strategy may be larger-scale evaluation of promising associations suggested by genome-wide association studies (GWAS). Hence, we genotyped 56,309 participants using a targeted gene array derived from earlier GWAS results and performed meta-analysis of results with 194,427 participants previously genotyped, totaling 88,192 CAD cases and 162,544 controls. We identified 25 new SNP-CAD associations (P < 5 × 10-8, in fixed-effects meta-analysis) from 15 genomic regions, including SNPs in or near genes involved in cellular adhesion, leukocyte migration and atherosclerosis (PECAM1, rs1867624), coagulation and inflammation (PROCR, rs867186 (p.Ser219Gly)) and vascular smooth muscle cell differentiation (LMOD1, rs2820315). Correlation of these regions with cell-type-specific gene expression and plasma protein levels sheds light on potential disease mechanisms.
Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes.
We conducted a genome-wide association study of oral cavity and pharyngeal cancer in 6,034 cases and 6,585 controls from Europe, North America and South America. We detected eight significantly associated loci (P < 5 × 10-8), seven of which are new for these cancer sites. Oral and pharyngeal cancers combined were associated with loci at 6p21.32 (rs3828805, HLA-DQB1), 10q26.13 (rs201982221, LHPP) and 11p15.4 (rs1453414, OR52N2-TRIM5). Oral cancer was associated with two new regions, 2p23.3 (rs6547741, GPN1) and 9q34.12 (rs928674, LAMC3), and with known cancer-related loci-9p21.3 (rs8181047, CDKN2B-AS1) and 5p15.33 (rs10462706, CLPTM1L). Oropharyngeal cancer associations were limited to the human leukocyte antigen (HLA) region, and classical HLA allele imputation showed a protective association with the class II haplotype HLA-DRB1*1301-HLA-DQA1*0103-HLA-DQB1*0603 (odds ratio (OR) = 0.59, P = 2.7 × 10-9). Stratified analyses on a subgroup of oropharyngeal cases with information available on human papillomavirus (HPV) status indicated that this association was considerably stronger in HPV-positive (OR = 0.23, P = 1.6 × 10-6) than in HPV-negative (OR = 0.75, P = 0.16) cancers.
Primary angle closure glaucoma (PACG) is a major cause of blindness worldwide. We conducted a genome-wide association study including 1,854 PACG cases and 9,608 controls across 5 sample collections in Asia. Replication experiments were conducted in 1,917 PACG cases and 8,943 controls collected from a further 6 sample collections. We report significant associations at three new loci: rs11024102 in PLEKHA7 (per-allele odds ratio (OR)=1.22; P=5.33×10(-12)), rs3753841 in COL11A1 (per-allele OR=1.20; P=9.22×10(-10)) and rs1015213 located between PCMTD1 and ST18 on chromosome 8q (per-allele OR=1.50; P=3.29×10(-9)). Our findings, accumulated across these independent worldwide collections, suggest possible mechanisms explaining the pathogenesis of PACG.
In a three-stage genome-wide association study among East Asian women including 22,780 cases and 24,181 controls, we identified 3 genetic loci newly associated with breast cancer risk, including rs4951011 at 1q32.1 (in intron 2 of the ZC3H11A gene; P=8.82×10(-9)), rs10474352 at 5q14.3 (near the ARRDC3 gene; P=1.67×10(-9)) and rs2290203 at 15q26.1 (in intron 14 of the PRC1 gene; P=4.25×10(-8)). We replicated these associations in 16,003 cases and 41,335 controls of European ancestry (P=0.030, 0.004 and 0.010, respectively). Data from the ENCODE Project suggest that variants rs4951011 and rs10474352 might be located in an enhancer region and transcription factor binding sites, respectively. This study provides additional insights into the genetics and biology of breast cancer.
Genome-wide association studies (GWAS) and large-scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ∼14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS, comprising 15,748 breast cancer cases and 18,084 controls together with 46,785 cases and 42,892 controls from 41 studies genotyped on a 211,155-marker custom array (iCOGS). Analyses were restricted to women of European ancestry. We generated genotypes for more than 11 million SNPs by imputation using the 1000 Genomes Project reference panel, and we identified 15 new loci associated with breast cancer at P < 5 × 10(-8). Combining association analysis with ChIP-seq chromatin binding data in mammary cell lines and ChIA-PET chromatin interaction data from ENCODE, we identified likely target genes in two regions: SETBP1 at 18q12.3 and RNF115 and PDZK1 at 1q21.1. One association appears to be driven by an amino acid substitution encoded in EXO1.
Breast cancer susceptibility variants frequently show heterogeneity in associations by tumor subtype1-3. To identify novel loci, we performed a genome-wide association study including 133,384 breast cancer cases and 113,789 controls, plus 18,908 BRCA1 mutation carriers (9,414 with breast cancer) of European ancestry, using both standard and novel methodologies that account for underlying tumor heterogeneity by estrogen receptor, progesterone receptor and human epidermal growth factor receptor 2 status and tumor grade. We identified 32 novel susceptibility loci (P
Primary angle closure glaucoma (PACG) is a major cause of blindness worldwide. We conducted a genome-wide association study (GWAS) followed by replication in a combined total of 10,503 PACG cases and 29,567 controls drawn from 24 countries across Asia, Australia, Europe, North America, and South America. We observed significant evidence of disease association at five new genetic loci upon meta-analysis of all patient collections. These loci are at EPDR1 rs3816415 (odds ratio (OR) = 1.24, P = 5.94 × 10(-15)), CHAT rs1258267 (OR = 1.22, P = 2.85 × 10(-16)), GLIS3 rs736893 (OR = 1.18, P = 1.43 × 10(-14)), FERMT2 rs7494379 (OR = 1.14, P = 3.43 × 10(-11)), and DPM2-FAM102A rs3739821 (OR = 1.15, P = 8.32 × 10(-12)). We also confirmed significant association at three previously described loci (P < 5 × 10(-8) for each sentinel SNP at PLEKHA7, COL11A1, and PCMTD1-ST18), providing new insights into the biology of PACG.
The widespread distribution and relapsing nature of Plasmodium vivax infection present major challenges for the elimination of malaria. To characterize the genetic diversity of this parasite in individual infections and across the population, we performed deep genome sequencing of >200 clinical samples collected across the Asia-Pacific region and analyzed data on >300,000 SNPs and nine regions of the genome with large copy number variations. Individual infections showed complex patterns of genetic structure, with variation not only in the number of dominant clones but also in their level of relatedness and inbreeding. At the population level, we observed strong signals of recent evolutionary selection both in known drug resistance genes and at new loci, and these varied markedly between geographical locations. These findings demonstrate a dynamic landscape of local evolutionary adaptation in the parasite population and provide a foundation for genomic surveillance to guide effective strategies for control and elimination of P. vivax.
Systemic lupus erythematosus (SLE) has a strong but incompletely understood genetic architecture. We conducted an association study with replication in 4,478 SLE cases and 12,656 controls from six East Asian cohorts to identify new SLE susceptibility loci and better localize known loci. We identified ten new loci and confirmed 20 known loci with genome-wide significance. Among the new loci, the most significant locus was GTF2IRD1-GTF2I at 7q11.23 (rs73366469, Pmeta = 3.75 × 10(-117), odds ratio (OR) = 2.38), followed by DEF6, IL12B, TCF7, TERT, CD226, PCNXL3, RASGRP1, SYNGR1 and SIGLEC6. We identified the most likely functional variants at each locus by analyzing epigenetic marks and gene expression data. Ten candidate variants are known to alter gene expression in cis or in trans. Enrichment analysis highlights the importance of these loci in B cell and T cell biology. The new loci, together with previously known loci, increase the explained heritability of SLE to 24%. The new loci share functional and ontological characteristics with previously reported loci and are possible drug targets for SLE therapeutics.
Transforming growth factor (TGF)-β1 (encoded by TGFB1) is the prototypic member of the TGF-β family of 33 proteins that orchestrate embryogenesis, development and tissue homeostasis1,2. Following its discovery 3 , enormous interest and numerous controversies have emerged about the role of TGF-β in coordinating the balance of pro- and anti-oncogenic properties4,5, pro- and anti-inflammatory effects 6 , or pro- and anti-fibrinogenic characteristics 7 . Here we describe three individuals from two pedigrees with biallelic loss-of-function mutations in the TGFB1 gene who presented with severe infantile inflammatory bowel disease (IBD) and central nervous system (CNS) disease associated with epilepsy, brain atrophy and posterior leukoencephalopathy. The proteins encoded by the mutated TGFB1 alleles were characterized by impaired secretion, function or stability of the TGF-β1-LAP complex, which is suggestive of perturbed bioavailability of TGF-β1. Our study shows that TGF-β1 has a critical and nonredundant role in the development and homeostasis of intestinal immunity and the CNS in humans.
To identify common alleles associated with different histotypes of epithelial ovarian cancer (EOC), we pooled data from multiple genome-wide genotyping projects totaling 25,509 EOC cases and 40,941 controls. We identified nine new susceptibility loci for different EOC histotypes: six for serous EOC histotypes (3q28, 4q32.3, 8q21.11, 10q24.33, 18q11.2 and 22q12.1), two for mucinous EOC (3q22.3 and 9q31.1) and one for endometrioid EOC (5q12.3). We then performed meta-analysis on the results for high-grade serous ovarian cancer with the results from analysis of 31,448 BRCA1 and BRCA2 mutation carriers, including 3,887 mutation carriers with EOC. This identified three additional susceptibility loci at 2q13, 8q24.1 and 12q24.31. Integrated analyses of genes and regulatory biofeatures at each locus predicted candidate susceptibility genes, including OBFC1, a new candidate susceptibility gene for low-grade and borderline serous EOC.