Primary open angle glaucoma (POAG), a major cause of blindness worldwide, is a complex disease with a significant genetic contribution. We performed Exome Array (Illumina) analysis on 3504 POAG cases and 9746 controls with replication of the most significant findings in 9173 POAG cases and 26 780 controls across 18 collections of Asian, African and European descent. Apart from confirming strong evidence of association at CDKN2B-AS1 (rs2157719 [G], odds ratio [OR] = 0.71, P = 2.81 × 10(-33)), we observed one SNP showing significant association to POAG (CDC7-TGFBR3 rs1192415, ORG-allele = 1.13, Pmeta = 1.60 × 10(-8)). This particular SNP has previously been shown to be strongly associated with optic disc area and vertical cup-to-disc ratio, which are regarded as glaucoma-related quantitative traits. Our study now extends this by directly implicating it in POAG disease pathogenesis.
Previous studies have suggested that polymorphisms in CASP8 on chromosome 2 are associated with breast cancer risk. To clarify the role of CASP8 in breast cancer susceptibility, we carried out dense genotyping of this region in the Breast Cancer Association Consortium (BCAC). Single-nucleotide polymorphisms (SNPs) spanning a 1 Mb region around CASP8 were genotyped in 46 450 breast cancer cases and 42 600 controls of European origin from 41 studies participating in the BCAC as part of a custom genotyping array experiment (iCOGS). Missing genotypes and SNPs were imputed and, after quality exclusions, 501 typed and 1232 imputed SNPs were included in logistic regression models adjusting for study and ancestry principal components. The SNPs retained in the final model were investigated further in data from nine genome-wide association studies (GWAS) comprising in total 10 052 case and 12 575 control subjects. The most significant association signal observed in European subjects was for the imputed intronic SNP rs1830298 in ALS2CR12 (telomeric to CASP8), with per allele odds ratio and 95% confidence interval [OR (95% confidence interval, CI)] for the minor allele of 1.05 (1.03-1.07), P = 1 × 10(-5). Three additional independent signals from intronic SNPs were identified, in CASP8 (rs36043647), ALS2CR11 (rs59278883) and CFLAR (rs7558475). The association with rs1830298 was replicated in the imputed results from the combined GWAS (P = 3 × 10(-6)), yielding a combined OR (95% CI) of 1.06 (1.04-1.08), P = 1 × 10(-9). Analyses of gene expression associations in peripheral blood and normal breast tissue indicate that CASP8 might be the target gene, suggesting a mechanism involving apoptosis.
We recently identified a novel susceptibility variant, rs865686, for estrogen-receptor positive breast cancer at 9q31.2. Here, we report a fine-mapping analysis of the 9q31.2 susceptibility locus using 43 160 cases and 42 600 controls of European ancestry ascertained from 52 studies and a further 5795 cases and 6624 controls of Asian ancestry from nine studies. Single nucleotide polymorphism (SNP) rs676256 was most strongly associated with risk in Europeans (odds ratios [OR] = 0.90 [0.88-0.92]; P-value = 1.58 × 10(-25)). This SNP is one of a cluster of highly correlated variants, including rs865686, that spans ∼14.5 kb. We identified two additional independent association signals demarcated by SNPs rs10816625 (OR = 1.12 [1.08-1.17]; P-value = 7.89 × 10(-09)) and rs13294895 (OR = 1.09 [1.06-1.12]; P-value = 2.97 × 10(-11)). SNP rs10816625, but not rs13294895, was also associated with risk of breast cancer in Asian individuals (OR = 1.12 [1.06-1.18]; P-value = 2.77 × 10(-05)). Functional genomic annotation using data derived from breast cancer cell-line models indicates that these SNPs localise to putative enhancer elements that bind known drivers of hormone-dependent breast cancer, including ER-α, FOXA1 and GATA-3. In vitro analyses indicate that rs10816625 and rs13294895 have allele-specific effects on enhancer activity and suggest chromatin interactions with the KLF4 gene locus. These results demonstrate the power of dense genotyping in large studies to identify independent susceptibility variants. Analysis of associations using subjects with different ancestry, combined with bioinformatic and genomic characterisation, can provide strong evidence for the likely causative alleles and their functional basis.
Central corneal thickness (CCT) is a risk factor of glaucoma, the most common cause of irreversible blindness worldwide. The identification of genetic determinants affecting CCT in the normal population will provide insights into the mechanisms underlying the association between CCT and glaucoma, as well as the pathogenesis of glaucoma itself. We conducted two genome-wide association studies for CCT in 5080 individuals drawn from two ethnic populations in Singapore (2538 Indian and 2542 Malays) and identified novel genetic loci significantly associated with CCT (COL8A2 rs96067, p(meta) = 5.40 × 10⁻¹³, interval of RXRA-COL5A1 rs1536478, p(meta) = 3.05 × 10⁻⁹). We confirmed the involvement of a previously reported gene for CCT and brittle cornea syndrome (ZNF469) [rs9938149 (p(meta) = 1.63 × 10⁻¹⁶) and rs12447690 (p(meta) = 1.92 × 10⁻¹⁴)]. Evidence of association exceeding the formal threshold for genome-wide significance was observed at rs7044529, an SNP located within COL5A1 when data from this study (n = 5080, P = 0.0012) were considered together with all published data (reflecting an additional 7349 individuals, p(Fisher) = 1.5 × 10⁻⁹). These findings implicate the involvement of collagen genes influencing CCT and thus, possibly the pathogenesis of glaucoma.
Candidate variant association studies have been largely unsuccessful in identifying common breast cancer susceptibility variants, although most studies have been underpowered to detect associations of a realistic magnitude. We assessed 41 common non-synonymous single-nucleotide polymorphisms (nsSNPs) for which evidence of association with breast cancer risk had been previously reported. Case-control data were combined from 38 studies of white European women (46 450 cases and 42 600 controls) and analyzed using unconditional logistic regression. Strong evidence of association was observed for three nsSNPs: ATXN7-K264R at 3p21 [rs1053338, per allele OR = 1.07, 95% confidence interval (CI) = 1.04-1.10, P = 2.9 × 10(-6)], AKAP9-M463I at 7q21 (rs6964587, OR = 1.05, 95% CI = 1.03-1.07, P = 1.7 × 10(-6)) and NEK10-L513S at 3p24 (rs10510592, OR = 1.10, 95% CI = 1.07-1.12, P = 5.1 × 10(-17)). The first two associations reached genome-wide statistical significance in a combined analysis of available data, including independent data from nine genome-wide association studies (GWASs): for ATXN7-K264R, OR = 1.07 (95% CI = 1.05-1.10, P = 1.0 × 10(-8)); for AKAP9-M463I, OR = 1.05 (95% CI = 1.04-1.07, P = 2.0 × 10(-10)). Further analysis of other common variants in these two regions suggested that intronic SNPs nearby are more strongly associated with disease risk. We have thus identified a novel susceptibility locus at 3p21, and confirmed previous suggestive evidence that rs6964587 at 7q21 is associated with risk. The third locus, rs10510592, is located in an established breast cancer susceptibility region; the association was substantially attenuated after adjustment for the known GWAS hit. Thus, each of the associated nsSNPs is likely to be a marker for another, non-coding, variant causally related to breast cancer risk. Further fine-mapping and functional studies are required to identify the underlying risk-modifying variants and the genes through which they act.
Integrin alpha M (ITGAM; CD11b) is a component of the macrophage-1 antigen complex, which mediates leukocyte adhesion, migration and phagocytosis as part of the immune system. We previously identified a missense polymorphism, rs1143679 (R77H), strongly associated with systemic lupus erythematosus (SLE). However, the molecular mechanisms of this variant are incompletely understood. A meta-analysis of published and novel data on 28 439 individuals with European, African, Hispanic and Asian ancestries reinforces genetic association between rs1143679 and SLE [Pmeta = 3.60 × 10(-90), odds ratio (OR) = 1.76]. Since rs1143679 is in the most active region of chromatin regulation and transcription factor binding in ITGAM, we quantitated ITGAM RNA and surface protein levels in monocytes from patients with each rs1143679 genotype. We observed that transcript levels significantly decreased for the risk allele ('A') relative to the non-risk allele ('G'), in a dose-dependent fashion: ('AA' < 'AG' < 'GG'). CD11b protein levels in patients' monocytes were directly correlated with RNA levels. Strikingly, heterozygous individuals express much lower (average 10- to 15-fold reduction) amounts of the 'A' transcript than 'G' transcript. We found that the non-risk sequence surrounding rs1143679 exhibits transcriptional enhancer activity in vivo and binds to Ku70/80, NFKB1 and EBF1 in vitro, functions that are significantly reduced with the risk allele. Mutant CD11b protein shows significantly reduced binding to fibrinogen and vitronectin, relative to non-risk, both in purified protein and in cellular models. This two-pronged contribution (nucleic acid- and protein-level) of the rs1143679 risk allele to decreasing ITGAM activity provides insight into the molecular mechanisms of its potent association with SLE.
To evaluate the contribution of non-synonymous-coding variants of known familial and genome-wide association studies (GWAS)-linked genes for Parkinson's disease (PD) to PD risk in the East Asian population, we sequenced all the coding exons of 39 PD-related disease genes and evaluated the accumulation of rare non-synonymous-coding variants in 375 early-onset PD cases and 399 controls. We also genotyped 782 non-synonymous-coding variants of these genes in 710 late-onset PD cases and 9046 population controls. Significant enrichment of LRRK2 variants was observed in both early- and late-onset PD (odds ratio = 1.58; 95% confidence interval = 1.29-1.93; P = 8.05 × 10(-6)). Moderate enrichment was also observed in FGF20, MCCC1, GBA and ITGA8. Half of the rare variants anticipated to cause loss of function of these genes were present in healthy controls. Overall, non-synonymous-coding variants of known familial and GWAS-linked genes appear to make a limited contribution to PD risk, suggesting that clinical sequencing of these genes will provide limited information for risk prediction and molecular diagnosis.
Candidate gene and genome-wide association studies (GWAS) have identified 15 independent genomic regions associated with bladder cancer risk. In search for additional susceptibility variants, we followed up on four promising single-nucleotide polymorphisms (SNPs) that had not achieved genome-wide significance in 6911 cases and 11 814 controls (rs6104690, rs4510656, rs5003154 and rs4907479, P < 1 × 10(-6)), using additional data from existing GWAS datasets and targeted genotyping for studies that did not have GWAS data. In a combined analysis, which included data on up to 15 058 cases and 286 270 controls, two SNPs achieved genome-wide statistical significance: rs6104690 in a gene desert at 20p12.2 (P = 2.19 × 10(-11)) and rs4907479 within the MCF2L gene at 13q34 (P = 3.3 × 10(-10)). Imputation and fine-mapping analyses were performed in these two regions for a subset of 5551 bladder cancer cases and 10 242 controls. Analyses at the 13q34 region suggest a single signal marked by rs4907479. In contrast, we detected two signals in the 20p12.2 region-the first signal is marked by rs6104690, and the second signal is marked by two moderately correlated SNPs (r(2) = 0.53), rs6108803 and the previously reported rs62185668. The second 20p12.2 signal is more strongly associated with the risk of muscle-invasive (T2-T4 stage) compared with non-muscle-invasive (Ta, T1 stage) bladder cancer (case-case P ≤ 0.02 for both rs62185668 and rs6108803). Functional analyses are needed to explore the biological mechanisms underlying these novel genetic associations with risk for bladder cancer.
Breast cancer is the most diagnosed malignancy and the second leading cause of cancer mortality in females. Previous association studies have identified variants on 2q35 associated with the risk of breast cancer. To identify functional susceptibility loci for breast cancer, we interrogated the 2q35 gene desert for chromatin architecture and functional variation correlated with gene expression. We report a novel intergenic breast cancer risk locus containing an enhancer copy number variation (enCNV; deletion) located approximately 400Kb upstream to IGFBP5, which overlaps an intergenic ERα-bound enhancer that loops to the IGFBP5 promoter. The enCNV is correlated with modified ERα binding and monoallelic-repression of IGFBP5 following oestrogen treatment. We investigated the association of enCNV genotype with breast cancer in 1,182 cases and 1,362 controls, and replicate our findings in an independent set of 62,533 cases and 60,966 controls from 41 case control studies and 11 GWAS. We report a dose-dependent inverse association of 2q35 enCNV genotype (percopy OR = 0.68 95%CI 0.55-0.83, P = 0.0002; replication OR = 0.77 95% CI 0.73-0.82, P = 2.1 × 10-19) and identify 13 additional linked variants (r2 > 0.8) in the 20Kb linkage block containing the enCNV (P = 3.2 × 10-15 - 5.6 × 10-17). These associations were independent of previously reported 2q35 variants, rs13387042/rs4442975 and rs16857609, and were stronger for ER-positive than ER-negative disease. Together, these results suggest that 2q35 breast cancer risk loci may be mediating their effect through IGFBP5.
Rare and low frequency variants are not well covered in most germline genotyping arrays and are understudied in relation to epithelial ovarian cancer (EOC) risk. To address this gap, we used genotyping arrays targeting rarer protein-coding variation in 8,165 EOC cases and 11,619 controls from the international Ovarian Cancer Association Consortium (OCAC). Pooled association analyses were conducted at the variant and gene level for 98,543 variants directly genotyped through two exome genotyping projects. Only common variants that represent or are in strong linkage disequilibrium (LD) with previously-identified signals at established loci reached traditional thresholds for exome-wide significance (P P≥5.0 ×10 - 7) were detected for rare and low-frequency variants at 16 novel loci. Four rare missense variants were identified (ACTBL2 rs73757391 (5q11.2), BTD rs200337373 (3p25.1), KRT13 rs150321809 (17q21.2) and MC2R rs104894658 (18p11.21)), but only MC2R rs104894668 had a large effect size (OR = 9.66). Genes most strongly associated with EOC risk included ACTBL2 (PAML = 3.23 × 10 - 5; PSKAT-o = 9.23 × 10 - 4) and KRT13 (PAML = 1.67 × 10 - 4; PSKAT-o = 1.07 × 10 - 5), reaffirming variant-level analysis. In summary, this large study identified several rare and low-frequency variants and genes that may contribute to EOC susceptibility, albeit with possible small effects. Future studies that integrate epidemiology, sequencing, and functional assays are needed to further unravel the unexplained heritability and biology of this disease.
We recently identified ten novel SLE susceptibility loci in Asians and uncovered several additional suggestive loci requiring further validation. This study aimed to replicate five of these suggestive loci in a Han Chinese cohort from Hong Kong, followed by meta-analysis (11,656 cases and 23,968 controls) on previously reported Asian and European populations, and to perform bioinformatic analyses on all 82 reported SLE loci to identify shared regulatory signatures. We performed a battery of analyses for these five loci, as well as joint analyses on all 82 SLE loci. All five loci passed genome-wide significance: MYNN (rs10936599, Pmeta = 1.92 × 10-13, OR = 1.14), ATG16L2 (rs11235604, Pmeta = 8.87 × 10 -12, OR = 0.78), CCL22 (rs223881, Pmeta = 5.87 × 10-16, OR = 0.87), ANKS1A (rs2762340, Pmeta = 4.93 × 10-15, OR = 0.87) and RNASEH2C (rs1308020, Pmeta = 2.96 × 10-19, OR = 0.84) and co-located with annotated gene regulatory elements. The novel loci share genetic signatures with other reported SLE loci, including effects on gene expression, transcription factor binding, and epigenetic characteristics. Most (56%) of the correlated (r2 > 0.8) SNPs from the 82 SLE loci were implicated in differential expression (9.81 × 10-198
Human RBMY1 genes are located in four variable-sized clusters on the Y chromosome, expressed in male germ cells and possibly associated with sperm motility. We have re-investigated the mutational background and evolutionary history of the RBMY1 copy number distribution in worldwide samples and its relevance to sperm parameters in an Estonian cohort of idiopathic male factor infertility subjects. We estimated approximate RBMY1 copy numbers in 1218 1000 Genomes Project phase 3 males from sequencing read-depth, then chose 14 for valid ation by multicolour fibre-FISH. These fibre-FISH samples provided accurate calibration standards for the entire panel and led to detailed insights into population variation and mutational mechanisms. RBMY1 copy number worldwide ranged from 3 to 13 with a mode of 8. The two larger proximal clusters were the most variable, and additional duplications, deletions and inversions were detected. Placing the copy number estimates onto the published Y-SNP-based phylogeny of the same samples suggested a minimum of 562 mutational changes, translating to a mutation rate of 2.20 × 10-3 (95% CI 1.94 × 10-3 to 2.48 × 10-3) per father-to-son Y-transmission, higher than many short tandem repeat (Y-STRs), and showed no evidence for selection for increased or decreased copy number, but possible copy number stabilizing selection. An analysis of RBMY1 copy numbers among 376 infertility subjects failed to replicate a previously reported association with sperm motility and showed no significant effect on sperm count and concentration, serum follicle stimulating hormone (FSH), luteinizing hormone (LH) and testosterone levels or testicular and semen volume. These results provide the first in-depth insights into the structural rearrangements underlying RBMY1 copy number variation across diverse human lineages.