FINDINGS: We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long reads, and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from 3 different tissue types from 3 other species of squid (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein-coding genes supported by evidence, and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome.
CONCLUSIONS: This annotated draft genome of A. dux provides a critical resource to investigate the unique traits of this species, including its gigantism and key adaptations to deep-sea environments.
MATERIALS AND METHODS: The EGFR intron 1 polymorphism was analysed in three distinct healthy Asian subjects, namely, Chinese (N = 96), Malays (N = 98) and Indians (N = 100). Comparative genomic hybridisation was performed to investigate for changes in DNA copy number in relation to the polymorphic CA dinucleotide repeats in breast tumor tissues (N = 22).
RESULTS: The frequency of short alleles with 14 and 15 CA repeats were most common in the Asian populations and significantly higher than those reported for Caucasians. The frequency of 20 CA repeats was 5%, almost 13-fold lower than previous reports. EGFR amplifications were detected in 23% and 11% of breast tumor tissues harboring short and long CA repeats, respectively.
CONCLUSION: Our results show that the frequency of alleles encoding for short CA dinucleotide repeats is common in Asian populations. EGFR expression and amplification levels were also higher in Asian breast tumor tissues with short CA dinucleotide repeats. These findings suggest that the EGFR intron 1 polymorphism may influence response to treatment with tyrosine kinase inhibitors in breast cancer patients and further studies are warranted.
METHODS: To discover novel pancreatic cancer risk loci and possible causal genes, we performed a pancreatic cancer transcriptome-wide association study in Europeans using three approaches: FUSION, MetaXcan, and Summary-MulTiXcan. We integrated genome-wide association studies summary statistics from 9040 pancreatic cancer cases and 12 496 controls, with gene expression prediction models built using transcriptome data from histologically normal pancreatic tissue samples (NCI Laboratory of Translational Genomics [n = 95] and Genotype-Tissue Expression v7 [n = 174] datasets) and data from 48 different tissues (Genotype-Tissue Expression v7, n = 74-421 samples).
RESULTS: We identified 25 genes whose genetically predicted expression was statistically significantly associated with pancreatic cancer risk (false discovery rate < .05), including 14 candidate genes at 11 novel loci (1p36.12: CELA3B; 9q31.1: SMC2, SMC2-AS1; 10q23.31: RP11-80H5.9; 12q13.13: SMUG1; 14q32.33: BTBD6; 15q23: HEXA; 15q26.1: RCCD1; 17q12: PNMT, CDK12, PGAP3; 17q22: SUPT4H1; 18q11.22: RP11-888D10.3; and 19p13.11: PGPEP1) and 11 at six known risk loci (5p15.33: TERT, CLPTM1L, ZDHHC11B; 7p14.1: INHBA; 9q34.2: ABO; 13q12.2: PDX1; 13q22.1: KLF5; and 16q23.1: WDR59, CFDP1, BCAR1, TMEM170A). The association for 12 of these genes (CELA3B, SMC2, and PNMT at novel risk loci and TERT, CLPTM1L, INHBA, ABO, PDX1, KLF5, WDR59, CFDP1, and BCAR1 at known loci) remained statistically significant after Bonferroni correction.
CONCLUSIONS: By integrating gene expression and genotype data, we identified novel pancreatic cancer risk loci and candidate functional genes that warrant further investigation.
RESULTS: Firstly, from the expression profiles of Na+/K+/2Cl- cotransporter, chloride channel protein 2, and ABC transporter, it turned out that the 24 h might be the most influenced duration in the short-term stress. We collected megalopa under different salinity for 24 h and then submitted to mRNA profiling. Totally, 57.87 Gb Clean Data were obtained. The comparative genomic analysis detected 342 differentially expressed genes (DEGs). The most significantly DEGs include gamma-butyrobetaine dioxygenase-like, facilitated trehalose transporter Tret1, sodium/potassium-transporting ATPase subunit alpha, rhodanese 1-like protein, etc. And the significantly enriched pathways were lysine degradation, choline metabolism in cancer, phospholipase D signaling pathway, Fc gamma R-mediated phagocytosis, and sphingolipid signaling pathway. The results indicate that in the short-term salinity stress, the megalopa might regulate some mechanism such as metabolism, immunity responses, osmoregulation to adapt to the alteration of the environment.
CONCLUSIONS: This study represents the first genome-wide transcriptome analysis of S. paramamosain megalopa for studying its stress adaption mechanisms under different salinity. The results reveal numbers of genes modified by salinity stress and some important pathways, which will provide valuable resources for discovering the molecular basis of salinity stress adaptation of S. paramamosain larvae and further boost the understanding of the potential molecular mechanisms of salinity stress adaptation for crustacean species.
RESULTS: In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors.
CONCLUSION: By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.