Eimeria tenella is an intracellular protozoan parasite that infects the intestinal tracts of domestic fowl and causes coccidiosis, a serious and sometimes lethal enteritis. Eimeria falls in the same phylum (Apicomplexa) as several human and animal parasites such as Cryptosporidium, Toxoplasma, and the malaria parasite, Plasmodium. Here we report the sequencing and analysis of the first chromosome of E. tenella, a chromosome believed to carry loci associated with drug resistance and known to differ between virulent and attenuated strains of the parasite. The chromosome--which appears to be representative of the genome--is gene-dense and rich in simple-sequence repeats, many of which appear to give rise to repetitive amino acid tracts in the predicted proteins. Most striking is the segmentation of the chromosome into repeat-rich regions peppered with transposon-like elements and telomere-like repeats, alternating with repeat-free regions. Predicted genes differ in character between the two types of segment, and the repeat-rich regions appear to be associated with strain-to-strain variation.
One of the causative agents of lympahtic filariasis is the nematode parasite Brugia malayi that requires a competent mosquito vector for its development and transmission. Armigeres subalbatus mosquitoes rapidly destroy invading B. malayi microfilariae via a defense response known as melanotic encapsulation. We have constructed a genetic linkage map for this mosquito species using RFLP markers from Aedes aegypti. This heterologous approach was possible because of the conserved nature of the coding sequences used as markers and provided an experimental framework to evaluate the hypothesis that linkage and gene order are conserved between these mosquito species. Of the 56 Ae. aegypti markers tested, 77% hybridize to genomic DNA digests of Ar. subalbatus under stringent conditions, with 53% of these demonstrating strain-specific polymorphisms. Twenty-six Ae. aegypti markers have been mapped using an F2- segregating Ar. subalbatus population derived from a cross of strains originating in Japan and Malaysia. Linear order of these marker loci is highly conserved between the two species. Only 1 of these markers, LF92, was not linked in the manner predicted by the Ae. aegypti map. In addition, the autosomal sex-determination locus that occurs in linkage group 1 in Ae. aegypti resides in group 3 in Ar. subalbatus. The Ar. subalbatus map provides a basic genetic context that can be utilized in further genetic studies to clarify the genetic basis of parasite resistance in this mosquito and is a necessary precursor to the identification of genome regions that carry genes that determine the encapsulation phenotype. [The composite map and sequence database information for Ae. aegypti markers can be retrieved directly from the Ae. aegypti Genome Database through the World Wide Web: http://klab.agsci.colostate.edu.]
Pangolins, unique mammals with scales over most of their body, no teeth, poor vision, and an acute olfactory system, comprise the only placental order (Pholidota) without a whole-genome map. To investigate pangolin biology and evolution, we developed genome assemblies of the Malayan (Manis javanica) and Chinese (M. pentadactyla) pangolins. Strikingly, we found that interferon epsilon (IFNE), exclusively expressed in epithelial cells and important in skin and mucosal immunity, is pseudogenized in all African and Asian pangolin species that we examined, perhaps impacting resistance to infection. We propose that scale development was an innovation that provided protection against injuries or stress and reduced pangolin vulnerability to infection. Further evidence of specialized adaptations was evident from positively selected genes involving immunity-related pathways, inflammation, energy storage and metabolism, muscular and nervous systems, and scale/hair development. Olfactory receptor gene families are significantly expanded in pangolins, reflecting their well-developed olfaction system. This study provides insights into mammalian adaptation and functional diversification, new research tools and questions, and perhaps a new natural IFNE-deficient animal model for studying mammalian immunity.
Chromosomal translocations are a genomic hallmark of many hematologic malignancies. Often as initiating events, these structural abnormalities result in fusion proteins involving transcription factors important for hematopoietic differentiation and/or signaling molecules regulating cell proliferation and cell cycle. In contrast, epigenetic regulator genes are more frequently targeted by somatic sequence mutations, possibly as secondary events to further potentiate leukemogenesis. Through comprehensive whole-transcriptome sequencing of 231 children with acute lymphoblastic leukemia (ALL), we identified 58 putative functional and predominant fusion genes in 54.1% of patients (n = 125), 31 of which have not been reported previously. In particular, we described a distinct ALL subtype with a characteristic gene expression signature predominantly driven by chromosomal rearrangements of the ZNF384 gene with histone acetyltransferases EP300 and CREBBP ZNF384-rearranged ALL showed significant up-regulation of CLCF1 and BTLA expression, and ZNF384 fusion proteins consistently showed higher activity to promote transcription of these target genes relative to wild-type ZNF384 in vitro. Ectopic expression of EP300-ZNF384 and CREBBP-ZNF384 fusion altered differentiation of mouse hematopoietic stem and progenitor cells and also potentiated oncogenic transformation in vitro. EP300- and CREBBP-ZNF384 fusions resulted in loss of histone lysine acetyltransferase activity in a dominant-negative fashion, with concomitant global reduction of histone acetylation and increased sensitivity of leukemia cells to histone deacetylase inhibitors. In conclusion, our results indicate that gene fusion is a common class of genomic abnormalities in childhood ALL and that recurrent translocations involving EP300 and CREBBP may cause epigenetic deregulation with potential for therapeutic targeting.
Global production of chickens has trebled in the past two decades and they are now the most important source of dietary animal protein worldwide. Chickens are subject to many infectious diseases that reduce their performance and productivity. Coccidiosis, caused by apicomplexan protozoa of the genus Eimeria, is one of the most important poultry diseases. Understanding the biology of Eimeria parasites underpins development of new drugs and vaccines needed to improve global food security. We have produced annotated genome sequences of all seven species of Eimeria that infect domestic chickens, which reveal the full extent of previously described repeat-rich and repeat-poor regions and show that these parasites possess the most repeat-rich proteomes ever described. Furthermore, while no other apicomplexan has been found to possess retrotransposons, Eimeria is home to a family of chromoviruses. Analysis of Eimeria genes involved in basic biology and host-parasite interaction highlights adaptations to a relatively simple developmental life cycle and a complex array of co-expressed surface proteins involved in host cell binding.
The Singapore Genome Variation Project (SGVP) provides a publicly available resource of 1.6 million single nucleotide polymorphisms (SNPs) genotyped in 268 individuals from the Chinese, Malay, and Indian population groups in Southeast Asia. This online database catalogs information and summaries on genotype and phased haplotype data, including allele frequencies, assessment of linkage disequilibrium (LD), and recombination rates in a format similar to the International HapMap Project. Here, we introduce this resource and describe the analysis of human genomic variation upon agglomerating data from the HapMap and the Human Genome Diversity Project, providing useful insights into the population structure of the three major population groups in Asia. In addition, this resource also surveyed across the genome for variation in regional patterns of LD between the HapMap and SGVP populations, and for signatures of positive natural selection using two well-established metrics: iHS and XP-EHH. The raw and processed genetic data, together with all population genetic summaries, are publicly available for download and browsing through a web browser modeled with the Generic Genome Browser.
We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.
Circular RNAs (circRNAs) are abundantly expressed in cancer. Their resistance to exonucleases enables them to have potentially stable interactions with different types of biomolecules. Alternative splicing can create different circRNA isoforms that have different sequences and unequal interaction potentials. The study of circRNA function thus requires knowledge of complete circRNA sequences. Here we describe psirc, a method that can identify full-length circRNA isoforms and quantify their expression levels from RNA sequencing data. We confirm the effectiveness and computational efficiency of psirc using both simulated and actual experimental data. Applying psirc on transcriptome profiles from nasopharyngeal carcinoma and normal nasopharynx samples, we discover and validate circRNA isoforms differentially expressed between the two groups. Compared with the assumed circular isoforms derived from linear transcript annotations, some of the alternatively spliced circular isoforms have 100 times higher expression and contain substantially fewer microRNA response elements, showing the importance of quantifying full-length circRNA isoforms.