RESULTS: More than 15,000 partial sequences were generated from the 5' and 3' ends of clones randomly selected from an E. tenella second generation merozoite full-length cDNA library. Clustering of these sequences produced 1,529 unique transcripts (UTs). Based on the transcript assembly and subsequently primer walking, 433 full-length cDNA sequences were successfully generated. These sequences varied in length, ranging from 441 bp to 3,083 bp, with an average size of 1,647 bp. Simple sequence repeat (SSR) analysis identified CAG as the most abundant trinucleotide motif, while codon usage analysis revealed that the ten most infrequently used codons in E. tenella are UAU, UGU, GUA, CAU, AUA, CGA, UUA, CUA, CGU and AGU. Subsequent analysis of the E. tenella complete coding sequences identified 25 putative secretory and 60 putative surface proteins, all of which are now rational candidates for development as recombinant vaccines or drug targets in the effort to control avian coccidiosis.
CONCLUSIONS: This paper describes the generation and characterisation of full-length cDNA sequences from E. tenella second generation merozoites and provides new insights into the E. tenella transcriptome. The data generated will be useful for the development and validation of diagnostic and control strategies for coccidiosis and will be of value in annotation of the E. tenella genome sequence.
METHODS: Genome sequencing of RCMV ALL-03 was carried out in order to identify the open reading frame (ORF), homology comparison of ORF with other strains of CMV, phylogenetic analysis, classifying ORF with its corresponding conserved genes, and determination of functional proteins and grouping of gene families in order to obtain fundamental knowledge of the genome.
RESULTS: The present study revealed a total of 123 Coding DNA sequences (CDS) from RCMV ALL-03 with 37 conserved ORF domains as with all herpesvirus genomes. All the CDS possess similar function with RCMV-England followed by RCMV-Berlin, RCMV-Maastricht, and Human CMV. The phylogenetic analysis of RCMV ALL-03 based on conserving genes of herpes virus showed that the Malaysian RCMV isolate is closest to RCMV-English and RCMV-Berlin strains, with 99% and 97% homology, respectively. Similarly, it also demonstrated an evolutionary relationship between RCMV ALL-03 and other strains of herpesviruses from all the three subfamilies. Interestingly, betaherpesvirus subfamily, which has been shown to be more closely related with gammaherpesviruses as compared to alphaherpesviruses, shares some of the functional ORFs. In addition, the arrangement of gene blocks for RCMV ALL-03, which was conserved among herpesvirus family members was also observed in the RCMV ALL-03 genome.
CONCLUSION: Genomic analysis of RCMV ALL-03 provided an overall picture of the whole genome organization and it served as a good platform for further understanding on the divergence in the family of Herpesviridae.
RESULTS: Here, we present draft genome information for five agriculturally, biologically, medicinally, and economically important underutilized plants native to Africa: Vigna subterranea, Lablab purpureus, Faidherbia albida, Sclerocarya birrea, and Moringa oleifera. Assembled genomes range in size from 217 to 654 Mb. In V. subterranea, L. purpureus, F. albida, S. birrea, and M. oleifera, we have predicted 31,707, 20,946, 28,979, 18,937, and 18,451 protein-coding genes, respectively. By further analyzing the expansion and contraction of selected gene families, we have characterized root nodule symbiosis genes, transcription factors, and starch biosynthesis-related genes in these genomes.
CONCLUSIONS: These genome data will be useful to identify and characterize agronomically important genes and understand their modes of action, enabling genomics-based, evolutionary studies, and breeding strategies to design faster, more focused, and predictable crop improvement programs.
FINDINGS: We optimized the assembly of a Hevea bark transcriptome based on 16 Gb Illumina PE RNA-Seq reads using the Oases assembler across a range of k-mer sizes. We then assessed assembly quality based on transcript N50 length and transcript mapping statistics in relation to (a) known Hevea cDNAs with complete open reading frames, (b) a set of core eukaryotic genes and (c) Hevea genome scaffolds. This was followed by a systematic transcript mapping process where sub-assemblies from a series of incremental amounts of bark transcripts were aligned to transcripts from the entire bark transcriptome assembly. The exercise served to relate read amounts to the degree of transcript mapping level, the latter being an indicator of the coverage of gene transcripts expressed in the sample. As read amounts or datasize increased toward 16 Gb, the number of transcripts mapped to the entire bark assembly approached saturation. A colour matrix was subsequently generated to illustrate sequencing depth requirement in relation to the degree of coverage of total sample transcripts.
CONCLUSIONS: We devised a procedure, the "transcript mapping saturation test", to estimate the amount of RNA-Seq reads needed for deep coverage of transcriptomes. For Hevea de novo assembly, we propose generating between 5-8 Gb reads, whereby around 90% transcript coverage could be achieved with optimized k-mers and transcript N50 length. The principle behind this methodology may also be applied to other non-model plants, or with reads from other second generation sequencing platforms.