DATA DESCRIPTION: The transcriptome of trunk tissues from healthy A. malaccensis, and naturally and artificially induced trees were sequenced using Illumina HiSeq™ 4000 platform which resulted in a total of 38.4 Gb clean reads with Q30 rate of at least 91%. The transcriptome consists of 85,986 unigenes containing 1305 bases on average which were annotated against several databases. From this, 44,654 unigenes were mapped to 290 metabolic pathways in the Kyoto Encyclopedia of Genes and Genomes database. These transcriptome data represent considerable contribution towards Aquilaria transcriptome data and enhance current knowledge in comprehending the molecular mechanisms underlying agarwood formation in Aquilaria spp.
METHODS: Two parental E. guineensis individuals and 23 of their F1 progenies were collected and sequenced using the next-generation sequencing (NGS) technique on the Illumina platform. Chloroplast genomes were assembled de novo from the cleaned raw reads and aligned to check for variations. The sequences were compared and analyzed with programming language scripting and relevant bioinformatic softwares. Simple sequence repeat (SSR) loci were determined from the chloroplast genome.
RESULTS: The chloroplast genome assembly resulted in 156,983 bp, 156,988 bp, 156,982 bp, and 156,984 bp. The gene content and arrangements were consistent with the reference genome published in the GenBank database. Seventy-eight SSRs were detected in the chloroplast genome, with most located in the intergenic spacer region.The chloroplast genomes of 17 F1 progenies were exact copies of the maternal parent, while six individuals showed a single variation in the sequence. Despite the significant variation displayed by the male parent, all the nucleotide variations were synonymous. This study show highly conserve gene content and sequence in Elaeis guineensis chloroplast genomes. Maternal inheritance of chloroplast genome among F1 progenies are robust with a low possibility of mutations over generations. The findings in this study can enlighten inheritance pattern of Elaeis guineensis chloroplast genome especially among crops' scientists who consider using chloroplast genome for agronomic trait modifications.
RESULTS: One of the samples was successfully sequenced with enough sequencing yield for further analysis. After depleting the reads mapped to host DNA, the remaining reads were shown to map to Theileria orientalis using BLAST and OneCodex. Although the reads were also mapped to Clostridium botulinum, those were found to be artifacts derived from the cow genome. An effort to construct a consensus sequence was successful using a reference-based approach with Pomoxis. Hence, we concluded that the asymptomatic cow might be infected with T. orientalis and showed the usefulness of sequencing technology, specifically the MinION platform, in a developing country.
METHODS: A total of 322 samples of mainly human origin were analysed using eight protocols, applying a wide variety of laboratory components. Several samples (60% of human specimens) were processed using different protocols. In total, 712 sequencing libraries were investigated for viral sequence contamination.
RESULTS: Among sequences showing similarity to viruses, 493 were significantly associated with the use of laboratory components. Each of these viral sequences had sporadic appearance, only being identified in a subset of the samples treated with the linked laboratory component, and some were not identified in the non-template control samples. Remarkably, more than 65% of all viral sequences identified were within viral clusters linked to the use of laboratory components.
CONCLUSIONS: We show that high prevalence of contaminating viral sequences can be expected in HTS-based virome data and provide an extensive list of novel contaminating viral sequences that can be used for evaluation of viral findings in future virome and metagenome studies. Moreover, we show that detection can be problematic due to stochastic appearance and limited non-template controls. Although the exact origin of these viral sequences requires further research, our results support laboratory-component-linked viral sequence contamination of both biological and synthetic origin.