DATA DESCRIPTION: The transcriptome of trunk tissues from healthy A. malaccensis, and naturally and artificially induced trees were sequenced using Illumina HiSeq™ 4000 platform which resulted in a total of 38.4 Gb clean reads with Q30 rate of at least 91%. The transcriptome consists of 85,986 unigenes containing 1305 bases on average which were annotated against several databases. From this, 44,654 unigenes were mapped to 290 metabolic pathways in the Kyoto Encyclopedia of Genes and Genomes database. These transcriptome data represent considerable contribution towards Aquilaria transcriptome data and enhance current knowledge in comprehending the molecular mechanisms underlying agarwood formation in Aquilaria spp.
RESULTS: One of the samples was successfully sequenced with enough sequencing yield for further analysis. After depleting the reads mapped to host DNA, the remaining reads were shown to map to Theileria orientalis using BLAST and OneCodex. Although the reads were also mapped to Clostridium botulinum, those were found to be artifacts derived from the cow genome. An effort to construct a consensus sequence was successful using a reference-based approach with Pomoxis. Hence, we concluded that the asymptomatic cow might be infected with T. orientalis and showed the usefulness of sequencing technology, specifically the MinION platform, in a developing country.
METHODS: A total of 322 samples of mainly human origin were analysed using eight protocols, applying a wide variety of laboratory components. Several samples (60% of human specimens) were processed using different protocols. In total, 712 sequencing libraries were investigated for viral sequence contamination.
RESULTS: Among sequences showing similarity to viruses, 493 were significantly associated with the use of laboratory components. Each of these viral sequences had sporadic appearance, only being identified in a subset of the samples treated with the linked laboratory component, and some were not identified in the non-template control samples. Remarkably, more than 65% of all viral sequences identified were within viral clusters linked to the use of laboratory components.
CONCLUSIONS: We show that high prevalence of contaminating viral sequences can be expected in HTS-based virome data and provide an extensive list of novel contaminating viral sequences that can be used for evaluation of viral findings in future virome and metagenome studies. Moreover, we show that detection can be problematic due to stochastic appearance and limited non-template controls. Although the exact origin of these viral sequences requires further research, our results support laboratory-component-linked viral sequence contamination of both biological and synthetic origin.
MATERIALS AND METHODS: The object of the study were samples of biological substrates (leukocyte mass, saliva, urine) taken from patients who underwent liver and kidney transplantation. Detection of CMV DNA was carried out by a real-time PCR using commercial diagnostic AmpliSense CMV-FL test systems (Central Research Institute for Epidemiology, Moscow, Russia). DNA extraction was performed using DNA-sorb AM and DNA-sorb V kits (Central Research Institute for Epidemiology) in accordance with manufacturer's manual. The quality of the prepared DNA library for sequencing was assessed by means of the QIAxcel Advanced System capillary gel electrophoresis system (QIAGEN, Germany). Alignment and assembly of nucleotide sequences were carried out using CLC Genomics Workbench 5.5 software (CLC bio, USA). The sequencing results were analyzed using BLAST of NCBI server.
RESULTS: CMV DNA samples were selected for genotyping. The two variable genes, UL55(gB) and UL73(gN), were used for CMV genotype determination, which was performed using NGS technology MiSeq sequencer (Illumina, USA). Based on the exploratory studies and analysis of literature sources, primers for genotyping on the UL55(gB) and UL73(gN) genes have been selected and the optimal conditions for the PCR reaction have been defined. The results of sequencing the UL55(gB) and UL73(gN) gene fragments of CMV clinical isolates from recipients of solid organs made it possible to determine the virus genotypes, among which gB2, gN4c, and gN4b were dominant. In some cases, association of two and three CMV genotypes has been revealed.
CONCLUSION: The application of the NGS technology for genotyping cytomegalovirus strains can become one of the main methods of CMV infection molecular epidemiology, as it allows for obtaining reliable results with a significant reduction in research time.