RESULTS: We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure).
CONCLUSIONS: Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.
RESULTS: Here, we have undertaken further analysis of role of OsFAD2-1 in the developing rice grain. The use of Illumina-based NGS transcriptomics analysis of developing rice grain reveals that knockdown of Os-FAD2-1 gene expression was accompanied by the down regulation of the expression of a number of key genes in the lipid biosynthesis pathway in the HO rice line. A slightly higher level of oil accumulation was also observed in the HO-RBO.
CONCLUSION: Prominent among the down regulated genes were those that coded for FatA, LACS, SAD2, SAD5, caleosin and steroleosin. It may be possible to further increase the oleic acid content in rice oil by altering the expression of the lipid biosynthetic genes that are affected in the HO line.