METHODS: Fourteen datasets extracted from three published papers were used in a meta-analysis to examine the cyclic behaviour of the Arabidopsis thaliana photosynthesis-related gene CAB2 and the clock oscillator genes TOC1 and LHY in T cycles and N-H cycles.
KEY RESULTS: Changes in the rhythms of CAB2, TOC1 and LHY in plants subjected to non-24-h light:dark cycles matched the hypothesized changes in their behaviour as predicted by the solar clock model, thus validating it. The analysis further showed that TOC1 expression peaked ∼5·5 h after mid-day, CAB2 peaked close to noon, while LHY peaked ∼7·5 h after midnight, regardless of the cycle period, the photoperiod or the light:dark period ratio. The solar clock model correctly predicted the zeitgeber timing of these genes under 11 different lighting regimes comprising combinations of seven light periods, nine dark periods, four cycle periods and four light:dark period ratios. In short cycles that terminated before LHY could be expressed, the solar clock correctly predicted zeitgeber timing of its expression in the following cycle.
CONCLUSIONS: Regulation of gene phases by the solar clock enables the plant to tell the time, by which means a large number of genes are regulated. This facilitates the initiation of gene expression even before the arrival of sunrise, sunset or noon, thus allowing the plant to 'anticipate' dawn, dusk or mid-day respectively, independently of the photoperiod.
FINDINGS: We optimized the assembly of a Hevea bark transcriptome based on 16 Gb Illumina PE RNA-Seq reads using the Oases assembler across a range of k-mer sizes. We then assessed assembly quality based on transcript N50 length and transcript mapping statistics in relation to (a) known Hevea cDNAs with complete open reading frames, (b) a set of core eukaryotic genes and (c) Hevea genome scaffolds. This was followed by a systematic transcript mapping process where sub-assemblies from a series of incremental amounts of bark transcripts were aligned to transcripts from the entire bark transcriptome assembly. The exercise served to relate read amounts to the degree of transcript mapping level, the latter being an indicator of the coverage of gene transcripts expressed in the sample. As read amounts or datasize increased toward 16 Gb, the number of transcripts mapped to the entire bark assembly approached saturation. A colour matrix was subsequently generated to illustrate sequencing depth requirement in relation to the degree of coverage of total sample transcripts.
CONCLUSIONS: We devised a procedure, the "transcript mapping saturation test", to estimate the amount of RNA-Seq reads needed for deep coverage of transcriptomes. For Hevea de novo assembly, we propose generating between 5-8 Gb reads, whereby around 90% transcript coverage could be achieved with optimized k-mers and transcript N50 length. The principle behind this methodology may also be applied to other non-model plants, or with reads from other second generation sequencing platforms.
RESULTS: Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC3-rich genes (GC3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures.
CONCLUSIONS: We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC3-rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops.
REVIEWERS: This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.