Displaying publications 1 - 20 of 51 in total

  1. Cai L, Xi Z, Amorim AM, Sugumaran M, Rest JS, Liu L, et al.
    New Phytol, 2019 01;221(1):565-576.
    PMID: 30030969 DOI: 10.1111/nph.15357
    Whole-genome duplications (WGDs) are widespread and prevalent in vascular plants and frequently coincide with major episodes of global and climatic upheaval, including the mass extinction at the Cretaceous-Tertiary boundary (c. 65 Ma) and during more recent periods of global aridification in the Miocene (c. 10-5 Ma). Here, we explore WGDs in the diverse flowering plant clade Malpighiales. Using transcriptomes and complete genomes from 42 species, we applied a multipronged phylogenomic pipeline to identify, locate, and determine the age of WGDs in Malpighiales using three means of inference: distributions of synonymous substitutions per synonymous site (Ks ) among paralogs, phylogenomic (gene tree) reconciliation, and a likelihood-based gene-count method. We conservatively identify 22 ancient WGDs, widely distributed across Malpighiales subclades. Importantly, these events are clustered around the Eocene-Paleocene transition (c. 54 Ma), during which time the planet was warmer and wetter than any period in the Cenozoic. These results establish that the Eocene Climatic Optimum likely represents a previously unrecognized period of prolific WGDs in plants, and lends further support to the hypothesis that polyploidization promotes adaptation and enhances plant survival during episodes of global change, especially for tropical organisms like Malpighiales, which have tight thermal tolerances.
    Matched MeSH terms: Genome, Plant*
  2. Ji YT, Xiu Z, Chen CH, Wang Y, Yang JX, Sui JJ, et al.
    Mol Ecol Resour, 2021 May;21(4):1243-1255.
    PMID: 33421343 DOI: 10.1111/1755-0998.13318
    Chinese mahogany (Toona sinensis) is a woody plant that is widely cultivated in China and Malaysia. Toona sinensis is important economically, including as a nutritious food source, as material for traditional Chinese medicine and as a high-quality hardwood. However, the absence of a reference genome has hindered in-depth molecular and evolutionary studies of this plant. In this study, we report a high-quality T. sinensis genome assembly, with scaffolds anchored to 28 chromosomes and a total assembled length of 596 Mb (contig N50 = 1.5 Mb and scaffold N50 = 21.5 Mb). A total of 34,345 genes were predicted in the genome after homology-based and de novo annotation analyses. Evolutionary analysis showed that the genomes of T. sinensis and Populus trichocarpa diverged ~99.1-103.1 million years ago, and the T. sinensis genome underwent a recent genome-wide duplication event at ~7.8 million years and one more ancient whole genome duplication event at ~71.5 million years. These results provide a high-quality chromosome-level reference genome for T. sinensis and confirm its evolutionary position at the genomic level. Such information will offer genomic resources to study the molecular mechanism of terpenoid biosynthesis and the formation of flavour compounds, which will further facilitate its molecular breeding. As the first chromosome-level genome assembled in the family Meliaceae, it will provide unique insights into the evolution of members of the Meliaceae.
    Matched MeSH terms: Genome, Plant*
  3. Ng KKS, Kobayashi MJ, Fawcett JA, Hatakeyama M, Paape T, Ng CH, et al.
    Commun Biol, 2021 10 07;4(1):1166.
    PMID: 34620991 DOI: 10.1038/s42003-021-02682-1
    Hyperdiverse tropical rainforests, such as the aseasonal forests in Southeast Asia, are supported by high annual rainfall. Its canopy is dominated by the species-rich tree family of Dipterocarpaceae (Asian dipterocarps), which has both ecological (e.g., supports flora and fauna) and economical (e.g., timber production) importance. Recent ecological studies suggested that rare irregular drought events may be an environmental stress and signal for the tropical trees. We assembled the genome of a widespread but near threatened dipterocarp, Shorea leprosula, and analyzed the transcriptome sequences of ten dipterocarp species representing seven genera. Comparative genomic and molecular dating analyses suggested a whole-genome duplication close to the Cretaceous-Paleogene extinction event followed by the diversification of major dipterocarp lineages (i.e. Dipterocarpoideae). Interestingly, the retained duplicated genes were enriched for genes upregulated by no-irrigation treatment. These findings provide molecular support for the relevance of drought for tropical trees despite the lack of an annual dry season.
    Matched MeSH terms: Genome, Plant*
  4. Cai L, Arnold BJ, Xi Z, Khost DE, Patel N, Hartmann CB, et al.
    Curr Biol, 2021 03 08;31(5):1002-1011.e9.
    PMID: 33485466 DOI: 10.1016/j.cub.2020.12.045
    Despite more than 2,000-fold variation in genome size, key features of genome architecture are largely conserved across angiosperms. Parasitic plants have elucidated the many ways in which genomes can be modified, yet we still lack comprehensive genome data for species that represent the most extreme form of parasitism. Here, we present the highly modified genome of the iconic endophytic parasite Sapria himalayana Griff. (Rafflesiaceae), which lacks a typical plant body. First, 44% of the genes conserved in eurosids are lost in Sapria, dwarfing previously reported levels of gene loss in vascular plants. These losses demonstrate remarkable functional convergence with other parasitic plants, suggesting a common genetic roadmap underlying the evolution of plant parasitism. Second, we identified extreme disparity in intron size among retained genes. This includes a category of genes with introns longer than any so far observed in angiosperms, nearing 100 kb in some cases, and a second category of genes with exceptionally short or absent introns. Finally, at least 1.2% of the Sapria genome, including both genic and intergenic content, is inferred to be derived from host-to-parasite horizontal gene transfers (HGTs) and includes genes potentially adaptive for parasitism. Focused phylogenomic reconstruction of HGTs reveals a hidden history of former host-parasite associations involving close relatives of Sapria's modern hosts in the grapevine family. Our findings offer a unique perspective into how deeply angiosperm genomes can be altered to fit an extreme form of plant parasitism and demonstrate the value of HGTs as DNA fossils to investigate extinct symbioses.
    Matched MeSH terms: Genome, Plant/genetics*
  5. Zhao H, Zhao S, International Network for Bamboo and Rattan, Fei B, Liu H, Yang H, et al.
    Gigascience, 2017 07 01;6(7):1-7.
    PMID: 28637269 DOI: 10.1093/gigascience/gix046
    Bamboo and rattan are widely grown for manufacturing, horticulture, and agroforestry. Bamboo and rattan production might help reduce poverty, boost economic growth, mitigate climate change, and protect the natural environment. Despite progress in research, sufficient molecular and genomic resources to study these species are lacking. We launched the Genome Atlas of Bamboo and Rattan (GABR) project, a comprehensive, coordinated international effort to accelerate understanding of bamboo and rattan genetics through genome analysis. GABR includes 2 core subprojects: Bamboo-T1K (Transcriptomes of 1000 Bamboos) and Rattan-G5 (Genomes of 5 Rattans), and several other subprojects. Here we describe the organization, directions, and status of GABR.
    Matched MeSH terms: Genome, Plant*
  6. Chan KL, Tatarinova TV, Rosli R, Amiruddin N, Azizi N, Halim MAA, et al.
    Biol. Direct, 2017 Sep 08;12(1):21.
    PMID: 28886750 DOI: 10.1186/s13062-017-0191-4
    BACKGROUND: Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools.

    RESULTS: Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC3-rich genes (GC3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures.

    CONCLUSIONS: We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC3-rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops.

    REVIEWERS: This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.

    Matched MeSH terms: Genome, Plant*
  7. Mazumdar P, Binti Othman R, Mebus K, Ramakrishnan N, Ann Harikrishna J
    Ann Bot, 2017 Nov 28;120(6):893-909.
    PMID: 29155926 DOI: 10.1093/aob/mcx112
    Background and Aims: Studies on codon usage in monocots have focused on grasses, and observed patterns of this taxon were generalized to all monocot species. Here, non-grass monocot species were analysed to investigate the differences between grass and non-grass monocots.

    Methods: First, studies of codon usage in monocots were reviewed. The current information was then extended regarding codon usage, as well as codon-pair context bias, using four completely sequenced non-grass monocot genomes (Musa acuminata, Musa balbisiana, Phoenix dactylifera and Spirodela polyrhiza) for which comparable transcriptome datasets are available. Measurements were taken regarding relative synonymous codon usage, effective number of codons, derived optimal codon and GC content and then the relationships investigated to infer the underlying evolutionary forces.

    Key Results: The research identified optimal codons, rare codons and preferred codon-pair context in the non-grass monocot species studied. In contrast to the bimodal distribution of GC3 (GC content in third codon position) in grasses, non-grass monocots showed a unimodal distribution. Disproportionate use of G and C (and of A and T) in two- and four-codon amino acids detected in the analysis rules out the mutational bias hypothesis as an explanation of genomic variation in GC content. There was found to be a positive relationship between CAI (codon adaptation index; predicts the level of expression of a gene) and GC3. In addition, a strong correlation was observed between coding and genomic GC content and negative correlation of GC3 with gene length, indicating a strong impact of GC-biased gene conversion (gBGC) in shaping codon usage and nucleotide composition in non-grass monocots.

    Conclusion: Optimal codons in these non-grass monocots show a preference for G/C in the third codon position. These results support the concept that codon usage and nucleotide composition in non-grass monocots are mainly driven by gBGC.

    Matched MeSH terms: Genome, Plant*
  8. Sahebi M, Hanafi MM, van Wijnen AJ, Rice D, Rafii MY, Azizi P, et al.
    Gene, 2018 Jul 30;665:155-166.
    PMID: 29684486 DOI: 10.1016/j.gene.2018.04.050
    Plants maintain extensive growth flexibility under different environmental conditions, allowing them to continuously and rapidly adapt to alterations in their environment. A large portion of many plant genomes consists of transposable elements (TEs) that create new genetic variations within plant species. Different types of mutations may be created by TEs in plants. Many TEs can avoid the host's defense mechanisms and survive alterations in transposition activity, internal sequence and target site. Thus, plant genomes are expected to utilize a variety of mechanisms to tolerate TEs that are near or within genes. TEs affect the expression of not only nearby genes but also unlinked inserted genes. TEs can create new promoters, leading to novel expression patterns or alternative coding regions to generate alternate transcripts in plant species. TEs can also provide novel cis-acting regulatory elements that act as enhancers or inserts within original enhancers that are required for transcription. Thus, the regulation of plant gene expression is strongly managed by the insertion of TEs into nearby genes. TEs can also lead to chromatin modifications and thereby affect gene expression in plants. TEs are able to generate new genes and modify existing gene structures by duplicating, mobilizing and recombining gene fragments. They can also facilitate cellular functions by sharing their transposase-coding regions. Hence, TE insertions can not only act as simple mutagens but can also alter the elementary functions of the plant genome. Here, we review recent discoveries concerning the contribution of TEs to gene expression in plant genomes and discuss the different mechanisms by which TEs can affect plant gene expression and reduce host defense mechanisms.
    Matched MeSH terms: Genome, Plant/physiology*
  9. Teh CK, Lee HL, Abidin H, Ong AL, Mayes S, Chew FT, et al.
    BMC Plant Biol, 2019 Nov 05;19(1):470.
    PMID: 31690276 DOI: 10.1186/s12870-019-2062-x
    BACKGROUND: Legitimacy in breeding and commercial crop production depends on optimised protocols to ensure purity of crosses and correct field planting of material. In oil palm, the presence of three fruit forms permits these assumptions to be tested, although only after field planting. The presence of incorrect fruit forms in a cross is a clear sign of illegitimacy. Given that tenera forms produce 30% more oil for the same weight of fruit as dura, the presence of low levels of dura contamination can have major effect during the economic lifespan of an oil palm, which is around 25 years. We evaluated two methods for legitimacy test 1) The use of SHELL markers to the gene that determines the shell-thickness trait 2) The use of SNP markers, to determine the legitimacy of the cross.

    RESULTS: Our results indicate that the SHELL markers can theoretically reduce the major losses due to dura contamination of tenera planting material. However, these markers cannot distinguish illegitimate tenera, which reduces the value of having bred elite tenera for commercial planting and in the breeding programme, where fruit form is of limited utility, and incorrect identity could lead to significant problems. We propose an optimised approach using SNPs for routine quality control.

    CONCLUSIONS: Both dura and tenera contamination can be identified and removed at or before the nursery stage. An optimised legitimacy assay using SNP markers coupled with a suitable sampling scheme is now ready to be deployed as a standard control for seed production and breeding in oil palm. The same approach will also be an effective solution for other perennial crops, such as coconut and date palm.

    Matched MeSH terms: Genome, Plant*
  10. Adedze YMN, Lu X, Xia Y, Sun Q, Nchongboh CG, Alam MA, et al.
    Sci Rep, 2021 02 16;11(1):3872.
    PMID: 33594240 DOI: 10.1038/s41598-021-83313-x
    Insertion and Deletion (InDel) are common features in genomes and are associated with genetic variation. The whole-genome re-sequencing data from two parents (X1 and X2) of the elite cucumber (Cucumis sativus) hybrid variety Lvmei No.1 was used for genome-wide InDel polymorphisms analysis. Obtained sequence reads were mapped to the genome reference sequence of Chinese fresh market type inbred line '9930' and gaps conforming to InDel were pinpointed. Further, the level of cross-parents polymorphism among five pairs of cucumber breeding parents and their corresponding hybrid varieties were used for evaluating hybrid seeds purity test efficiency of InDel markers. A panel of 48 cucumber breeding lines was utilized for PCR amplification versatility and phylogenetic analysis of these markers. In total, 10,470 candidate InDel markers were identified for X1 and X2. Among these, 385 markers with more than 30 nucleotide difference were arbitrary chosen. These markers were selected for experimental resolvability through electrophoresis on an Agarose gel. Two hundred and eleven (211) accounting for 54.81% of markers could be validated as single and clear polymorphic pattern while 174 (45.19%) showed unclear or monomorphic genetic bands between X1 and X2. Cross-parents polymorphism evaluation recorded 68 (32.23%) of these markers, which were designated as cross-parents transferable (CPT) InDel markers. Interestingly, the marker InDel114 presented experimental transferability between cucumber and melon. A panel of 48 cucumber breeding lines including parents of Lvmei No. 1 subjected to PCR amplification versatility using CPT InDel markers successfully clustered them into fruit and common cucumber varieties based on phylogenetic analysis. It is worth noting that 16 of these markers were predominately associated to enzymatic activities in cucumber. These agarose-based InDel markers could constitute a valuable resource for hybrid seeds purity testing, germplasm classification and marker-assisted breeding in cucumber.
    Matched MeSH terms: Genome, Plant*
  11. Goh HH
    Adv Exp Med Biol, 2018 11 2;1102:69-80.
    PMID: 30382569 DOI: 10.1007/978-3-319-98758-3_5
    This chapter introduces different aspects of bioinformatics with a brief discussion in the systems biology context. Example applications in network pharmacology of traditional Chinese medicine, systems metabolic engineering, and plant genome-scale modelling are described. Lastly, this chapter concludes on how bioinformatics helps to integrate omics data derived from various studies described in previous chapters for a holistic understanding of secondary metabolite production in P. minus.
    Matched MeSH terms: Genome, Plant
  12. Keong BP, Harikrishna JA
    Biochem Genet, 2012 Feb;50(1-2):135-45.
    PMID: 22089543 DOI: 10.1007/s10528-011-9479-8
    A preliminary screening was conducted on BC3F1 and BC4F1 backcross families developed from crossing Oryza sativa (MR219) and O. rufipogon (IRGC105491). Despite earlier results showing that O. rufipogon alleles (wild introgression) contributed to both number of panicles (qPPL-2) and tillers (qTPL-2) at loci RM250, RM208, and RM48 in line A20 of the BC2F2 population, we observed that wild introgression was lost at loci RM250 and RM208 but retained at locus RM48 in BC3F1 and BC4F1. Progeny tests conducted utilizing genotype and phenotype data on both BC4F1 and a reference population, BC2F7 (A20 line), did not show significant differences between groups having the MR219 allele and wild introgression at locus RM48. This suggests that there is no additive and transgressive effect of wild introgression in the BC3F1 and BC4F1 generated. The presence of wild introgression was largely due to gene contamination by cross-pollination during field breeding practices.
    Matched MeSH terms: Genome, Plant
  13. Sablok G, Pérez-Pulido AJ, Do T, Seong TY, Casimiro-Soriguer CS, La Porta N, et al.
    Front Plant Sci, 2016;7:878.
    PMID: 27446111 DOI: 10.3389/fpls.2016.00878
    Analysis of repetitive DNA sequence content and divergence among the repetitive functional classes is a well-accepted approach for estimation of inter- and intra-generic differences in plant genomes. Among these elements, microsatellites, or Simple Sequence Repeats (SSRs), have been widely demonstrated as powerful genetic markers for species and varieties discrimination. We present PlantFuncSSRs platform having more than 364 plant species with more than 2 million functional SSRs. They are provided with detailed annotations for easy functional browsing of SSRs and with information on primer pairs and associated functional domains. PlantFuncSSRs can be leveraged to identify functional-based genic variability among the species of interest, which might be of particular interest in developing functional markers in plants. This comprehensive on-line portal unifies mining of SSRs from first and next generation sequencing datasets, corresponding primer pairs and associated in-depth functional annotation such as gene ontology annotation, gene interactions and its identification from reference protein databases. PlantFuncSSRs is freely accessible at: http://www.bioinfocabd.upo.es/plantssr.
    Matched MeSH terms: Genome, Plant
  14. Izan S, Esselink D, Visser RGF, Smulders MJM, Borm T
    Front Plant Sci, 2017;8:1271.
    PMID: 28824658 DOI: 10.3389/fpls.2017.01271
    Whole Genome Shotgun (WGS) sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This re-sequencing approach may select against structural differences between the genomes especially in non-model species for which no close relatives have been sequenced before. The alternative approach is to de novo assemble the chloroplast genome from total genomic DNA sequences. In this study, we used k-mer frequency tables to identify and extract the chloroplast reads from the WGS reads and assemble these using a highly integrated and automated custom pipeline. Our strategy includes steps aimed at optimizing assemblies and filling gaps which are left due to coverage variation in the WGS dataset. We have successfully de novo assembled three complete chloroplast genomes from plant species with a range of nuclear genome sizes to demonstrate the universality of our approach: Solanum lycopersicum (0.9 Gb), Aegilops tauschii (4 Gb) and Paphiopedilum henryanum (25 Gb). We also highlight the need to optimize the choice of k and the amount of data used. This new and cost-effective method for de novo short read assembly will facilitate the study of complete chloroplast genomes with more accurate analyses and inferences, especially in non-model plant genomes.
    Matched MeSH terms: Genome, Plant
  15. Mohd-Yusoff NF, Ruperao P, Tomoyoshi NE, Edwards D, Gresshoff PM, Biswas B, et al.
    G3 (Bethesda), 2015 Apr;5(4):559-67.
    PMID: 25660167 DOI: 10.1534/g3.114.014571
    Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm.
    Matched MeSH terms: Genome, Plant*
  16. Singh R, Ong-Abdullah M, Low ET, Manaf MA, Rosli R, Nookiah R, et al.
    Nature, 2013 Aug 15;500(7462):335-9.
    PMID: 23883927 DOI: 10.1038/nature12309
    Oil palm is the most productive oil-bearing crop. Although it is planted on only 5% of the total world vegetable oil acreage, palm oil accounts for 33% of vegetable oil and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8-gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. A total of 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators, which are highly expressed in the kernel. We also report the draft sequence of the South American oil palm Elaeis oleifera, which has the same number of chromosomes (2n = 32) and produces fertile interspecific hybrids with E. guineensis but seems to have diverged in the New World. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations that restrict the use of clones in commercial plantings, and should therefore help to achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop.
    Matched MeSH terms: Genome, Plant/genetics*
  17. Kwong QB, Teh CK, Ong AL, Heng HY, Lee HL, Mohamed M, et al.
    Mol Plant, 2016 Aug 01;9(8):1132-1141.
    PMID: 27112659 DOI: 10.1016/j.molp.2016.04.010
    High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle.
    Matched MeSH terms: Genome, Plant/genetics
  18. Lau NS, Makita Y, Kawashima M, Taylor TD, Kondo S, Othman AS, et al.
    Sci Rep, 2016 06 24;6:28594.
    PMID: 27339202 DOI: 10.1038/srep28594
    Hevea brasiliensis Muell. Arg, a member of the family Euphorbiaceae, is the sole natural resource exploited for commercial production of high-quality natural rubber. The properties of natural rubber latex are almost irreplaceable by synthetic counterparts for many industrial applications. A paucity of knowledge on the molecular mechanisms of rubber biosynthesis in high yield traits still persists. Here we report the comprehensive genome-wide analysis of the widely planted H. brasiliensis clone, RRIM 600. The genome was assembled based on ~155-fold combined coverage with Illumina and PacBio sequence data and has a total length of 1.55 Gb with 72.5% comprising repetitive DNA sequences. A total of 84,440 high-confidence protein-coding genes were predicted. Comparative genomic analysis revealed strong synteny between H. brasiliensis and other Euphorbiaceae genomes. Our data suggest that H. brasiliensis's capacity to produce high levels of latex can be attributed to the expansion of rubber biosynthesis-related genes in its genome and the high expression of these genes in latex. Using cap analysis gene expression data, we illustrate the tissue-specific transcription profiles of rubber biosynthesis-related genes, revealing alternative means of transcriptional regulation. Our study adds to the understanding of H. brasiliensis biology and provides valuable genomic resources for future agronomic-related improvement of the rubber tree.
    Matched MeSH terms: Genome, Plant/genetics*
  19. Chan KL, Rosli R, Tatarinova TV, Hogan M, Firdaus-Raih M, Low EL
    BMC Bioinformatics, 2017 Jan 27;18(Suppl 1):1426.
    PMID: 28466793 DOI: 10.1186/s12859-016-1426-6
    BACKGROUND: Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion.

    RESULTS: We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure).

    CONCLUSIONS: Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.

    Matched MeSH terms: Genome, Plant/genetics*
  20. Kwong QB, Teh CK, Ong AL, Chew FT, Mayes S, Kulaveerasingam H, et al.
    BMC Genet, 2017 Dec 11;18(1):107.
    PMID: 29228905 DOI: 10.1186/s12863-017-0576-5
    BACKGROUND: Genomic selection (GS) uses genome-wide markers as an attempt to accelerate genetic gain in breeding programs of both animals and plants. This approach is particularly useful for perennial crops such as oil palm, which have long breeding cycles, and for which the optimal method for GS is still under debate. In this study, we evaluated the effect of different marker systems and modeling methods for implementing GS in an introgressed dura family derived from a Deli dura x Nigerian dura (Deli x Nigerian) with 112 individuals. This family is an important breeding source for developing new mother palms for superior oil yield and bunch characters. The traits of interest selected for this study were fruit-to-bunch (F/B), shell-to-fruit (S/F), kernel-to-fruit (K/F), mesocarp-to-fruit (M/F), oil per palm (O/P) and oil-to-dry mesocarp (O/DM). The marker systems evaluated were simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). RR-BLUP, Bayesian A, B, Cπ, LASSO, Ridge Regression and two machine learning methods (SVM and Random Forest) were used to evaluate GS accuracy of the traits.

    RESULTS: The kinship coefficient between individuals in this family ranged from 0.35 to 0.62. S/F and O/DM had the highest genomic heritability, whereas F/B and O/P had the lowest. The accuracies using 135 SSRs were low, with accuracies of the traits around 0.20. The average accuracy of machine learning methods was 0.24, as compared to 0.20 achieved by other methods. The trait with the highest mean accuracy was F/B (0.28), while the lowest were both M/F and O/P (0.18). By using whole genomic SNPs, the accuracies for all traits, especially for O/DM (0.43), S/F (0.39) and M/F (0.30) were improved. The average accuracy of machine learning methods was 0.32, compared to 0.31 achieved by other methods.

    CONCLUSION: Due to high genomic resolution, the use of whole-genome SNPs improved the efficiency of GS dramatically for oil palm and is recommended for dura breeding programs. Machine learning slightly outperformed other methods, but required parameters optimization for GS implementation.

    Matched MeSH terms: Genome, Plant*
Contact Us

Please provide feedback to Administrator (tengcl@gmail.com)

External Links