Affiliations 

  • 1 Biotechnology & Breeding Department, Sime Darby Plantation R&D Centre, 43400, Serdang, Selangor, Malaysia. kwong.qi.bin@simedarbyplantation.com
  • 2 Biotechnology & Breeding Department, Sime Darby Plantation R&D Centre, 43400, Serdang, Selangor, Malaysia
  • 3 Department of Biological Sciences, National University of Singapore, Singapore, 117543, Singapore
  • 4 School of Biosciences, University of Nottingham, Sutton Bonington Campus, Nr, Loughborough, LE12 5RD, UK
  • 5 Institute of Biological Sciences, University Malaya, 50603, Kuala Lumpur, Malaysia
  • 6 Institute of Biological Sciences, University Malaya, 50603, Kuala Lumpur, Malaysia. jennihari@um.edu.my
BMC Genet, 2017 Dec 11;18(1):107.
PMID: 29228905 DOI: 10.1186/s12863-017-0576-5

Abstract

BACKGROUND: Genomic selection (GS) uses genome-wide markers as an attempt to accelerate genetic gain in breeding programs of both animals and plants. This approach is particularly useful for perennial crops such as oil palm, which have long breeding cycles, and for which the optimal method for GS is still under debate. In this study, we evaluated the effect of different marker systems and modeling methods for implementing GS in an introgressed dura family derived from a Deli dura x Nigerian dura (Deli x Nigerian) with 112 individuals. This family is an important breeding source for developing new mother palms for superior oil yield and bunch characters. The traits of interest selected for this study were fruit-to-bunch (F/B), shell-to-fruit (S/F), kernel-to-fruit (K/F), mesocarp-to-fruit (M/F), oil per palm (O/P) and oil-to-dry mesocarp (O/DM). The marker systems evaluated were simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). RR-BLUP, Bayesian A, B, Cπ, LASSO, Ridge Regression and two machine learning methods (SVM and Random Forest) were used to evaluate GS accuracy of the traits.

RESULTS: The kinship coefficient between individuals in this family ranged from 0.35 to 0.62. S/F and O/DM had the highest genomic heritability, whereas F/B and O/P had the lowest. The accuracies using 135 SSRs were low, with accuracies of the traits around 0.20. The average accuracy of machine learning methods was 0.24, as compared to 0.20 achieved by other methods. The trait with the highest mean accuracy was F/B (0.28), while the lowest were both M/F and O/P (0.18). By using whole genomic SNPs, the accuracies for all traits, especially for O/DM (0.43), S/F (0.39) and M/F (0.30) were improved. The average accuracy of machine learning methods was 0.32, compared to 0.31 achieved by other methods.

CONCLUSION: Due to high genomic resolution, the use of whole-genome SNPs improved the efficiency of GS dramatically for oil palm and is recommended for dura breeding programs. Machine learning slightly outperformed other methods, but required parameters optimization for GS implementation.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.