RESULTS: Here, we analyzed genetic data of 230 B. flabellifer accessions across Thailand using 17 EST-SSR and 12 gSSR polymorphic markers. Clustering analysis revealed that the population consisted of two genetic clusters (STRUCTURE K = 2). Cluster I is found mainly in southern Thailand, while Cluster II is found mainly in the northeastern. Those found in the central are of an extensive mix between the two. These two clusters are in moderate differentiation (F ST = 0.066 and N M = 3.532) and have low genetic diversity (HO = 0.371 and 0.416; AR = 2.99 and 3.19, for the cluster I and II respectively). The minimum numbers of founders for each genetic group varies from 3 to 4 individuals, based on simulation using different allele frequency assumptions. These numbers coincide with that B. flabellifer is dioecious, and a number of seeds had to be simultaneously introduced for obtaining both male and female founders.
CONCLUSIONS: From these data and geographical and historical evidence, we hypothesize that there were at least two different invasive events of B. flabellifer in Thailand. B. flabellifer was likely brought through the Straits of Malacca to be propagated in the southern Thailand as one of the invasive events before spreading to the central Thailand. The second event likely occurred in Khmer Empire, currently Cambodia, before spreading to the northeastern Thailand.
RESULTS: Our results indicate that the SHELL markers can theoretically reduce the major losses due to dura contamination of tenera planting material. However, these markers cannot distinguish illegitimate tenera, which reduces the value of having bred elite tenera for commercial planting and in the breeding programme, where fruit form is of limited utility, and incorrect identity could lead to significant problems. We propose an optimised approach using SNPs for routine quality control.
CONCLUSIONS: Both dura and tenera contamination can be identified and removed at or before the nursery stage. An optimised legitimacy assay using SNP markers coupled with a suitable sampling scheme is now ready to be deployed as a standard control for seed production and breeding in oil palm. The same approach will also be an effective solution for other perennial crops, such as coconut and date palm.
BIOLOGICAL SIGNIFICANCE: In this study, proteomic analysis was used to identify abundant proteins from total protein extracts. PEG fractionation was used to reveal lower abundant proteins from both high and low proliferation embryogenic lines of oil palm samples in tissue culture. A total of 40 protein spots were found to be significant in abundance and the mRNA levels of 12 of these were assessed using real time PCR. Three proteins namely, triosephosphate isomerase, l-ascorbate peroxidase and superoxide dismutase were found to be concordant in their mRNA expression and protein abundance. Triosephosphate isomerase is a key enzyme in glycolysis. Both l-ascorbate peroxidase and superoxide dismutase play a role in anti-oxidative scavenging defense systems. These proteins have potential for use as biomarkers to screen for high and low embryogenic oil palm samples.
RESULTS: The kinship coefficient between individuals in this family ranged from 0.35 to 0.62. S/F and O/DM had the highest genomic heritability, whereas F/B and O/P had the lowest. The accuracies using 135 SSRs were low, with accuracies of the traits around 0.20. The average accuracy of machine learning methods was 0.24, as compared to 0.20 achieved by other methods. The trait with the highest mean accuracy was F/B (0.28), while the lowest were both M/F and O/P (0.18). By using whole genomic SNPs, the accuracies for all traits, especially for O/DM (0.43), S/F (0.39) and M/F (0.30) were improved. The average accuracy of machine learning methods was 0.32, compared to 0.31 achieved by other methods.
CONCLUSION: Due to high genomic resolution, the use of whole-genome SNPs improved the efficiency of GS dramatically for oil palm and is recommended for dura breeding programs. Machine learning slightly outperformed other methods, but required parameters optimization for GS implementation.