RESULTS: We analyzed the whole-genome deep sequencing data (~ 30×) of five native trios from Peninsular Malaysia and North Borneo, and characterized the genomic variants, including single nucleotide variants (SNVs), small insertions and deletions (indels) and copy number variants (CNVs). We discovered approximately 6.9 million SNVs, 1.2 million indels, and 9000 CNVs in the 15 samples, of which 2.7% SNVs, 2.3% indels and 22% CNVs were novel, implying the insufficient coverage of population diversity in existing databases. We identified a higher proportion of novel variants in the Orang Asli (OA) samples, i.e., the indigenous people from Peninsular Malaysia, than that of the North Bornean (NB) samples, likely due to more complex demographic history and long-time isolation of the OA groups. We used the pedigree information to identify de novo variants and estimated the autosomal mutation rates to be 0.81 × 10- 8 - 1.33 × 10- 8, 1.0 × 10- 9 - 2.9 × 10- 9, and ~ 0.001 per site per generation for SNVs, indels, and CNVs, respectively. The trio-genomes also allowed for haplotype phasing with high accuracy, which serves as references to the future genomic studies of OA and NB populations. In addition, high-frequency inherited CNVs specific to OA or NB were identified. One example is a 50-kb duplication in DEFA1B detected only in the Negrito trios, implying plausible effects on host defense against the exposure of diverse microbial in tropical rainforest environment of these hunter-gatherers. The CNVs shared between OA and NB groups were much fewer than those specific to each group. Nevertheless, we identified a 142-kb duplication in AMY1A in all the 15 samples, and this gene is associated with the high-starch diet. Moreover, novel insertions shared with archaic hominids were identified in our samples.
CONCLUSION: Our study presents a full catalogue of the genome variants of the native Malaysian populations, which is a complement of the genome diversity in Southeast Asians. It implies specific population history of the native inhabitants, and demonstrated the necessity of more genome sequencing efforts on the multi-ethnic native groups of Malaysia and Southeast Asia.
Methods: In the present study, displacement loop (D-loop) sequences were used to evaluate the genetic relationship and diversity of seven tilapia populations that are widely cultured in China; this was done specifically to speculate on the maternal ancestry of red tilapia strains. Three red tilapia varieties of Oreochromis ssp., Taiwan (TW), Israel (IL), and Malaysia (MY) strains and other populations, including O. aureus (AR), O. niloticus (NL), O. mossambicus (MS), and the GIFT strain of O. niloticus, were collected and analyzed in this study.
Results: A total of 146 polymorphic sites and 32 haplotypes of D-loop sequences were detected among 332 fish and four major haplotypes were shared among the populations. The TW and NL populations had a greater number of haplotypes (20 and 8, respectively). The haplotype diversity (Hd) and nucleotide diversity (π) of each population ranged from 0.234 to 0.826, and 0 to 0.060, respectively. The significant positive Tajima's D value of neutral test were detected in the NL, IL, and MY populations (P 0.05). The nearest K2P genetic distance (D = 0.014) was detected between the MS and TW populations, whereas, the farthest (D = 0.101) was found between the GIFT and AR populations. The results from the molecular variance analysis (AMOVA) showed that there was an extremely significant genetic variation observed among the populations (P