MyMedR

Displaying all 8 publications

Abstract:

Sort:

A survey of recently emerged genome-wide computational enhancer predictor tools

Lim LWK, Chung HH, Chong YL, Lee NK

Comput Biol Chem, 2018 Jun;74:132-141.
PMID: 29602043 DOI: 10.1016/j.compbiolchem.2018.03.019

The race for the discovery of enhancers at a genome-wide scale has been on since the commencement of next generation sequencing decades after the discovery of the first enhancer, SV40. A few enhancer-predicting features such as chromatin feature, histone modifications and sequence feature had been implemented with varying success rates. However, to date, there is no consensus yet on the single enhancer marker that can be employed to ultimately distinguish and uncover enhancers from the enormous genomic regions. Many supervised, unsupervised and semi-supervised computational approaches had emerged to complement and facilitate experimental approaches in enhancer discovery. In this review, we placed our focus on the recently emerged enhancer predictor tools that work on general enhancer features such as sequences, chromatin states and histone modifications, eRNA and of multiple feature approach. Comparisons of their prediction methods and outcomes were done across their functionally similar counterparts. We provide some recommendations and insights for future development of more comprehensive and robust tools.
The first transcriptome sequencing and data analysis of the Javan mahseer (Tor tambra)

Lau MML, Lim LWK, Chung HH, Gan HM

Data Brief, 2021 Dec;39:107481.
PMID: 34712757 DOI: 10.1016/j.dib.2021.107481

The Javan mahseer (Tor tambra) is one of the most valuable freshwater fish found in Tor species. To date, other than mitogenomic data (BioProject: PRJNA422829), genomic and transcriptomic resources for this species are still lacking which is crucial to understand the molecular mechanisms associated with important traits such as growth, immune response, reproduction and sex determination. For the first time, we sequenced the transcriptome from a whole juvenile fish using Illumina NovaSEQ6000 generating raw paired-end reads. De novo transcriptome assembly generated a draft transcriptome (BUSCO5 completeness of 91.2% [Actinopterygii_odb10 database]) consisting of 259,403 putative transcripts with a total and N50 length of 333,881,215 bp and 2283 bp, respectively. A total count of 77,503 non-redundant protein coding sequences were predicted from the transcripts and used for functional annotation. We mapped the predicted proteins to 304 known KEGG pathways with signal transduction cluster having the highest representation followed by immune system and endocrine system. In addition, transcripts exhibiting significant similarity to previously published growth-and immune-related genes were identified which will facilitate future molecular breeding of Tor tambra.
Improving the phylogenetic resolution of Malaysian and Javan mahseer (Cyprinidae), Tor tambroides and Tor tambra: Whole mitogenomes sequencing, phylogeny and potential mitogenome markers

Lim LWK, Chung HH, Lau MML, Aziz F, Gan HM

Gene, 2021 Jul 30;791:145708.
PMID: 33984441 DOI: 10.1016/j.gene.2021.145708

The true mahseer (Tor spp.) is one of the highest valued fish in the world due to its high nutritional value and great unique taste. Nevertheless, its morphological characterization and single mitochondrial gene phylogeny in the past had yet to resolve the ambiguity in its taxonomical classification. In this study, we sequenced and assembled 11 complete mahseer mitogenomes collected from Java of Indonesia, Pahang and Terengganu of Peninsular Malaysia as well as Sarawak of East Malaysia. The mitogenome evolutionary relationships among closely related Tor spp. samples were investigated based on maximum likelihood phylogenetic tree construction. Compared to the commonly used COX1 gene fragment, the complete COX1, Cytb, ND2, ND4 and ND5 genes appear to be better phylogenetic markers for genetic differentiation at the population level. In addition, a total of six population-specific mitolineage haplotypes were identified among the mahseer samples analyzed, which this offers hints towards its taxonomical landscape.
Fulltext First high-quality genome assembly data of sago palm (Metroxylon sagu Rottboll)

Lim LWK, Lau MML, Chung HH, Hussain H, Gan HM

Data Brief, 2022 Feb;40:107800.
PMID: 35059482 DOI: 10.1016/j.dib.2022.107800

The sago palm (Metroxylon sagu Rottboll) is a tropical halophytic starch-producing, economically important crop palm mainly located in Southeast Asian countries. Recently, a genome survey was conducted on this palm using the Illumina sequencing platform, with a very low (21.5%) BUSCO genome completeness score, and most of them (∼78%) are either fragmented or missing. Thus, in this study, the sago palm genome completeness was further improved with the utilization of the Nanopore sequencing platform that produced longer reads. A hybrid genome assembly was conducted, and the outcome was a much complete sago palm genome with BUSCO completeness achieved at as high as 97.9%, with only ∼2% of them either fragmented or missing. The estimated genome size of the sago palm is 509,812,790 bp in this study. A sum of 33,242 protein-coding genes was revealed from the sago palm genome and around 96.39% of them had been functionally annotated. An investigation on the carbohydrate metabolism KEGG pathways also unearthed that starch synthesis was one of the major sago palm activities. The genome data obtained from this work is indispensable for future molecular evolutionary and genome-wide association studies on the economically important sago palm.
Fulltext Sequencing and Characterisation of Complete Mitochondrial DNA Genome for Trigonopoma pauciperforatum (Cypriniformes: Cyprinidae: Danioninae) with Phylogenetic Consideration

Chung HH, Lim LWK, Liao Y, Lam TT, Chong YL

Trop Life Sci Res, 2020 Apr;31(1):107-121.
PMID: 32963714 DOI: 10.21315/tlsr2020.31.1.7

The Trigonopoma pauciperforatum or the redstripe rasbora is a cyprinid commonly found in marshes and swampy areas with slight acidic tannin-stained water in the tropics. In this study, the complete mitogenome sequence of T. pauciperforatum was first amplified in two parts using two pairs of overlapping primers and then sequenced. The size of the mitogenome is 16,707 bp, encompassing 22 transfer RNA genes, 13 protein-coding genes, two ribosomal RNA genes and a putative control region. Identical gene organisation was detected between this species and other family members. The heavy strand accommodates 28 genes while the light strand houses the remaining nine genes. Most protein-coding genes utilise ATG as start codon except for COI gene which uses GTG instead. The terminal associated sequence (TAS), central conserved sequence block (CSB-F, CSB-D and CSB-E) as well as variable sequence block (CSB-1, CSB-2 and CSB-3) are conserved in the control region. The maximum likelihood phylogenetic tree revealed the divergence of T. pauciperforatum from the basal region of the major clade, where its evolutionary relationships with Boraras maculatus, Rasbora cephalotaenia and R. daniconius are poorly resolved as suggested by the low bootstrap values. This work contributes towards the genetic resource enrichment for peat swamp conservation and comprehensive in-depth comparisons across other phylogenetic researches done on the Rasbora-related genus.
Fulltext The first engkabang jantong (Rubroshorea macrophylla) genome survey data

Chung HH, Soh AAL, Lau MML, Gan HM, Sim SF, Lim LWK

Data Brief, 2025 Feb;58:111248.
PMID: 39830615 DOI: 10.1016/j.dib.2024.111248

The engkabang jantong (Rubroshorea macrophylla) is one of the most indispensable tree species for reforestation due to its high survival rate and rapid growth rate. Due to relatively low genetic interest of this tree species, its genomic landscape has since faced scarcity, impeding our further elucidation on genes that are involved in expressing its aforementioned superior properties. In this study, we performed genome survey and microsatellite analysis of engkabang jantong. Based on the results, the estimated genome size of this species is 312,071,515 bp with 18.43 % repeated sequences and 1.16 % heterozygosity. BUSCO analysis unearthed that 83.5 % of the contigs are single-copy genes whereas 12.7 % of them are duplicated. Only 2.8 % and 1 % of them are fragmented and missing respectively. The short-read sequencing results obtained from the Illumina platform in this study will be essential to complement the Nanopore long-read sequencing results in hybrid genome assembly endeavors in the near future.
Sequencing and characterisation of complete mitogenome DNA for Rasbora sarawakensis (Cypriniformes: Cyprinidae: Rasbora) with phylogenetic consideration

Lim LWK, Kamar CKA, Roja JS, Chung HH, Liao Y, Lam TT, et al.

Comput Biol Chem, 2020 Dec;89:107403.
PMID: 33120127 DOI: 10.1016/j.compbiolchem.2020.107403

The Blueline Rasbora (Rasbora sarawakensis) is a small ray-finned fish categorized under the genus Rasbora in the Cyprinidae family. In this study, the complete mitogenome sequence of R. sarawakensis was sequenced using four primers targeting overlapping regions. The mitogenome is 16,709 bp in size, accommodating 22 transfer RNA genes, 13 protein-coding genes, two ribosomal RNA genes and a putative control region. Identical gene organisation was detected between this species and other genus counterparts. The heavy strand houses 28 genes while the light strand stores the other nine genes. Most protein-coding genes employ ATG as start codon, excluding COI gene, which utilizes GTG instead. The central conserved sequence blocks (CSB-F, CSB-E and CSB-D), variable sequence blocks (CSB-3, CSB-2 and CSB-1) as well as the terminal associated sequence (TAS) are conserved in the control region. The maximum likelihood phylogenetic tree revealed the divergence of R. sarawakensis from the basal region of the Rasbora clade, where its evolutionary relationships with R. maculatus and R. pauciperforata are poorly resolved as indicated by the low bootstrap values. This work acts as steppingstone towards further molecular evolution and population genetics studies of Rasbora genus in future.
Fulltext Complete chloroplast genome data of Shorea macrophylla (Engkabang): Structural features, comparative and phylogenetic analysis

Chew IYY, Chung HH, Lim LWK, Lau MML, Gan HM, Wee BS, et al.

Data Brief, 2023 Apr;47:109029.
PMID: 36936629 DOI: 10.1016/j.dib.2023.109029

Shorea macrophylla belongs to the Shorea genus under the Dipterocarpaceae family. It is a woody tree that grows in the rainforest in Southeast Asia. The complete chloroplast (cp) genome sequence of S. macrophylla is reported here. The genomic size of S. macrophylla is 150,778 bp and it possesses a circular structure with conserved constitute regions of large single copy (LSC, 83,681 bp) and small single copy (SSC, 19,813 bp) regions, as well as a pair of inverted repeats with a length of 23,642 bp. It has 112 unique genes, including 78 protein-coding genes, 30 tRNA genes, and four rRNA genes. The genome exhibits a similar GC content, gene order, structure, and codon usage when compared to previously reported chloroplast genomes from other plant species. The chloroplast genome of S. macrophylla contained 262 SSRs, the most prevalent of which was A/T, followed by AAT/ATT. Furthermore, the sequences contain 43 long repeat sequences, practically most of them are forward or palindrome type long repeats. The genome structure of S. macrophylla was compared to the genomic structures of closely related species from the same family, and eight mutational hotspots were discovered. The phylogenetic analysis demonstrated a close relationship between Shorea and Parashorea species, indicating that Shorea is not monophyletic. The complete chloroplast genome sequence analysis of S. macrophylla reported in this paper will contribute to further studies in molecular identification, genetic diversity, and phylogenetic research.