Displaying publications 1 - 20 of 104 in total

Abstract:
Sort:
  1. Halim-Fikri H, Etemad A, Abdul Latif AZ, Merican AF, Baig AA, Annuar AA, et al.
    BMC Res Notes, 2015;8:176.
    PMID: 25925844 DOI: 10.1186/s13104-015-1123-y
    The Malaysian Node of the Human Variome Project (MyHVP) is one of the eighteen official Human Variome Project (HVP) country-specific nodes. Since its inception in 9(th) October 2010, MyHVP has attracted the significant number of Malaysian clinicians and researchers to participate and contribute their data to this project. MyHVP also act as the center of coordination for genotypic and phenotypic variation studies of the Malaysian population. A specialized database was developed to store and manage the data based on genetic variations which also associated with health and disease of Malaysian ethnic groups. This ethnic-specific database is called the Malaysian Node of the Human Variome Project database (MyHVPDb).
    Matched MeSH terms: Databases, Genetic*
  2. Yahya P, Sulong S, Harun A, Wangkumhang P, Wilantho A, Ngamphiw C, et al.
    Int J Legal Med, 2020 Jan;134(1):123-134.
    PMID: 31760471 DOI: 10.1007/s00414-019-02184-0
    Ancestry-informative markers (AIMs) can be used to infer the ancestry of an individual to minimize the inaccuracy of self-reported ethnicity in biomedical research. In this study, we describe three methods for selecting AIM SNPs for the Malay population (Malay AIM panel) using different approaches based on pairwise FST, informativeness for assignment (In), and PCA-correlated SNPs (PCAIMs). These Malay AIM panels were extracted from genotype data stored in SNP arrays hosted by the Malaysian node of the Human Variome Project (MyHVP) and the Singapore Genome Variation Project (SGVP). In particular, genotype data from a total of 165 Malay individuals were analyzed, comprising data on 117 individual genotypes from the Affymetrix SNP-6 SNP array platform and data on 48 individual genotypes from the OMNI 2.5 Illumina SNP array platform. The HapMap phase 3 database (1397 individuals from 11 populations) was used as a reference for comparison with the Malay genotype data. The accuracy of each resulting Malay AIM panel was evaluated using a machine learning "ancestry-predictive model" constructed by using WEKA, a comprehensive machine learning platform written in Java. A total of 1250 SNPs were finally selected, which successfully identified Malay individuals from other world populations with an accuracy of 90%, but the accuracy decreased to 80% using 157 SNPs according to the pairwise FST method, while a panel of 200 SNPs selected using In and PCAIMs could be used to identify Malay individuals with an accuracy of approximately 80%.
    Matched MeSH terms: Databases, Genetic*
  3. Zeng C, Guo X, Long J, Kuchenbaecker KB, Droit A, Michailidou K, et al.
    Breast Cancer Res, 2016 06 21;18(1):64.
    PMID: 27459855 DOI: 10.1186/s13058-016-0718-0
    BACKGROUND: Multiple recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP), rs10771399, at 12p11 that is associated with breast cancer risk.

    METHOD: We performed a fine-scale mapping study of a 700 kb region including 441 genotyped and more than 1300 imputed genetic variants in 48,155 cases and 43,612 controls of European descent, 6269 cases and 6624 controls of East Asian descent and 1116 cases and 932 controls of African descent in the Breast Cancer Association Consortium (BCAC; http://bcac.ccge.medschl.cam.ac.uk/ ), and in 15,252 BRCA1 mutation carriers in the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA). Stepwise regression analyses were performed to identify independent association signals. Data from the Encyclopedia of DNA Elements project (ENCODE) and the Cancer Genome Atlas (TCGA) were used for functional annotation.

    RESULTS: Analysis of data from European descendants found evidence for four independent association signals at 12p11, represented by rs7297051 (odds ratio (OR) = 1.09, 95 % confidence interval (CI) = 1.06-1.12; P = 3 × 10(-9)), rs805510 (OR = 1.08, 95 % CI = 1.04-1.12, P = 2 × 10(-5)), and rs1871152 (OR = 1.04, 95 % CI = 1.02-1.06; P = 2 × 10(-4)) identified in the general populations, and rs113824616 (P = 7 × 10(-5)) identified in the meta-analysis of BCAC ER-negative cases and BRCA1 mutation carriers. SNPs rs7297051, rs805510 and rs113824616 were also associated with breast cancer risk at P 

    Matched MeSH terms: Databases, Genetic
  4. Roslan R, Othman RM, Shah ZA, Kasim S, Asmuni H, Taliba J, et al.
    Comput Biol Med, 2010 Jun;40(6):555-64.
    PMID: 20417930 DOI: 10.1016/j.compbiomed.2010.03.009
    Protein-protein interactions (PPIs) play a significant role in many crucial cellular operations such as metabolism, signaling and regulations. The computational methods for predicting PPIs have shown tremendous growth in recent years, but problem such as huge false positive rates has contributed to the lack of solid PPI information. We aimed at enhancing the overlap between computational predictions and experimental results in an effort to partially remove PPIs falsely predicted. The use of protein function predictor named PFP() that are based on shared interacting domain patterns is introduced in this study with the purpose of aiding the Gene Ontology Annotations (GOA). We used GOA and PFP() as agents in a filtering process to reduce false positive pairs in the computationally predicted PPI datasets. The functions predicted by PFP() were extracted from cross-species PPI data in order to assign novel functional annotations for the uncharacterized proteins and also as additional functions for those that are already characterized by the GO (Gene Ontology). The implementation of PFP() managed to increase the chances of finding matching function annotation for the first rule in the filtration process as much as 20%. To assess the capability of the proposed framework in filtering false PPIs, we applied it on the available S. cerevisiae PPIs and measured the performance in two aspects, the improvement made indicated as Signal-to-Noise Ratio (SNR) and the strength of improvement, respectively. The proposed filtering framework significantly achieved better performance than without it in both metrics.
    Matched MeSH terms: Databases, Genetic
  5. Ng CH, Ng KKS, Lee SL, Tnah LH, Lee CT, Zakaria NF
    Forensic Sci Int Genet, 2020 01;44:102188.
    PMID: 31648150 DOI: 10.1016/j.fsigen.2019.102188
    To inform product users about the origin of timber, the implementation of a traceability system is necessary for the forestry industry. In this study, we developed a comprehensive genetic database for the important tropical timber species Merbau, Intsia palembanica, to trace its geographic origin within peninsular Malaysia. A total of 1373 individual trees representing 39 geographically distinct populations of I. palembanica were sampled throughout peninsular Malaysia. We analyzed the samples using a combination of four chloroplast DNA (cpDNA) markers and 14 short tandem repeat (STR) markers to establish both cpDNA haplotype and STR allele frequency databases. A haplotype map was generated through cpDNA sequencing for population identification, resulting in six unique haplotypes based on 10 informative intraspecifically variable sites. Subsequently, an STR allele frequency database was developed from 14 STRs allowing individual identification. Bayesian cluster analysis divided the individuals into two genetic clusters corresponding to the northern and southern regions of peninsular Malaysia. Tests of conservativeness showed that the databases were conservative after the adjustment of the θ values to 0.2000 and 0.2900 for the northern (f = 0.0163) and southern (f = 0.0285) regions, respectively. Using self-assignment tests, we observed that individuals were correctly assigned to populations at rates of 40.54-94.12% and to the identified regions at rates of 79.80-80.62%. Both the cpDNA and STR markers appear to be useful for tracking Merbau timber originating from peninsular Malaysia. The use of these forensic tools in addition to the existing paper-based timber tracking system will help to verify the legality of the origin of I. palembanica and to combat illegal logging issues associated with the species.
    Matched MeSH terms: Databases, Genetic*
  6. Abdulrauf Sharifai G, Zainol Z
    Genes (Basel), 2020 06 27;11(7).
    PMID: 32605144 DOI: 10.3390/genes11070717
    The training machine learning algorithm from an imbalanced data set is an inherently challenging task. It becomes more demanding with limited samples but with a massive number of features (high dimensionality). The high dimensional and imbalanced data set has posed severe challenges in many real-world applications, such as biomedical data sets. Numerous researchers investigated either imbalanced class or high dimensional data sets and came up with various methods. Nonetheless, few approaches reported in the literature have addressed the intersection of the high dimensional and imbalanced class problem due to their complicated interactions. Lately, feature selection has become a well-known technique that has been used to overcome this problem by selecting discriminative features that represent minority and majority class. This paper proposes a new method called Robust Correlation Based Redundancy and Binary Grasshopper Optimization Algorithm (rCBR-BGOA); rCBR-BGOA has employed an ensemble of multi-filters coupled with the Correlation-Based Redundancy method to select optimal feature subsets. A binary Grasshopper optimisation algorithm (BGOA) is used to construct the feature selection process as an optimisation problem to select the best (near-optimal) combination of features from the majority and minority class. The obtained results, supported by the proper statistical analysis, indicate that rCBR-BGOA can improve the classification performance for high dimensional and imbalanced datasets in terms of G-mean and the Area Under the Curve (AUC) performance metrics.
    Matched MeSH terms: Databases, Genetic/standards*
  7. Mohamad MS, Omatu S, Deris S, Yoshioka M
    IEEE Trans Inf Technol Biomed, 2011 Nov;15(6):813-22.
    PMID: 21914573 DOI: 10.1109/TITB.2011.2167756
    Gene expression data are expected to be of significant help in the development of efficient cancer diagnoses and classification platforms. In order to select a small subset of informative genes from the data for cancer classification, recently, many researchers are analyzing gene expression data using various computational intelligence methods. However, due to the small number of samples compared to the huge number of genes (high dimension), irrelevant genes, and noisy genes, many of the computational methods face difficulties to select the small subset. Thus, we propose an improved (modified) binary particle swarm optimization to select the small subset of informative genes that is relevant for the cancer classification. In this proposed method, we introduce particles' speed for giving the rate at which a particle changes its position, and we propose a rule for updating particle's positions. By performing experiments on ten different gene expression datasets, we have found that the performance of the proposed method is superior to other previous related works, including the conventional version of binary particle swarm optimization (BPSO) in terms of classification accuracy and the number of selected genes. The proposed method also produces lower running times compared to BPSO.
    Matched MeSH terms: Databases, Genetic*
  8. Choo SW, Heydari H, Tan TK, Siow CC, Beh CY, Wee WY, et al.
    ScientificWorldJournal, 2014;2014:569324.
    PMID: 25243218 DOI: 10.1155/2014/569324
    To facilitate the ongoing research of Vibrio spp., a dedicated platform for the Vibrio research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. We present VibrioBase, a useful resource platform, providing all basic features of a sequence database with the addition of unique analysis tools which could be valuable for the Vibrio research community. VibrioBase currently houses a total of 252 Vibrio genomes developed in a user-friendly manner and useful to enable the analysis of these genomic data, particularly in the field of comparative genomics. Besides general data browsing features, VibrioBase offers analysis tools such as BLAST interfaces and JBrowse genome browser. Other important features of this platform include our newly developed in-house tools, the pairwise genome comparison (PGC) tool, and pathogenomics profiling tool (PathoProT). The PGC tool is useful in the identification and comparative analysis of two genomes, whereas PathoProT is designed for comparative pathogenomics analysis of Vibrio strains. Both of these tools will enable researchers with little experience in bioinformatics to get meaningful information from Vibrio genomes with ease. We have tested the validity and suitability of these tools and features for use in the next-generation database development.
    Matched MeSH terms: Databases, Genetic/trends*
  9. Feng B, Wang XH, Ratkowsky D, Gates G, Lee SS, Grebenc T, et al.
    Sci Rep, 2016 May 06;6:25586.
    PMID: 27151256 DOI: 10.1038/srep25586
    Hydnum is a fungal genus proposed by Linnaeus in the early time of modern taxonomy. It contains several ectomycorrhizal species which are commonly consumed worldwide. However, Hydnum is one of the most understudied fungal genera, especially from a molecular phylogenetic view. In this study, we extensively gathered specimens of Hydnum from Asia, Europe, America and Australasia, and analyzed them by using sequences of four gene fragments (ITS, nrLSU, tef1α and rpb1). Our phylogenetic analyses recognized at least 31 phylogenetic species within Hydnum, 15 of which were reported for the first time. Most Australasian species were recognized as strongly divergent old relics, but recent migration between Australasia and the Northern Hemisphere was also detected. Within the Northern Hemisphere, frequent historical biota exchanges between the Old World and the New World via both the North Atlantic Land Bridge and the Bering Land Bridge could be elucidated. Our study also revealed that most Hydnum species found in subalpine areas of the Hengduan Mountains in southwestern China occur in northeastern/northern China and Europe, indicating that the composition of the mycobiota in the Hengduan Mountains reigion is more complicated than what we have known before.
    Matched MeSH terms: Databases, Genetic
  10. Zhang C, Gao Y, Ning Z, Lu Y, Zhang X, Liu J, et al.
    Genome Biol, 2019 10 22;20(1):215.
    PMID: 31640808 DOI: 10.1186/s13059-019-1838-5
    Despite the tremendous growth of the DNA sequencing data in the last decade, our understanding of the human genome is still in its infancy. To understand the implications of genetic variants in the light of population genetics and molecular evolution, we developed a database, PGG.SNV ( https://www.pggsnv.org ), which gives much higher weight to previously under-investigated indigenous populations in Asia. PGG.SNV archives 265 million SNVs across 220,147 present-day genomes and 1018 ancient genomes, including 1009 newly sequenced genomes, representing 977 global populations. Moreover, estimation of population genetic diversity and evolutionary parameters is available in PGG.SNV, a unique feature compared with other databases.
    Matched MeSH terms: Databases, Genetic*
  11. Choo SW, Wee WY, Ngeow YF, Mitchell W, Tan JL, Wong GJ, et al.
    Sci Rep, 2014;4:4061.
    PMID: 24515248 DOI: 10.1038/srep04061
    Mycobacterium abscessus (Ma) is an emerging human pathogen that causes both soft tissue infections and systemic disease. We present the first comparative whole-genome study of Ma strains isolated from patients of wide geographical origin. We found a high proportion of accessory strain-specific genes indicating an open, non-conservative pan-genome structure, and clear evidence of rapid phage-mediated evolution. Although we found fewer virulence factors in Ma compared to M. tuberculosis, our data indicated that Ma evolves rapidly and therefore should be monitored closely for the acquisition of more pathogenic traits. This comparative study provides a better understanding of Ma and forms the basis for future functional work on this important pathogen.
    Matched MeSH terms: Databases, Genetic
  12. Choo SW, Ang MY, Dutta A, Tan SY, Siow CC, Heydari H, et al.
    Sci Rep, 2015 Dec 15;5:18227.
    PMID: 26666970 DOI: 10.1038/srep18227
    Mycobacterium spp. are renowned for being the causative agent of diseases like leprosy, Buruli ulcer and tuberculosis in human beings. With more and more mycobacterial genomes being sequenced, any knowledge generated from comparative genomic analysis would provide better insights into the biology, evolution, phylogeny and pathogenicity of this genus, thus helping in better management of diseases caused by Mycobacterium spp.With this motivation, we constructed MycoCAP, a new comparative analysis platform dedicated to the important genus Mycobacterium. This platform currently provides information of 2108 genome sequences of at least 55 Mycobacterium spp. A number of intuitive web-based tools have been integrated in MycoCAP particularly for comparative analysis including the PGC tool for comparison between two genomes, PathoProT for comparing the virulence genes among the Mycobacterium strains and the SuperClassification tool for the phylogenic classification of the Mycobacterium strains and a specialized classification system for strains of Mycobacterium abscessus. We hope the broad range of functions and easy-to-use tools provided in MycoCAP makes it an invaluable analysis platform to speed up the research discovery on mycobacteria for researchers. Database URL: http://mycobacterium.um.edu.my.
    Matched MeSH terms: Databases, Genetic
  13. Choo SW, Ang MY, Fouladi H, Tan SY, Siow CC, Mutha NV, et al.
    BMC Genomics, 2014;15:600.
    PMID: 25030426 DOI: 10.1186/1471-2164-15-600
    Helicobacter is a genus of Gram-negative bacteria, possessing a characteristic helical shape that has been associated with a wide spectrum of human diseases. Although much research has been done on Helicobacter and many genomes have been sequenced, currently there is no specialized Helicobacter genomic resource and analysis platform to facilitate analysis of these genomes. With the increasing number of Helicobacter genomes being sequenced, comparative genomic analysis on members of this species will provide further insights on their taxonomy, phylogeny, pathogenicity and other information that may contribute to better management of diseases caused by Helicobacter pathogens.
    Matched MeSH terms: Databases, Genetic*
  14. Cheah BH, Nadarajah K, Divate MD, Wickneswari R
    BMC Genomics, 2015;16:692.
    PMID: 26369665 DOI: 10.1186/s12864-015-1851-3
    Developing drought-tolerant rice varieties with higher yield under water stressed conditions provides a viable solution to serious yield-reduction impact of drought. Understanding the molecular regulation of this polygenic trait is crucial for the eventual success of rice molecular breeding programmes. microRNAs have received tremendous attention recently due to its importance in negative regulation. In plants, apart from regulating developmental and physiological processes, microRNAs have also been associated with different biotic and abiotic stresses. Hence here we chose to analyze the differential expression profiles of microRNAs in three drought treated rice varieties: Vandana (drought-tolerant), Aday Sel (drought-tolerant) and IR64 (drought-susceptible) in greenhouse conditions via high-throughput sequencing.
    Matched MeSH terms: Databases, Genetic
  15. Ummu Atiqah Mohd Roslan
    MATEMATIKA, 2018;34(1):13-21.
    MyJurnal
    Markov map is one example of interval maps where it is a piecewise expanding
    map and obeys the Markov property. One well-known example of Markov map is the
    doubling map, a map which has two subintervals with equal partitions. In this paper, we
    are interested to investigate another type of Markov map, the so-called skewed doubling
    map. This map is a more generalized map than the doubling map. Thus, the aims of this
    paper are to find the fixed points as well as the periodic points for the skewed doubling
    map and to investigate the sensitive dependence on initial conditions of this map. The
    method considered here is the cobweb diagram. Numerical results suggest that there exist
    dense of periodic orbits for this map. The sensitivity of this map to initial conditions is
    also verified where small differences in initial conditions give different behaviour of the
    orbits in the map.
    Matched MeSH terms: Databases, Genetic
  16. Chuon C, Takahashi K, Matsuo J, Katayama K, Yamamoto C, Ko K, et al.
    Sci Rep, 2019 08 21;9(1):12186.
    PMID: 31434918 DOI: 10.1038/s41598-019-48304-z
    Approximately 75% of hepatocellular carcinomas (HCC) occur in Asia; core promoter mutations are associated with HCC in HBV genotype C, the dominant genotype in Cambodia. We analyzed these mutations in Cambodian residents and compared them with HBV full genomes registered in GenBank. We investigated the characteristics of 26 full-length HBV genomes among 35 residents positive for hepatitis B surface antigen in Siem Reap province, Cambodia. Genotype C1 was dominant (92.3%, 24/26), with one case of B2 and B4 each. Multiple mutations were confirmed in 24 Cambodian C1 isolates, especially double mutation at A1762T/G1764A in 18 isolates (75.0%), and combination mutation at C1653T and/or T1753V and A1762T/G1764A in 14 isolates (58.3%). In phylogenetic analysis, 16 of 24 isolates were located in the cluster with Laos, Thailand, and Malaysia. In 340 GenBank-registered C1 strains, 113 (33.2%) had combination mutation amongst which 16.5%, 34.2%, and 95.2% were found in ASC, chronic hepatitis, and liver cirrhosis (LC)/HCC respectively (P 
    Matched MeSH terms: Databases, Genetic
  17. Tan CH, Tan KY
    Toxins (Basel), 2021 02 09;13(2).
    PMID: 33572266 DOI: 10.3390/toxins13020127
    Envenomation resulted from sea snake bite is a highly lethal health hazard in Southeast Asia. Although commonly caused by sea snakes of Hydrophiinae, each species is evolutionarily distinct and thus, unveiling the toxin gene diversity within individual species is important. Applying next-generation sequencing, this study investigated the venom-gland transcriptome of Hydrophis curtus (spine-bellied sea snake) from Penang, West Malaysia. The transcriptome was de novo assembled, followed by gene annotation and sequence analyses. Transcripts with toxin annotation were only 96 in number but highly expressed, constituting 48.18% of total FPKM in the overall transcriptome. Of the 21 toxin families, three-finger toxins (3FTX) were the most abundantly expressed and functionally diverse, followed by phospholipases A2. Lh_FTX001 (short neurotoxin) and Lh_FTX013 (long neurotoxin) were the most dominant 3FTXs expressed, consistent with the pathophysiology of envenomation. Lh_FTX001 and Lh_FTX013 were variable in amino acid compositions and predicted epitopes, while Lh_FTX001 showed high sequence similarity with the short neurotoxin from Hydrophis schistosus, supporting cross-neutralization effect of Sea Snake Antivenom. Other toxins of low gene expression, for example, snake venom metalloproteinases and L-amino acid oxidases not commonly studied in sea snake venom were also identified, enriching the knowledgebase of sea snake toxins for future study.
    Matched MeSH terms: Databases, Genetic
  18. Low JZB, Khang TF, Tammi MT
    BMC Bioinformatics, 2017 12 28;18(Suppl 16):575.
    PMID: 29297307 DOI: 10.1186/s12859-017-1974-4
    BACKGROUND: In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis.

    RESULTS: We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data.

    CONCLUSIONS: Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. The method is implemented in CORNAS (Coverage-dependent RNA-Seq), and is available at https://github.com/joel-lzb/CORNAS .

    Matched MeSH terms: Databases, Genetic*
  19. Behrooz Gharleghi, Abu Hassan Shaari Md Nor, Tamat Sarmidi
    Sains Malaysiana, 2014;43:1609-1622.
    Linear time series models are not able to capture the behaviour of many financial time series, as in the cases of exchange rates and stock market data. Some phenomena, such as volatility and structural breaks in time series data, cannot be modelled implicitly using linear time series models. Therefore, nonlinear time series models are typically designed to accommodate for such nonlinear features. In the present study, a nonlinearity test and a structural change test are used to detect the nonlinearity and the break date in three ASEAN currencies, namely the Indonesian Rupiah (IDR), the Malaysian Ringgit (MYR) and the Thai Baht (THB). The study finds that the null hypothesis of linearity is rejected and evidence of structural breaks exist in the exchange rates series. Therefore, the decision to use the self-exciting threshold autoregressive (SETAR) model in the present study is justified. The results showed that the SETAR model, as a regime switching model, can explain abrupt changes in a time series. To evaluate the prediction performance of SETAR model, an Autoregressive Integrated Moving Average (ARIMA) model used as a benchmark. In order to increase the accuracy of prediction, both models are combined with an exponential generalised autoregressive conditional heteroscedasticity (EGARCH) model. The prediction results showed that the construct model of SETAR-EGARCH performs better than that of the ARIMA model and the combined ARIMA and EGARCH model. The results indicated that nonlinear models give better fitting than linear models.
    Matched MeSH terms: Databases, Genetic
  20. Chen J, Teo YY, Toh DS, Sung C
    Pharmacogenomics, 2010 Aug;11(8):1077-94.
    PMID: 20712526 DOI: 10.2217/pgs.10.79
    The frequencies of alleles implicated in drug-response variability provide vital information for public health management. Differences in frequencies between genetically diverse groups of individuals can hamper drug assessments, particularly in populations where clinical data are not readily available.
    Matched MeSH terms: Databases, Genetic*
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links