MyMedR

Displaying publications 1 - 20 of 35 in total

Abstract:

Sort:

Fulltext Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods

Chang SW, Abdul-Kareem S, Merican AF, Zain RB

BMC Bioinformatics, 2013;14:170.
PMID: 23725313 DOI: 10.1186/1471-2105-14-170

Machine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers.
Fulltext In-vitro diagnosis of single and poly microbial species targeted for diabetic foot infection using e-nose technology

Yusuf N, Zakaria A, Omar MI, Shakaff AY, Masnan MJ, Kamarudin LM, et al.

BMC Bioinformatics, 2015;16:158.
PMID: 25971258 DOI: 10.1186/s12859-015-0601-5

Effective management of patients with diabetic foot infection is a crucial concern. A delay in prescribing appropriate antimicrobial agent can lead to amputation or life threatening complications. Thus, this electronic nose (e-nose) technique will provide a diagnostic tool that will allow for rapid and accurate identification of a pathogen.
Fulltext Revealing the functionality of hypothetical protein KPN00728 from Klebsiella pneumoniae MGH78578: molecular dynamics simulation approaches

Choi SB, Normi YM, Wahab HA

BMC Bioinformatics, 2011;12 Suppl 13:S11.
PMID: 22372825 DOI: 10.1186/1471-2105-12-S13-S11

Previously, the hypothetical protein, KPN00728 from Klebsiella pneumoniae MGH78578 was the Succinate dehydrogenase (SDH) chain C subunit via structural prediction and molecular docking simulation studies. However, due to limitation in docking simulation, an in-depth understanding of how SDH interaction occurs across the transmembrane of mitochondria could not be provided.
Fulltext Structure-based and ligand-based virtual screening of novel methyltransferase inhibitors of the dengue virus

Lim SV, Rahman MB, Tejo BA

BMC Bioinformatics, 2011;12 Suppl 13:S24.
PMID: 22373153 DOI: 10.1186/1471-2105-12-S13-S24

The dengue virus is the most significant arthropod-borne human pathogen, and an increasing number of cases have been reported over the last few decades. Currently neither vaccines nor drugs against the dengue virus are available. NS5 methyltransferase (MTase), which is located on the surface of the dengue virus and assists in viral attachment to the host cell, is a promising antiviral target. In order to search for novel inhibitors of NS5 MTase, we performed a computer-aided virtual screening of more than 5 million commercially available chemical compounds using two approaches: i) structure-based screening using the crystal structure of NS5 MTase and ii) ligand-based screening using active ligands of NS5 MTase. Structure-based screening was performed using the LIDAEUS (LIgand Discovery At Edinburgh UniverSity) program. The ligand-based screening was carried out using the EDULISS (EDinburgh University LIgand Selection System) program.
Fulltext Discovery of a new class of inhibitors for the protein arginine deiminase type 4 (PAD4) by structure-based virtual screening

Teo CY, Shave S, Chor AL, Salleh AB, Rahman MB, Walkinshaw MD, et al.

BMC Bioinformatics, 2012;13 Suppl 17:S4.
PMID: 23282142 DOI: 10.1186/1471-2105-13-S17-S4

BACKGROUND: Rheumatoid arthritis (RA) is an autoimmune disease with unknown etiology. Anticitrullinated protein autoantibody has been documented as a highly specific autoantibody associated with RA. Protein arginine deiminase type 4 (PAD4) is the enzyme responsible for catalyzing the conversion of peptidylarginine into peptidylcitrulline. PAD4 is a new therapeutic target for RA treatment. In order to search for inhibitors of PAD4, structure-based virtual screening was performed using LIDAEUS (Ligand discovery at Edinburgh university). Potential inhibitors were screened experimentally by inhibition assays.
RESULTS: Twenty two of the top-ranked water-soluble compounds were selected for inhibitory screening against PAD4. Three compounds showed significant inhibition of PAD4 and their IC50 values were investigated. The structures of the three compounds show no resemblance with previously discovered PAD4 inhibitors, nor with existing drugs for RA treatment.
CONCLUSION: Three compounds were discovered as potential inhibitors of PAD4 by virtual screening. The compounds are commercially available and can be used as scaffolds to design more potent inhibitors against PAD4.
Fulltext AnkPlex: algorithmic structure for refinement of near-native ankyrin-protein docking

Wisitponchai T, Shoombuatong W, Lee VS, Kitidee K, Tayapiwatana C

BMC Bioinformatics, 2017 Apr 19;18(1):220.
PMID: 28424069 DOI: 10.1186/s12859-017-1628-6

BACKGROUND: Computational analysis of protein-protein interaction provided the crucial information to increase the binding affinity without a change in basic conformation. Several docking programs were used to predict the near-native poses of the protein-protein complex in 10 top-rankings. The universal criteria for discriminating the near-native pose are not available since there are several classes of recognition protein. Currently, the explicit criteria for identifying the near-native pose of ankyrin-protein complexes (APKs) have not been reported yet.
RESULTS: In this study, we established an ensemble computational model for discriminating the near-native docking pose of APKs named "AnkPlex". A dataset of APKs was generated from seven X-ray APKs, which consisted of 3 internal domains, using the reliable docking tool ZDOCK. The dataset was composed of 669 and 44,334 near-native and non-near-native poses, respectively, and it was used to generate eleven informative features. Subsequently, a re-scoring rank was generated by AnkPlex using a combination of a decision tree algorithm and logistic regression. AnkPlex achieved superior efficiency with ≥1 near-native complexes in the 10 top-rankings for nine X-ray complexes compared to ZDOCK, which only obtained six X-ray complexes. In addition, feature analysis demonstrated that the van der Waals feature was the dominant near-native pose out of the potential ankyrin-protein docking poses.
CONCLUSION: The AnkPlex model achieved a success at predicting near-native docking poses and led to the discovery of informative characteristics that could further improve our understanding of the ankyrin-protein complex. Our computational study could be useful for predicting the near-native poses of binding proteins and desired targets, especially for ankyrin-protein complexes. The AnkPlex web server is freely accessible at http://ankplex.ams.cmu.ac.th .
Fulltext Towards big data science in the decade ahead from ten years of InCoB and the 1st ISCB-Asia Joint Conference

Ranganathan S, Schönbach C, Kelso J, Rost B, Nathan S, Tan TW

BMC Bioinformatics, 2011;12 Suppl 13:S1.
PMID: 22372736 DOI: 10.1186/1471-2105-12-S13-S1

The 2011 International Conference on Bioinformatics (InCoB) conference, which is the annual scientific conference of the Asia-Pacific Bioinformatics Network (APBioNet), is hosted by Kuala Lumpur, Malaysia, is co-organized with the first ISCB-Asia conference of the International Society for Computational Biology (ISCB). InCoB and the sequencing of the human genome are both celebrating their tenth anniversaries and InCoB's goalposts for the next decade, implementing standards in bioinformatics and globally distributed computational networks, will be discussed and adopted at this conference. Of the 49 manuscripts (selected from 104 submissions) accepted to BMC Genomics and BMC Bioinformatics conference supplements, 24 are featured in this issue, covering software tools, genome/proteome analysis, systems biology (networks, pathways, bioimaging) and drug discovery and design.
Fulltext Emerging strengths in Asia Pacific bioinformatics

Ranganathan S, Hsu WL, Yang UC, Tan TW

BMC Bioinformatics, 2008;9 Suppl 12:S1.
PMID: 19091008 DOI: 10.1186/1471-2105-9-S12-S1

The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.
Fulltext Bioinformatics research in the Asia Pacific: a 2007 update

Ranganathan S, Gribskov M, Tan TW

BMC Bioinformatics, 2008;9 Suppl 1:S1.
PMID: 18315840 DOI: 10.1186/1471-2105-9-S1-S1

We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27-30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours.
Fulltext CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates

Low JZB, Khang TF, Tammi MT

BMC Bioinformatics, 2017 12 28;18(Suppl 16):575.
PMID: 29297307 DOI: 10.1186/s12859-017-1974-4

BACKGROUND: In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis.
RESULTS: We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data.
CONCLUSIONS: Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. The method is implemented in CORNAS (Coverage-dependent RNA-Seq), and is available at https://github.com/joel-lzb/CORNAS .
Fulltext A comparative study of the SVM and K-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals

Palaniappan R, Sundaraj K, Sundaraj S

BMC Bioinformatics, 2014;15:223.
PMID: 24970564 DOI: 10.1186/1471-2105-15-223

Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database.
Fulltext Predicting probable Alzheimer's disease using linguistic deficits and biomarkers

Orimaye SO, Wong JS, Golden KJ, Wong CP, Soyiri IN

BMC Bioinformatics, 2017 Jan 14;18(1):34.
PMID: 28088191 DOI: 10.1186/s12859-016-1456-0

BACKGROUND: The manual diagnosis of neurodegenerative disorders such as Alzheimer's disease (AD) and related Dementias has been a challenge. Currently, these disorders are diagnosed using specific clinical diagnostic criteria and neuropsychological examinations. The use of several Machine Learning algorithms to build automated diagnostic models using low-level linguistic features resulting from verbal utterances could aid diagnosis of patients with probable AD from a large population. For this purpose, we developed different Machine Learning models on the DementiaBank language transcript clinical dataset, consisting of 99 patients with probable AD and 99 healthy controls.
RESULTS: Our models learned several syntactic, lexical, and n-gram linguistic biomarkers to distinguish the probable AD group from the healthy group. In contrast to the healthy group, we found that the probable AD patients had significantly less usage of syntactic components and significantly higher usage of lexical components in their language. Also, we observed a significant difference in the use of n-grams as the healthy group were able to identify and make sense of more objects in their n-grams than the probable AD group. As such, our best diagnostic model significantly distinguished the probable AD group from the healthy elderly group with a better Area Under the Receiving Operating Characteristics Curve (AUC) using the Support Vector Machines (SVM).
CONCLUSIONS: Experimental and statistical evaluations suggest that using ML algorithms for learning linguistic biomarkers from the verbal utterances of elderly individuals could help the clinical diagnosis of probable AD. We emphasise that the best ML model for predicting the disease group combines significant syntactic, lexical and top n-gram features. However, there is a need to train the diagnostic models on larger datasets, which could lead to a better AUC and clinical diagnosis of probable AD.
Robustness evaluations of pathway activity inference methods on gene expression data

Hui TX, Kasim S, Aziz IA, Fudzee MFM, Haron NS, Sutikno T, et al.

BMC Bioinformatics, 2024 Jan 12;25(1):23.
PMID: 38216898 DOI: 10.1186/s12859-024-05632-w

BACKGROUND: With the exponential growth of high-throughput technologies, multiple pathway analysis methods have been proposed to estimate pathway activities from gene expression profiles. These pathway activity inference methods can be divided into two main categories: non-Topology-Based (non-TB) and Pathway Topology-Based (PTB) methods. Although some review and survey articles discussed the topic from different aspects, there is a lack of systematic assessment and comparisons on the robustness of these approaches.
RESULTS: Thus, this study presents comprehensive robustness evaluations of seven widely used pathway activity inference methods using six cancer datasets based on two assessments. The first assessment seeks to investigate the robustness of pathway activity in pathway activity inference methods, while the second assessment aims to assess the robustness of risk-active pathways and genes predicted by these methods. The mean reproducibility power and total number of identified informative pathways and genes were evaluated. Based on the first assessment, the mean reproducibility power of pathway activity inference methods generally decreased as the number of pathway selections increased. Entropy-based Directed Random Walk (e-DRW) distinctly outperformed other methods in exhibiting the greatest reproducibility power across all cancer datasets. On the other hand, the second assessment shows that no methods provide satisfactory results across datasets.
CONCLUSION: However, PTB methods generally appear to perform better in producing greater reproducibility power and identifying potential cancer markers compared to non-TB methods.
Fulltext A review of machine learning methods to predict the solubility of overexpressed recombinant proteins in Escherichia coli

Habibi N, Mohd Hashim SZ, Norouzi A, Samian MR

BMC Bioinformatics, 2014;15:134.
PMID: 24885721 DOI: 10.1186/1471-2105-15-134

Over the last 20 years in biotechnology, the production of recombinant proteins has been a crucial bioprocess in both biopharmaceutical and research arena in terms of human health, scientific impact and economic volume. Although logical strategies of genetic engineering have been established, protein overexpression is still an art. In particular, heterologous expression is often hindered by low level of production and frequent fail due to opaque reasons. The problem is accentuated because there is no generic solution available to enhance heterologous overexpression. For a given protein, the extent of its solubility can indicate the quality of its function. Over 30% of synthesized proteins are not soluble. In certain experimental circumstances, including temperature, expression host, etc., protein solubility is a feature eventually defined by its sequence. Until now, numerous methods based on machine learning are proposed to predict the solubility of protein merely from its amino acid sequence. In spite of the 20 years of research on the matter, no comprehensive review is available on the published methods.
Fulltext Assessment of predictive models for chlorophyll-a concentration of a tropical lake

Malek S, Syed Ahmad SM, Singh SK, Milow P, Salleh A

BMC Bioinformatics, 2011;12 Suppl 13:S12.
PMID: 22372859 DOI: 10.1186/1471-2105-12-S13-S12

This study assesses four predictive ecological models; Fuzzy Logic (FL), Recurrent Artificial Neural Network (RANN), Hybrid Evolutionary Algorithm (HEA) and multiple linear regressions (MLR) to forecast chlorophyll- a concentration using limnological data from 2001 through 2004 of unstratified shallow, oligotrophic to mesotrophic tropical Putrajaya Lake (Malaysia). Performances of the models are assessed using Root Mean Square Error (RMSE), correlation coefficient (r), and Area under the Receiving Operating Characteristic (ROC) curve (AUC). Chlorophyll-a have been used to estimate algal biomass in aquatic ecosystem as it is common in most algae. Algal biomass indicates of the trophic status of a water body. Chlorophyll- a therefore, is an effective indicator for monitoring eutrophication which is a common problem of lakes and reservoirs all over the world. Assessments of these predictive models are necessary towards developing a reliable algorithm to estimate chlorophyll- a concentration for eutrophication management of tropical lakes.
Fulltext A preliminary study on automated freshwater algae recognition and classification system

Mosleh MA, Manssor H, Malek S, Milow P, Salleh A

BMC Bioinformatics, 2012;13 Suppl 17:S25.
PMID: 23282059 DOI: 10.1186/1471-2105-13-S17-S25

Freshwater algae can be used as indicators to monitor freshwater ecosystem condition. Algae react quickly and predictably to a broad range of pollutants. Thus they provide early signals of worsening environment. This study was carried out to develop a computer-based image processing technique to automatically detect, recognize, and identify algae genera from the divisions Bacillariophyta, Chlorophyta and Cyanobacteria in Putrajaya Lake. Literature shows that most automated analyses and identification of algae images were limited to only one type of algae. Automated identification system for tropical freshwater algae is even non-existent and this study is partly to fill this gap.
An analysis of simple computational strategies to facilitate the design of functional molecular information processors

Lee Y, Roslan R, Azizan S, Firdaus-Raih M, Ramlan EI

BMC Bioinformatics, 2016 Oct 28;17(1):438.
PMID: 27793081

BACKGROUND: Biological macromolecules (DNA, RNA and proteins) are capable of processing physical or chemical inputs to generate outputs that parallel conventional Boolean logical operators. However, the design of functional modules that will enable these macromolecules to operate as synthetic molecular computing devices is challenging.
RESULTS: Using three simple heuristics, we designed RNA sensors that can mimic the function of a seven-segment display (SSD). Ten independent and orthogonal sensors representing the numerals 0 to 9 are designed and constructed. Each sensor has its own unique oligonucleotide binding site region that is activated uniquely by a specific input. Each operator was subjected to a stringent in silico filtering. Random sensors were selected and functionally validated via ribozyme self cleavage assays that were visualized via electrophoresis.
CONCLUSIONS: By utilising simple permutation and randomisation in the sequence design phase, we have developed functional RNA sensors thus demonstrating that even the simplest of computational methods can greatly aid the design phase for constructing functional molecular devices.
Fulltext Automated craniofacial landmarks detection on 3D image using geometry characteristics information

Abu A, Ngo CG, Abu-Hassan NIA, Othman SA

BMC Bioinformatics, 2019 Feb 04;19(Suppl 13):548.
PMID: 30717658 DOI: 10.1186/s12859-018-2548-9

BACKGROUND: Indirect anthropometry (IA) is one of the craniofacial anthropometry methods to perform the measurements on the digital facial images. In order to get the linear measurements, a few definable points on the structures of individual facial images have to be plotted as landmark points. Currently, most anthropometric studies use landmark points that are manually plotted on a 3D facial image by the examiner. This method is time-consuming and leads to human biases, which will vary from intra-examiners to inter-examiners when involving large data sets. Biased judgment also leads to a wider gap in measurement error. Thus, this work aims to automate the process of landmarks detection to help in enhancing the accuracy of measurement. In this work, automated craniofacial landmarks (ACL) on a 3D facial image system was developed using geometry characteristics information to identify the nasion (n), pronasale (prn), subnasale (sn), alare (al), labiale superius (ls), stomion (sto), labiale inferius (li), and chelion (ch). These landmarks were detected on the 3D facial image in .obj file format. The IA was also performed by manually plotting the craniofacial landmarks using Mirror software. In both methods, once all landmarks were detected, the eight linear measurements were then extracted. Paired t-test was performed to check the validity of ACL (i) between the subjects and (ii) between the two methods, by comparing the linear measurements extracted from both ACL and AI. The tests were performed on 60 subjects (30 males and 30 females).
RESULTS: The results on the validity of the ACL against IA between the subjects show accurate detection of n, sn, prn, sto, ls and li landmarks. The paired t-test showed that the seven linear measurements were statistically significant when p
Fulltext Classification of Suncus murinus species complex (Soricidae: Crocidurinae) in Peninsular Malaysia using image analysis and machine learning approaches

Abu A, Leow LK, Ramli R, Omar H

BMC Bioinformatics, 2016 Dec 22;17(Suppl 19):505.
PMID: 28155645 DOI: 10.1186/s12859-016-1362-5

BACKGROUND: Taxonomists frequently identify specimen from various populations based on the morphological characteristics and molecular data. This study looks into another invasive process in identification of house shrew (Suncus murinus) using image analysis and machine learning approaches. Thus, an automated identification system is developed to assist and simplify this task. In this study, seven descriptors namely area, convex area, major axis length, minor axis length, perimeter, equivalent diameter and extent which are based on the shape are used as features to represent digital image of skull that consists of dorsal, lateral and jaw views for each specimen. An Artificial Neural Network (ANN) is used as classifier to classify the skulls of S. murinus based on region (northern and southern populations of Peninsular Malaysia) and sex (adult male and female). Thus, specimen classification using Training data set and identification using Testing data set were performed through two stages of ANNs.
RESULTS: At present, the classifier used has achieved an accuracy of 100% based on skulls' views. Classification and identification to regions and sexes have also attained 72.5%, 87.5% and 80.0% of accuracy for dorsal, lateral, and jaw views, respectively. This results show that the shape characteristic features used are substantial because they can differentiate the specimens based on regions and sexes up to the accuracy of 80% and above. Finally, an application was developed and can be used for the scientific community.
CONCLUSIONS: This automated system demonstrates the practicability of using computer-assisted systems in providing interesting alternative approach for quick and easy identification of unknown species.
Fulltext Simulation of a Petri net-based model of the terpenoid biosynthesis pathway

Hawari AH, Mohamed-Hussein ZA

BMC Bioinformatics, 2010;11:83.
PMID: 20144236 DOI: 10.1186/1471-2105-11-83

The development and simulation of dynamic models of terpenoid biosynthesis has yielded a systems perspective that provides new insights into how the structure of this biochemical pathway affects compound synthesis. These insights may eventually help identify reactions that could be experimentally manipulated to amplify terpenoid production. In this study, a dynamic model of the terpenoid biosynthesis pathway was constructed based on the Hybrid Functional Petri Net (HFPN) technique. This technique is a fusion of three other extended Petri net techniques, namely Hybrid Petri Net (HPN), Dynamic Petri Net (HDN) and Functional Petri Net (FPN).

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links