MyMedR

Displaying publications 1 - 20 of 44 in total

Abstract:

Sort:

Random forest for gene selection and microarray data classification

Moorthy K, Mohamad MS

Bioinformation, 2011;7(3):142-6.
PMID: 22125385

A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods.
Samira-VP: A simple protein alignment method with rechecking the alphabet vector positions

Fotoohifiroozabadi S, Mohamad MS, Deris S

J Bioinform Comput Biol, 2017 Apr;15(2):1750004.
PMID: 28274174 DOI: 10.1142/S0219720017500044

Protein structure alignment and comparisons that are based on an alphabetical demonstration of protein structure are more simple to run with faster evaluation processes; thus, their accuracy is not as reliable as three-dimension (3D)-based tools. As a 1D method candidate, TS-AMIR used the alphabetic demonstration of secondary-structure elements (SSE) of proteins and compared the assigned letters to each SSE using the [Formula: see text]-gram method. Although the results were comparable to those obtained via geometrical methods, the SSE length and accuracy of adjacency between SSEs were not considered in the comparison process. Therefore, to obtain further information on accuracy of adjacency between SSE vectors, the new approach of assigning text to vectors was adopted according to the spherical coordinate system in the present study. Moreover, dynamic programming was applied in order to account for the length of SSE vectors. Five common datasets were selected for method evaluation. The first three datasets were small, but difficult to align, and the remaining two datasets were used to compare the capability of the proposed method with that of other methods on a large protein dataset. The results showed that the proposed method, as a text-based alignment approach, obtained results comparable to both 1D and 3D methods. It outperformed 1D methods in terms of accuracy and 3D methods in terms of runtime.
A practical illustration of spatial smoothing methods for disconnected regions with INLA: spatial survey on overweight and obesity in Malaysia

Mohamad MS, Abdul Maulud KN, Faes C

Int J Health Geogr, 2023 Jun 21;22(1):14.
PMID: 37344913 DOI: 10.1186/s12942-023-00336-5

BACKGROUND: National prevalence could mask subnational heterogeneity in disease occurrence, and disease mapping is an important tool to illustrate the spatial pattern of disease. However, there is limited information on techniques for the specification of conditional autoregressive models in disease mapping involving disconnected regions. This study explores available techniques for producing district-level prevalence estimates for disconnected regions, using as an example childhood overweight in Malaysia, which consists of the Peninsular and Borneo regions separated by the South China Sea. We used data from Malaysia National Health and Morbidity Survey conducted in 2015. We adopted Bayesian hierarchical modelling using the integrated nested Laplace approximation (INLA) program in R-software to model the spatial distribution of overweight among 6301 children aged 5-17 years across 144 districts located in two disconnected regions. We illustrate different types of spatial models for prevalence mapping across disconnected regions, taking into account the survey design and adjusting for district-level demographic and socioeconomic covariates.
RESULTS: The spatial model with split random effects and a common intercept has the lowest Deviance and Watanabe Information Criteria. There was evidence of a spatial pattern in the prevalence of childhood overweight across districts. An increasing trend in smoothed prevalence of overweight was observed when moving from the east to the west of the Peninsular and Borneo regions. The proportion of Bumiputera ethnicity in the district had a significant negative association with childhood overweight: the higher the proportion of Bumiputera ethnicity in the district, the lower the prevalence of childhood overweight.
CONCLUSION: This study illustrates different available techniques for mapping prevalence across districts in disconnected regions using survey data. These techniques can be utilized to produce reliable subnational estimates for any areas that comprise of disconnected regions. Through the example, we learned that the best-fit model was the one that considered the separate variations of the individual regions. We discovered that the occurrence of childhood overweight in Malaysia followed a spatial pattern with an east-west gradient trend, and we identified districts with high prevalence of overweight. This information could help policy makers in making informed decisions for targeted public health interventions in high-risk areas.
Fulltext A synchronous-asynchronous particle swarm optimisation algorithm

Ab Aziz NA, Mubin M, Mohamad MS, Ab Aziz K

ScientificWorldJournal, 2014;2014:123019.
PMID: 25121109 DOI: 10.1155/2014/123019

In the original particle swarm optimisation (PSO) algorithm, the particles' velocities and positions are updated after the whole swarm performance is evaluated. This algorithm is also known as synchronous PSO (S-PSO). The strength of this update method is in the exploitation of the information. Asynchronous update PSO (A-PSO) has been proposed as an alternative to S-PSO. A particle in A-PSO updates its velocity and position as soon as its own performance has been evaluated. Hence, particles are updated using partial information, leading to stronger exploration. In this paper, we attempt to improve PSO by merging both update methods to utilise the strengths of both methods. The proposed synchronous-asynchronous PSO (SA-PSO) algorithm divides the particles into smaller groups. The best member of a group and the swarm's best are chosen to lead the search. Members within a group are updated synchronously, while the groups themselves are asynchronously updated. Five well-known unimodal functions, four multimodal functions, and a real world optimisation problem are used to study the performance of SA-PSO, which is compared with the performances of S-PSO and A-PSO. The results are statistically analysed and show that the proposed SA-PSO has performed consistently well.
Optimising the production of succinate and lactate in Escherichia coli using a hybrid of artificial bee colony algorithm and minimisation of metabolic adjustment

Tang PW, Choon YW, Mohamad MS, Deris S, Napis S

J Biosci Bioeng, 2015 Mar;119(3):363-8.
PMID: 25216804 DOI: 10.1016/j.jbiosc.2014.08.004

Metabolic engineering is a research field that focuses on the design of models for metabolism, and uses computational procedures to suggest genetic manipulation. It aims to improve the yield of particular chemical or biochemical products. Several traditional metabolic engineering methods are commonly used to increase the production of a desired target, but the products are always far below their theoretical maximums. Using numeral optimisation algorithms to identify gene knockouts may stall at a local minimum in a multivariable function. This paper proposes a hybrid of the artificial bee colony (ABC) algorithm and the minimisation of metabolic adjustment (MOMA) to predict an optimal set of solutions in order to optimise the production rate of succinate and lactate. The dataset used in this work was from the iJO1366 Escherichia coli metabolic network. The experimental results include the production rate, growth rate and a list of knockout genes. From the comparative analysis, ABCMOMA produced better results compared to previous works, showing potential for solving genetic engineering problems.
Fulltext A review for detecting gene-gene interactions using machine learning methods in genetic epidemiology

Koo CL, Liew MJ, Mohamad MS, Salleh AH

Biomed Res Int, 2013;2013:432375.
PMID: 24228248 DOI: 10.1155/2013/432375

Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.
A modified binary particle swarm optimization for selecting the small subset of informative genes from gene expression data

Mohamad MS, Omatu S, Deris S, Yoshioka M

IEEE Trans Inf Technol Biomed, 2011 Nov;15(6):813-22.
PMID: 21914573 DOI: 10.1109/TITB.2011.2167756

Gene expression data are expected to be of significant help in the development of efficient cancer diagnoses and classification platforms. In order to select a small subset of informative genes from the data for cancer classification, recently, many researchers are analyzing gene expression data using various computational intelligence methods. However, due to the small number of samples compared to the huge number of genes (high dimension), irrelevant genes, and noisy genes, many of the computational methods face difficulties to select the small subset. Thus, we propose an improved (modified) binary particle swarm optimization to select the small subset of informative genes that is relevant for the cancer classification. In this proposed method, we introduce particles' speed for giving the rate at which a particle changes its position, and we propose a rule for updating particle's positions. By performing experiments on ten different gene expression datasets, we have found that the performance of the proposed method is superior to other previous related works, including the conventional version of binary particle swarm optimization (BPSO) in terms of classification accuracy and the number of selected genes. The proposed method also produces lower running times compared to BPSO.
Fulltext A newton cooperative genetic algorithm method for in silico optimization of metabolic pathway production

Ismail MA, Deris S, Mohamad MS, Abdullah A

PLoS One, 2015;10(5):e0126199.
PMID: 25961295 DOI: 10.1371/journal.pone.0126199

This paper presents an in silico optimization method of metabolic pathway production. The metabolic pathway can be represented by a mathematical model known as the generalized mass action model, which leads to a complex nonlinear equations system. The optimization process becomes difficult when steady state and the constraints of the components in the metabolic pathway are involved. To deal with this situation, this paper presents an in silico optimization method, namely the Newton Cooperative Genetic Algorithm (NCGA). The NCGA used Newton method in dealing with the metabolic pathway, and then integrated genetic algorithm and cooperative co-evolutionary algorithm. The proposed method was experimentally applied on the benchmark metabolic pathways, and the results showed that the NCGA achieved better results compared to the existing methods.
Fulltext In silico gene knockout prediction using a hybrid of Bat algorithm and minimization of metabolic adjustment

Man MY, Mohamad MS, Choon YW, Ismail MA

J Integr Bioinform, 2021 Aug 04;18(3).
PMID: 34348418 DOI: 10.1515/jib-2020-0037

Microorganisms commonly produce many high-demand industrial products like fuels, food, vitamins, and other chemicals. Microbial strains are the strains of microorganisms, which can be optimized to improve their technological properties through metabolic engineering. Metabolic engineering is the process of overcoming cellular regulation in order to achieve a desired product or to generate a new product that the host cells do not usually need to produce. The prediction of genetic manipulations such as gene knockout is part of metabolic engineering. Gene knockout can be used to optimize the microbial strains, such as to maximize the production rate of chemicals of interest. Metabolic and genetic engineering is important in producing the chemicals of interest as, without them, the product yields of many microorganisms are normally low. As a result, the aim of this paper is to propose a combination of the Bat algorithm and the minimization of metabolic adjustment (BATMOMA) to predict which genes to knock out in order to increase the succinate and lactate production rates in Escherichia coli (E. coli).
Fulltext An improved swarm optimization for parameter estimation and biological model selection

Abdullah A, Deris S, Mohamad MS, Anwar S

PLoS One, 2013;8(4):e61258.
PMID: 23593445 DOI: 10.1371/journal.pone.0061258

One of the key aspects of computational systems biology is the investigation on the dynamic biological processes within cells. Computational models are often required to elucidate the mechanisms and principles driving the processes because of the nonlinearity and complexity. The models usually incorporate a set of parameters that signify the physical properties of the actual biological systems. In most cases, these parameters are estimated by fitting the model outputs with the corresponding experimental data. However, this is a challenging task because the available experimental data are frequently noisy and incomplete. In this paper, a new hybrid optimization method is proposed to estimate these parameters from the noisy and incomplete experimental data. The proposed method, called Swarm-based Chemical Reaction Optimization, integrates the evolutionary searching strategy employed by the Chemical Reaction Optimization, into the neighbouring searching strategy of the Firefly Algorithm method. The effectiveness of the method was evaluated using a simulated nonlinear model and two biological models: synthetic transcriptional oscillators, and extracellular protease production models. The results showed that the accuracy and computational speed of the proposed method were better than the existing Differential Evolution, Firefly Algorithm and Chemical Reaction Optimization methods. The reliability of the estimated parameters was statistically validated, which suggests that the model outputs produced by these parameters were valid even when noisy and incomplete experimental data were used. Additionally, Akaike Information Criterion was employed to evaluate the model selection, which highlighted the capability of the proposed method in choosing a plausible model based on the experimental data. In conclusion, this paper presents the effectiveness of the proposed method for parameter estimation and model selection problems using noisy and incomplete experimental data. This study is hoped to provide a new insight in developing more accurate and reliable biological models based on limited and low quality experimental data.
Identification of gene knockout strategies using a hybrid of an ant colony optimization algorithm and flux balance analysis to optimize microbial strains

Lu SJ, Salleh AH, Mohamad MS, Deris S, Omatu S, Yoshioka M

Comput Biol Chem, 2014 12;53PB:175-183.
PMID: 25462325 DOI: 10.1016/j.compbiolchem.2014.09.008

Reconstructions of genome-scale metabolic networks from different organisms have become popular in recent years. Metabolic engineering can simulate the reconstruction process to obtain desirable phenotypes. In previous studies, optimization algorithms have been implemented to identify the near-optimal sets of knockout genes for improving metabolite production. However, previous works contained premature convergence and the stop criteria were not clear for each case. Therefore, this study proposes an algorithm that is a hybrid of the ant colony optimization algorithm and flux balance analysis (ACOFBA) to predict near optimal sets of gene knockouts in an effort to maximize growth rates and the production of certain metabolites. Here, we present a case study that uses Baker's yeast, also known as Saccharomyces cerevisiae, as the model organism and target the rate of vanillin production for optimization. The results of this study are the growth rate of the model organism after gene deletion and a list of knockout genes. The ACOFBA algorithm was found to improve the yield of vanillin in terms of growth rate and production compared with the previous algorithms.
Fulltext Feature selection and classifier parameters estimation for EEG signals peak detection using particle swarm optimization

Adam A, Shapiai MI, Tumari MZ, Mohamad MS, Mubin M

ScientificWorldJournal, 2014;2014:973063.
PMID: 25243236 DOI: 10.1155/2014/973063

Electroencephalogram (EEG) signal peak detection is widely used in clinical applications. The peak point can be detected using several approaches, including time, frequency, time-frequency, and nonlinear domains depending on various peak features from several models. However, there is no study that provides the importance of every peak feature in contributing to a good and generalized model. In this study, feature selection and classifier parameters estimation based on particle swarm optimization (PSO) are proposed as a framework for peak detection on EEG signals in time domain analysis. Two versions of PSO are used in the study: (1) standard PSO and (2) random asynchronous particle swarm optimization (RA-PSO). The proposed framework tries to find the best combination of all the available features that offers good peak detection and a high classification rate from the results in the conducted experiments. The evaluation results indicate that the accuracy of the peak detection can be improved up to 99.90% and 98.59% for training and testing, respectively, as compared to the framework without feature selection adaptation. Additionally, the proposed framework based on RA-PSO offers a better and reliable classification rate as compared to standard PSO as it produces low variance model.
Fulltext A review of feature extraction software for microarray gene expression data

Tan CS, Ting WS, Mohamad MS, Chan WH, Deris S, Shah ZA

Biomed Res Int, 2014;2014:213656.
PMID: 25250315 DOI: 10.1155/2014/213656

When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method.
A hybrid of ant colony optimization and minimization of metabolic adjustment to improve the production of succinic acid in Escherichia coli

Chong SK, Mohamad MS, Mohamed Salleh AH, Choon YW, Chong CK, Deris S

Comput Biol Med, 2014 Jun;49:74-82.
PMID: 24763079 DOI: 10.1016/j.compbiomed.2014.03.011

This paper presents a study on gene knockout strategies to identify candidate genes to be knocked out for improving the production of succinic acid in Escherichia coli. Succinic acid is widely used as a precursor for many chemicals, for example production of antibiotics, therapeutic proteins and food. However, the chemical syntheses of succinic acid using the traditional methods usually result in the production that is far below their theoretical maximums. In silico gene knockout strategies are commonly implemented to delete the gene in E. coli to overcome this problem. In this paper, a hybrid of Ant Colony Optimization (ACO) and Minimization of Metabolic Adjustment (MoMA) is proposed to identify gene knockout strategies to improve the production of succinic acid in E. coli. As a result, the hybrid algorithm generated a list of knockout genes, succinic acid production rate and growth rate for E. coli after gene knockout. The results of the hybrid algorithm were compared with the previous methods, OptKnock and MOMAKnock. It was found that the hybrid algorithm performed better than OptKnock and MOMAKnock in terms of the production rate. The information from the results produced from the hybrid algorithm can be used in wet laboratory experiments to increase the production of succinic acid in E. coli.
A review on the computational approaches for gene regulatory network construction

Chai LE, Loh SK, Low ST, Mohamad MS, Deris S, Zakaria Z

Comput Biol Med, 2014 May;48:55-65.
PMID: 24637147 DOI: 10.1016/j.compbiomed.2014.02.011

Many biological research areas such as drug design require gene regulatory networks to provide clear insight and understanding of the cellular process in living cells. This is because interactions among the genes and their products play an important role in many molecular processes. A gene regulatory network can act as a blueprint for the researchers to observe the relationships among genes. Due to its importance, several computational approaches have been proposed to infer gene regulatory networks from gene expression data. In this review, six inference approaches are discussed: Boolean network, probabilistic Boolean network, ordinary differential equation, neural network, Bayesian network, and dynamic Bayesian network. These approaches are discussed in terms of introduction, methodology and recent applications of these approaches in gene regulatory network construction. These approaches are also compared in the discussion section. Furthermore, the strengths and weaknesses of these computational approaches are described.
A hybrid of bees algorithm and flux balance analysis with OptKnock as a platform for in silico optimization of microbial strains

Choon YW, Mohamad MS, Deris S, Illias RM, Chong CK, Chai LE

Bioprocess Biosyst Eng, 2014 Mar;37(3):521-32.
PMID: 23892659 DOI: 10.1007/s00449-013-1019-y

Microbial strain optimization focuses on improving technological properties of the strain of microorganisms. However, the complexities of the metabolic networks, which lead to data ambiguity, often cause genetic modification on the desirable phenotypes difficult to predict. Furthermore, vast number of reactions in cellular metabolism lead to the combinatorial problem in obtaining optimal gene deletion strategy. Consequently, the computation time increases exponentially with the increase in the size of the problem. Hence, we propose an extension of a hybrid of Bees Algorithm and Flux Balance Analysis (BAFBA) by integrating OptKnock into BAFBA to validate the result. This paper presents a number of computational experiments to test on the performance and capability of BAFBA. Escherichia coli, Bacillus subtilis and Clostridium thermocellum are the model organisms in this paper. Also included is the identification of potential reactions to improve the production of succinic acid, lactic acid and ethanol, plus the discussion on the changes in the flux distribution of the predicted mutants. BAFBA shows potential in suggesting the non-intuitive gene knockout strategies and a low variability among the several runs. The results show that BAFBA is suitable, reliable and applicable in predicting optimal gene knockout strategy.
Fulltext An enhancement of binary particle swarm optimization for gene selection in classifying cancer classes

Mohamad MS, Omatu S, Deris S, Yoshioka M, Abdullah A, Ibrahim Z

Algorithms Mol Biol, 2013;8(1):15.
PMID: 23617960 DOI: 10.1186/1748-7188-8-15

Gene expression data could likely be a momentous help in the progress of proficient cancer diagnoses and classification platforms. Lately, many researchers analyze gene expression data using diverse computational intelligence methods, for selecting a small subset of informative genes from the data for cancer classification. Many computational methods face difficulties in selecting small subsets due to the small number of samples compared to the huge number of genes (high-dimension), irrelevant genes, and noisy genes.
Shortened activated partial thromboplastin time, a hemostatic marker for hypercoagulable state during acute coronary event

Abdullah WZ, Moufak SK, Yusof Z, Mohamad MS, Kamarul IM

Transl Res, 2010 Jun;155(6):315-9.
PMID: 20478546 DOI: 10.1016/j.trsl.2010.02.001

Various factors may contribute to a hypercoagulable state and acute vascular thrombosis. A prospective study was conducted involving 165 coronary heart disease (CHD) patients from the Cardiology Unit, Hospital Universiti Sains Malaysia. The purpose of this study was to investigate the relationship among factor VIII (FVIII), prothrombin time (PT), activated partial thromboplastin time (APTT), and activated protein C resistance (APC-R) state among CHD patients and to look for potential clinical applications from these laboratory findings. There were 110 cases diagnosed as acute coronary syndrome (ACS), whereas another 55 were stable coronary artery disease (SCAD) patients. PT, APTT, FVIII, and APC-R assays were performed on all subjects. There was a significant difference between the FVIII level and the APTT results (P value < 0.0001). A negative relationship was found between the FVIII level and the APTT from linear regression analysis (R(2) = 10%, P value < 0.0001). For each 1% increase in the FVIII level, the APTT was reduced by 0.013 s (95% confidence interval (CI) between -0.019 and -0.007). Interestingly, none of the SCAD patients had abnormally short APTT. Approximately 68.4% of cases with a positive APC-R assay were found to have a high FVIII level. In conclusion, the APTT test is a potential hemostatic marker for hypercoagulable state including in arterial thrombosis.
Study site: Cardiology unit (outpatient and inpatient), Hospital Universisti Sains Malaysia (HUSM), Kelantan, Malaysia
Fulltext Gene knockout identification using an extension of Bees Hill Flux Balance Analysis

Choon YW, Mohamad MS, Deris S, Chong CK, Omatu S, Corchado JM

Biomed Res Int, 2015;2015:124537.
PMID: 25874200 DOI: 10.1155/2015/124537

Microbial strain optimisation for the overproduction of a desired phenotype has been a popular topic in recent years. Gene knockout is a genetic engineering technique that can modify the metabolism of microbial cells to obtain desirable phenotypes. Optimisation algorithms have been developed to identify the effects of gene knockout. However, the complexities of metabolic networks have made the process of identifying the effects of genetic modification on desirable phenotypes challenging. Furthermore, a vast number of reactions in cellular metabolism often lead to a combinatorial problem in obtaining optimal gene knockout. The computational time increases exponentially as the size of the problem increases. This work reports an extension of Bees Hill Flux Balance Analysis (BHFBA) to identify optimal gene knockouts to maximise the production yield of desired phenotypes while sustaining the growth rate. This proposed method functions by integrating OptKnock into BHFBA for validating the results automatically. The results show that the extension of BHFBA is suitable, reliable, and applicable in predicting gene knockout. Through several experiments conducted on Escherichia coli, Bacillus subtilis, and Clostridium thermocellum as model organisms, extension of BHFBA has shown better performance in terms of computational time, stability, growth rate, and production yield of desired phenotypes.
Fulltext Modelling the longevity of dental restorations by means of a CBR system

Aliaga IJ, Vera V, De Paz JF, García AE, Mohamad MS

Biomed Res Int, 2015;2015:540306.
PMID: 25866792 DOI: 10.1155/2015/540306

The lifespan of dental restorations is limited. Longevity depends on the material used and the different characteristics of the dental piece. However, it is not always the case that the best and longest lasting material is used since patients may prefer different treatments according to how noticeable the material is. Over the last 100 years, the most commonly used material has been silver amalgam, which, while very durable, is somewhat aesthetically displeasing. Our study is based on the collection of data from the charts, notes, and radiographic information of restorative treatments performed by Dr. Vera in 1993, the analysis of the information by computer artificial intelligence to determine the most appropriate restoration, and the monitoring of the evolution of the dental restoration. The data will be treated confidentially according to the Organic Law 15/1999 on 13 December on the Protection of Personal Data. This paper also presents a clustering technique capable of identifying the most significant cases with which to instantiate the case-base. In order to classify the cases, a mixture of experts is used which incorporates a Bayesian network and a multilayer perceptron; the combination of both classifiers is performed with a neural network.

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links