Displaying all 8 publications

Abstract:
Sort:
  1. Wasito I, Hashim SZ, Sukmaningrum S
    Bioinformation, 2007 Dec 30;2(5):175-81.
    PMID: 18305825
    Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis.
  2. Habibi N, Mohd Hashim SZ, Norouzi A, Samian MR
    BMC Bioinformatics, 2014;15:134.
    PMID: 24885721 DOI: 10.1186/1471-2105-15-134
    Over the last 20 years in biotechnology, the production of recombinant proteins has been a crucial bioprocess in both biopharmaceutical and research arena in terms of human health, scientific impact and economic volume. Although logical strategies of genetic engineering have been established, protein overexpression is still an art. In particular, heterologous expression is often hindered by low level of production and frequent fail due to opaque reasons. The problem is accentuated because there is no generic solution available to enhance heterologous overexpression. For a given protein, the extent of its solubility can indicate the quality of its function. Over 30% of synthesized proteins are not soluble. In certain experimental circumstances, including temperature, expression host, etc., protein solubility is a feature eventually defined by its sequence. Until now, numerous methods based on machine learning are proposed to predict the solubility of protein merely from its amino acid sequence. In spite of the 20 years of research on the matter, no comprehensive review is available on the published methods.
  3. Habibi N, Samian MR, Hashim SZ, Norouzi A
    Protein Expr Purif, 2014 Mar;95:92-5.
    PMID: 24333540 DOI: 10.1016/j.pep.2013.11.014
    Recombinant protein production is a significant biotechnological process as it allows researchers to produce a specific protein in desired quantities. Escherichia coli (E. coli) is the most popular heterologous expression host for the production of recombinant proteins due to its advantages such as low cost, high-productivity, well-characterized genetics, simple growth requirements and rapid growth. There are a number of factors that influence the expression level of a recombinant protein in E. coli which are the gene to be expressed, the expression vector, the expression host, and the culture condition. The major motivation to develop our database, EcoliOverExpressionDB, is to provide a means for researchers to quickly locate key factors in the overexpression of certain proteins. Such information would be a useful guide for the overexpression of similar proteins in E. coli. To the best of the present researchers' knowledge, in general and specifically in E. coli, EcoliOverExpressionDB is the first database of recombinant protein expression experiments which gathers the influential parameters on protein overexpression and the results in one place.
  4. Habibi N, Norouzi A, Mohd Hashim SZ, Shamsir MS, Samian R
    Comput Biol Med, 2015 Nov 1;66:330-6.
    PMID: 26476414 DOI: 10.1016/j.compbiomed.2015.09.015
    Recombinant protein overexpression, an important biotechnological process, is ruled by complex biological rules which are mostly unknown, is in need of an intelligent algorithm so as to avoid resource-intensive lab-based trial and error experiments in order to determine the expression level of the recombinant protein. The purpose of this study is to propose a predictive model to estimate the level of recombinant protein overexpression for the first time in the literature using a machine learning approach based on the sequence, expression vector, and expression host. The expression host was confined to Escherichia coli which is the most popular bacterial host to overexpress recombinant proteins. To provide a handle to the problem, the overexpression level was categorized as low, medium and high. A set of features which were likely to affect the overexpression level was generated based on the known facts (e.g. gene length) and knowledge gathered from related literature. Then, a representative sub-set of features generated in the previous objective was determined using feature selection techniques. Finally a predictive model was developed using random forest classifier which was able to adequately classify the multi-class imbalanced small dataset constructed. The result showed that the predictive model provided a promising accuracy of 80% on average, in estimating the overexpression level of a recombinant protein.
  5. Majeed Alneamy JS, A Hameed Alnaish Z, Mohd Hashim SZ, Hamed Alnaish RA
    Comput Biol Med, 2019 09;112:103348.
    PMID: 31356992 DOI: 10.1016/j.compbiomed.2019.103348
    Accurate medical disease diagnosis is considered to be an important classification problem. The main goal of the classification process is to determine the class to which a certain pattern belongs. In this article, a new classification technique based on a combination of The Teaching Learning-Based Optimization (TLBO) algorithm and Fuzzy Wavelet Neural Network (FWNN) with Functional Link Neural Network (FLNN) is proposed. In addition, the TLBO algorithm is utilized for training the new hybrid Functional Fuzzy Wavelet Neural Network (FFWNN) and optimizing the learning parameters, which are weights, dilation and translation. To evaluate the performance of the proposed method, five standard medical datasets were used: Breast Cancer, Heart Disease, Hepatitis, Pima-Indian diabetes and Appendicitis. The efficiency of the proposed method is evaluated using 5-fold cross-validation and 10-fold cross-validation in terms of mean square error (MSE), classification accuracy, running time, sensitivity, specificity and kappa. The experimental results show that the efficiency of the proposed method for the medical classification problems is 98.309%, 91.1%, 91.39%, 88.67% and 93.51% for the Breast Cancer, Heart Disease, Hepatitis, Pima-Indian diabetes and Appendicitis datasets, respectively, in terms of accuracy after 30 runs for each dataset with low computational complexity. In addition, it has been observed that the proposed method has efficient performance compared with the performance of other methods found in the related previous studies.
  6. Gharaei N, Abu Bakar K, Mohd Hashim SZ, Hosseingholi Pourasl A, Siraj M, Darwish T
    Sensors (Basel), 2017 Aug 11;17(8).
    PMID: 28800121 DOI: 10.3390/s17081858
    Network lifetime and energy efficiency are crucial performance metrics used to evaluate wireless sensor networks (WSNs). Decreasing and balancing the energy consumption of nodes can be employed to increase network lifetime. In cluster-based WSNs, one objective of applying clustering is to decrease the energy consumption of the network. In fact, the clustering technique will be considered effective if the energy consumed by sensor nodes decreases after applying clustering, however, this aim will not be achieved if the cluster size is not properly chosen. Therefore, in this paper, the energy consumption of nodes, before clustering, is considered to determine the optimal cluster size. A two-stage Genetic Algorithm (GA) is employed to determine the optimal interval of cluster size and derive the exact value from the interval. Furthermore, the energy hole is an inherent problem which leads to a remarkable decrease in the network's lifespan. This problem stems from the asynchronous energy depletion of nodes located in different layers of the network. For this reason, we propose Circular Motion of Mobile-Sink with Varied Velocity Algorithm (CM2SV2) to balance the energy consumption ratio of cluster heads (CH). According to the results, these strategies could largely increase the network's lifetime by decreasing the energy consumption of sensors and balancing the energy consumption among CHs.
  7. Misman MF, Mohamad MS, Deris S, Abdullah A, Hashim SZ
    Bioinformation, 2011;7(4):169-75.
    PMID: 22102773
    Pathway analysis has lead to a new era in genomic research by providing further biological process information compared to traditional single gene analysis. Beside the advantage, pathway analysis provides some challenges to the researchers, one of which is the quality of pathway data itself. The pathway data usually defined from biological context free, when it comes to a specific biological context (e.g. lung cancer disease), typically only several genes within pathways are responsible for the corresponding cellular process. It also can be that some pathways may be included with uninformative genes or perhaps informative genes were excluded. Moreover, many algorithms in pathway analysis neglect these limitations by treating all the genes within pathways as significant. In previous study, a hybrid of support vector machines and smoothly clipped absolute deviation with groups-specific tuning parameters (gSVM-SCAD) was proposed in order to identify and select the informative genes before the pathway evaluation process. However, gSVM-SCAD had showed a limitation in terms of the performance of classification accuracy. In order to deal with this limitation, we made an enhancement to the tuning parameter method for gSVM-SCAD by applying the B-Type generalized approximate cross validation (BGACV). Experimental analyses using one simulated data and two gene expression data have shown that the proposed method obtains significant results in identifying biologically significant genes and pathways, and in classification accuracy.
  8. Ismail AM, Mohamad MS, Abdul Majid H, Abas KH, Deris S, Zaki N, et al.
    Biosystems, 2017 Dec;162:81-89.
    PMID: 28951204 DOI: 10.1016/j.biosystems.2017.09.013
    Mathematical modelling is fundamental to understand the dynamic behavior and regulation of the biochemical metabolisms and pathways that are found in biological systems. Pathways are used to describe complex processes that involve many parameters. It is important to have an accurate and complete set of parameters that describe the characteristics of a given model. However, measuring these parameters is typically difficult and even impossible in some cases. Furthermore, the experimental data are often incomplete and also suffer from experimental noise. These shortcomings make it challenging to identify the best-fit parameters that can represent the actual biological processes involved in biological systems. Computational approaches are required to estimate these parameters. The estimation is converted into multimodal optimization problems that require a global optimization algorithm that can avoid local solutions. These local solutions can lead to a bad fit when calibrating with a model. Although the model itself can potentially match a set of experimental data, a high-performance estimation algorithm is required to improve the quality of the solutions. This paper describes an improved hybrid of particle swarm optimization and the gravitational search algorithm (IPSOGSA) to improve the efficiency of a global optimum (the best set of kinetic parameter values) search. The findings suggest that the proposed algorithm is capable of narrowing down the search space by exploiting the feasible solution areas. Hence, the proposed algorithm is able to achieve a near-optimal set of parameters at a fast convergence speed. The proposed algorithm was tested and evaluated based on two aspartate pathways that were obtained from the BioModels Database. The results show that the proposed algorithm outperformed other standard optimization algorithms in terms of accuracy and near-optimal kinetic parameter estimation. Nevertheless, the proposed algorithm is only expected to work well in small scale systems. In addition, the results of this study can be used to estimate kinetic parameter values in the stage of model selection for different experimental conditions.
Related Terms
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links