Displaying publications 1 - 20 of 62 in total

Abstract:
Sort:
  1. Eltyeb S, Salim N
    J Cheminform, 2014;6:17.
    PMID: 24834132 DOI: 10.1186/1758-2946-6-17
    The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to "text mine" these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.
  2. Abdo A, Salim N
    J Chem Inf Model, 2011 Jan 24;51(1):25-32.
    PMID: 21155550 DOI: 10.1021/ci100232h
    Many of the conventional similarity methods assume that molecular fragments that do not relate to biological activity carry the same weight as the important ones. One possible approach to this problem is to use the Bayesian inference network (BIN), which models molecules and reference structures as probabilistic inference networks. The relationships between molecules and reference structures in the Bayesian network are encoded using a set of conditional probability distributions, which can be estimated by the fragment weighting function, a function of the frequencies of the fragments in the molecule or the reference structure as well as throughout the collection. The weighting function combines one or more fragment weighting schemes. In this paper, we have investigated five different weighting functions and present a new fragment weighting scheme. Later on, these functions were modified to combine the new weighting scheme. Simulated virtual screening experiments with the MDL Drug Data Report (23) and maximum unbiased validation data sets show that the use of new weighting scheme can provide significantly more effective screening when compared with the use of current weighting schemes.
  3. Abdo A, Salim N
    ChemMedChem, 2009 Feb;4(2):210-8.
    PMID: 19072820 DOI: 10.1002/cmdc.200800290
    Many methods have been developed to capture the biological similarity between two compounds for use in drug discovery. A variety of similarity metrics have been introduced, the Tanimoto coefficient being the most prominent. Many of the approaches assume that molecular features or descriptors that do not relate to the biological activity carry the same weight as the important aspects in terms of biological similarity. Herein, a novel similarity searching approach using a Bayesian inference network is discussed. Similarity searching is regarded as an inference or evidential reasoning process in which the probability that a given compound has biological similarity with the query is estimated and used as evidence. Our experiments demonstrate that the similarity approach based on Bayesian inference networks is likely to outperform the Tanimoto similarity search and offer a promising alternative to existing similarity search approaches.
  4. Da'u A, Salim N
    PeerJ Comput Sci, 2019;5:e191.
    PMID: 33816844 DOI: 10.7717/peerj-cs.191
    Aspect extraction is a subtask of sentiment analysis that deals with identifying opinion targets in an opinionated text. Existing approaches to aspect extraction typically rely on using handcrafted features, linear and integrated network architectures. Although these methods can achieve good performances, they are time-consuming and often very complicated. In real-life systems, a simple model with competitive results is generally more effective and preferable over complicated models. In this paper, we present a multichannel convolutional neural network for aspect extraction. The model consists of a deep convolutional neural network with two input channels: a word embedding channel which aims to encode semantic information of the words and a part of speech (POS) tag embedding channel to facilitate the sequential tagging process. To get the vector representation of words, we initialized the word embedding channel and the POS channel using pretrained word2vec and one-hot-vector of POS tags, respectively. Both the word embedding and the POS embedding vectors were fed into the convolutional layer and concatenated to a one-dimensional vector, which is finally pooled and processed using a Softmax function for sequence labeling. We finally conducted a series of experiments using four different datasets. The results indicated better performance compared to the baseline models.
  5. Altalib MK, Salim N
    Biomolecules, 2022 Nov 20;12(11).
    PMID: 36421733 DOI: 10.3390/biom12111719
    Information technology has become an integral aspect of the drug development process. The virtual screening process (VS) is a computational technique for screening chemical compounds in a reasonable amount of time and cost. The similarity search is one of the primary tasks in VS that estimates a molecule's similarity. It is predicated on the idea that molecules with similar structures may also have similar activities. Many techniques for comparing the biological similarity between a target compound and each compound in the database have been established. Although the approaches have a strong performance, particularly when dealing with molecules with homogenous active structural, they are not enough good when dealing with structurally heterogeneous compounds. The previous works examined many deep learning methods in the enhanced Siamese similarity model and demonstrated that the Enhanced Siamese Multi-Layer Perceptron similarity model (SMLP) and the Siamese Convolutional Neural Network-one dimension similarity model (SCNN1D) have good outcomes when dealing with structurally heterogeneous molecules. To further improve the retrieval effectiveness of the similarity model, we incorporate the best two models in one hybrid model. The reason is that each method gives good results in some classes, so combining them in one hybrid model may improve the retrieval recall. Many designs of the hybrid models will be tested in this study. Several experiments on real-world data sets were conducted, and the findings demonstrated that the new approaches outperformed the previous method.
  6. Altalib MK, Salim N
    Molecules, 2021 Nov 03;26(21).
    PMID: 34771076 DOI: 10.3390/molecules26216669
    Traditional drug development is a slow and costly process that leads to the production of new drugs. Virtual screening (VS) is a computational procedure that measures the similarity of molecules as one of its primary tasks. Many techniques for capturing the biological similarity between a test compound and a known target ligand have been established in ligand-based virtual screens (LBVSs). However, despite the good performances of the above methods compared to their predecessors, especially when dealing with molecules that have structurally homogenous active elements, they are not satisfied when dealing with molecules that are structurally heterogeneous. The main aim of this study is to improve the performance of similarity searching, especially with molecules that are structurally heterogeneous. The Siamese network will be used due to its capability to deal with complicated data samples in many fields. The Siamese multi-layer perceptron architecture will be enhanced by using two similarity distance layers with one fused layer, then multiple layers will be added after the fusion layer, and then the nodes of the model that contribute less or nothing during inference according to their signal-to-noise ratio values will be pruned. Several benchmark datasets will be used, which are: the MDL Drug Data Report (MDDR-DS1, MDDR-DS2, and MDDR-DS3), the Maximum Unbiased Validation (MUV), and the Directory of Useful Decoys (DUD). The results show the outperformance of the proposed method on standard Tanimoto coefficient (TAN) and other methods. Additionally, it is possible to reduce the number of nodes in the Siamese multilayer perceptron model while still keeping the effectiveness of recall on the same level.
  7. Ahmed A, Abdo A, Salim N
    ScientificWorldJournal, 2012;2012:410914.
    PMID: 22623895 DOI: 10.1100/2012/410914
    Many of the similarity-based virtual screening approaches assume that molecular fragments that are not related to the biological activity carry the same weight as the important ones. This was the reason that led to the use of Bayesian networks as an alternative to existing tools for similarity-based virtual screening. In our recent work, the retrieval performance of the Bayesian inference network (BIN) was observed to improve significantly when molecular fragments were reweighted using the relevance feedback information. In this paper, a set of active reference structures were used to reweight the fragments in the reference structure. In this approach, higher weights were assigned to those fragments that occur more frequently in the set of active reference structures while others were penalized. Simulated virtual screening experiments with MDL Drug Data Report datasets showed that the proposed approach significantly improved the retrieval effectiveness of ligand-based virtual screening, especially when the active molecules being sought had a high degree of structural heterogeneity.
  8. Abdo A, Salim N, Ahmed A
    J Biomol Screen, 2011 Oct;16(9):1081-8.
    PMID: 21862688 DOI: 10.1177/1087057111416658
    Recently, the use of the Bayesian network as an alternative to existing tools for similarity-based virtual screening has received noticeable attention from researchers in the chemoinformatics field. The main aim of the Bayesian network model is to improve the retrieval effectiveness of similarity-based virtual screening. To this end, different models of the Bayesian network have been developed. In our previous works, the retrieval performance of the Bayesian network was observed to improve significantly when multiple reference structures or fragment weightings were used. In this article, the authors enhance the Bayesian inference network (BIN) using the relevance feedback information. In this approach, a few high-ranking structures of unknown activity were filtered from the outputs of BIN, based on a single active reference structure, to form a set of active reference structures. This set of active reference structures was used in two distinct techniques for carrying out such BIN searching: reweighting the fragments in the reference structures and group fusion techniques. Simulated virtual screening experiments with three MDL Drug Data Report data sets showed that the proposed techniques provide simple ways of enhancing the cost-effectiveness of ligand-based virtual screening searches, especially for higher diversity data sets.
  9. Hentabli H, Saeed F, Abdo A, Salim N
    ScientificWorldJournal, 2014;2014:286974.
    PMID: 25140330 DOI: 10.1155/2014/286974
    Molecular similarity is a pervasive concept in drug design. The basic idea underlying molecular similarity is the similar property principle, which states that structurally similar molecules will exhibit similar physicochemical and biological properties. In this paper, a new graph-based molecular descriptor (GBMD) is introduced. The GBMD is a new method of obtaining a rough description of 2D molecular structure in textual form based on the canonical representations of the molecule outline shape and it allows rigorous structure specification using small and natural grammars. Simulated virtual screening experiments with the MDDR database show clearly the superiority of the graph-based descriptor compared to many standard descriptors (ALOGP, MACCS, EPFP4, CDKFP, PCFP, and SMILE) using the Tanimoto coefficient (TAN) and the basic local alignment search tool (BLAST) when searches were carried.
  10. Ahmed A, Saeed F, Salim N, Abdo A
    J Cheminform, 2014;6:19.
    PMID: 24883114 DOI: 10.1186/1758-2946-6-19
    BACKGROUND: It is known that any individual similarity measure will not always give the best recall of active molecule structure for all types of activity classes. Recently, the effectiveness of ligand-based virtual screening approaches can be enhanced by using data fusion. Data fusion can be implemented using two different approaches: group fusion and similarity fusion. Similarity fusion involves searching using multiple similarity measures. The similarity scores, or ranking, for each similarity measure are combined to obtain the final ranking of the compounds in the database.

    RESULTS: The Condorcet fusion method was examined. This approach combines the outputs of similarity searches from eleven association and distance similarity coefficients, and then the winner measure for each class of molecules, based on Condorcet fusion, was chosen to be the best method of searching. The recall of retrieved active molecules at top 5% and significant test are used to evaluate our proposed method. The MDL drug data report (MDDR), maximum unbiased validation (MUV) and Directory of Useful Decoys (DUD) data sets were used for experiments and were represented by 2D fingerprints.

    CONCLUSIONS: Simulated virtual screening experiments with the standard two data sets show that the use of Condorcet fusion provides a very simple way of improving the ligand-based virtual screening, especially when the active molecules being sought have a lowest degree of structural heterogeneity. However, the effectiveness of the Condorcet fusion was increased slightly when structural sets of high diversity activities were being sought.

  11. Saeed F, Salim N, Abdo A
    Int J Comput Biol Drug Des, 2014 01 09;7(1):31-44.
    PMID: 24429501 DOI: 10.1504/IJCBDD.2014.058584
    Many types of clustering techniques for chemical structures have been used in the literature, but it is known that any single method will not always give the best results for all types of applications. Recent work on consensus clustering methods is motivated because of the successes of combining multiple classifiers in many areas and the ability of consensus clustering to improve the robustness, novelty, consistency and stability of individual clusterings. In this paper, the Cluster-based Similarity Partitioning Algorithm (CSPA) was examined for improving the quality of chemical structures clustering. The effectiveness of clustering was evaluated based on the ability to separate active from inactive molecules in each cluster and the results were compared with the Ward's clustering method. The chemical dataset MDL Drug Data Report (MDDR) database was used for experiments. The results, obtained by combining multiple clusterings, showed that the consensus clustering method can improve the robustness, novelty and stability of chemical structures clustering.
  12. Saeed F, Salim N, Abdo A
    J Chem Inf Model, 2013 May 24;53(5):1026-34.
    PMID: 23581471 DOI: 10.1021/ci300442u
    The goal of consensus clustering methods is to find a consensus partition that optimally summarizes an ensemble and improves the quality of clustering compared with single clustering algorithms. In this paper, an enhanced voting-based consensus method was introduced and compared with other consensus clustering methods, including co-association-based, graph-based, and voting-based consensus methods. The MDDR and MUV data sets were used for the experiments and were represented by three 2D fingerprints: ALOGP, ECFP_4, and ECFC_4. The results were evaluated based on the ability of the clustering method to separate active from inactive molecules in each cluster using four criteria: F-measure, Quality Partition Index (QPI), Rand Index (RI), and Fowlkes-Mallows Index (FMI). The experiments suggest that the consensus methods can deliver significant improvements for the effectiveness of chemical structures clustering.
  13. Salim N, Abdullah S, Sapuan J, Haflah NH
    J Hand Surg Eur Vol, 2012 Jan;37(1):27-34.
    PMID: 21816888 DOI: 10.1177/1753193411415343
    We compared the effectiveness of physiotherapy and corticosteroid injection treatment in the management of mild trigger fingers. Mild trigger fingers are those with mild crepitus, uneven finger movements and actively correctable triggering. This is a single-centred, prospective, block randomized study with 74 patients; 39 patients for steroid injection and 35 patients for physiotherapy. The study duration was from Jun 2009 until August 2010. Evaluation was done at 6 weeks, 3 months and 6 months post-treatment. At 3 months, the success rate (absence of pain and triggering) for those receiving steroid injection was 97.4% and physiotherapy 68.6%. The group receiving steroid injection also had lower pain score, higher rate of satisfaction, stronger grip strength and early recovery to near normal function (findings were all significant, p 
  14. Reafee W, Salim N, Khan A
    PLoS One, 2016;11(5):e0154848.
    PMID: 27152663 DOI: 10.1371/journal.pone.0154848
    The explosive growth of social networks in recent times has presented a powerful source of information to be utilized as an extra source for assisting in the social recommendation problems. The social recommendation methods that are based on probabilistic matrix factorization improved the recommendation accuracy and partly solved the cold-start and data sparsity problems. However, these methods only exploited the explicit social relations and almost completely ignored the implicit social relations. In this article, we firstly propose an algorithm to extract the implicit relation in the undirected graphs of social networks by exploiting the link prediction techniques. Furthermore, we propose a new probabilistic matrix factorization method to alleviate the data sparsity problem through incorporating explicit friendship and implicit friendship. We evaluate our proposed approach on two real datasets, Last.Fm and Douban. The experimental results show that our method performs much better than the state-of-the-art approaches, which indicates the importance of incorporating implicit social relations in the recommendation process to address the poor prediction accuracy.
  15. Jalila A, Dorny P, Sani R, Salim NB, Vercruysse J
    Vet Parasitol, 1998 Jan 31;74(2-4):165-72.
    PMID: 9561704
    Coccidial infections were studied in goats in the state of Selangor (peninsular Malaysia) during a 12-month period. The study included 10 smallholder farms on which kids were monitored for faecal oocyst counts from birth until 1-year old. Eimeria oocysts were found in 725 (89%) of 815 faecal samples examined. Nine species of Eimeria were identified. The most prevalent were E. arloingi, found in 71% of the samples, E. ninakohlyakimovae (67%), E. christenseni (63%) and E. alijevi (61%). The other species found were, E. hirci, E. jolchijevi, E. caprovina, E. caprina and E. pallida, present in 34, 22, 12, 9 and 4% of the samples, respectively. Oocyst counts were significantly higher in animals of less than 4-months old (P < 0.05). High oocyst counts were mainly caused by non-pathogenic species. Poor hygienic conditions were found to be associated with a higher intensity of coccidial infections. Mortality rates in kids could not be related to the intensity of coccidial infections.
  16. Saeed F, Salim N, Abdo A
    Mol Inform, 2013 Jul;32(7):591-8.
    PMID: 27481767 DOI: 10.1002/minf.201300004
    Many consensus clustering methods have been applied in different areas such as pattern recognition, machine learning, information theory and bioinformatics. However, few methods have been used for chemical compounds clustering. In this paper, an information theory and voting based algorithm (Adaptive Cumulative Voting-based Aggregation Algorithm A-CVAA) was examined for combining multiple clusterings of chemical structures. The effectiveness of clusterings was evaluated based on the ability of the clustering method to separate active from inactive molecules in each cluster, and the results were compared with Ward's method. The chemical dataset MDL Drug Data Report (MDDR) and the Maximum Unbiased Validation (MUV) dataset were used. Experiments suggest that the adaptive cumulative voting-based consensus method can improve the effectiveness of combining multiple clusterings of chemical structures.
  17. Saeed F, Salim N, Abdo A, Hentabli H
    Mol Inform, 2013 Feb;32(2):165-78.
    PMID: 27481278 DOI: 10.1002/minf.201200110
    Consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics. In this paper, consensus clustering is used for combining the clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Two graph-based consensus clustering methods were examined. The Quality Partition Index method (QPI) was used to evaluate the clusterings and the results were compared to the Ward's clustering method. Two homogeneous and heterogeneous subsets DS1-DS2 of MDL Drug Data Report database (MDDR) were used for experiments and represented by two 2D fingerprints. The results, obtained by a combination of multiple runs of an individual clustering and a single run of multiple individual clusterings, showed that graph-based consensus clustering methods can improve the effectiveness of chemical structures clusterings.
  18. Osman A, Salim N, Saeed F
    PLoS One, 2019;14(5):e0215516.
    PMID: 31091242 DOI: 10.1371/journal.pone.0215516
    The Text Forum Threads (TFThs) contain a large amount of Initial-Posts Replies pairs (IPR pairs) which are related to information exchange and discussion amongst the forum users with similar interests. Generally, some user replies in the discussion thread are off-topic and irrelevant. Hence, the content is of different qualities. It is important to identify the quality of the IPR pairs in a discussion thread in order to extract relevant information and helpful replies because a higher frequency of irrelevant replies in the thread could take the discussion in a different direction and the genuine users would lose interest in this discussion thread. In this study, the authors have presented an approach for identifying the high-quality user replies to the Initial-Post and use some quality dimensions features for their extraction. Moreover, crowdsourcing platforms were used for judging the quality of the replies and classified them into high-quality, low-quality or non-quality replies to the Initial-Posts. Then, the high-quality IPR pairs were extracted and identified based on their quality, and they were ranked using three classifiers i.e., Support Vector Machine, Naïve Bayes, and the Decision Trees according to their quality dimensions of relevancy, author activeness, timeliness, ease-of-understanding, politeness, and amount-of-data. In conclusion, the experimental results for the TFThs showed that the proposed approach could improve the extraction of the quality replies and identify the quality features that can be used for the Text Forum Thread Summarization.
  19. Saeed F, Salim N, Abdo A
    J Cheminform, 2012 Dec 17;4(1):37.
    PMID: 23244782 DOI: 10.1186/1758-2946-4-37
    BACKGROUND: Although many consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics, few consensus clustering methods have been applied for combining multiple clusterings of chemical structures. It is known that any individual clustering method will not always give the best results for all types of applications. So, in this paper, three voting and graph-based consensus clusterings were used for combining multiple clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster.

    RESULTS: The cumulative voting-based aggregation algorithm (CVAA), cluster-based similarity partitioning algorithm (CSPA) and hyper-graph partitioning algorithm (HGPA) were examined. The F-measure and Quality Partition Index method (QPI) were used to evaluate the clusterings and the results were compared to the Ward's clustering method. The MDL Drug Data Report (MDDR) dataset was used for experiments and was represented by two 2D fingerprints, ALOGP and ECFP_4. The performance of voting-based consensus clustering method outperformed the Ward's method using F-measure and QPI method for both ALOGP and ECFP_4 fingerprints, while the graph-based consensus clustering methods outperformed the Ward's method only for ALOGP using QPI. The Jaccard and Euclidean distance measures were the methods of choice to generate the ensembles, which give the highest values for both criteria.

    CONCLUSIONS: The results of the experiments show that consensus clustering methods can improve the effectiveness of chemical structures clusterings. The cumulative voting-based aggregation algorithm (CVAA) was the method of choice among consensus clustering methods.

  20. Saeed F, Ahmed A, Shamsir MS, Salim N
    J Comput Aided Mol Des, 2014 Jun;28(6):675-84.
    PMID: 24830925 DOI: 10.1007/s10822-014-9750-2
    The cluster-based compound selection is used in the lead identification process of drug discovery and design. Many clustering methods have been used for chemical databases, but there is no clustering method that can obtain the best results under all circumstances. However, little attention has been focused on the use of combination methods for chemical structure clustering, which is known as consensus clustering. Recently, consensus clustering has been used in many areas including bioinformatics, machine learning and information theory. This process can improve the robustness, stability, consistency and novelty of clustering. For chemical databases, different consensus clustering methods have been used including the co-association matrix-based, graph-based, hypergraph-based and voting-based methods. In this paper, a weighted cumulative voting-based aggregation algorithm (W-CVAA) was developed. The MDL Drug Data Report (MDDR) benchmark chemical dataset was used in the experiments and represented by the AlogP and ECPF_4 descriptors. The results from the clustering methods were evaluated by the ability of the clustering to separate biologically active molecules in each cluster from inactive ones using different criteria, and the effectiveness of the consensus clustering was compared to that of Ward's method, which is the current standard clustering method in chemoinformatics. This study indicated that weighted voting-based consensus clustering can overcome the limitations of the existing voting-based methods and improve the effectiveness of combining multiple clusterings of chemical structures.
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links