Displaying all 5 publications

Abstract:
Sort:
  1. Mujtaba G, Shuib L, Raj RG, Rajandram R, Shaikh K
    J Forensic Leg Med, 2018 Jul;57:41-50.
    PMID: 29801951 DOI: 10.1016/j.jflm.2017.07.001
    OBJECTIVES: Automatic text classification techniques are useful for classifying plaintext medical documents. This study aims to automatically predict the cause of death from free text forensic autopsy reports by comparing various schemes for feature extraction, term weighing or feature value representation, text classification, and feature reduction.

    METHODS: For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall.

    RESULTS: From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier.

    CONCLUSION: Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques.

  2. Al-Garadi MA, Khan MS, Varathan KD, Mujtaba G, Al-Kabsi AM
    J Biomed Inform, 2016 08;62:1-11.
    PMID: 27224846 DOI: 10.1016/j.jbi.2016.05.005
    BACKGROUND: The popularity and proliferation of online social networks (OSNs) have created massive social interaction among users that generate an extensive amount of data. An OSN offers a unique opportunity for studying and understanding social interaction and communication among far larger populations now more than ever before. Recently, OSNs have received considerable attention as a possible tool to track a pandemic because they can provide an almost real-time surveillance system at a less costly rate than traditional surveillance systems.

    METHODS: A systematic literature search for studies with the primary aim of using OSN to detect and track a pandemic was conducted. We conducted an electronic literature search for eligible English articles published between 2004 and 2015 using PUBMED, IEEExplore, ACM Digital Library, Google Scholar, and Web of Science. First, the articles were screened on the basis of titles and abstracts. Second, the full texts were reviewed. All included studies were subjected to quality assessment.

    RESULT: OSNs have rich information that can be utilized to develop an almost real-time pandemic surveillance system. The outcomes of OSN surveillance systems have demonstrated high correlations with the findings of official surveillance systems. However, the limitation in using OSN to track pandemic is in collecting representative data with sufficient population coverage. This challenge is related to the characteristics of OSN data. The data are dynamic, large-sized, and unstructured, thus requiring advanced algorithms and computational linguistics.

    CONCLUSIONS: OSN data contain significant information that can be used to track a pandemic. Different from traditional surveys and clinical reports, in which the data collection process is time consuming at costly rates, OSN data can be collected almost in real time at a cheaper cost. Additionally, the geographical and temporal information can provide exploratory analysis of spatiotemporal dynamics of infectious disease spread. However, on one hand, an OSN-based surveillance system requires comprehensive adoption, enhanced geographical identification system, and advanced algorithms and computational linguistics to eliminate its limitations and challenges. On the other hand, OSN is probably to never replace traditional surveillance, but it can offer complementary data that can work best when integrated with traditional data.

  3. Mujtaba G, Shuib L, Raj RG, Rajandram R, Shaikh K, Al-Garadi MA
    J Biomed Inform, 2018 06;82:88-105.
    PMID: 29738820 DOI: 10.1016/j.jbi.2018.04.013
    Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results. The experimental results indicated that the CGDR technique achieved 12% to 15% improvement in accuracy compared with fully automated document representation baseline techniques. Moreover, two-level classification obtained better results compared with one-level classification. The promising results of the proposed conceptual graph-based document representation technique suggest that pathologists can adopt the proposed system as their basis for second opinion, thereby supporting them in effectively determining CoD.
  4. Mujtaba G, Shuib L, Raj RG, Rajandram R, Shaikh K, Al-Garadi MA
    PLoS One, 2017;12(2):e0170242.
    PMID: 28166263 DOI: 10.1371/journal.pone.0170242
    OBJECTIVES: Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models.

    METHODS: Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system.

    RESULTS: Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines.

    CONCLUSION: The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports.

  5. Iqbal U, Wah TY, Habib Ur Rehman M, Mujtaba G, Imran M, Shoaib M
    J Med Syst, 2018 Nov 05;42(12):252.
    PMID: 30397730 DOI: 10.1007/s10916-018-1107-2
    Electrocardiography (ECG) sensors play a vital role in the Internet of Medical Things, and these sensors help in monitoring the electrical activity of the heart. ECG signal analysis can improve human life in many ways, from diagnosing diseases among cardiac patients to managing the lifestyles of diabetic patients. Abnormalities in heart activities lead to different cardiac diseases and arrhythmia. However, some cardiac diseases, such as myocardial infarction (MI) and atrial fibrillation (Af), require special attention due to their direct impact on human life. The classification of flattened T wave cases of MI in ECG signals and how much of these cases are similar to ST-T changes in MI remain an open issue for researchers. This article presents a novel contribution to classify MI and Af. To this end, we propose a new approach called deep deterministic learning (DDL), which works by combining predefined heart activities with fused datasets. In this research, we used two datasets. The first dataset, Massachusetts Institute of Technology-Beth Israel Hospital, is publicly available, and we exclusively obtained the second dataset from the University of Malaya Medical Center, Kuala Lumpur Malaysia. We first initiated predefined activities on each individual dataset to recognize patterns between the ST-T change and flattened T wave cases and then used the data fusion approach to merge both datasets in a manner that delivers the most accurate pattern recognition results. The proposed DDL approach is a systematic stage-wise methodology that relies on accurate detection of R peaks in ECG signals, time domain features of ECG signals, and fine tune-up of artificial neural networks. The empirical evaluation shows high accuracy (i.e., ≤99.97%) in pattern matching ST-T changes and flattened T waves using the proposed DDL approach. The proposed pattern recognition approach is a significant contribution to the diagnosis of special cases of MI.
Related Terms
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links