Displaying publications 21 - 40 of 325 in total

Abstract:
Sort:
  1. Asim Shahid M, Alam MM, Mohd Su'ud M
    PLoS One, 2023;18(4):e0284209.
    PMID: 37053173 DOI: 10.1371/journal.pone.0284209
    The benefits and opportunities offered by cloud computing are among the fastest-growing technologies in the computer industry. Additionally, it addresses the difficulties and issues that make more users more likely to accept and use the technology. The proposed research comprised of machine learning (ML) algorithms is Naïve Bayes (NB), Library Support Vector Machine (LibSVM), Multinomial Logistic Regression (MLR), Sequential Minimal Optimization (SMO), K Nearest Neighbor (KNN), and Random Forest (RF) to compare the classifier gives better results in accuracy and less fault prediction. In this research, the secondary data results (CPU-Mem Mono) give the highest percentage of accuracy and less fault prediction on the NB classifier in terms of 80/20 (77.01%), 70/30 (76.05%), and 5 folds cross-validation (74.88%), and (CPU-Mem Multi) in terms of 80/20 (89.72%), 70/30 (90.28%), and 5 folds cross-validation (92.83%). Furthermore, on (HDD Mono) the SMO classifier gives the highest percentage of accuracy and less fault prediction fault in terms of 80/20 (87.72%), 70/30 (89.41%), and 5 folds cross-validation (88.38%), and (HDD-Multi) in terms of 80/20 (93.64%), 70/30 (90.91%), and 5 folds cross-validation (88.20%). Whereas, primary data results found RF classifier gives the highest percentage of accuracy and less fault prediction in terms of 80/20 (97.14%), 70/30 (96.19%), and 5 folds cross-validation (95.85%) in the primary data results, but the algorithm complexity (0.17 seconds) is not good. In terms of 80/20 (95.71%), 70/30 (95.71%), and 5 folds cross-validation (95.71%), SMO has the second highest accuracy and less fault prediction, but the algorithm complexity is good (0.3 seconds). The difference in accuracy and less fault prediction between RF and SMO is only (.13%), and the difference in time complexity is (14 seconds). We have decided that we will modify SMO. Finally, the Modified Sequential Minimal Optimization (MSMO) Algorithm method has been proposed to get the highest accuracy & less fault prediction errors in terms of 80/20 (96.42%), 70/30 (96.42%), & 5 fold cross validation (96.50%).
    Matched MeSH terms: Machine Learning*
  2. Teo BG, Dhillon SK
    BMC Bioinformatics, 2019 Dec 24;20(Suppl 19):658.
    PMID: 31870297 DOI: 10.1186/s12859-019-3210-x
    BACKGROUND: Studying structural and functional morphology of small organisms such as monogenean, is difficult due to the lack of visualization in three dimensions. One possible way to resolve this visualization issue is to create digital 3D models which may aid researchers in studying morphology and function of the monogenean. However, the development of 3D models is a tedious procedure as one will have to repeat an entire complicated modelling process for every new target 3D shape using a comprehensive 3D modelling software. This study was designed to develop an alternative 3D modelling approach to build 3D models of monogenean anchors, which can be used to understand these morphological structures in three dimensions. This alternative 3D modelling approach is aimed to avoid repeating the tedious modelling procedure for every single target 3D model from scratch.

    RESULT: An automated 3D modeling pipeline empowered by an Artificial Neural Network (ANN) was developed. This automated 3D modelling pipeline enables automated deformation of a generic 3D model of monogenean anchor into another target 3D anchor. The 3D modelling pipeline empowered by ANN has managed to automate the generation of the 8 target 3D models (representing 8 species: Dactylogyrus primaries, Pellucidhaptor merus, Dactylogyrus falcatus, Dactylogyrus vastator, Dactylogyrus pterocleidus, Dactylogyrus falciunguis, Chauhanellus auriculatum and Chauhanellus caelatus) of monogenean anchor from the respective 2D illustrations input without repeating the tedious modelling procedure.

    CONCLUSIONS: Despite some constraints and limitation, the automated 3D modelling pipeline developed in this study has demonstrated a working idea of application of machine learning approach in a 3D modelling work. This study has not only developed an automated 3D modelling pipeline but also has demonstrated a cross-disciplinary research design that integrates machine learning into a specific domain of study such as 3D modelling of the biological structures.

    Matched MeSH terms: Machine Learning*
  3. Ong SQ, Isawasan P, Ngesom AMM, Shahar H, Lasim AM, Nair G
    Sci Rep, 2023 Nov 05;13(1):19129.
    PMID: 37926755 DOI: 10.1038/s41598-023-46342-2
    Machine learning algorithms (ML) are receiving a lot of attention in the development of predictive models for monitoring dengue transmission rates. Previous work has focused only on specific weather variables and algorithms, and there is still a need for a model that uses more variables and algorithms that have higher performance. In this study, we use vector indices and meteorological data as predictors to develop the ML models. We trained and validated seven ML algorithms, including an ensemble ML method, and compared their performance using the receiver operating characteristic (ROC) with the area under the curve (AUC), accuracy and F1 score. Our results show that an ensemble ML such as XG Boost, AdaBoost and Random Forest perform better than the logistics regression, Naïve Bayens, decision tree, and support vector machine (SVM), with XGBoost having the highest AUC, accuracy and F1 score. Analysis of the importance of the variables showed that the container index was the least important. By removing this variable, the ML models improved their performance by at least 6% in AUC and F1 score. Our result provides a framework for future studies on the use of predictive models in the development of an early warning system.
    Matched MeSH terms: Machine Learning*
  4. Yeow MYH, Chong CY, Lim MK, Yee Yen Y
    PLoS One, 2025;20(2):e0314512.
    PMID: 39946354 DOI: 10.1371/journal.pone.0314512
    Software reuse is an essential practice to increase efficiency and reduce costs in software production. Software reuse practices range from reusing artifacts, libraries, components, packages, and APIs. Identifying suitable software for reuse requires pinpointing potential candidates. However, there are no objective methods in place to measure software reuse. This makes it challenging to identify highly reusable software. Software reuse research mainly addresses two hurdles: 1) identifying reusable candidates effectively and efficiently, and 2) selecting high-quality software components that improve maintainability and extensibility. This paper proposes automating software reuse prediction by leveraging machine learning (ML) algorithms, enabling future research and practitioners to better identify highly reusable software. Our approach uses cross-project code clone detection to establish the ground truth for software reuse, identifying code clones across popular GitHub projects as indicators of potential reuse candidates. Software metrics were extracted from Maven artifacts and used to train classification and regression models to predict and estimate software reuse. The average F1-score of the ML classification models is 77.19%. The best-performing model, Ridge Regression, achieved an F1-score of 79.17%. Additionally, this research aims to assist developers by identifying key metrics that significantly impact software reuse. Our findings suggest that the file-level PUA (Public Undocumented API) metric is the most important factor influencing software reuse. We also present suitable value ranges for the top five important metrics that developers can follow to create highly reusable software. Furthermore, we developed a tool that utilizes the trained models to predict the reuse potential of existing GitHub projects and rank Maven artifacts by their domain.
    Matched MeSH terms: Machine Learning*
  5. Hassan MK, Syed Ariffin SH, Ghazali NE, Hamad M, Hamdan M, Hamdi M, et al.
    Sensors (Basel), 2022 May 09;22(9).
    PMID: 35591282 DOI: 10.3390/s22093592
    Recently, there has been an increasing need for new applications and services such as big data, blockchains, vehicle-to-everything (V2X), the Internet of things, 5G, and beyond. Therefore, to maintain quality of service (QoS), accurate network resource planning and forecasting are essential steps for resource allocation. This study proposes a reliable hybrid dynamic bandwidth slice forecasting framework that combines the long short-term memory (LSTM) neural network and local smoothing methods to improve the network forecasting model. Moreover, the proposed framework can dynamically react to all the changes occurring in the data series. Backbone traffic was used to validate the proposed method. As a result, the forecasting accuracy improved significantly with the proposed framework and with minimal data loss from the smoothing process. The results showed that the hybrid moving average LSTM (MLSTM) achieved the most remarkable improvement in the training and testing forecasts, with 28% and 24% for long-term evolution (LTE) time series and with 35% and 32% for the multiprotocol label switching (MPLS) time series, respectively, while robust locally weighted scatter plot smoothing and LSTM (RLWLSTM) achieved the most significant improvement for upstream traffic with 45%; moreover, the dynamic learning framework achieved improvement percentages that can reach up to 100%.
    Matched MeSH terms: Machine Learning*
  6. Rosa D, Elya B, Hanafi M, Khatib A, Budiarto E, Nur S, et al.
    PLoS One, 2025;20(1):e0313592.
    PMID: 39752479 DOI: 10.1371/journal.pone.0313592
    One way to treat diabetes mellitus type II is by using α-glucosidase inhibitor, that will slow down the postprandial glucose intake. Metabolomics analysis of Artabotrys sumatranus leaf extract was used in this research to predict the active compounds as α-glucosidase inhibitors from this extract. Both multivariate statistical analysis and machine learning approaches were used to improve the confidence of the predictions. After performance comparisons with other machine learning methods, random forest was chosen to make predictive model for the activity of the extract samples. Feature importance analysis (using random feature permutation and Shapley score calculation) was used to identify the predicted active compound as the important features that influenced the activity prediction of the extract samples. The combined analysis of multivariate statistical analysis and machine learning predicted 9 active compounds, where 6 of them were identified as mangiferin, neomangiferin, norisocorydine, apigenin-7-O-galactopyranoside, lirioferine, and 15,16-dihydrotanshinone I. The activities of norisocorydine, apigenin-7-O-galactopyranoside, and lirioferine as α-glucosidase inhibitors have not yet reported before. Molecular docking simulation, both to 3A4A (α-glucosidase enzyme from Saccharomyces cerevisiae, usually used in bioassay test) and 3TOP (a part of α-glucosidase enzyme in human gut) showed strong to very strong binding of the identified predicted active compounds to both receptors, with exception of neomangiferin which only showed strong binding to 3TOP receptor. Isolation based on bioassay guided fractionation further verified the metabolomics prediction by succeeding to isolate mangiferin from the extract, which showed strong α-glucosidase activity when subjected to bioassay test. The correlation analysis also showed a possibility of 3 groups in the predicted active compounds, which might be related to the biosynthesis pathway (need further research for verification). Another result from correlation analysis was that in general the α-glucosidase inhibition activity in the extract had strong correlation to antioxidant activity, which was also reflected in the predicted active compounds. Only one predicted compound had very low positive correlation to antioxidant activity.
    Matched MeSH terms: Machine Learning*
  7. Hasan RI, Yusuf SM, Alzubaidi L
    Plants (Basel), 2020 Oct 01;9(10).
    PMID: 33019765 DOI: 10.3390/plants9101302
    Deep learning (DL) represents the golden era in the machine learning (ML) domain, and it has gradually become the leading approach in many fields. It is currently playing a vital role in the early detection and classification of plant diseases. The use of ML techniques in this field is viewed as having brought considerable improvement in cultivation productivity sectors, particularly with the recent emergence of DL, which seems to have increased accuracy levels. Recently, many DL architectures have been implemented accompanying visualisation techniques that are essential for determining symptoms and classifying plant diseases. This review investigates and analyses the most recent methods, developed over three years leading up to 2020, for training, augmentation, feature fusion and extraction, recognising and counting crops, and detecting plant diseases, including how these methods can be harnessed to feed deep classifiers and their effects on classifier accuracy.
    Matched MeSH terms: Machine Learning
  8. Rahman MM, Khatun F, Uzzaman A, Sami SI, Bhuiyan MA, Kiong TS
    Int J Health Serv, 2021 10;51(4):446-461.
    PMID: 33999732 DOI: 10.1177/00207314211017469
    The novel coronavirus disease (COVID-19) has spread over 219 countries of the globe as a pandemic, creating alarming impacts on health care, socioeconomic environments, and international relationships. The principal objective of the study is to provide the current technological aspects of artificial intelligence (AI) and other relevant technologies and their implications for confronting COVID-19 and preventing the pandemic's dreadful effects. This article presents AI approaches that have significant contributions in the fields of health care, then highlights and categorizes their applications in confronting COVID-19, such as detection and diagnosis, data analysis and treatment procedures, research and drug development, social control and services, and the prediction of outbreaks. The study addresses the link between the technologies and the epidemics as well as the potential impacts of technology in health care with the introduction of machine learning and natural language processing tools. It is expected that this comprehensive study will support researchers in modeling health care systems and drive further studies in advanced technologies. Finally, we propose future directions in research and conclude that persuasive AI strategies, probabilistic models, and supervised learning are required to tackle future pandemic challenges.
    Matched MeSH terms: Machine Learning
  9. Melisa Anak Adeh, Mohd Ibrahim Shapiai, Ayman Maliha, Muhammad Hafiz Md Zaini
    MyJurnal
    Nowadays, the applications of tracking moving object are commonly used in various
    areas especially in computer vision applications. There are many tracking algorithms
    have been introduced and they are divided into three groups which are generative
    trackers, discriminative trackers and hybrid trackers. One of the methods is TrackingLearning-Detection
    (TLD) framework which is an example of the hybrid trackers where
    combination between the generative trackers and the discriminative trackers occur. In
    TLD, the detector consists of three stages which are patch variance, ensemble classifier
    and KNearest Neighbor classifier. In the second stage, the ensemble classifier depends
    on simple pixel comparison hence, it is likely fail to offer a better generalization of the
    appearances of the target object in the detection process. In this paper, OnlineSequential
    Extreme Learning Machine (OS-ELM) was used to replace the ensemble
    classifier in the TLD framework. Besides that, different types of Haar-like features were
    used for the feature extraction process instead of using raw pixel value as the features.
    The objectives of this study are to improve the classifier in the second stage of detector
    in TLD framework by using Haar-like features as an input to the classifier and to get a
    more generalized detector in TLD framework by using OS-ELM based detector. The
    results showed that the proposed method performs better in Pedestrian 1 in terms of
    F-measure and also offers good performance in terms of Precision in four out of six
    videos.
    Matched MeSH terms: Machine Learning
  10. Zafar R, Qayyum A, Mumtaz W
    J Integr Neurosci, 2019 Sep 30;18(3):217-229.
    PMID: 31601069 DOI: 10.31083/j.jin.2019.03.164
    In the electroencephalogram recorded data are often confounded with artifacts, especially in the case of eye blinks. Different methods for artifact detection and removal are discussed in the literature, including automatic detection and removal. Here, an automatic method of eye blink detection and correction is proposed where sparse coding is used for an electroencephalogram dataset. In this method, a hybrid dictionary based on a ridgelet transformation is used to capture prominent features by analyzing independent components extracted from a different number of electroencephalogram channels. In this study, the proposed method has been tested and validated with five different datasets for artifact detection and correction. Results show that the proposed technique is promising as it successfully extracted the exact locations of eye blinking artifacts. The accuracy of the method (automatic detection) is 89.6% which represents a better estimate than that obtained by an extreme machine learning classifier.
    Matched MeSH terms: Machine Learning
  11. Almaleeh AA, Zakaria A, Kamarudin LM, Rahiman MHF, Ndzi DL, Ismail I
    Sensors (Basel), 2022 Jan 05;22(1).
    PMID: 35009947 DOI: 10.3390/s22010405
    The moisture content of stored rice is dependent on the surrounding and environmental factors which in turn affect the quality and economic value of the grains. Therefore, the moisture content of grains needs to be measured frequently to ensure that optimum conditions that preserve their quality are maintained. The current state of the art for moisture measurement of rice in a silo is based on grab sampling or relies on single rod sensors placed randomly into the grain. The sensors that are currently used are very localized and are, therefore, unable to provide continuous measurement of the moisture distribution in the silo. To the authors' knowledge, there is no commercially available 3D volumetric measurement system for rice moisture content in a silo. Hence, this paper presents results of work carried out using low-cost wireless devices that can be placed around the silo to measure changes in the moisture content of rice. This paper proposes a novel technique based on radio frequency tomographic imaging using low-cost wireless devices and regression-based machine learning to provide contactless non-destructive 3D volumetric moisture content distribution in stored rice grain. This proposed technique can detect multiple levels of localized moisture distributions in the silo with accuracies greater than or equal to 83.7%, depending on the size and shape of the sample under test. Unlike other approaches proposed in open literature or employed in the sector, the proposed system can be deployed to provide continuous monitoring of the moisture distribution in silos.
    Matched MeSH terms: Machine Learning
  12. Singh OP, Vallejo M, El-Badawy IM, Aysha A, Madhanagopal J, Mohd Faudzi AA
    Comput Biol Med, 2021 Sep;136:104650.
    PMID: 34329865 DOI: 10.1016/j.compbiomed.2021.104650
    Due to the continued evolution of the SARS-CoV-2 pandemic, researchers worldwide are working to mitigate, suppress its spread, and better understand it by deploying digital signal processing (DSP) and machine learning approaches. This study presents an alignment-free approach to classify the SARS-CoV-2 using complementary DNA, which is DNA synthesized from the single-stranded RNA virus. Herein, a total of 1582 samples, with different lengths of genome sequences from different regions, were collected from various data sources and divided into a SARS-CoV-2 and a non-SARS-CoV-2 group. We extracted eight biomarkers based on three-base periodicity, using DSP techniques, and ranked those based on a filter-based feature selection. The ranked biomarkers were fed into k-nearest neighbor, support vector machines, decision trees, and random forest classifiers for the classification of SARS-CoV-2 from other coronaviruses. The training dataset was used to test the performance of the classifiers based on accuracy and F-measure via 10-fold cross-validation. Kappa-scores were estimated to check the influence of unbalanced data. Further, 10 × 10 cross-validation paired t-test was utilized to test the best model with unseen data. Random forest was elected as the best model, differentiating the SARS-CoV-2 coronavirus from other coronaviruses and a control a group with an accuracy of 97.4 %, sensitivity of 96.2 %, and specificity of 98.2 %, when tested with unseen samples. Moreover, the proposed algorithm was computationally efficient, taking only 0.31 s to compute the genome biomarkers, outperforming previous studies.
    Matched MeSH terms: Machine Learning
  13. Huqh MZU, Abdullah JY, Wong LS, Jamayet NB, Alam MK, Rashid QF, et al.
    Int J Environ Res Public Health, 2022 Aug 31;19(17).
    PMID: 36078576 DOI: 10.3390/ijerph191710860
    OBJECTIVE: The objective of this systematic review was (a) to explore the current clinical applications of AI/ML (Artificial intelligence and Machine learning) techniques in diagnosis and treatment prediction in children with CLP (Cleft lip and palate), (b) to create a qualitative summary of results of the studies retrieved.

    MATERIALS AND METHODS: An electronic search was carried out using databases such as PubMed, Scopus, and the Web of Science Core Collection. Two reviewers searched the databases separately and concurrently. The initial search was conducted on 6 July 2021. The publishing period was unrestricted; however, the search was limited to articles involving human participants and published in English. Combinations of Medical Subject Headings (MeSH) phrases and free text terms were used as search keywords in each database. The following data was taken from the methods and results sections of the selected papers: The amount of AI training datasets utilized to train the intelligent system, as well as their conditional properties; Unilateral CLP, Bilateral CLP, Unilateral Cleft lip and alveolus, Unilateral cleft lip, Hypernasality, Dental characteristics, and sagittal jaw relationship in children with CLP are among the problems studied.

    RESULTS: Based on the predefined search strings with accompanying database keywords, a total of 44 articles were found in Scopus, PubMed, and Web of Science search results. After reading the full articles, 12 papers were included for systematic analysis.

    CONCLUSIONS: Artificial intelligence provides an advanced technology that can be employed in AI-enabled computerized programming software for accurate landmark detection, rapid digital cephalometric analysis, clinical decision-making, and treatment prediction. In children with corrected unilateral cleft lip and palate, ML can help detect cephalometric predictors of future need for orthognathic surgery.

    Matched MeSH terms: Machine Learning
  14. Nilashi M, Abumalloh RA, Yusuf SYM, Thi HH, Alsulami M, Abosaq H, et al.
    Comput Biol Chem, 2023 Feb;102:107788.
    PMID: 36410240 DOI: 10.1016/j.compbiolchem.2022.107788
    Predicting Unified Parkinson's Disease Rating Scale (UPDRS) in Total- UPDRS and Motor-UPDRS clinical scales is an important part of controlling PD. Computational intelligence approaches have been used effectively in the early diagnosis of PD by predicting UPDRS. In this research, we target to present a combined approach for PD diagnosis using an ensemble learning approach with the ability of online learning from clinical large datasets. The method is developed using Deep Belief Network (DBN) and Neuro-Fuzzy approaches. A clustering approach, Expectation-Maximization (EM), is used to handle large datasets. The Principle Component Analysis (PCA) technique is employed for noise removal from the data. The UPDRS prediction models are constructed for PD diagnosis. To handle the missing data, K-NN is used in the proposed method. We use incremental machine learning approaches to improve the efficiency of the proposed method. We assess our approach on a real-world PD dataset and the findings are assessed compared to other PD diagnosis approaches developed by machine learning techniques. The findings revealed that the approach can improve the UPDRS prediction accuracy and the time complexity of previous methods in handling large datasets.
    Matched MeSH terms: Machine Learning
  15. Sharma V, Singh A, Chauhan S, Sharma PK, Chaudhary S, Sharma A, et al.
    Curr Drug Deliv, 2024;21(6):870-886.
    PMID: 37670704 DOI: 10.2174/1567201821666230905090621
    Drug discovery and development (DDD) is a highly complex process that necessitates precise monitoring and extensive data analysis at each stage. Furthermore, the DDD process is both timeconsuming and costly. To tackle these concerns, artificial intelligence (AI) technology can be used, which facilitates rapid and precise analysis of extensive datasets within a limited timeframe. The pathophysiology of cancer disease is complicated and requires extensive research for novel drug discovery and development. The first stage in the process of drug discovery and development involves identifying targets. Cell structure and molecular functioning are complex due to the vast number of molecules that function constantly, performing various roles. Furthermore, scientists are continually discovering novel cellular mechanisms and molecules, expanding the range of potential targets. Accurately identifying the correct target is a crucial step in the preparation of a treatment strategy. Various forms of AI, such as machine learning, neural-based learning, deep learning, and network-based learning, are currently being utilised in applications, online services, and databases. These technologies facilitate the identification and validation of targets, ultimately contributing to the success of projects. This review focuses on the different types and subcategories of AI databases utilised in the field of drug discovery and target identification for cancer.
    Matched MeSH terms: Machine Learning
  16. Masseran N, Safari MAM, Tajuddin RRM
    Environ Monit Assess, 2024 May 08;196(6):523.
    PMID: 38717514 DOI: 10.1007/s10661-024-12700-4
    Air pollution events can be categorized as extreme or non-extreme on the basis of their magnitude of severity. High-risk extreme air pollution events will exert a disastrous effect on the environment. Therefore, public health and policy-making authorities must be able to determine the characteristics of these events. This study proposes a probabilistic machine learning technique for predicting the classification of extreme and non-extreme events on the basis of data features to address the above issue. The use of the naïve Bayes model in the prediction of air pollution classes is proposed to leverage its simplicity as well as high accuracy and efficiency. A case study was conducted on the air pollution index data of Klang, Malaysia, for the period of January 01, 1997, to August 31, 2020. The trained naïve Bayes model achieves high accuracy, sensitivity, and specificity on the training and test datasets. Therefore, the naïve Bayes model can be easily applied in air pollution analysis while providing a promising solution for the accurate and efficient prediction of extreme or non-extreme air pollution events. The findings of this study provide reliable information to public authorities for monitoring and managing sustainable air quality over time.
    Matched MeSH terms: Machine Learning
  17. Kaleem S, Sohail A, Tariq MU, Babar M, Qureshi B
    PLoS One, 2023;18(10):e0292587.
    PMID: 37819992 DOI: 10.1371/journal.pone.0292587
    Coronavirus disease (COVID-19), which has caused a global pandemic, continues to have severe effects on human lives worldwide. Characterized by symptoms similar to pneumonia, its rapid spread requires innovative strategies for its early detection and management. In response to this crisis, data science and machine learning (ML) offer crucial solutions to complex problems, including those posed by COVID-19. One cost-effective approach to detect the disease is the use of chest X-rays, which is a common initial testing method. Although existing techniques are useful for detecting COVID-19 using X-rays, there is a need for further improvement in efficiency, particularly in terms of training and execution time. This article introduces an advanced architecture that leverages an ensemble learning technique for COVID-19 detection from chest X-ray images. Using a parallel and distributed framework, the proposed model integrates ensemble learning with big data analytics to facilitate parallel processing. This approach aims to enhance both execution and training times, ensuring a more effective detection process. The model's efficacy was validated through a comprehensive analysis of predicted and actual values, and its performance was meticulously evaluated for accuracy, precision, recall, and F-measure, and compared to state-of-the-art models. The work presented here not only contributes to the ongoing fight against COVID-19 but also showcases the wider applicability and potential of ensemble learning techniques in healthcare.
    Matched MeSH terms: Machine Learning
  18. Alabsi BA, Anbar M, Rihan SDA
    Sensors (Basel), 2023 Jun 16;23(12).
    PMID: 37420810 DOI: 10.3390/s23125644
    The increasing use of Internet of Things (IoT) devices has led to a rise in Distributed Denial of Service (DDoS) and Denial of Service (DoS) attacks on these networks. These attacks can have severe consequences, resulting in the unavailability of critical services and financial losses. In this paper, we propose an Intrusion Detection System (IDS) based on a Conditional Tabular Generative Adversarial Network (CTGAN) for detecting DDoS and DoS attacks on IoT networks. Our CGAN-based IDS utilizes a generator network to produce synthetic traffic that mimics legitimate traffic patterns, while the discriminator network learns to differentiate between legitimate and malicious traffic. The syntactic tabular data generated by CTGAN is employed to train multiple shallow machine-learning and deep-learning classifiers, enhancing their detection model performance. The proposed approach is evaluated using the Bot-IoT dataset, measuring detection accuracy, precision, recall, and F1 measure. Our experimental results demonstrate the accurate detection of DDoS and DoS attacks on IoT networks using the proposed approach. Furthermore, the results highlight the significant contribution of CTGAN in improving the performance of detection models in machine learning and deep learning classifiers.
    Matched MeSH terms: Machine Learning
  19. Letchumanan N, Wong JHD, Tan LK, Ab Mumin N, Ng WL, Chan WY, et al.
    J Digit Imaging, 2023 Aug;36(4):1533-1540.
    PMID: 37253893 DOI: 10.1007/s10278-022-00753-1
    This study investigates the feasibility of using texture radiomics features extracted from mammography images to distinguish between benign and malignant breast lesions and to classify benign lesions into different categories and determine the best machine learning (ML) model to perform the tasks. Six hundred and twenty-two breast lesions from 200 retrospective patient data were segmented and analysed. Three hundred fifty radiomics features were extracted using the Standardized Environment for Radiomics Analysis (SERA) library, one of the radiomics implementations endorsed by the Image Biomarker Standardisation Initiative (IBSI). The radiomics features and selected patient characteristics were used to train selected machine learning models to classify the breast lesions. A fivefold cross-validation was used to evaluate the performance of the ML models and the top 10 most important features were identified. The random forest (RF) ensemble gave the highest accuracy (89.3%) and positive predictive value (66%) and likelihood ratio of 13.5 in categorising benign and malignant lesions. For the classification of benign lesions, the RF model again gave the highest likelihood ratio of 3.4 compared to the other models. Morphological and textural radiomics features were identified as the top 10 most important features from the random forest models. Patient age was also identified as one of the significant features in the RF model. We concluded that machine learning models trained against texture-based radiomics features and patient features give reasonable performance in differentiating benign versus malignant breast lesions. Our study also demonstrated that the radiomics-based machine learning models were able to emulate the visual assessment of mammography lesions, typically used by radiologists, leading to a better understanding of how the machine learning model arrive at their decision.
    Matched MeSH terms: Machine Learning
  20. Sayeed S, Ahmad AF, Peng TC
    F1000Res, 2022;11:17.
    PMID: 38269303 DOI: 10.12688/f1000research.73613.1
    The Internet of Things (IoT) is leading the physical and digital world of technology to converge. Real-time and massive scale connections produce a large amount of versatile data, where Big Data comes into the picture. Big Data refers to large, diverse sets of information with dimensions that go beyond the capabilities of widely used database management systems, or standard data processing software tools to manage within a given limit. Almost every big dataset is dirty and may contain missing data, mistyping, inaccuracies, and many more issues that impact Big Data analytics performances. One of the biggest challenges in Big Data analytics is to discover and repair dirty data; failure to do this can lead to inaccurate analytics results and unpredictable conclusions. We experimented with different missing value imputation techniques and compared machine learning (ML) model performances with different imputation methods. We propose a hybrid model for missing value imputation combining ML and sample-based statistical techniques. Furthermore, we continued with the best missing value inputted dataset, chosen based on ML model performance for feature engineering and hyperparameter tuning. We used k-means clustering and principal component analysis. Accuracy, the evaluated outcome, improved dramatically and proved that the XGBoost model gives very high accuracy at around 0.125 root mean squared logarithmic error (RMSLE). To overcome overfitting, we used K-fold cross-validation.
    Matched MeSH terms: Machine Learning
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links