Affiliations 

  • 1 Department of Electrical and Electronics Engineering, University of Dhaka, Dhaka, 1000, Bangladesh
  • 2 Department of Physiology, Faculty of Medicine, University Kebangsaan Malaysia, Kuala Lumpur, 56000, Malaysia
  • 3 Department of Emergency Medicine, Faculty of Medicine, Universiti Kebangsaan Malaysia, Kuala Lumpur, 56000, Malaysia
  • 4 Department of Electrical and Electronic Engineering, Independent University, Bangladesh, Bashundhara, Dhaka, Bangladesh
  • 5 Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
  • 6 Department of Basic Medical Sciences, College of Medicine, QU Health, Qatar University, Doha, 2713, Qatar
  • 7 Department of Electrical Engineering, Qatar University, Doha 2713, Qatar
  • 8 Intelligent Signal Processing (ISP) Research Lab, Department of Electronics and Communication Engineering, Kuwait College of Science and Technology, Block 4, Doha, Kuwait; Department of Electronics and Communication Engineering, Vels Institute of Sciences, Technology, and Advanced Studies, Chennai, Tamilnadu, India. Electronic address: m.murugappan@kcst.edu.kw
  • 9 Department of Electrical Engineering, Qatar University, Doha 2713, Qatar. Electronic address: mchowdhury@qu.edu.qa
Comput Biol Med, 2025 Jan;184:109284.
PMID: 39579661 DOI: 10.1016/j.compbiomed.2024.109284

Abstract

Sepsis, a life-threatening condition triggered by the body's response to infection, remains a significant global health challenge, annually affecting millions in the United States alone with substantial mortality and healthcare costs. Early prediction of sepsis is critical for timely intervention and improved patient outcomes. This study introduces an innovative predictive model leveraging machine learning techniques and a specific data-splitting approach on highly imbalanced electronic health records (EHRs). Using PhysioNet/CinC Challenge 2019 data from 40,336 patients, including vital signs, lab values, and demographics. Preliminary assessments using classical and stacked ML models with Synthetic Minority Oversampling Technique (SMOTE) augmentation were conducted, showing improved performance. It is found that stacking ML models enhances overall accuracy but faces limitations in precision, recall, and F1 score for positive class prediction. A novel data-splitting approach with 5-fold cross-validation and SMOTE and COPULA augmentation techniques demonstrated promise, with F1 scores ranging from 93 % to 94 % using the COPULA technique. COPULA excelled at predictions for different hours' onsets compared to the SMOTE technique. The proposed model outperformed existing studies, suggesting clinical viability for early sepsis prediction.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.