MyMedR

Displaying 1 publication

Abstract:

Sort:

Fulltext Enhancing malware detection with feature selection and scaling techniques using machine learning models

Hasan R, Biswas B, Samiun M, Saleh MA, Prabha M, Akter J, et al.

Sci Rep, 2025 Mar 17;15(1):9122.
PMID: 40097688 DOI: 10.1038/s41598-025-93447-x

The increasing prevalence of malware presents a critical challenge to cybersecurity, emphasizing the need for robust detection methods. This study uses a binary tabular classification dataset to evaluate the impact of feature selection, feature scaling, and machine learning (ML) models on malware detection. The methodology involves experimenting with three feature scaling techniques (no scaling, normalization, and min-max scaling), three feature selection methods (no selection, Linear Discriminant Analysis (LDA), and Principal Component Analysis (PCA)), and twelve ML models, including traditional algorithms and ensemble methods. A publicly available dataset with 11,598 samples and 139 features is utilized, and model performance is assessed using metrics such as accuracy, precision, recall, F1-score, and AUC-ROC. Results reveal that the Light Gradient Boosting Machine (LGBM) achieves the highest accuracy of 97.16% when PCA and either min-max scaling or normalization are applied. Additionally, ensemble models consistently outperform traditional ML models, demonstrating their effectiveness in enhancing malware detection. These findings offer valuable insights into optimizing preprocessing and model selection strategies for developing reliable and efficient malware detection systems.

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links