The respiratory disease of coronavirus disease 2019 (COVID-19) has wreaked havoc on the economy of every nation by infecting and killing millions of people. This deadly disease has taken a toll on the life of the entire human race, and an exact cure for it is still not developed. Thus, the control and cure of this disease mainly depend on restricting its transmission rate through early detection. The detection of coronavirus infection facilitates the isolation and exclusive care of infected patients. This research paper proposes a novel data mining system that combines the ensemble feature selection method and machine learning classifier for the effective identification of COVID-19 infection. Different feature selection approaches including chi-square test, recursive feature elimination (RFE), genetic algorithm (GA), particle swarm optimization (PSO), and random forest are evaluated for their effectiveness in enhancing the classification accuracy of the machine learning classifiers. The classifiers that are considered in this research work are decision tree, naïve Bayes, K-nearest neighbor (KNN), multilayer perceptron (MLP), and support vector machine (SVM). Two COVID-19 datasets were used for testing from which the best features supporting the dataset were extracted by the proposed system. The performance of the machine learning classifiers based on the ensemble feature selection methods is analyzed.
* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.