Feature selection has been widely applied in many areas such as classification of spam emails, cancer cells, fraudulent claims, credit risk, text categorisation and DNA microarray analysis. Classification involves building predictive models to predict the target variable based on several input variables (features). This study compares filter and wrapper feature selection methods to maximise the classifier accuracy. The logistic regression was used as a classifier while the performance of the feature selection methods was based on the classification accuracy, Akaike information criteria (AIC), Bayesian information criteria (BIC), Area Under Receiver operator curve (AUC), as well as sensitivity and specificity of the classifier. The simulation study involves generating data for continuous features and one binary dependent variable for different sample sizes. The filter methods used are correlation based feature selection and information gain, while the wrapper methods are sequential forward and sequential backward elimination. The simulation was carried out using R, an open-source programming language. Simulation results showed that the wrapper method (sequential forward selection and sequential backward elimination) methods were better than the filter method in selecting the correct features.
AIMS: Health-Related Quality of Life (HRQoL) has been increasing attention in health outcome studies. Factors that individually influence HRQoL, diabetes self-care behaviors, and medication adherence have been widely investigated; however, most previous studies have not tested an integrated association between multiple health outcomes. The purpose of this study was to formulate a hypothetical structural equation model linking HRQoL, diabetes distress, diabetes self-care activities, medication adherence and diabetes-dependent QoL in patients with Type 2 Diabetes Mellitus (T2DM).
METHODS: A cross-sectional study design was employed, and 497 patients with T2DM were recruited from outpatient clinics in three public hospitals and one government clinic. The patients completed a series of questionnaires. The hypothetical model was tested using Structural Equation Modeling (SEM) analysis.
RESULTS: The values of the multiple fit indices indicated that the proposed model provided a good fit to the data. SEM results showed that medication adherence (MMAS) had a significant direct effect on diabetes distress (PAID) (Beta = -0.20). The self-care activities (SDSCA) construct was significantly related to PAID (Beta = -0.24). SDSCA was found to have a significant relationship with HRQoL (SF-36) (Beta = 0.11). Additionally, diabetes distress had a significant effect (Beta = -0.11) on HRQoL of patients. Finally, ADDQoL had a significant effect on HRQoL (Beta = 0.12).
CONCLUSIONS: The various health outcome indicators such as self-care behaviors, diabetes distress, medication adherence and diabetes-dependent QoL need to be considered in clinical practice for enhancing HRQoL in those patients.
Study site: Hospital Tuanku Ampuan Rahimah, Hospital Sungai Buloh and Hospital Serdang; Klinik Kesihatan Botanic, Kelang, Selangor, Malaysia
Dengue fever is a mosquito-borne disease that affects nearly 3.9 billion people globally. Dengue remains endemic in Malaysia since its outbreak in the 1980's, with its highest concentration of cases in the state of Selangor. Predictors of dengue fever outbreaks could provide timely information for health officials to implement preventative actions. In this study, five districts in Selangor, Malaysia, that demonstrated the highest incidence of dengue fever from 2013 to 2017 were evaluated for the best machine learning model to predict Dengue outbreaks. Climate variables such as temperature, wind speed, humidity and rainfall were used in each model. Based on results, the SVM (linear kernel) exhibited the best prediction performance (Accuracy = 70%, Sensitivity = 14%, Specificity = 95%, Precision = 56%). However, the sensitivity for SVM (linear) for the testing sample increased up to 63.54% compared to 14.4% for imbalanced data (original data). The week-of-the-year was the most important predictor in the SVM model. This study exemplifies that machine learning has respectable potential for the prediction of dengue outbreaks. Future research should consider boosting, or using, nature inspired algorithms to develop a dengue prediction model.