Affiliations 

  • 1 Centre for Community Health Studies, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
  • 2 National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo, Japan
JMIR Form Res, 2022 Dec 07;6(12):e40404.
PMID: 36476813 DOI: 10.2196/40404

Abstract

BACKGROUND: Overweight or obesity is a primary health concern that leads to a significant burden of noncommunicable disease and threatens national productivity and economic growth. Given the complexity of the etiology of overweight or obesity, machine learning (ML) algorithms offer a promising alternative approach in disentangling interdependent factors for predicting overweight or obesity status.

OBJECTIVE: This study examined the performance of 3 ML algorithms in comparison with logistic regression (LR) to predict overweight or obesity status among working adults in Malaysia.

METHODS: Using data from 16,860 participants (mean age 34.2, SD 9.0 years; n=6904, 41% male; n=7048, 41.8% with overweight or obesity) in the Malaysia's Healthiest Workplace by AIA Vitality 2019 survey, predictor variables, including sociodemographic characteristics, job characteristics, health and weight perceptions, and lifestyle-related factors, were modeled using the extreme gradient boosting (XGBoost), random forest (RF), and support vector machine (SVM) algorithms, as well as LR, to predict overweight or obesity status based on a BMI cutoff of 25 kg/m2.

RESULTS: The area under the receiver operating characteristic curve was 0.81 (95% CI 0.79-0.82), 0.80 (95% CI 0.79-0.81), 0.80 (95% CI 0.78-0.81), and 0.78 (95% CI 0.77-0.80) for the XGBoost, RF, SVM, and LR models, respectively. Weight satisfaction was the top predictor, and ethnicity, age, and gender were also consistent predictor variables of overweight or obesity status in all models.

CONCLUSIONS: Based on multi-domain online workplace survey data, this study produced predictive models that identified overweight or obesity status with moderate to high accuracy. The performance of both ML-based and logistic regression models were comparable when predicting obesity among working adults in Malaysia.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.