Affiliations 

  • 1 Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Viet Nam. Electronic address: tiyasha.st@tdtu.edu.vn
  • 2 Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Viet Nam. Electronic address: tranminhtung@tdtu.edu.vn
  • 3 Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Viet Nam. Electronic address: surajenv@gmail.com
  • 4 GeoInformatic Unit, Geography Section, School of Humanities, Universiti Sains Malaysia, 11800, Pulau Pinang, Malaysia
  • 5 Faculty of Applied Sciences, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
  • 6 Department of Civil Engineering, faculty of engineering and built environment, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia
  • 7 Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Viet Nam; New era and development in civil engineering research group, Scientific Research Center, Al-Ayen University, Thi-Qar 64001, Iraq.; College of Creative Design, Asia University, Taichung City, Taiwan. Electronic address: yaseen@alayen.edu.iq
Mar Pollut Bull, 2021 Sep;170:112639.
PMID: 34273614 DOI: 10.1016/j.marpolbul.2021.112639

Abstract

Dissolved oxygen (DO) is an important indicator of river health for environmental engineers and ecological scientists to understand the state of river health. This study aims to evaluate the reliability of four feature selector algorithms i.e., Boruta, genetic algorithm (GA), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGBoost) to select the best suited predictor of the applied water quality (WQ) parameters; and compare four tree-based predictive models, namely, random forest (RF), conditional random forests (cForest), RANdom forest GEneRator (Ranger), and XGBoost to predict the changes of dissolved oxygen (DO) in the Klang River, Malaysia. The total features including 15 WQ parameters from monitoring site data and 7 hydrological components from remote sensing data. All predictive models performed well as per the features selected by the algorithms XGBoost and MARS in terms applied statistical evaluators. Besides, the best performance noted in case of XGBoost predictive model among all applied predictive models when the feature selected by MARS and XGBoost algorithms, with the coefficient of determination (R2) values of 0.84 and 0.85, respectively, nonetheless the marginal performance came up by Boruta-XGBoost model on in this scenario.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.