Affiliations 

  • 1 Computer Science, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
  • 2 Fleet Management Systems & Technologies, Istanbul, Turkey
  • 3 Faculty of Engineering & Quantity Surveying, INTI International University (INTI-IU), Persiaran Perdana BBN, Putra Nilai, 71800, Nilai, Negeri Sembilan, Malaysia
  • 4 Department of Civil Engineering, College of Engineering, Universiti Tenaga Nasional, 43000, Kajang, Selangor, Malaysia. mahfoodh@uniten.edu.my
  • 5 Department of Civil Engineering, Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Jalan Sg. Long, Bandar Sg. Long, 43000, Kajang, Selangor, Malaysia
  • 6 National Water and Energy Center, United Arab Emirates University, P.O. Box 15551, Al Ain, United Arab Emirates
  • 7 Department of Civil Engineering, Faculty of Engineering, University of Malaya (UM), 50603, Kuala Lumpur, Malaysia
Sci Rep, 2023 Sep 04;13(1):14574.
PMID: 37666880 DOI: 10.1038/s41598-023-41735-9

Abstract

Due to excessive streamflow (SF), Peninsular Malaysia has historically experienced floods and droughts. Forecasting streamflow to mitigate municipal and environmental damage is therefore crucial. Streamflow prediction has been extensively demonstrated in the literature to estimate the continuous values of streamflow level. Prediction of continuous values of streamflow is not necessary in several applications and at the same time it is very challenging task because of uncertainty. A streamflow category prediction is more advantageous for addressing the uncertainty in numerical point forecasting, considering that its predictions are linked to a propensity to belong to the pre-defined classes. Here, we formulate streamflow prediction as a time series classification with discrete ranges of values, each representing a class to classify streamflow into five or ten, respectively, using machine learning approaches in various rivers in Malaysia. The findings reveal that several models, specifically LSTM, outperform others in predicting the following n-time steps of streamflow because LSTM is able to learn the mapping between streamflow time series of 2 or 3 days ahead more than support vector machine (SVM) and gradient boosting (GB). LSTM produces higher F1 score in various rivers (by 5% in Johor, 2% in Kelantan and Melaka and Selangor, 4% in Perlis) in 2 days ahead scenario. Furthermore, the ensemble stacking of the SVM and GB achieves high performance in terms of F1 score and quadratic weighted kappa. Ensemble stacking gives 3% higher F1 score in Perak river compared to SVM and gradient boosting.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.