This study investigated relationships of a water quality index (WQI) with multiple water quality variables (WQVs), explored variability in water quality over time and space, and established linear and non-linear models predictive of WQI from raw WQVs. Data were processed using Spearman's rank correlation analysis, multiple linear regression, and artificial neural network modeling. Correlation analysis indicated that from a temporal perspective, the WQI, temperature, and zinc, arsenic, chemical oxygen demand, sodium, and dissolved oxygen concentrations increased, whereas turbidity and suspended solids, total solids, nitrate nitrogen (NO3-N), and biochemical oxygen demand concentrations decreased with year. From a spatial perspective, an increase with distance of the sampling station from the headwater was exhibited by 10 WQVs: magnesium, calcium, dissolved solids, electrical conductivity, temperature, NO3-N, arsenic, chloride, potassium, and sodium. At the same time, the WQI; Escherichia coli bacteria counts; and suspended solids, total solids, and dissolved oxygen concentrations decreased with distance from the headwater. Lastly, regression and artificial neural network models with high prediction powers (81.2% and 91.4%, respectively) were developed and are discussed.
This study employed three chemometric data mining techniques (factor analysis (FA), cluster analysis (CA), and discriminant analysis (DA)) to identify the latent structure of a water quality (WQ) dataset pertaining to Kinta River (Malaysia) and to classify eight WQ monitoring stations along the river into groups of similar WQ characteristics. FA identified the WQ parameters responsible for variations in Kinta River's WQ and accentuated the roles of weathering and surface runoff in determining the river's WQ. CA grouped the monitoring locations into a cluster of low levels of water pollution (the two uppermost monitoring stations) and another of relatively high levels of river pollution (the mid-, and down-stream stations). DA confirmed these clusters and produced a discriminant function which can predict the cluster membership of new and/or unknown samples. These chemometric techniques highlight the potential for reasonably reducing the number of WQVs and monitoring stations for long-term monitoring purposes.
This paper describes the design of an artificial neural network (ANN) model to predict the water quality index (WQI) using land use areas as predictors. Ten-year records of land use statistics and water quality data for Kinta River (Malaysia) were employed in the modeling process. The most accurate WQI predictions were obtained with the network architecture 7-23-1; the back propagation training algorithm; and a learning rate of 0.02. The WQI forecasts of this model had significant (p < 0.01), positive, very high correlation (ρs = 0.882) with the measured WQI values. Sensitivity analysis revealed that the relative importance of the land use classes to WQI predictions followed the order: mining > rubber > forest > logging > urban areas > agriculture > oil palm. These findings show that the ANNs are highly reliable means of relating water quality to land use, thus integrating land use development with river water quality management.
This article describes design and application of feed-forward, fully-connected, three-layer perceptron neural network model for computing the water quality index (WQI)(1) for Kinta River (Malaysia). The modeling efforts showed that the optimal network architecture was 23-34-1 and that the best WQI predictions were associated with the quick propagation (QP) training algorithm; a learning rate of 0.06; and a QP coefficient of 1.75. The WQI predictions of this model had significant, positive, very high correlation (r=0.977, p<0.01) with the measured WQI values, implying that the model predictions explain around 95.4% of the variation in the measured WQI values. The approach presented in this article offers useful and powerful alternative to WQI computation and prediction, especially in the case of WQI calculation methods which involve lengthy computations and use of various sub-index formulae for each value, or range of values, of the constituent water quality variables.