This study evaluates state-of-the-art machine learning models in predicting the most sustainable arsenic mitigation preference. A Gaussian distribution-based Naïve Bayes (NB) classifier scored the highest Area Under the Curve (AUC) of the Receiver Operating Characteristic curve (0.82), followed by Nu Support Vector Classification (0.80), and K-Neighbors (0.79). Ensemble classifiers scored higher than 70% AUC, with Random Forest being the top performer (0.77), and Decision Tree model ranked fourth with an AUC of 0.77. The multilayer perceptron model also achieved high performance (AUC=0.75). Most linear classifiers underperformed, with the Ridge classifier at the top (AUC=0.73) and perceptron at the bottom (AUC=0.57). A Bernoulli distribution-based Naïve Bayes classifier was the poorest model (AUC=0.50). The Gaussian NB was also the most robust ML model with the slightest variation of Kappa score on training (0.58) and test data (0.64). The results suggest that nonlinear or ensemble classifiers could more accurately understand the complex relationships of socio-environmental data and help develop accurate and robust prediction models of sustainable arsenic mitigation. Furthermore, Gaussian NB is the best option when data is scarce.
We used AdaBoost (AB), alternating decision tree (ADTree), and their combination as an ensemble model (AB-ADTree) to spatially predict landslides in the Cameron Highlands, Malaysia. The models were trained with a database of 152 landslides compiled using Synthetic Aperture Radar Interferometry, Google Earth images, and field surveys, and 17 conditioning factors (slope, aspect, elevation, distance to road, distance to river, proximity to fault, road density, river density, normalized difference vegetation index, rainfall, land cover, lithology, soil types, curvature, profile curvature, stream power index, and topographic wetness index). We carried out the validation process using the area under the receiver operating characteristic curve (AUC) and several parametric and non-parametric performance metrics, including positive predictive value, negative predictive value, sensitivity, specificity, accuracy, root mean square error, and the Friedman and Wilcoxon sign rank tests. The AB model (AUC = 0.96) performed better than the ensemble AB-ADTree model (AUC = 0.94) and successfully outperformed the ADTree model (AUC = 0.59) in predicting landslide susceptibility. Our findings provide insights into the development of more efficient and accurate landslide predictive models that can be used by decision makers and land-use managers to mitigate landslide hazards.
Adaptive neuro-fuzzy inference system (ANFIS) includes two novel GIS-based ensemble artificial intelligence approaches called imperialistic competitive algorithm (ICA) and firefly algorithm (FA). This combination could result in ANFIS-ICA and ANFIS-FA models, which were applied to flood spatial modelling and its mapping in the Haraz watershed in Northern Province of Mazandaran, Iran. Ten influential factors including slope angle, elevation, stream power index (SPI), curvature, topographic wetness index (TWI), lithology, rainfall, land use, stream density, and the distance to river were selected for flood modelling. The validity of the models was assessed using statistical error-indices (RMSE and MSE), statistical tests (Friedman and Wilcoxon signed-rank tests), and the area under the curve (AUC) of success. The prediction accuracy of the models was compared to some new state-of-the-art sophisticated machine learning techniques that had previously been successfully tested in the study area. The results confirmed the goodness of fit and appropriate prediction accuracy of the two ensemble models. However, the ANFIS-ICA model (AUC = 0.947) had a better performance in comparison to the Bagging-LMT (AUC = 0.940), BLR (AUC = 0.936), LMT (AUC = 0.934), ANFIS-FA (AUC = 0.917), LR (AUC = 0.885) and RF (AUC = 0.806) models. Therefore, the ANFIS-ICA model can be introduced as a promising method for the sustainable management of flood-prone areas.
In this study, land subsidence susceptibility was assessed for a study area in South Korea by using four machine learning models including Bayesian Logistic Regression (BLR), Support Vector Machine (SVM), Logistic Model Tree (LMT) and Alternate Decision Tree (ADTree). Eight conditioning factors were distinguished as the most important affecting factors on land subsidence of Jeong-am area, including slope angle, distance to drift, drift density, geology, distance to lineament, lineament density, land use and rock-mass rating (RMR) were applied to modelling. About 24 previously occurred land subsidence were surveyed and used as training dataset (70% of data) and validation dataset (30% of data) in the modelling process. Each studied model generated a land subsidence susceptibility map (LSSM). The maps were verified using several appropriate tools including statistical indices, the area under the receiver operating characteristic (AUROC) and success rate (SR) and prediction rate (PR) curves. The results of this study indicated that the BLR model produced LSSM with higher acceptable accuracy and reliability compared to the other applied models, even though the other models also had reasonable results.
Landslides are major hazards for human activities often causing great damage to human lives and infrastructure. Therefore, the main aim of the present study is to evaluate and compare three machine learning algorithms (MLAs) including Naïve Bayes (NB), radial basis function (RBF) Classifier, and RBF Network for landslide susceptibility mapping (LSM) at Longhai area in China. A total of 14 landslide conditioning factors were obtained from various data sources, then the frequency ratio (FR) and support vector machine (SVM) methods were used for the correlation and selection the most important factors for modelling process, respectively. Subsequently, the resulting three models were validated and compared using some statistical metrics including area under the receiver operating characteristics (AUROC) curve, and Friedman and Wilcoxon signed-rank tests The results indicated that the RBF Classifier model had the highest goodness-of-fit and performance based on the training and validation datasets. The results concluded that the RBF Classifier model outperformed and outclassed (AUROC = 0.881), the NB (AUROC = 0.872) and the RBF Network (AUROC = 0.854) models. The obtained results pointed out that the RBF Classifier model is a promising method for spatial prediction of landslide over the world.
The declining water level in Lake Urmia has become a significant issue for Iranian policy and decision makers. This lake has been experiencing an abrupt decrease in water level and is at real risk of becoming a complete saline land. Because of its position, assessment of changes in the Lake Urmia is essential. This study aims to evaluate changes in the water level of Lake Urmia using the space-borne remote sensing and GIS techniques. Therefore, multispectral Landsat 7 ETM+ images for the years 2000, 2010, and 2017 were acquired. In addition, precipitation and temperature data for 31 years between 1986 and 2017 were collected for further analysis. Results indicate that the increased temperature (by 19%), decreased rainfall of about 62%, and excessive damming in the Urmia Basin along with mismanagement of water resources are the key factors in the declining water level of Lake Urmia. Furthermore, the current research predicts the potential environmental crisis as the result of the lake shrinking and suggests a few possible alternatives. The insights provided by this study can be beneficial for environmentalists and related organizations working on this and similar topics.
The main objective of this research was to introduce a novel machine learning algorithm of alternating decision tree (ADTree) based on the multiboost (MB), bagging (BA), rotation forest (RF) and random subspace (RS) ensemble algorithms under two scenarios of different sample sizes and raster resolutions for spatial prediction of shallow landslides around Bijar City, Kurdistan Province, Iran. The evaluation of modeling process was checked by some statistical measures and area under the receiver operating characteristic curve (AUROC). Results show that, for combination of sample sizes of 60%/40% and 70%/30% with a raster resolution of 10 m, the RS model, while, for 80%/20% and 90%/10% with a raster resolution of 20 m, the MB model obtained a high goodness-of-fit and prediction accuracy. The RS-ADTree and MB-ADTree ensemble models outperformed the ADTree model in two scenarios. Overall, MB-ADTree in sample size of 80%/20% with a resolution of 20 m (area under the curve (AUC) = 0.942) and sample size of 60%/40% with a resolution of 10 m (AUC = 0.845) had the highest and lowest prediction accuracy, respectively. The findings confirm that the newly proposed models are very promising alternative tools to assist planners and decision makers in the task of managing landslide prone areas.
In this study, we introduced a novel hybrid artificial intelligence approach of rotation forest (RF) as a Meta/ensemble classifier based on alternating decision tree (ADTree) as a base classifier called RF-ADTree in order to spatially predict gully erosion at Klocheh watershed of Kurdistan province, Iran. A total of 915 gully erosion locations along with 22 gully conditioning factors were used to construct a database. Some soft computing benchmark models (SCBM) including the ADTree, the Support Vector Machine by two kernel functions such as Polynomial and Radial Base Function (SVM-Polynomial and SVM-RBF), the Logistic Regression (LR), and the Naïve Bayes Multinomial Updatable (NBMU) models were used for comparison of the designed model. Results indicated that 19 conditioning factors were effective among which distance to river, geomorphology, land use, hydrological group, lithology and slope angle were the most remarkable factors for gully modeling process. Additionally, results of modeling concluded the RF-ADTree ensemble model could significantly improve (area under the curve (AUC) = 0.906) the prediction accuracy of the ADTree model (AUC = 0.882). The new proposed model had also the highest performance (AUC = 0.913) in comparison to the SVM-Polynomial model (AUC = 0.879), the SVM-RBF model (AUC = 0.867), the LR model (AUC = 0.75), the ADTree model (AUC = 0.861) and the NBMU model (AUC = 0.811).
Shallow landslides damage buildings and other infrastructure, disrupt agriculture practices, and can cause social upheaval and loss of life. As a result, many scientists study the phenomenon, and some of them have focused on producing landslide susceptibility maps that can be used by land-use managers to reduce injury and damage. This paper contributes to this effort by comparing the power and effectiveness of five machine learning, benchmark algorithms-Logistic Model Tree, Logistic Regression, Naïve Bayes Tree, Artificial Neural Network, and Support Vector Machine-in creating a reliable shallow landslide susceptibility map for Bijar City in Kurdistan province, Iran. Twenty conditioning factors were applied to 111 shallow landslides and tested using the One-R attribute evaluation (ORAE) technique for modeling and validation processes. The performance of the models was assessed by statistical-based indexes including sensitivity, specificity, accuracy, mean absolute error (MAE), root mean square error (RMSE), and area under the receiver operatic characteristic curve (AUC). Results indicate that all the five machine learning models performed well for shallow landslide susceptibility assessment, but the Logistic Model Tree model (AUC = 0.932) had the highest goodness-of-fit and prediction accuracy, followed by the Logistic Regression (AUC = 0.932), Naïve Bayes Tree (AUC = 0.864), ANN (AUC = 0.860), and Support Vector Machine (AUC = 0.834) models. Therefore, we recommend the use of the Logistic Model Tree model in shallow landslide mapping programs in semi-arid regions to help decision makers, planners, land-use managers, and government agencies mitigate the hazard and risk.