Affiliations 

  • 1 Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia
  • 2 Department of Science and Technology Studies, Faculty of Sciences, Universiti Malaya, Kuala Lumpur, Malaysia
  • 3 Department of Electrical and Electronic Engineering, Faculty of Engineering and Built Environment, Universiti Sains Islam Malaysia, Nilai, Negeri Sembilan, Malaysia
  • 4 Department of Chemical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia
  • 5 Institute of Biological Science, Faculty of Science, Univerisiti Malaya, Kuala Lumpur, Malaysia
  • 6 Civil Engineering Department, Jenderal Soedirman University, Purwokerto, Indonesia
PeerJ Comput Sci, 2023;9:e1306.
PMID: 37346549 DOI: 10.7717/peerj-cs.1306

Abstract

BACKGROUND: The environment has been significantly impacted by rapid urbanization, leading to a need for changes in climate change and pollution indicators. The 4IR offers a potential solution to efficiently manage these impacts. Smart city ecosystems can provide well-designed, sustainable, and safe cities that enable holistic climate change and global warming solutions through various community-centred initiatives. These include smart planning techniques, smart environment monitoring, and smart governance. An air quality intelligence platform, which operates as a complete measurement site for monitoring and governing air quality, has shown promising results in providing actionable insights. This article aims to highlight the potential of machine learning models in predicting air quality, providing data-driven strategic and sustainable solutions for smart cities.

METHODS: This study proposed an end-to-end air quality predictive model for smart city applications, utilizing four machine learning techniques and two deep learning techniques. These include Ada Boost, SVR, RF, KNN, MLP regressor and LSTM. The study was conducted in four different urban cities in Selangor, Malaysia, including Petaling Jaya, Banting, Klang, and Shah Alam. The model considered the air quality data of various pollution markers such as PM2.5, PM10, O3, and CO. Additionally, meteorological data including wind speed and wind direction were also considered, and their interactions with the pollutant markers were quantified. The study aimed to determine the correlation variance of the dependent variable in predicting air pollution and proposed a feature optimization process to reduce dimensionality and remove irrelevant features to enhance the prediction of PM2.5, improving the existing LSTM model. The study estimates the concentration of pollutants in the air based on training and highlights the contribution of feature optimization in air quality predictions through feature dimension reductions.

RESULTS: In this section, the results of predicting the concentration of pollutants (PM2.5, PM10, O3, and CO) in the air are presented in R2 and RMSE. In predicting the PM10 and PM2.5concentration, LSTM performed the best overall high R2values in the four study areas with the R2 values of 0.998, 0.995, 0.918, and 0.993 in Banting, Petaling, Klang and Shah Alam stations, respectively. The study indicated that among the studied pollution markers, PM2.5,PM10, NO2, wind speed and humidity are the most important elements to monitor. By reducing the number of features used in the model the proposed feature optimization process can make the model more interpretable and provide insights into the most critical factor affecting air quality. Findings from this study can aid policymakers in understanding the underlying causes of air pollution and develop more effective smart strategies for reducing pollution levels.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.