APD-BayTM: Jakarta Air Quality Index Prediction using Bayesian Optimized LSTM

Main Article Content

Raey Faldo
Satria Mandala
Mohd Shahrizal Sunar
Salim M. Zaki

Keywords

Deep learning, LSTM, air pollution, air quality index (AQI)

Abstract

The Air Quality Index (AQI) is a metric for evaluating air quality in a region. Jakarta holds the fifth position globally in terms of air pollution. Several studies have been performed to forecast pollution levels in Jakarta. However, existing studies exhibit limitations such as outdated datasets, lack of data normalization, absence of machine learning parameter setting, neglect of k-fold cross-validation, and a failure to incorporate deep learning algorithms for pollution detection. This study introduces an air quality detection system called APD-BayTM to address these issues. This proposed system leverages Long Short-Term Memory (LSTM) and uses Bayesian Optimization (BO) to enhance the performance of air pollution detection. The methodology used in this research involves four key steps: data preprocessing, LSTM model development, hyperparameter tuning through BO, and performance assessment using 5-fold cross-validation. APD-BayTM exhibits robust performance that is comparable to previous research outcomes. The LSTM model in APD-BayTM on the training dataset achieved average precision, recall, F1 score, and accuracy values of 93.29%, 91.41%, 91.89%, and 95.90%, respectively. These metrics improved on the test dataset, reaching 97.44%, 99.71%, 98.52%, and 99.34%, respectively. These findings show the robustness of APD-BayTM across datasets of varying sizes, encompassing both large and small datasets.

References

[1] P. K and P. Kumar, “A critical evaluation of air quality index models (1960–2021),” Environ Monit Assess, vol. 194, no. 5, p. 324, May 2022, doi: 10.1007/s10661-022-09896-8.
[2] M. F. Sanfia, S. Mandala, and E. Ariyanto, “Air Pollution Detection Based on Adaboost Ensemble Learning,” in 2023 International Conference on Data Science and Its Applications (ICoDSA), IEEE, Aug. 2023, pp. 203–208. doi: 10.1109/ICoDSA58501.2023.10277254.
[3] M. Hardini, R. A. Sunarjo, M. Asfi, M. H. Riza Chakim, and Y. P. Ayu Sanjaya, “Predicting Air Quality Index using Ensemble Machine Learning,” ADI Journal on Recent Innovation (AJRI), vol. 5, no. 1Sp, pp. 78–86, Aug. 2023, doi: 10.34306/ajri.v5i1Sp.981.
[4] IQAir, “Live most polluted major city ranking.” Accessed: Oct. 25, 2023. [Online]. Available: https://www.iqair.com/world-air-quality-ranking
[5] WHO, “Ambient (outdoor) air pollution,” 2022, Accessed: Jun. 14, 2023. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health
[6] A. Singh, H. Joshi, A. Srivastava, R. Kumar, and N. Hasteer, “An Analysis of Polluted Air Consumption and Hazards on Human Health: A Study Towards System Design,” in 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), IEEE, Jan. 2020, pp. 532–539. doi: 10.1109/Confluence47617.2020.9057848.
[7] S. Priya R.M. and P. Sathya, “Statistical Analysis of Air Pollutants in Ambient Air, Reality of Sensors and Corrective Measures in India,” in 2019 Innovations in Power and Advanced Computing Technologies (i-PACT), IEEE, Mar. 2019, pp. 1–6. doi: 10.1109/i-PACT44901.2019.8960010.
[8] “World Health Organization.” Accessed: Oct. 08, 2023. [Online]. Available: https://www.who.int/health-topics/air-pollution
[9] S. D. Permai, N. T. M. Sagala, A. Y. Zakiyyah, H. Tanty, and J. Harefa, “Multiclass Classification for Air Quality In Jakarta Using Support Vector Machine and Multi-Layer Perceptron Classifier,” in 2022 3rd International Conference on Artificial Intelligence and Data Sciences (AiDAS), IEEE, Sep. 2022, pp. 198–202. doi: 10.1109/AiDAS56890.2022.9918697.
[10] R. Muljana, L. D. Ayuningtyas, R. P. Daksa, S. F. Djamhari, M. A. Fiezayyan, and N. T. M. Sagala, “Air Pollution Prediction using Random Forest Classifier: A Case Study of DKI Jakarta,” in 2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE), IEEE, Feb. 2023, pp. 428–433. doi: 10.1109/ICCoSITE57641.2023.10127759.
[11] M. A. Rafif, G. Sanjaya Indrajaya, M. K. Al-Ghazi, J. Johnny, and N. T. M. Sagala, “Comparison of Decision Tree and Support Vector Machine for Predicting Jakarta Air Quality Index,” in 2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE), IEEE, Feb. 2023, pp. 381–385. doi: 10.1109/ICCoSITE57641.2023.10127855.
[12] N. H. Syafiuddin, S. Mandala, and N. D. W. Cahyani, “Detection Syn Flood and UDP Lag Attacks Based on Machine Learning Using AdaBoost,” in 2023 International Conference on Data Science and Its Applications (ICoDSA), IEEE, Aug. 2023, pp. 36–41. doi: 10.1109/ICoDSA58501.2023.10276638.
[13] S. Fitriani, S. Mandala, and M. A. Murti, “Review of semi-supervised method for Intrusion Detection System,” in 2016 Asia Pacific Conference on Multimedia and Broadcasting (APMediaCast), IEEE, Nov. 2016, pp. 36–41. doi: 10.1109/APMediaCast.2016.7878168.
[14] D. Septyadi and S. Mandala, “Analysis of Home Security System Design Based on 4 PIR Sensors Using Deep Learning Method,” in 2023 International Conference on Data Science and Its Applications (ICoDSA), IEEE, Aug. 2023, pp. 181–186. doi: 10.1109/ICoDSA58501.2023.10277453.
[15] B. N. Gerald Ergi and S. Mandala, “PIR Sensor-Based Intelligent Home Security System Design Analysis Using Machine Learning Methods,” in 2023 International Conference on Data Science and Its Applications (ICoDSA), IEEE, Aug. 2023, pp. 175–180. doi: 10.1109/ICoDSA58501.2023.10277399.
[16] S. Rachmadi, S. Mandala, and D. Oktaria, “Detection of DoS Attack using AdaBoost Algorithm on IoT System,” in 2021 International Conference on Data Science and Its Applications (ICoDSA), IEEE, Oct. 2021, pp. 28–33. doi: 10.1109/ICoDSA53588.2021.9617545.
[17] A. Syafiq Muhammad, S. Mandala, and P. H. Gunawan, “IOT-Based Pest Detection in Maize Plants Using Machine Learning,” in 2023 International Conference on Data Science and Its Applications (ICoDSA), IEEE, Aug. 2023, pp. 254–258. doi: 10.1109/ICoDSA58501.2023.10277633.
[18] S. Mandala et al., “Enhanced Myocardial Infarction Identification in Phonocardiogram Signals Using Segmented Feature Extraction and Transfer Learning-Based Classification,” IEEE Access, vol. 11, pp. 136654–136665, 2023, doi: 10.1109/ACCESS.2023.3338853.
[19] L. Alzubaidi et al., “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,” J Big Data, vol. 8, no. 1, p. 53, Mar. 2021, doi: 10.1186/s40537-021-00444-8.
[20] A. H. Almaliki, A. Derdour, and E. Ali, “Air Quality Index (AQI) Prediction in Holy Makkah Based on Machine Learning Methods,” Sustainability, vol. 15, no. 17, p. 13168, Sep. 2023, doi: 10.3390/su151713168.
[21] S. Saminathan and C. Malathy, “Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data,” Front Big Data, vol. 6, Jun. 2023, doi: 10.3389/fdata.2023.1175259.
[22] K. M. Babu and J. R. Beulah, “Air Quality Prediction based on Supervised Machine Learning Methods,” International Journal of Innovative Technology and Exploring Engineering, vol. 8, no. 9S4, pp. 206–212, Oct. 2019, doi: 10.35940/ijitee.I1132.0789S419.
[23] A. Pant, S. Sharma, M. Bansal, and M. Narang, “Comparative Analysis of Supervised Machine Learning Techniques for AQI Prediction,” in 2022 International Conference on Advanced Computing Technologies and Applications, ICACTA 2022, 2022. doi: 10.1109/ICACTA54488.2022.9753636.
[24] A. Attaallah and R. Ahmad Khan, “SMOTEDNN: A Novel Model for Air Pollution Forecasting and AQI Classification,” Computers, Materials & Continua, vol. 71, no. 1, pp. 1403–1425, 2022, doi: 10.32604/cmc.2022.021968.
[25] M. G. Ragab et al., “A Novel One-Dimensional CNN with Exponential Adaptive Gradients for Air Pollution Index Prediction,” Sustainability, vol. 12, no. 23, p. 10090, Dec. 2020, doi: 10.3390/su122310090.
[26] T. Toharudin et al., “Boosting Algorithm to Handle Unbalanced Classification of PM 2.5 Concentration Levels by Observing Meteorological Parameters in Jakarta-Indonesia Using AdaBoost, XGBoost, CatBoost, and LightGBM,” IEEE Access, vol. 11, pp. 35680–35696, 2023, doi: 10.1109/ACCESS.2023.3265019.
[27] E. Vlachou, C. Karras, A. Karras, D. Tsolis, and S. Sioutas, “EVCA Classifier: A MCMC-Based Classifier for Analyzing High-Dimensional Big Data,” Information, vol. 14, no. 8, p. 451, Aug. 2023, doi: 10.3390/info14080451.
[28] G. V. S. S. N. S. Sarma, B. R. Reddy, P. M. Nirgude, and P. V. Naidu, “Performance Assessment of Customized LSTM based Deep Learning Model for Predictive Maintenance of Transformer,” International Journal of Electrical and Electronics Research, vol. 11, no. 2, pp. 389–400, Jun. 2023, doi: 10.37391/ijeer.110220.
[29] G. Sun, C. Jiang, X. Wang, and X. Yang, “Short‐term building load forecast based on a data‐mining feature selection and LSTM‐RNN method,” IEEJ Transactions on Electrical and Electronic Engineering, vol. 15, no. 7, pp. 1002–1010, Jul. 2020, doi: 10.1002/tee.23144.
[30] M. Massaoudi, S. S. Refaat, I. Chihi, M. Trabelsi, F. S. Oueslati, and H. Abu-Rub, “A novel stacked generalization ensemble-based hybrid LGBM-XGB-MLP model for Short-Term Load Forecasting,” Energy, vol. 214, p. 118874, Jan. 2021, doi: 10.1016/j.energy.2020.118874.
[31] R. Shi, X. Xu, J. Li, and Y. Li, “Prediction and analysis of train arrival delay based on XGBoost and Bayesian optimization,” Appl Soft Comput, vol. 109, p. 107538, Sep. 2021, doi: 10.1016/j.asoc.2021.107538.
[32] H. V. Nguyen and H. Byeon, “Prediction of Out-of-Hospital Cardiac Arrest Survival Outcomes Using a Hybrid Agnostic Explanation TabNet Model,” Mathematics, vol. 11, no. 9, p. 2030, Apr. 2023, doi: 10.3390/math11092030.
[33] A. Kulshrestha, V. Krishnaswamy, and M. Sharma, “Bayesian BILSTM approach for tourism demand forecasting,” Ann Tour Res, vol. 83, p. 102925, Jul. 2020, doi: 10.1016/j.annals.2020.102925.
[34] “Indeks Standar Pencemaran Udara (ISPU) Tahun 2021.” Accessed: Oct. 07, 2023. [Online]. Available: https://data.jakarta.go.id/dataset/indeks-standar-pencemaran-udara-ispu-tahun-2021
[35] D. Chaniago, A. Zahara, and I. S. Ramadhani, “INDEKS STANDAR PENCEMAR UDARA (ISPU) SEBAGAI INFORMASI MUTU UDARA AMBIEN DI INDONESIA,” Direktorat Pengendalian Pencemaran Udara, Direktorat Jenderal Pengendalian Pencemaran dan Kerusakan Lingkungan, Kementerian Lingkungan Hidup dan Kehutanan.
[36] “Tensorflow.” Accessed: Dec. 06, 2023. [Online]. Available: https://www.tensorflow.org/
[37] M. Schmidt, S. Safarani, J. Gastinger, T. Jacobs, S. Nicolas, and A. Schulke, “On the Performance of Differential Evolution for Hyperparameter Tuning,” in 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, Jul. 2019, pp. 1–8. doi: 10.1109/IJCNN.2019.8851978.
[38] P. Tsangaratos and I. Ilia, “Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: The influence of models complexity and training dataset size,” Catena (Amst), vol. 145, pp. 164–179, Oct. 2016, doi: 10.1016/j.catena.2016.06.004.
[39] B. Kalantar, B. Pradhan, S. A. Naghibi, A. Motevalli, and S. Mansor, “Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN),” Geomatics, Natural Hazards and Risk, vol. 9, no. 1, pp. 49–69, Jan. 2018, doi: 10.1080/19475705.2017.1407368.

Similar Articles

1 2 3 4 5 6 > >> 

You may also start an advanced similarity search for this article.