Implementation of GridSearchCV to Find the Best Hyperparameter Combination for Classification Model Algorithm in Predicting Water Potability
DOI:
https://doi.org/10.59934/jaiea.v4i2.844Keywords:
Potability, GridSearchCV, SVM, Random Forest, Logistic Regressio.Abstract
Drinking water quality is an important factor in public health, so an accurate approach is needed to determine water potability. This research aims to create a water potability prediction model using machine learning methods, with a focus on model accuracy and testing. The dataset used includes various chemical parameters, as well as one radiological and acceptability parameter. In this study, various machine learning algorithms, such as Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression, were applied using GridSearchCV and their performance compared. Models were evaluated using accuracy, precision, recall, F1-score, and confusion matrix metrics, with cross-validation to ensure generalizability. The results showed that the Support Vector Machine algorithm provided the best performance with an accuracy of 70.43%, followed by Random Forest and Logistic Regression with accuracies of 70.12% and 62.20%, respectively. The Support Vector Machine-based model is able to provide reliable predictions and can be used as a tool to support decision-making in water quality management.
Downloads
References
W. H. O. (WHO), “Guidelines for drinking-water quality: fourth edition incorporating the first and second addenda.” [Online]. Available: https://www.who.int/publications/i/item/9789240045064.
U. S. E. P. A. (EPA), “National Primary Drinking Water Regulations.” [Online]. Available: https://www.epa.gov/ground-water-and-drinking-water/national-primary-drinking-water-regulations.
S. Handayani, Sudarti, and Yushardi, “Analisis Kualitas Air Minum Berdasarkan Kadar PH Air Mineral dan Rebusan Sebagai Sumber Energi Terbarukan,” Opt. J. Pendidik. Fis., vol. 7, no. 2, pp. 385–395, 2023.
T. T. Irianti, Kuswandi, S. Nuranto, and A. Budiyatni, “Logam Berat & Kesehatan,” Buku Logam Berat Kesehat., pp. 1–131, 2017.
M. Zaynab et al., “Health and Environmental Effects of Heavy Metals,” J. King Saud Univ. - Sci., vol. 34, no. 1, p. 101653, 2022.
W. S. D. of Health, “Nitrate in Drinking Water,” Washington State Department of Health. [Online]. Available: https://doh.wa.gov/community-and-environment/drinking-water/contaminants/nitrate.
I. El-Nahhal and Y. El-Nahhal, “Pesticide residues in drinking water, their potential risk to human health and removal options,” J. Environ. Manage., vol. 299, no. August, p. 113611, 2021.
I. Paradis, U. Syamsudin, and M. I. Rantau, “Optimalisasi Pelayanan Air Minum Oleh PDAM Tirta Benteng Kota Tangerang,” J. Ilm. Wahana Pendidik., vol. 10, no. 8, pp. 491–528, 2024.
Permenkes RI, “Peraturan Menteri Kesehatan Republik Indonesia Nomor 492/Menkes/Per/IV/2010 Tentang Persyaratan Kualitas Air Minum,” Peraturan Mentri Kesehatan Republik Indonesia. p. MENKES, 2010.
W. H. O. (WHO), “Drinking-water.” [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/drinking-water.
SDGs, “Summary Progress Update 2021 : SDG 6 — water and sanitation for all,” UN-Water Integr. Monit. Initiat., pp. 1–58, 2021.
S. Y. Kurniawan, S. Sanjaya, Y. Vitriani, and I. Afrianty, “Klasifikasi Kelayakan Air Minum dengan Backpropagation Neural Network Berbasis Penanganan Missing Value dan Normalisasi,” J. Inf. Syst. Res., vol. 6, no. 1, pp. 87–95, 2024.
K. Abdi, A. Warjaya, I. Muthmainnah, and P. H. Pahutar, “Penerapan Algoritma Random Forest dalam Prediksi Kelayakan Air Minum,” J. Ilmu Komput. dan Inform., vol. 3, no. 2, pp. 81–88, 2024.
Y. V. Sari, Z. Muallifah, and A. Fanani, “Klasifikasi Kualitas Air Menggunakan Metode Extreme Learning Machine (ELM),” J. JUPITER, vol. 15, no. 2, pp. 983–994, 2023.
F. Malik Namus Akbar, “Metode KNN (K-Nearest Neighbor) untuk Menentukan Kualitas Air,” J. Tekno Kompak, vol. 18, no. 1, pp. 28–40, 2024.
Achmad Baroqah Pohan, Irmawati, and A. Kurniasih, “Optimization of Classification Algorithm with GridSearchCV and Hyperparameter Tuning for Sentiment Analysis of the Nusantara Capital City,” J. Artif. Intell. Eng. Appl., vol. 3, no. 3, pp. 808–814, 2024.
M. P. Pulungan, A. Purnomo, and A. Kurniasih, “Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Kepribadian MBTI Menggunakan Naive Bayes Classifier,” J. Teknol. Inf. dan Ilmu Komput., vol. 10, no. 7, pp. 1493–1502, 2023.
P. A. Octaviani, Y. Wilandari, and D. Ispiriyanti, “Penerapan Metode Klasifikasi Support Vector Machine (SVM) pada Data Akreditasi Sekolah Dasar (SD) di Kabupaten Magelang,” J. Gaussian, vol. 3, no. 8, pp. 811–820, 2014.
A. Z. Praghakusma and N. Charibaldi, “Komparasi Fungsi Kernel Metode Support Vector Machine untuk Analisis Sentimen Instagram dan Twitter (Studi Kasus : Komisi Pemberantasan Korupsi),” JSTIE (Jurnal Sarj. Tek. Inform., vol. 9, no. 2, p. 23342, 2021.
S. D. Wahyuni and R. H. Kusumodestoni, “Optimalisasi Algoritma Support Vector Machine (SVM) Dalam Klasifikasi Kejadian Data Stunting,” Bull. Inf. Technol., vol. 5, no. 2, pp. 56–64, 2024.
E. Rizqi Mar’atus Sholiihah, I. G. Susrama Mas Diyasa, and E. Yulia Puspaningrum, “Perbandingan Kinerja Kernel Linear Dan Rbf Support Vector Machine Untuk Analisis Sentimen Ulasan Pengguna Kai Access Pada Google Play Store,” JATI (Jurnal Mhs. Tek. Inform., vol. 8, no. 1, pp. 728–733, 2024.
Suci Amaliah, M. Nusrang, and A. Aswi, “Penerapan Metode Random Forest Untuk Klasifikasi Varian Minuman Kopi di Kedai Kopi Konijiwa Bantaeng,” VARIANSI J. Stat. Its Appl. Teach. Res., vol. 4, no. 3, pp. 121–127, 2022.
L. Tharmalingam, “Water Quality and Potability,” Kaggle Dataset, 2023. [Online]. Available: https://www.kaggle.com/datasets/uom190346a/water-quality-and-potability.
M. R. Fatturrahman and A. Kurniasih, “Penggunaan Metode NearMiss, SMOTE, dan Naïve Bayes untuk Klasifikasi Gangguan Tidur Berdasarkan Kualitas Tidur dan Gaya Hidup,” Pros. Semin. Nas. Mhs. Bid. Ilmu Komput. dan Apl., vol. 4, no. 2, pp. 567–576, 2023.
F. Rachmawati, J. Jaenudin, N. B. Ginting, and P. Laksono, “Machine Learning for the Model Prediction of Final Semester Assessment (FSA) using the Multiple Linear Regression Method,” J. Tek. Inform., vol. 17, no. 1, pp. 1–9, 2024.
A. Kurniasih and L. P. Manik, “On the Role of Text Preprocessing in BERT Embedding-based DNNs for Classifying Informal Texts,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 6, pp. 927–934, 2022.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Artificial Intelligence and Engineering Applications (JAIEA)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.