Application of Diabetes Risk Prediction Using Machine Learning Algorithms
DOI:
https://doi.org/10.59934/jaiea.v5i2.2040Keywords:
Diabetes mellitus, Machine learning, Risk prediction, Streamlit, Support vector machineAbstract
Diabetes mellitus is a chronic disease that poses a significant global health burden, requiring effective early detection strategies to reduce complications and mortality. In recent years, machine learning techniques have been widely applied to support medical decision-making, particularly in disease risk prediction. This study aims to compare the performance of several machine learning algorithms for diabetes risk prediction and to implement the best-performing model into a web-based application. The PIMA Indians Diabetes Dataset was used in this study, and data preprocessing was conducted to address class imbalance and improve model performance. Five classification algorithms were evaluated, namely Logistic Regression, Support Vector Machine (SVM), Random Forest, K-Nearest Neighbors (KNN), and Naive Bayes. Model performance was assessed using accuracy, recall, F1-score, and Area Under the Curve (AUC), with a particular emphasis on recall and F1-score due to their importance in medical screening applications. Experimental results show that the SVM model outperformed the other algorithms, achieving higher recall, F1-score, and AUC values. The selected model was then implemented into a web-based application using the Streamlit framework, enabling users to input clinical parameters and obtain real-time diabetes risk predictions. The results indicate that machine learning models, particularly SVM, can effectively support diabetes risk prediction and demonstrate the potential of integrating predictive models into practical healthcare applications.
Downloads
References
International Diabetes Federation, Diabetes Atlas, vol. 11th editi. 2025. [Online]. Available: https://diabetesatlas.org/resources/idf-diabetes-atlas-2025/
World Health Organization, “Diabetes,” 2024, [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/diabetes
Kementerian Kesehatan Republik Indonesia, “Survei Kesehatan Indonesia (SKI),” 2023. [Online]. Available: https://kemkes.go.id/id/survei-kesehatan-indonesia-ski-2023
S. Sidiq, Alfian, and N. S. Mabrur, “Pengembangan Model Prediksi Risiko Diabetes Menggunakan Pendekatan AdaBoost dan Teknik Oversampling SMOTE,” J. Ilm. Inform. dan Ilmu Komput., vol. 4, no. 1, pp. 13–23, 2025.
R. A. Pratama, F. Wabula, H. Ilmandry, L. M. Isabela, M. Raharjo, and R. Sianipar, “Literature Review the Impact of Machine Learning in Modern Industries,” Nian Tana Sikk. J. ilmiah Mahasiswa, vol. 3, no. 1, pp. 177–182, 2025.
B. Siswoyo and M. I. Nurhafidz, “Penerapan Algoritma Random Forest Untuk Prediksi Risiko Diabetes Berdasarkan Data Kesehatan Pasien,” J. Teknol. Inf. Digit., vol. 1, no. 1, pp. 35–38, 2025.
E. Giunchiglia, F. Imrie, M. van der Schaar, and T. Lukasiewicz, “Machine learning with requirements: A manifesto,” Neurosymbolic Artif. Intell., vol. 1, pp. 1–12, 2025, doi: 10.3233/nai-240767.
P. R. Sihombing and I. F. Yuliati, “Penerapan Metode Machine Learning dalam Klasifikasi Risiko Kejadian Berat Badan Lahir Rendah di Indonesia,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 20, no. 2, pp. 417–426, 2021, doi: 10.30812/matrik.v20i2.1174.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Journal of Artificial Intelligence and Engineering Applications (JAIEA)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.







