Comparison of Logistic Regression and XGBoost Model Performance in Predicting Credit Scores

Authors

  • Stacyana Jesika Surianto Universitas Negeri Medan
  • Chairunisah Universitas Negeri Medan

DOI:

https://doi.org/10.59934/jaiea.v5i2.1877

Keywords:

Credit Scoring, Logistic Regression, XGBoost, Machine Learning, Non Performing Loan, SMOTE

Abstract

Credit Scoring is a mathematical approach used to assess the creditworthiness of individuals or companies by classifying debtors into certain categories based on their risk profiles. This study aims to compare the performance of the Logistic Regression and XGBoost machine learning algorithms in predicting credit scores (credit scoring) to reduce the risk of Non-Performing Loan (NPL) risk at PT Graha Mazindo Mandiri. The secondary dataset used contains 1,533 car loan debtor data with 17 variables, including 1dependent variable and 16 independent variables. The research process includes data preprocessing (cleaning, handling outliers, encoding, normalization, and class balancing with SMOTE), modeling, and evaluation using the Accuracy, Precision, Recall, F1-score, and ROC-AUC metrics. The results show that XGBoost excels with 96% accuracy and ROC-AUC of 0.99 compared to Logistic Regression with an accuracy of 88% and ROC-AUC0.94, due to XGBoost ability to capture non-linear patterns and handle data imbalance. This study provides insights into credit risk factors and supports more accurate credit decision-making, with recommendations for hyperparameter optimization and model integration into operational systems.

Downloads

Download data is not yet available.

References

A. Waluyo, A. Mukid, and T. Wuryandari, “Perbandingan Analisis Klasifikasi Nasabah Kredit Menggunakan Regresi Logistik Biner Dan Cart (Classification And Regression Trees),” Vol. 4, Pp. 215–225, 2020.

Fatimah, A. Mukid, And A. Rusgiyono, “Analisis Credit Scoring Menggunakan Metode Bagging K-Nearest Neighbor,” vol. 6, no. 1996, pp. 161–170, 2020.

A. R. Hakim, M. A. Mukid, H. Yasin, and S. Sugito, “Analisis Klasifikasi Credit Scoring Menggunakan Weighted Probabilistic Neural Network (Wpnn),” J. Stat. Univ. Muhammadiyah Semarang, vol. 7, no. 1, 2020.

A. Yaqin, “Penilaian Kredit Menggunakan Algoritma XGBoost dan Logistic Regression,” J. Inform. J. Pengemb. IT, vol. 8, no. 1, pp. 4–10, 2022, doi: 10.30591/jpit.v8i1.4337.

R. Delima, M. Hosianna, D. Pebrianty, and J. Amalia, “Credit Risk Analysis dengan Algoritma Extreme Gradient Boosting dan Adaptive Boosting,” J. Inf. Syst. Graph. Hosp. Technol., vol. 05, pp. 1–7, 2023.

R. M. Daffaa, D. Santika, and F. Mahardika, “Perbandingan Xgboost dan Logistic Regression dalam Memprediksi Credit Card Customer Churn,” Publ. Ilmu Keteknikan Ind. Tek. Elektro dan Inform., vol. 3, 2025.

S. E. Herni Yulianti, S. Oni, and S. Yuana, “Penerapan Metode Extreme Gradient Boosting (XGBOOST) pada Klasifikasi Nasabah Kartu Kredit,” J. Math. Theory Appl., vol. 4, no. 1, pp. 21–26, 2022, doi: 10.31605/jomta.v4i1.1792.

J. Xiao, Y. Wang, J. Chen, L. Xie, and J. Huang, “Impact of resampling methods and classification models on the imbalanced credit scoring problems,” Inf. Sci. (Ny)., vol. 569, pp. 508–526, 2021, doi: 10.1016/j.ins.2021.05.029.

Downloads

Published

2026-02-15

How to Cite

Surianto, S. J., & Chairunisah. (2026). Comparison of Logistic Regression and XGBoost Model Performance in Predicting Credit Scores . Journal of Artificial Intelligence and Engineering Applications (JAIEA), 5(2), 2427–2434. https://doi.org/10.59934/jaiea.v5i2.1877