Optimization of XGBoost hyperparameters using grid search and random search for credit card default prediction

Authors

  • Eryan Ahmad Firdaus Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • Jonson Manurung Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • Hondor Saragih Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • Muhammad Azhar Prabukusumo Universitas Pertahanan Republik Indonesia, Bogor, Indonesia

DOI:

https://doi.org/10.35335/mandiri.v14i2.468

Keywords:

Credit Card Default Prediction, Grid Search, Hyperparameter Optimization, Machine Learning, XGBoost

Abstract

This study explores the optimization of the Extreme Gradient Boosting (XGBoost) algorithm for credit card default prediction through systematic hyperparameter tuning using Grid Search and Random Search methodologies. Utilizing the publicly available Default of Credit Card Clients dataset from the UCI Machine Learning Repository, the research focuses on enhancing model performance by fine-tuning critical parameters such as learning rate, maximum tree depth, number of estimators, subsample ratio, and column sampling rate. The baseline XGBoost model achieved an accuracy of 0.8118, while the tuned models using Grid Search and Random Search improved the accuracy to 0.8183 and 0.8188, respectively. Although the improvement appears modest, the optimized models exhibited enhanced balance between precision and recall, particularly in identifying defaulters within an imbalanced dataset—an essential aspect in credit risk assessment. The results demonstrate that systematic hyperparameter optimization not only improves predictive performance but also contributes to model stability and generalization. Moreover, Random Search proved to be more computationally efficient, achieving near-optimal performance with fewer evaluations than Grid Search, thereby emphasizing its practicality for large-scale financial risk modeling applications. The novelty of this study lies in the comparative evaluation of two optimization techniques within the context of financial risk prediction, providing practical insights into how efficient hyperparameter tuning can enhance the reliability and scalability of machine learning models used in real-world credit risk management systems.

References

Addy, W. A., Ugochukwu, C. E., Oyewole, A. T., Ofodile, O. C., Adeoye, O. B., & Okoye, C. C. (2024). Predictive analytics in credit risk management for banks: A comprehensive review. GSC Advanced Research and Reviews, 18(2), 434–449. DOI : 10.30574/gscarr.2024.18.2.0077

Alanazi, B. S. (2025). A Comparative Study of Traditional Statistical Methods and Machine Learning Techniques for Improved Predictive Models. International Journal of Analysis and Applications, 23, 18.DOI : 10.28924/2291-8639-23-2025-18

Bello, O. A. (2023). Machine learning algorithms for credit risk assessment: an economic and financial analysis. International Journal of Management, 10(1), 109–133.

DOI : 10.37745/ijmt.2013/vol10n1109133

Cho, Y., Demmel, J., Dereziński, M., Li, H., Luo, H., Mahoney, M., & Murray, R. (2025). Surrogate-based autotuning for randomized sketching algorithms in regression problems. SIAM Journal on Matrix Analysis and Applications, 46(2), 1247–1279.

DOI : 10.1137/23M1597526

Demir, S., & Sahin, E. K. (2023). An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost. Neural Computing and Applications, 35(4), 3173–3190.

DOI : 10.1007/s00521-022-07856-4

Dong, J., Chen, Y., Yao, B., Zhang, X., & Zeng, N. (2022). A neural network boosting regression model based on XGBoost. Applied Soft Computing, 125, 109067.

DOI : 10.1016/j.asoc.2022.109067

Eckman, D. J., Henderson, S. G., & Shashaani, S. (2023). SimOpt: A testbed for simulation-optimization experiments. INFORMS Journal on Computing, 35(2), 495–508.

DOI :10.1287/ijoc.2023.1273

Hassanali, M., Soltanaghaei, M., Javdani Gandomani, T., & Zamani Boroujeni, F. (2024). Software development effort estimation using boosting algorithms and automatic tuning of hyperparameters with Optuna. Journal of Software: Evolution and Process, 36(9), e2665.

DOI : 10.1002/smr.2665

Liao, L., Li, H., Shang, W., & Ma, L. (2022). An empirical study of the impact of hyperparameter tuning and model optimization on the performance properties of deep neural networks. ACM Transactions on Software Engineering and Methodology (TOSEM), 31(3), 1–40.

DOI : 10.1145/3506695

Machireddy, J. R. (2023). Data science and business analytics approaches to financial wellbeing: Modeling consumer habits and identifying at-risk individuals in financial services. Journal of Applied Big Data Analytics, Decision-Making, and Predictive Modelling Systems, 7(12), 1–18.

DOI : 10.1021/acs.jcim.3c01790

Phorah, K., Sumbwanyambe, M., & Sibiya, M. (2024). Systematic literature review on data preprocessing for improved water potability prediction: a study of data cleaning, feature engineering, and dimensionality reduction techniques. Nanotechnol Percept, 20(S11), 133–151. DOI : 10.1371/journal.pcbi.1010718

Plevris, V., Bakas, N. P., & Solorzano, G. (2021). Pure random orthogonal search (pros): A plain and elegant parameterless algorithm for global optimization. Applied Sciences, 11(11), 5053.

DOI : 10.3390/app11115053

Rahaman, M. M., Rani, S., Islam, M. R., & Bhuiyan, M. M. R. (2023). Machine learning in business analytics: Advancing statistical methods for data-driven innovation. Journal of Computer Science and Technology Studies, 5(3), 104–111. DOI : 10.32996/jcsts.2023.5.3.8

Sahin, E. K. (2020). Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences, 2(7), 1308. DOI : 10.1007/s42452-020-3060-1

Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN Computer Science, 2(3), 160. DOI : 10.1007/s42979-021-00592-x

Shams, M. Y., Elshewey, A. M., El-Kenawy, E.-S. M., Ibrahim, A., Talaat, F. M., & Tarek, Z. (2024). Water quality prediction using machine learning models based on grid search method. Multimedia Tools and Applications, 83(12), 35307–35334. DOI : 10.1007/s11042-023-16737-4

Yu, T., & Zhu, H. (2020). Hyper-parameter optimization: A review of algorithms and applications. ArXiv Preprint ArXiv:2003.05689. DOI : 10.48550/arXiv.2003.05689

Zabinsky, Z. B. (2009). Random search algorithms. Department of Industrial and Systems Engineering, University of Washington, USA, 1–16. DOI : 10.1002/9781119515326

Zhang, L., & Jánošík, D. (2024). Enhanced short-term load forecasting with hybrid machine learning models: CatBoost and XGBoost approaches. Expert Systems with Applications, 241, 122686.

DOI : 10.1016/j.eswa.2023.122686

ZLOBIN, M., & BAZYLEVYCH, V. (2025). BAYESIAN OPTIMIZATION FOR TUNING HYPERPARAMETRS OF MACHINE LEARNING MODELS: A PERFORMANCE ANALYSIS IN XGBOOST. Computer Systems and Information Technologies, 1, 141–146.

DOI : 10.31891/csit-2025-1-16

Downloads

Published

2025-10-28

How to Cite

Firdaus, E. A., Manurung, J., Saragih, H., & Prabukusumo, M. A. (2025). Optimization of XGBoost hyperparameters using grid search and random search for credit card default prediction. Jurnal Mandiri IT, 14(2), 269–280. https://doi.org/10.35335/mandiri.v14i2.468

Most read articles by the same author(s)

1 2 > >>