Optimization of XGBoost hyperparameters using grid search and random search for credit card default prediction
DOI:
https://doi.org/10.35335/mandiri.v14i2.468Keywords:
Credit Card Default Prediction, Grid Search, Hyperparameter Optimization, Machine Learning, XGBoostAbstract
This study explores the optimization of the Extreme Gradient Boosting (XGBoost) algorithm for credit card default prediction through systematic hyperparameter tuning using Grid Search and Random Search methodologies. Utilizing the publicly available Default of Credit Card Clients dataset from the UCI Machine Learning Repository, the research focuses on enhancing model performance by fine-tuning critical parameters such as learning rate, maximum tree depth, number of estimators, subsample ratio, and column sampling rate. The baseline XGBoost model achieved an accuracy of 0.8118, while the tuned models using Grid Search and Random Search improved the accuracy to 0.8183 and 0.8188, respectively. Although the improvement appears modest, the optimized models exhibited enhanced balance between precision and recall, particularly in identifying defaulters within an imbalanced dataset—an essential aspect in credit risk assessment. The results demonstrate that systematic hyperparameter optimization not only improves predictive performance but also contributes to model stability and generalization. Moreover, Random Search proved to be more computationally efficient, achieving near-optimal performance with fewer evaluations than Grid Search, thereby emphasizing its practicality for large-scale financial risk modeling applications. The novelty of this study lies in the comparative evaluation of two optimization techniques within the context of financial risk prediction, providing practical insights into how efficient hyperparameter tuning can enhance the reliability and scalability of machine learning models used in real-world credit risk management systems.
References
Addy, W. A., Ugochukwu, C. E., Oyewole, A. T., Ofodile, O. C., Adeoye, O. B., & Okoye, C. C. (2024). Predictive analytics in credit risk management for banks: A comprehensive review. GSC Advanced Research and Reviews, 18(2), 434–449. DOI : 10.30574/gscarr.2024.18.2.0077
Alanazi, B. S. (2025). A Comparative Study of Traditional Statistical Methods and Machine Learning Techniques for Improved Predictive Models. International Journal of Analysis and Applications, 23, 18.DOI : 10.28924/2291-8639-23-2025-18
Bello, O. A. (2023). Machine learning algorithms for credit risk assessment: an economic and financial analysis. International Journal of Management, 10(1), 109–133.
DOI : 10.37745/ijmt.2013/vol10n1109133
Cho, Y., Demmel, J., Dereziński, M., Li, H., Luo, H., Mahoney, M., & Murray, R. (2025). Surrogate-based autotuning for randomized sketching algorithms in regression problems. SIAM Journal on Matrix Analysis and Applications, 46(2), 1247–1279.
DOI : 10.1137/23M1597526
Demir, S., & Sahin, E. K. (2023). An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost. Neural Computing and Applications, 35(4), 3173–3190.
DOI : 10.1007/s00521-022-07856-4
Dong, J., Chen, Y., Yao, B., Zhang, X., & Zeng, N. (2022). A neural network boosting regression model based on XGBoost. Applied Soft Computing, 125, 109067.
DOI : 10.1016/j.asoc.2022.109067
Eckman, D. J., Henderson, S. G., & Shashaani, S. (2023). SimOpt: A testbed for simulation-optimization experiments. INFORMS Journal on Computing, 35(2), 495–508.
DOI :10.1287/ijoc.2023.1273
Hassanali, M., Soltanaghaei, M., Javdani Gandomani, T., & Zamani Boroujeni, F. (2024). Software development effort estimation using boosting algorithms and automatic tuning of hyperparameters with Optuna. Journal of Software: Evolution and Process, 36(9), e2665.
DOI : 10.1002/smr.2665
Liao, L., Li, H., Shang, W., & Ma, L. (2022). An empirical study of the impact of hyperparameter tuning and model optimization on the performance properties of deep neural networks. ACM Transactions on Software Engineering and Methodology (TOSEM), 31(3), 1–40.
DOI : 10.1145/3506695
Machireddy, J. R. (2023). Data science and business analytics approaches to financial wellbeing: Modeling consumer habits and identifying at-risk individuals in financial services. Journal of Applied Big Data Analytics, Decision-Making, and Predictive Modelling Systems, 7(12), 1–18.
DOI : 10.1021/acs.jcim.3c01790
Phorah, K., Sumbwanyambe, M., & Sibiya, M. (2024). Systematic literature review on data preprocessing for improved water potability prediction: a study of data cleaning, feature engineering, and dimensionality reduction techniques. Nanotechnol Percept, 20(S11), 133–151. DOI : 10.1371/journal.pcbi.1010718
Plevris, V., Bakas, N. P., & Solorzano, G. (2021). Pure random orthogonal search (pros): A plain and elegant parameterless algorithm for global optimization. Applied Sciences, 11(11), 5053.
DOI : 10.3390/app11115053
Rahaman, M. M., Rani, S., Islam, M. R., & Bhuiyan, M. M. R. (2023). Machine learning in business analytics: Advancing statistical methods for data-driven innovation. Journal of Computer Science and Technology Studies, 5(3), 104–111. DOI : 10.32996/jcsts.2023.5.3.8
Sahin, E. K. (2020). Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences, 2(7), 1308. DOI : 10.1007/s42452-020-3060-1
Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN Computer Science, 2(3), 160. DOI : 10.1007/s42979-021-00592-x
Shams, M. Y., Elshewey, A. M., El-Kenawy, E.-S. M., Ibrahim, A., Talaat, F. M., & Tarek, Z. (2024). Water quality prediction using machine learning models based on grid search method. Multimedia Tools and Applications, 83(12), 35307–35334. DOI : 10.1007/s11042-023-16737-4
Yu, T., & Zhu, H. (2020). Hyper-parameter optimization: A review of algorithms and applications. ArXiv Preprint ArXiv:2003.05689. DOI : 10.48550/arXiv.2003.05689
Zabinsky, Z. B. (2009). Random search algorithms. Department of Industrial and Systems Engineering, University of Washington, USA, 1–16. DOI : 10.1002/9781119515326
Zhang, L., & Jánošík, D. (2024). Enhanced short-term load forecasting with hybrid machine learning models: CatBoost and XGBoost approaches. Expert Systems with Applications, 241, 122686.
DOI : 10.1016/j.eswa.2023.122686
ZLOBIN, M., & BAZYLEVYCH, V. (2025). BAYESIAN OPTIMIZATION FOR TUNING HYPERPARAMETRS OF MACHINE LEARNING MODELS: A PERFORMANCE ANALYSIS IN XGBOOST. Computer Systems and Information Technologies, 1, 141–146.
DOI : 10.31891/csit-2025-1-16
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Eryan Ahmad Firdaus, Jonson Manurung, Hondor Saragih, Muhammad Azhar Prabukusumo

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.




