Swarm driven automatic feature selection and classification framework for parkinson voice data

Penulis

  • Muhammad Azhar Prabukusumo Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • Hondor Saragih Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • Jonson Manurung Universitas Pertahanan Republik Indonesia, Bogor, Indonesia

DOI:

https://doi.org/10.35335/mandiri.v14i2.470

Kata Kunci:

Feature Selection, Machine Learning, Parkinson’s Disease, Particle Swarm Optimization, Voice Analysis

Abstrak

Parkinson’s disease (PD) severely impairs motor and vocal functions, and early detection is crucial for effective intervention. Conventional diagnostic procedures remain subjective and time-consuming, highlighting the need for automated, data-driven approaches. This study aims to develop an intelligent and fully automated framework integrating Particle Swarm Optimization (PSO)–based feature selection with ensemble machine learning classifiers for PD detection using voice data. The proposed Swarm-Driven Automatic Feature Selection and Classification Framework (SAFSCF) automates data preprocessing, adaptive feature optimization, and classification within a unified pipeline. The framework was evaluated on the Parkinson’s Speech Dataset comprising 743 numerical features. Baseline models achieved accuracies of 0.7738 (Logistic Regression), 0.8651 (Random Forest), and 0.8690 (Gradient Boosting). After PSO optimization, the feature set was reduced by nearly 50% to 382 attributes, achieving a test accuracy of 0.8421 slightly higher than the full-feature model (0.8355). Convergence plots confirmed that PSO effectively minimized the fitness function while maintaining high classification stability. Feature importance analysis revealed that the most discriminative attributes were derived from log energy, Teager Kaiser energy operators (TKEO), MFCCs, Shimmer, and entropy-based features biomarkers known to reflect Parkinsonian speech degradation. These findings demonstrate that the proposed framework enhances computational efficiency and interpretability, offering a reproducible and scalable solution for non-invasive, voice-based PD diagnosis.

Referensi

Al-Shammary, D., Albukhnefis, A. L., Alsaeedi, A. H., & Al-Asfoor, M. (2022). Extended particle swarm optimization for feature selection of high-dimensional biomedical data. Concurrency and Computation: Practice and Experience, 34(10), e6776. https://doi.org/10.1002/cpe.6776

Aldossary, M. (2025). Q-MobiGraphNet: Quantum-Inspired Multimodal IoT and UAV Data Fusion for Coastal Vulnerability and Solar Farm Resilience. Mathematics, 13(18), 3051.

Ashok, R. S., & Anil, K. D. (2025). Machine learning-based early detection of Parkinson’s disease using handwriting and vocal features. Research on Engineering Structures and Materials. https://doi.org/10.17515/resm2025-835ml0422rs

Bashir, S., Khattak, I. U., Khan, A., Khan, F. H., Gani, A., & Shiraz, M. (2022). A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches. Complexity, 2022(1), 8190814. https://doi.org/10.1155/2022/8190814

Bhuyan, H. K., Chakraborty, C., Pani, S. K., & Ravi, V. (2021). Feature and subfeature selection for classification using correlation coefficient and fuzzy model. IEEE Transactions on Engineering Management, 70(5), 1655–1669.

Chen, C. W., Tsai, Y. H., Chang, F. R., & Lin, W. C. (2020). Ensemble feature selection in medical datasets: Combining filter, wrapper, and embedded feature selection results. Expert Systems, 37(5), e12553. https://doi.org/10.1111/exsy.12553

Dhanka, S., Sharma, A., Kumar, A., Maini, S., & Vundavilli, H. (2025). Advancements in Hybrid Machine Learning Models for Biomedical Disease Classification Using Integration of Hyperparameter-Tuning and Feature Selection Methodologies: A Comprehensive Review. Archives of Computational Methods in Engineering, 1–36. https://doi.org/10.1007/s11831-025-10309-5

Dixit, S., Bohre, K., Singh, Y., Himeur, Y., Mansoor, W., Atalla, S., & Srinivasan, K. (2023). A Comprehensive Review on AI-Enabled Models for Parkinson’s Disease Diagnosis. Electronics (Switzerland), 12(4), 783. https://doi.org/10.3390/electronics12040783

Gad, A. G. (2022). Particle Swarm Optimization Algorithm and Its Applications: A Systematic Review. Archives of Computational Methods in Engineering, 29(5), 2531–2561. https://doi.org/10.1007/s11831-021-09694-4

Gawali, A. (2024). Voice Analysis for Disease Screening. Sant Gadge Baba Amravati University, Amravati.

Govindu, A., & Palwe, S. (2022). Early detection of Parkinson’s disease using machine learning. Procedia Computer Science, 218(2022), 249–261. https://doi.org/10.1016/j.procs.2023.01.007

Iyer, A., Kemp, A., Rahmatallah, Y., Pillai, L., Glover, A., Prior, F., Larson-Prior, L., & Virmani, T. (2023). A machine learning method to process voice samples for identification of Parkinson’s disease. Scientific Reports, 13(1), 1–9. https://doi.org/10.1038/s41598-023-47568-w

Karabayir, I., Goldman, S. M., Pappu, S., & Akbilgic, O. (2020). Gradient boosting for Parkinson’s disease diagnosis from voice recordings. BMC Medical Informatics and Decision Making, 20(1), 228. https://doi.org/10.1186/s12911-020-01250-7

Kavya, S., Viswanathan, P., Perumal, R., & Charan, S. (2022). Impact of communication difficulty on the quality of life in individuals with Parkinson’s disease. Annals of Movement Disorders, 5(1), 49–54. https://doi.org/10.4103/AOMD.AOMD_45_21

Khurma, R. A., Aljarah, I., Sharieh, A., Elaziz, M. A., Damaševičius, R., & Krilavičius, T. (2022). A Review of the Modification Strategies of the Nature Inspired Algorithms for Feature Selection Problem. Mathematics, 10(3), 464. https://doi.org/10.3390/math10030464

Kunapuli, G. (2023). Ensemble Methods for Machine Learning. Simon and Schuster.

Nagra, A. A., Khan, A. H., Abubakar, M., Faheem, M., Rasool, A., Masood, K., & Hussain, M. (2024). A gene selection algorithm for microarray cancer classification using an improved particle swarm optimization. Scientific Reports, 14(1), 19613. https://doi.org/10.1038/s41598-024-68744-6

Nazir, A., Hussain, A., Singh, M., & Assad, A. (2025). Deep learning in medicine: advancing healthcare with intelligent solutions and the future of holography imaging in early diagnosis. Multimedia Tools and Applications, 84(17), 17677–17740. https://doi.org/10.1007/s11042-024-19694-8

Quamar, D., Ambeth Kumar, V. D., Rizwan, M., Bagdasar, O., & Kadar, M. (2025). Voice-Based Early Diagnosis of Parkinson’s Disease Using Spectrogram Features and AI Models. Bioengineering, 12(10), 1052. https://doi.org/10.3390/bioengineering12101052

Rabie, H., & Akhloufi, M. A. (2025). A review of machine learning and deep learning for Parkinson’s disease detection. Discover Artificial Intelligence, 5(1), 24. https://doi.org/10.1007/s44163-025-00241-9

Salman, H. A., Kalakech, A., & Steiti, A. (2024). Random Forest Algorithm Overview. Babylonian Journal of Machine Learning, 2024, 69–79. https://doi.org/10.58496/bjml/2024/007

Shami, T. M., El-Saleh, A. A., Alswaitti, M., Al-Tashi, Q., Summakieh, M. A., & Mirjalili, S. (2022). Particle Swarm Optimization: A Comprehensive Survey. IEEE Access, 10, 10031–10061. https://doi.org/10.1109/ACCESS.2022.3142859

Shao, J., Feng, J., Li, J., Liang, S., Li, W., & Wang, C. (2023). Novel tools for early diagnosis and precision treatment based on artificial intelligence. Chinese Medical Journal Pulmonary and Critical Care Medicine, 1(3), 148–160. https://doi.org/10.1016/j.pccm.2023.05.001

Sheng, J., Amankwah-Amoah, J., Khan, Z., & Wang, X. (2021). COVID-19 Pandemic in the New Era of Big Data Analytics: Methodological Innovations and Future Research Directions. British Journal of Management, 32(4), 1164–1183. https://doi.org/10.1111/1467-8551.12441

Sowan, B., Eshtay, M., Dahal, K., Qattous, H., & Zhang, L. (2023). Hybrid PSO feature selection-based association classification approach for breast cancer detection. Neural Computing and Applications, 35(7), 5291–5317. https://doi.org/10.1007/s00521-022-07950-7

Zaini, F. A., Sulaima, M. F., Razak, I. A. W. A., Zulkafli, N. I., & Mokhlis, H. (2023). A Review on the Applications of PSO-Based Algorithm in Demand Side Management: Challenges and Opportunities. IEEE Access, 11, 53373–53400. https://doi.org/10.1109/ACCESS.2023.3278261

Diterbitkan

2025-10-31

Cara Mengutip

Prabukusumo, M. A., Saragih, H., & Manurung, J. (2025). Swarm driven automatic feature selection and classification framework for parkinson voice data. Jurnal Mandiri IT, 14(2), 292–300. https://doi.org/10.35335/mandiri.v14i2.470

Artikel paling banyak dibaca berdasarkan penulis yang sama

<< < 1 2