Alphabet SIBI sign language recognition using YOLOv11 for real-time gesture detection
DOI:
https://doi.org/10.35335/mandiri.v14i1.408Kata Kunci:
Backbone optimization, Data augmentation, Gesture recognition, Real-time detection, Sibi alphabetAbstrak
Modern gesture recognition systems for sign language face challenges in balancing computational efficiency and detection accuracy in complex and dynamic environments. To address this, this study proposes a SIBI alphabet recognition framework based on YOLOv11, optimized for real-time applications. The model architecture integrates a modified, efficient YOLOv11 backbone to enable accurate hand gesture feature extraction with minimal latency. A custom SIBI dataset comprising alphabet signs and essential vocabulary is used to train the model, supported by data augmentation techniques that enhance robustness against variations in position, lighting, and background. Experimental results demonstrate that the model achieves a high detection accuracy with an mAP50 of 97%, while significantly reducing computational complexity. These findings present a meaningful scientific contribution by showcasing how a lightweight yet highly accurate deep learning model can be effectively applied to sign language recognition, particularly for SIBI in the Indonesian context. From a practical standpoint, this framework offers a real-time gesture detection solution that is suitable for deployment on resource-constrained devices, making it accessible for mobile or embedded systems. The system can replace or complement traditional communication aids, especially in inclusive education, public services, and healthcare. Furthermore, the proposed method can be adapted for gesture-based interaction in other domains such as athletic training, physical education, and app-based fitness programs where accurate and real-time motion recognition is essential.
Referensi
Aboud, H., Elsayaad, F., Gad, S. S., Abdelaziz, M., & Atia, A. (2024). Automated Rats Detection and Tracking for Behavioral Analysis in Biological Experiments. In 2024 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC) (pp. 307–312). IEEE. http://doi.org/10.1109/MIUCC62295.2024.10783596
Affairs, M. of S. (2023). Data Penyandang Disabilitas 2022–2023. Jakarta.
Agrawal, A. K., Kumar, J., Kumar, A., & Arvind, P. (2024). Analogous sign language communication using gesture detection. Australian Journal of Electrical and Electronics Engineering, 21(4), 486–498. http://doi.org/10.1080/1448837X.2024.2337495
Ahmed, A., Farhan, M., Eesaar, H., Chong, K. T., & Tayara, H. (2024). From Detection to Action: A Multimodal AI Framework for Traffic Incident Response. Drones, 8(12), 741. http://doi.org/10.3390/drones8120741
Alamsyah, A. (2024). Detection of Indonesian Sign Language System using Convolutional Neural Network ( CNN ) with Nadam Optimizer. Atlantis Press International BV. http://doi.org/10.2991/978-94-6463-589-8
Ali, M. L., & Zhang, Z. (2024). The YOLO Framework: A Comprehensive Review of Evolution, Applications, and Benchmarks in Object Detection. Computers, 13(12), 336. http://doi.org/10.3390/computers13120336
Chen, G., Wang, F., Li, W., Hong, L., Conradt, J., Chen, J., … Knoll, A. (2022). NeuroIV: Neuromorphic Vision Meets Intelligent Vehicle Towards Safe Driving With a New Database and Baseline Evaluations. IEEE Transactions on Intelligent Transportation Systems, 23(2), 1171–1183. http://doi.org/10.1109/TITS.2020.3022921
Das, A., Sayma, J., Barman, A. N., & Hasan, K. M. A. (2024). Application of YOLOv11 Classification for Efficient Waste Segmentation in Australia’s Recycling Facilities. In 2024 IEEE Asia-Pacific Conference on Geoscience, Electronics and Remote Sensing Technology (AGERS) (pp. 70–74). IEEE. http://doi.org/10.1109/AGERS65212.2024.10932955
Dignan, C., Perez, E., Ahmad, I., Huber, M., & Clark, A. (2022). An AI-based Approach for Improved Sign Language Recognition using Multiple Videos. Multimedia Tools and Applications, 81(24), 34525–34546. http://doi.org/10.1007/s11042-021-11830-y
Gomez, A., & Arzuaga, E. (2024). Real Time American Sign Language Recognition Using Yolov6 Model (pp. 343–353). http://doi.org/10.1007/978-3-031-67447-1_25
Guo, J., Han, K., Wu, H., Tang, Y., Chen, X., Wang, Y., & Xu, C. (2022). CMT: Convolutional Neural Networks Meet Vision Transformers. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 12165–12175). IEEE. http://doi.org/10.1109/CVPR52688.2022.01186
He, L., Zhou, Y., Liu, L., & Ma, J. (2024). Research and Application of YOLOv11-Based Object Segmentation in Intelligent Recognition at Construction Sites. Buildings, 14(12), 3777. http://doi.org/10.3390/buildings14123777
Jegham, N., Koh, C. Y., Abdelatti, M., & Hendawi, A. (2025). YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions. Retrieved from http://arxiv.org/abs/2411.00201
Khanam, R., & Hussain, M. (2024). YOLOv11: An Overview of the Key Architectural Enhancements. Retrieved from http://arxiv.org/abs/2410.17725
Liu, B., & Li, X. (2024). An Improved YOLOv11 Model for Detecting the Metal Roofing Tiles alongside the Railways. In 2024 4th International Conference on Artificial Intelligence, Robotics, and Communication (ICAIRC) (pp. 195–199). IEEE. http://doi.org/10.1109/ICAIRC64177.2024.10900077
Liu, Y., Nand, P., Hossain, M. A., Nguyen, M., & Yan, W. Q. (2023). Sign language recognition from digital videos using feature pyramid network with detection transformer. Multimedia Tools and Applications, 82(14), 21673–21685. http://doi.org/10.1007/s11042-023-14646-0
Ong, S. C. W., & Ranganath, S. (2005). Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 873–891. http://doi.org/10.1109/TPAMI.2005.112
Parico, A. I. B., & Ahamed, T. (2021). Real Time Pear Fruit Detection and Counting Using YOLOv4 Models and Deep SORT. Sensors, 21(14), 4803. http://doi.org/10.3390/s21144803
Rodríguez-Lira, D.-C., Córdova-Esparza, D.-M., Álvarez-Alvarado, J. M., Romero-González, J.-A., Terven, J., & Rodríguez-Reséndiz, J. (2024). Comparative Analysis of YOLO Models for Bean Leaf Disease Detection in Natural Environments. AgriEngineering, 6(4), 4585–4603. http://doi.org/10.3390/agriengineering6040262
Roy, A. M., Bhaduri, J., Kumar, T., & Raj, K. (2023). WilDect-YOLO: An efficient and robust computer vision-based accurate object localization model for automated endangered wildlife detection. Ecological Informatics, 75, 101919. http://doi.org/10.1016/j.ecoinf.2022.101919
Safitri, M., Yuniarno, E. M., & Rachmadi, R. F. (2024). Indonesian Sign Language (SIBI) Recognition and Extraction Using Convolutional Neural Networks - Symmetric Deletion Spelling Correction. In 2024 International Seminar on Intelligent Technology and Its Applications (ISITIA) (pp. 220–225). IEEE. http://doi.org/10.1109/ISITIA63062.2024.10667714
Sani, A. R., Zolfagharian, A., & Kouzani, A. Z. (2024). Automated defects detection in extrusion 3D printing using YOLO models. Journal of Intelligent Manufacturing. http://doi.org/10.1007/s10845-024-02543-8
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2020). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. International Journal of Computer Vision, 128(2), 336–359. http://doi.org/10.1007/s11263-019-01228-7
Shivaprasad Yadav, S. G., Itagi, S., Krishna Suresh, B. V. N. V, K.L, H., & A C, R. (2023). Human Illegal Activity Recognition Based on Deep Learning Techniques. In 2023 IEEE International Conference on Integrated Circuits and Communication Systems (ICICACS) (pp. 01–07). IEEE. http://doi.org/10.1109/ICICACS57338.2023.10099857
Sihananto, A. N., Safitri, E. M., Maulana, Y., Fakhruddin, F., & Yudistira, M. E. (2023). Indonesian Sign Language Image Detection Using Convolutional Neural Network (CNN) Method. Inspiration: Jurnal Teknologi Informasi Dan Komunikasi, 13(1), 13–21. http://doi.org/10.35585/inspir.v13i1.37
Statistik, B. P. (2022). Susenas 2022: Penyandang Disabilitas Menurut Jenis Disabilitas dan Provinsi. Jakarta.
Sümbül, H. (2024). A Novel Mems and Flex Sensor-Based Hand Gesture Recognition and Regenerating System Using Deep Learning Model. IEEE Access, 12, 133685–133693. http://doi.org/10.1109/ACCESS.2024.3448232
Tourani, A., Soroori, S., Shahbahrami, A., & Akoushideh, A. (2021). Iranis: A Large-scale Dataset of Iranian Vehicles License Plate Characters. In 2021 5th International Conference on Pattern Recognition and Image Analysis (IPRIA) (pp. 1–5). IEEE. http://doi.org/10.1109/IPRIA53572.2021.9483461
Xiao, B., & Kang, S.-C. (2021). Development of an Image Data Set of Construction Machines for Deep Learning Object Detection. Journal of Computing in Civil Engineering, 35(2). http://doi.org/10.1061/(ASCE)CP.1943-5487.0000945
Yang, J., Tian, T., Liu, Y., Li, C., Wu, D., Wang, L., & Wang, X. (2024). A Rainy Day Object Detection Method Based on YOLOv11 Combined with FFT and MF Model Fusion. In 2024 International Conference on Advanced Control Systems and Automation Technologies (ACSAT) (pp. 246–250). IEEE. http://doi.org/10.1109/ACSAT63853.2024.10823725
Zanevych, Y., Yovbak, V., Basystiuk, O., Shakhovska, N., Fedushko, S., & Argyroudis, S. (2024). Evaluation of Pothole Detection Performance Using Deep Learning Models Under Low-Light Conditions. Sustainability, 16(24), 10964. http://doi.org/10.3390/su162410964
Zhang, C., Cheng, H., Wu, R., Ren, B., Zhu, Y., & Peng, N. (2024). Development of a Traffic Congestion Prediction and Emergency Lane Development Strategy Based on Object Detection Algorithms. Sustainability, 16(23), 10232. http://doi.org/10.3390/su162310232
Unduhan
Diterbitkan
Cara Mengutip
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2025 Salsabilla Azahra Putri, Murinto Murinto, Sunardi Sunardi

Artikel ini berlisensi Creative Commons Attribution-NonCommercial 4.0 International License.




