Artificial intelligence-based hand gesture recognition for sign language interpretation

Authors

  • M. Fazil Rais Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • M. Ilham AlFatrah Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • Chadafa Zulti Noorta Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • H.A Danang Rimbawa Universitas Pertahanan Republik Indonesia, Bogor, Indonesia
  • Abdurrosyid Atturoybi Universitas Pertahanan Republik Indonesia, Bogor, Indonesia

DOI:

https://doi.org/10.35335/mandiri.v14i1.395

Keywords:

Artificial Intelligence, Computer Vision, Convolutional Neural Network, Hand Gesture Recognition, Sign Language

Abstract

This paper presents an artificial intelligence-based system for real-time hand gesture recognition to support sign language interpretation for the deaf and hard-of-hearing community. The proposed system integrates computer vision techniques with deep learning models to accurately identify static hand gestures representing alphabetic signs. The MediaPipe framework is employed to detect and track hand landmarks from live video input, which are then processed and classified using a Convolutional Neural Network (CNN) model. The model is trained on a publicly available BISINDO (Bahasa Isyarat Indonesia) gesture dataset retrieved from Kaggle, comprising 312 images across 26 hand gestures captured under multiple background conditions. Preprocessing includes resizing, grayscale conversion, data augmentation, and landmark extraction with specific innovations in preprocessing techniques, such as the use of advanced data augmentation methods and landmark normalization, which significantly enhance gesture identification accuracy and model robustness. Experimental results show that the system achieves an average classification accuracy of 88.03% and maintains stable performance in real-time applications. Despite these promising results, the system exhibits limitations, including challenges with dynamic gesture recognition, background interference, and limited handling of complex hand movements, all of which can be explored in future research to improve the system’s accuracy and generalization. These findings highlight the system’s potential as an inclusive communication tool to bridge language barriers between deaf individuals and non-signers. This research contributes to the development of accessible assistive technologies by demonstrating a non-intrusive, vision-based approach to sign language interpretation. Future development may involve dynamic gesture translation, sentence-level recognition, and deployment on mobile platforms.

References

Adege, A. O., Mekonnen, A., & Mekuria, M. M. (2022). American sign language recognition system using convolutional neural network. Procedia Computer Science, 199, 30–37. https://doi.org/10.1016/j.procs.2022.01.005

Al-Hammadi, M., Muhammad, G., Abdul, W., Alsulaiman, M., & Hossain, M. S. (2020). Hand gesture recognition using 3D-CNN model. IEEE Consumer Electronics Magazine, 9(1), 95–101. https://doi.org/10.1109/MCE.2019.2933862

Al-Hammadi, M., Muhammad, G., Abdul, W., Alsulaiman, M., Bencherif, M. A., Alrayes, T. S., Mathkour, H., & Mekhtiche, M. A. (2020). Deep learning-based approach for sign language gesture recognition with efficient hand gesture representation. IEEE Access, 8, 192527–192542. https://doi.org/10.1109/ACCESS.2020.3032140

Aly, S., & Aly, W. (2020). DeepArSLR: A novel signer-independent deep learning framework for isolated Arabic sign language gestures recognition. IEEE Access, 8, 83199–83212. https://doi.org/10.1109/ACCESS.2020.2991633

Asadi, M., Clapés, A., Bellantonio, M., Escalante, H. J., Ponce-López, V., Baró, X., Guyon, I., Kasaei, S., & Escalera, S. (2017). A survey on deep learning-based approaches for action and gesture recognition in image sequences. In 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017) (pp. 150–157). IEEE. https://doi.org/10.1109/FG.2017.150

Choudhary, P., & Tazi, S. N. (2020). An adaptive system of yogic gesture recognition for human-computer interaction. In 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS) (pp. 399–402). IEEE. https://doi.org/10.1109/ICIIS51140.2020.9342678

Dong, Y., Liu, J., & Yan, W. (2021). Dynamic hand gesture recognition based on signals from specialized data glove and deep learning algorithms. IEEE Transactions on Instrumentation and Measurement, 70, 1–9. https://doi.org/10.1109/TIM.2021.3077967

Durden, J. M., Hosking, B., Bett, B. J., Cline, D., & Ruhl, H. A. (2021). Automated classification of fauna in seabed photographs: The impact of training and validation dataset size, with considerations for the class imbalance. Progress in Oceanography, 196, 102612. https://doi.org/10.1016/j.pocean.2021.102612

Ensophea, T. (2024). A review on deep learning algorithms for hand gesture recognition in higher education. (Manuscript in preparation / Journal unknown)

Gomaa, A. A., & Elrayes, R. G. (n.d.). Egyptian sign language recognition using CNN and LSTM. (Manuscript in preparation / Journal unknown)

Gupta, A., Kumar, S., & Kumar, S. (2023). Review for optimal human-gesture design methodology and motion representation of medical images using segmentation from depth data and gesture recognition. Current Medical Imaging, 20. https://doi.org/10.2174/1573405620666230530093026

Hu, B., & Wang, J. (2020). Deep learning-based hand gesture recognition and UAV flight controls. International Journal of Automation and Computing, 17(1), 17–29. https://doi.org/10.1007/s11633-019-1211-9

Jiang, S., Kang, P., Song, X., Lo, B., & Shull, P. (2022). Emerging wearable interfaces and algorithms for hand gesture recognition: A survey. IEEE Reviews in Biomedical Engineering, 15, 85–102. https://doi.org/10.1109/RBME.2021.3078190

Kim, J. W., Choi, J. Y., Ha, E. J., & Choi, J. H. (2023). Human pose estimation using MediaPipe pose and optimization method based on a humanoid model. Applied Sciences, 13(4), 2700. https://doi.org/10.3390/app13042700

Lee, C. K. M., Ng, K. K. H., Chen, C. H., Lau, H. C. W., Chung, S. Y., & Tsoi, T. (2021). American sign language recognition and training method with recurrent neural network. Expert Systems with Applications, 167, 114403. https://doi.org/10.1016/j.eswa.2020.114403

McAllister, P., Zheng, H., Bond, R., & Moorhead, A. (2018). Combining deep residual neural network features with supervised machine learning algorithms to classify diverse food image datasets. Computers in Biology and Medicine, 95, 217–233. https://doi.org/10.1016/j.compbiomed.2018.02.008

Mohamed, R., Ibrahim, O., & Nilashi, M. (2015). The influence of culture on communication among groups of the hearing impaired persons in Malaysia and Indonesia: Sign language learning problems. Journal of Soft Computing and Decision Support Systems, 2(3). http://www.jscdss.com

Ojeda-Castelo, J., Capobianco, M., Piedra-Fernandez, J., & Ayala, R. (2022). A survey on intelligent gesture recognition techniques. IEEE Access, 1–1. https://doi.org/10.1109/ACCESS.2022.3199358

Prananta, G. B., Azzikri, H. A., & Rozikin, C. (2023). Real-time hand gesture detection and recognition using convolutional artificial neural networks. METHODIKA, 9(2), 30–34. https://doi.org/10.32528/methodika.v9i2.12345

Qi, J., Ma, L., Cui, Z., & Yu, Y. (2024). Computer vision-based hand gesture recognition for human-robot interaction: A review. Complex & Intelligent Systems, 10(1), 1581–1606. https://doi.org/10.1007/s40747-023-01173-6

Rautaray, S. S., & Agrawal, A. (2012). Vision-based hand gesture recognition for human computer interaction: A survey. Artificial Intelligence Review, 43, 1–54. https://doi.org/10.1007/s10462-012-9338-5

Sagayam, K. M., & Hemanth, D. J. (2017). Hand posture and gesture recognition techniques for virtual reality applications: A survey. Virtual Reality, 21(2), 91–107. https://doi.org/10.1007/s10055-016-0301-0

Saiful, M. N., Alhaddad, M. S., Mubin, S. A., & Nordin, N. H. (2022). Real-time sign language detection using CNN. In 2022 International Conference on Data Analytics for Business and Industry (ICDABI) (pp. 697–701). IEEE. https://doi.org/10.1109/ICDABI56818.2022.10041711

Sánchez-Vicinaiz, T. J., Camacho-Pérez, E., Castillo-Atoche, A. A., Cruz-Fernandez, M., García-Martínez, J. R., & Rodríguez-Reséndiz, J. (2024). MediaPipe frame and convolutional neural networks-based fingerspelling detection in Mexican sign language. Technologies, 12(8), 1–15. https://doi.org/10.3390/technologies12080124

Sharma, S., & Singh, S. (2021). Vision-based hand gesture recognition using deep learning for the interpretation of sign language. Expert Systems with Applications, 182, 115657. https://doi.org/10.1016/j.eswa.2021.115657

Xavier, S. V. B., & Pai, M. L. (2023). Real-time hand gesture recognition using MediaPipe and artificial neural networks. In 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT) (pp. 1–6). IEEE. https://doi.org/10.1109/ICCCNT56998.2023.10306439

Zhang, F., Bazarevsky, V., Vakunov, A., Tkachenka, A., Sung, G., Chang, C.-L., & Grundmann, M. (2020). MediaPipe Hands: On-device real-time hand tracking. arXiv preprint arXiv:2006.10214. https://doi.org/10.48550/arXiv.2006.10214

Downloads

Published

2025-07-17

How to Cite

Rais, M. F., AlFatrah, M. I., Noorta, C. Z., Rimbawa, H. D., & Atturoybi, A. (2025). Artificial intelligence-based hand gesture recognition for sign language interpretation. Jurnal Mandiri IT, 14(1), 76–86. https://doi.org/10.35335/mandiri.v14i1.395