Attention-based convolutional neural networks for interpretable classification of maritime equipment

Authors

  • luky fabrianto Nusa Mandiri University, Jakarta, Indonesia
  • Tiwuk Wahyuli Prihandayani Mercu Buana University, Jakarata, Indonesia
  • Rasenda Rasenda Pembangunan Nasional "Veteran" University, Jakarta, Indonesia
  • Novianti Madhona Faizah Tama Jagakarsa University, Jakarta, Indonesia

DOI:

https://doi.org/10.35335/mandiri.v14i1.426

Keywords:

Attention Mechanism, CNN, Explainability, Grad-CAM, Ship Components

Abstract

This study introduces a Convolutional Neural Network with an Attention Mechanism (CNN+AM), utilizing the Squeeze-and-Excitation (SE) block, to classify critical ship components: generators, engines, and oil-water separators (OWS). The SE block enhances the model's ability to focus on discriminative features, thereby improving classification performance. To overcome the limitation of the original dataset, which contained only 199 images, extensive data augmentation techniques were applied, expanding the dataset to 2,648 images. The augmented dataset was divided into training (70%), validation (15%), and testing (15%) sets to ensure reliable evaluation. Experimental results show that the CNN-AM achieved an accuracy of 72.39%, surpassing the baseline CNN model with 68.16%. These findings confirm that the attention mechanism significantly improves generalization and the ability to differentiate visually similar classes. Furthermore, the integration of interpretability tools, such as Gradient-weighted Class Activation Mapping (Grad-CAM), provides visual explanations of model predictions, increasing trust and reliability for safety-critical maritime applications. The proposed approach demonstrates strong potential for real-time ship component monitoring, offering meaningful contributions to predictive maintenance and operational safety within the maritime industry.

References

Akhtar, N., & Ragavendran, U. (2020). Interpretation of intelligence in CNN-pooling processes: a methodological survey. Neural Computing and Applications, 32(3), 879–898. https://doi.org/10.1007/s00521-019-04296-5

Beyer, L., Zhai, X., Royer, A., Markeeva, L., Anil, R., & Kolesnikov, A. (2022). Knowledge distillation: A good teacher is patient and consistent. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022-June, 10915–10924. https://doi.org/10.1109/CVPR52688.2022.01065

Chakraborty, T., Trehan, U., Mallat, K., & Dugelay, J. L. (2022). Generalizing Adversarial Explanations with Grad-CAM. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2022-June, 186–192. https://doi.org/10.1109/CVPRW56347.2022.00031

Chen, L., Chen, J., Hajimirsadeghi, H., Mori, G., & Ai, B. (n.d.). Adapting Grad-CAM for Embedding Networks.

da Costa, A. Z., Figueroa, H. E. H., & Fracarolli, J. A. (2020). Computer vision based detection of external defects on tomatoes using deep learning. Biosystems Engineering, 190, 131–144. https://doi.org/10.1016/j.biosystemseng.2019.12.003

Damrich, S., & Hamprecht, F. A. (n.d.). On UMAP’s True Loss Function.

Gholamalinezhad, H., & Khosravi, H. (n.d.). Pooling Methods in Deep Neural Networks, a Review.

Hayou, S., Doucet, A., & Rousseau, J. (n.d.). On the Impact of the Activation Function on Deep Neural Networks Training.

Kossaifi, J., Kolbeinsson, A., Khanna, A., Furlanello, T., & Anandkumar, A. (2020). Tensor Regression Networks. Journal of Machine Learning Research, 21, 1–21. http://jmlr.org/papers/v21/18-503.html.

Krstinić, D., Braović, M., Šerić, L., & Božić-Štulić, D. (2020). MULTI-LABEL CLASSIFIER PERFORMANCE EVALUATION WITH CONFUSION MATRIX. 1–14. https://doi.org/10.5121/csit.2020.100801

Lee, C. M., Jang, H. J., & Jung, B. G. (2023). Development of an Automated Spare-Part Management Device for Ship Controlled by Raspberry-Pi Microcomputer Based on Image-Progressing & Transfer-Learning. Journal of Marine Science and Engineering, 11(5). https://doi.org/10.3390/jmse11051015

Li, Z., Ji, J., Ge, Y., & Zhang, Y. (n.d.). AutoLossGen: Automatic Loss Function Generation for Recommender Systems. 12. https://doi.org/10.1145/3477495.3531941

Mahadevkar, S. V., Khemani, B., Patil, S., Kotecha, K., Vora, D. R., Abraham, A., & Gabralla, L. A. (2022). A Review on Machine Learning Styles in Computer Vision - Techniques and Future Directions. IEEE Access, 10(September), 107293–107329. https://doi.org/10.1109/ACCESS.2022.3209825

Markoulidakis YannisMarkoulidakis, I., & Kopsiaftis, G. (2021). Multi-Class Confusion Matrix Reduction method and its application on Net Promoter Score classification problem. https://doi.org/10.1145/3453892.3461323

Mohiuddin, K., Welke, P., Alam, M. A., Martin, M., Alam, M. M., Lehmann, J., & Vahdati, S. (2023). Retention Is All You Need. International Conference on Information and Knowledge Management, Proceedings, Nips, 4752–4758. https://doi.org/10.1145/3583780.3615497

Morbidelli, P., Carrera, D., Rossi, B., Fragneto, P., & Boracchi, G. (2020). Augmented Grad-CAM: Heat-Maps Super Resolution Through Augmentation. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2020-May, 4067–4071. https://doi.org/10.1109/ICASSP40776.2020.9054416

Moujahid, H., Cherradi, B., Al-Sarem, M., Bahatti, L., Eljialy, A. B. A. M. Y., Alsaeedi, A., & Saeed, F. (2022). Combining cnn and grad-cam for covid-19 disease prediction and visual explanation. Intelligent Automation and Soft Computing, 32(2), 723–745. https://doi.org/10.32604/iasc.2022.022179

Mumuni, A., & Mumuni, F. (2022). Data augmentation: A comprehensive survey of modern approaches. Array, 16(November), 100258. https://doi.org/10.1016/j.array.2022.100258

Polytechnic, N. A., Uni-, K., Engineering, M., & Technological, N. (n.d.). Version of Record: https://www.sciencedirect.com/science/article/pii/S0010482522003420. 1–46.

Sardar, A. (2024). Improving safety and efficiency in the maritime industry: a multi-disciplinary approach. https://doi.org/10.25959/26011102.V1

Sarvamangala, D. R., Raghavendra, ·, & Kulkarni, V. (2065). Convolutional neural networks in medical image understanding: a survey. Evolutionary Intelligence, 15, 1–22. https://doi.org/10.1007/s12065-020-00540-3

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2016). Grad-cam: Why did you say that? visual explanations from deep networks via gradient-based localization. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization, 17, 331–336. http://arxiv.org/abs/1610.02391

Shang, D., Zhang, J., Zhou, K., Wang, T., & Qi, J. (2022). Research on the Application of Visual Recognition in the Engine Room of Intelligent Ships. Sensors, 22(19). https://doi.org/10.3390/s22197261

Sharma, H., & Kumar, H. (2024). A computer vision-based system for real-time component identification from waste printed circuit boards. Journal of Environmental Management, 351(December 2023), 119779. https://doi.org/10.1016/j.jenvman.2023.119779

Theodoropoulos, P., Spandonidis, C. C., Giannopoulos, F., & Fassois, S. (2021). A deep learning-based fault detection model for optimization of shipping operations and enhancement of maritime safety. Sensors, 21(16). https://doi.org/10.3390/s21165658

Wang, Y., Zhang, J., Zhu, J., Ge, Y., & Zhai, G. (2023). Research on the Visual Perception of Ship Engine Rooms Based on Deep Learning. Journal of Marine Science and Engineering, 11(7). https://doi.org/10.3390/jmse11071450

Xu, M., Yoon, S., Fuentes, A., & Park, D. S. (2023). A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognition, 137, 109347. https://doi.org/10.1016/j.patcog.2023.109347

Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., Dutta, A. K., & Almotairi, S. (2022). A Comparison of Pooling Methods for Convolutional Neural Networks. Applied Sciences 2022, Vol. 12, Page 8643, 12(17), 8643. https://doi.org/10.3390/APP12178643

Zhang, M., Gao, H., Liao, X., Ning, B., Gu, H., & Yu, B. (n.d.). Problem Solving Protocol DBGRU-SE: predicting drug-drug interactions based on double BiGRU and squeeze-and-excitation attention mechanism. https://doi.org/10.1093/bib/bbad184

Zhang, Y., Li, K., Li, K., & Fu, Y. (n.d.). MR Image Super-Resolution with Squeeze and Excitation Reasoning Attention Network.

Zhang, Z., & Peng, H. (n.d.). Deeper and Wider Siamese Networks for Real-Time Visual Tracking. Retrieved January 23, 2025, from https://github.com/

Downloads

Published

2025-07-25

How to Cite

fabrianto, luky, Prihandayani, T. W., Rasenda, R., & Faizah, N. M. (2025). Attention-based convolutional neural networks for interpretable classification of maritime equipment. Jurnal Mandiri IT, 14(1), 157–168. https://doi.org/10.35335/mandiri.v14i1.426