Attention-based convolutional neural networks for interpretable classification of maritime equipment
DOI:
https://doi.org/10.35335/mandiri.v14i1.426Keywords:
Attention Mechanism, CNN, Explainability, Grad-CAM, Ship ComponentsAbstract
This study introduces a Convolutional Neural Network with an Attention Mechanism (CNN+AM), utilizing the Squeeze-and-Excitation (SE) block, to classify critical ship components: generators, engines, and oil-water separators (OWS). The SE block enhances the model's ability to focus on discriminative features, thereby improving classification performance. To overcome the limitation of the original dataset, which contained only 199 images, extensive data augmentation techniques were applied, expanding the dataset to 2,648 images. The augmented dataset was divided into training (70%), validation (15%), and testing (15%) sets to ensure reliable evaluation. Experimental results show that the CNN-AM achieved an accuracy of 72.39%, surpassing the baseline CNN model with 68.16%. These findings confirm that the attention mechanism significantly improves generalization and the ability to differentiate visually similar classes. Furthermore, the integration of interpretability tools, such as Gradient-weighted Class Activation Mapping (Grad-CAM), provides visual explanations of model predictions, increasing trust and reliability for safety-critical maritime applications. The proposed approach demonstrates strong potential for real-time ship component monitoring, offering meaningful contributions to predictive maintenance and operational safety within the maritime industry.
References
Akhtar, N., & Ragavendran, U. (2020). Interpretation of intelligence in CNN-pooling processes: a methodological survey. Neural Computing and Applications, 32(3), 879–898. https://doi.org/10.1007/s00521-019-04296-5
Beyer, L., Zhai, X., Royer, A., Markeeva, L., Anil, R., & Kolesnikov, A. (2022). Knowledge distillation: A good teacher is patient and consistent. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022-June, 10915–10924. https://doi.org/10.1109/CVPR52688.2022.01065
Chakraborty, T., Trehan, U., Mallat, K., & Dugelay, J. L. (2022). Generalizing Adversarial Explanations with Grad-CAM. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2022-June, 186–192. https://doi.org/10.1109/CVPRW56347.2022.00031
Chen, L., Chen, J., Hajimirsadeghi, H., Mori, G., & Ai, B. (n.d.). Adapting Grad-CAM for Embedding Networks.
da Costa, A. Z., Figueroa, H. E. H., & Fracarolli, J. A. (2020). Computer vision based detection of external defects on tomatoes using deep learning. Biosystems Engineering, 190, 131–144. https://doi.org/10.1016/j.biosystemseng.2019.12.003
Damrich, S., & Hamprecht, F. A. (n.d.). On UMAP’s True Loss Function.
Gholamalinezhad, H., & Khosravi, H. (n.d.). Pooling Methods in Deep Neural Networks, a Review.
Hayou, S., Doucet, A., & Rousseau, J. (n.d.). On the Impact of the Activation Function on Deep Neural Networks Training.
Kossaifi, J., Kolbeinsson, A., Khanna, A., Furlanello, T., & Anandkumar, A. (2020). Tensor Regression Networks. Journal of Machine Learning Research, 21, 1–21. http://jmlr.org/papers/v21/18-503.html.
Krstinić, D., Braović, M., Šerić, L., & Božić-Štulić, D. (2020). MULTI-LABEL CLASSIFIER PERFORMANCE EVALUATION WITH CONFUSION MATRIX. 1–14. https://doi.org/10.5121/csit.2020.100801
Lee, C. M., Jang, H. J., & Jung, B. G. (2023). Development of an Automated Spare-Part Management Device for Ship Controlled by Raspberry-Pi Microcomputer Based on Image-Progressing & Transfer-Learning. Journal of Marine Science and Engineering, 11(5). https://doi.org/10.3390/jmse11051015
Li, Z., Ji, J., Ge, Y., & Zhang, Y. (n.d.). AutoLossGen: Automatic Loss Function Generation for Recommender Systems. 12. https://doi.org/10.1145/3477495.3531941
Mahadevkar, S. V., Khemani, B., Patil, S., Kotecha, K., Vora, D. R., Abraham, A., & Gabralla, L. A. (2022). A Review on Machine Learning Styles in Computer Vision - Techniques and Future Directions. IEEE Access, 10(September), 107293–107329. https://doi.org/10.1109/ACCESS.2022.3209825
Markoulidakis YannisMarkoulidakis, I., & Kopsiaftis, G. (2021). Multi-Class Confusion Matrix Reduction method and its application on Net Promoter Score classification problem. https://doi.org/10.1145/3453892.3461323
Mohiuddin, K., Welke, P., Alam, M. A., Martin, M., Alam, M. M., Lehmann, J., & Vahdati, S. (2023). Retention Is All You Need. International Conference on Information and Knowledge Management, Proceedings, Nips, 4752–4758. https://doi.org/10.1145/3583780.3615497
Morbidelli, P., Carrera, D., Rossi, B., Fragneto, P., & Boracchi, G. (2020). Augmented Grad-CAM: Heat-Maps Super Resolution Through Augmentation. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2020-May, 4067–4071. https://doi.org/10.1109/ICASSP40776.2020.9054416
Moujahid, H., Cherradi, B., Al-Sarem, M., Bahatti, L., Eljialy, A. B. A. M. Y., Alsaeedi, A., & Saeed, F. (2022). Combining cnn and grad-cam for covid-19 disease prediction and visual explanation. Intelligent Automation and Soft Computing, 32(2), 723–745. https://doi.org/10.32604/iasc.2022.022179
Mumuni, A., & Mumuni, F. (2022). Data augmentation: A comprehensive survey of modern approaches. Array, 16(November), 100258. https://doi.org/10.1016/j.array.2022.100258
Polytechnic, N. A., Uni-, K., Engineering, M., & Technological, N. (n.d.). Version of Record: https://www.sciencedirect.com/science/article/pii/S0010482522003420. 1–46.
Sardar, A. (2024). Improving safety and efficiency in the maritime industry: a multi-disciplinary approach. https://doi.org/10.25959/26011102.V1
Sarvamangala, D. R., Raghavendra, ·, & Kulkarni, V. (2065). Convolutional neural networks in medical image understanding: a survey. Evolutionary Intelligence, 15, 1–22. https://doi.org/10.1007/s12065-020-00540-3
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2016). Grad-cam: Why did you say that? visual explanations from deep networks via gradient-based localization. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization, 17, 331–336. http://arxiv.org/abs/1610.02391
Shang, D., Zhang, J., Zhou, K., Wang, T., & Qi, J. (2022). Research on the Application of Visual Recognition in the Engine Room of Intelligent Ships. Sensors, 22(19). https://doi.org/10.3390/s22197261
Sharma, H., & Kumar, H. (2024). A computer vision-based system for real-time component identification from waste printed circuit boards. Journal of Environmental Management, 351(December 2023), 119779. https://doi.org/10.1016/j.jenvman.2023.119779
Theodoropoulos, P., Spandonidis, C. C., Giannopoulos, F., & Fassois, S. (2021). A deep learning-based fault detection model for optimization of shipping operations and enhancement of maritime safety. Sensors, 21(16). https://doi.org/10.3390/s21165658
Wang, Y., Zhang, J., Zhu, J., Ge, Y., & Zhai, G. (2023). Research on the Visual Perception of Ship Engine Rooms Based on Deep Learning. Journal of Marine Science and Engineering, 11(7). https://doi.org/10.3390/jmse11071450
Xu, M., Yoon, S., Fuentes, A., & Park, D. S. (2023). A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognition, 137, 109347. https://doi.org/10.1016/j.patcog.2023.109347
Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., Dutta, A. K., & Almotairi, S. (2022). A Comparison of Pooling Methods for Convolutional Neural Networks. Applied Sciences 2022, Vol. 12, Page 8643, 12(17), 8643. https://doi.org/10.3390/APP12178643
Zhang, M., Gao, H., Liao, X., Ning, B., Gu, H., & Yu, B. (n.d.). Problem Solving Protocol DBGRU-SE: predicting drug-drug interactions based on double BiGRU and squeeze-and-excitation attention mechanism. https://doi.org/10.1093/bib/bbad184
Zhang, Y., Li, K., Li, K., & Fu, Y. (n.d.). MR Image Super-Resolution with Squeeze and Excitation Reasoning Attention Network.
Zhang, Z., & Peng, H. (n.d.). Deeper and Wider Siamese Networks for Real-Time Visual Tracking. Retrieved January 23, 2025, from https://github.com/
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 luky fabrianto, Tiwuk Wahyuli Prihandayani, Rasenda Rasenda, Novianti Madhona Faizah

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
						
							



