Comparison of naïve bayes and KNN for herbal leaf classification


  • Bangkit Indarmawan Nugroho STMIK YMI Tegal, Indonesia
  • Muhammad Wazid Khusni STMIK YMI Tegal, Indonesia
  • Pingky Septiana Ananda STMIK YMI Tegal, Indonesia
  • Gunawan Gunawan STMIK YMI Tegal, Indonesia



Classification, GLCM, Herbal Leaf, K-Nearest Neighbor, Naïve Bayes


This study aims to compare the effectiveness of two classification algorithms, namely Naïve Bayes Classifier and K-Nearest Neighbor (KNN), in classifying herbal leaves. This research design uses a quantitative approach with experimental analysis and model validation. The dataset consisted of images of papaya leaves, pandanus, cat's whiskers, and betel nut taken in different lighting conditions. The methodology includes pre-processing of data by converting images into grayscale, feature extraction using Gray Level Co-occurrence Matrix (GLCM), and application of Naïve Bayes and KNN algorithms. The main results showed that KNN achieved 90.00% accuracy with precision, recall, and F1-score of 88.33% respectively, higher than Naïve Bayes which had 82.50% accuracy, 81.46% precision, 85.83% recall, and 82.27% F1-score. In conclusion, KNN is superior in the classification of herbal leaves to Naïve Bayes, although it requires a longer computational time. Further research is recommended to optimize algorithm parameters and explore the integration of deep learning techniques to improve classification accuracy and efficiency.


Abraham, E. J., & Kellogg, J. J. (2021). Chemometric-guided approaches for profiling and authenticating botanical materials. Frontiers in Nutrition, 8, 780228.

Alem, A., & Kumar, S. (2022). Deep learning models performance evaluations for remote sensed image classification. IEEE Access, 10, 111784–111793.

Banchhor, C., & Srinivasu, N. (2021). Analysis of Bayesian optimization algorithms for big data classification based on Map Reduce framework. Journal of big data, 8(1), 81.

Behera, S. K., Rath, A. K., & Sethy, P. K. (2021). Maturity status classification of papaya fruits based on machine learning and transfer learning approach. Information Processing in Agriculture, 8(2), 244–250.

Edison, H., Wang, X., & Conboy, K. (2021). Comparing methods for large-scale agile software development: A systematic literature review. IEEE Transactions on Software Engineering, 48(8), 2709–2731.

El Akhal, H., Yahya, A. Ben, Moussa, N., & El Alaoui, A. E. B. (2023). A novel approach for image-based olive leaf diseases classification using a deep hybrid model. Ecological Informatics, 77, 102276.

Falk, A., Becker, A., Dohmen, T., Huffman, D., & Sunde, U. (2023). The preference survey module: A validated instrument for measuring risk, time, and social preferences. Management Science, 69(4), 1935–1950.

Hicks, S. A., Strümke, I., Thambawita, V., Hammou, M., Riegler, M. A., Halvorsen, P., & Parasa, S. (2022). On evaluation metrics for medical applications of artificial intelligence. Scientific reports, 12(1), 5979.

Howes, M. R., Quave, C. L., Collemare, J., Tatsis, E. C., Twilley, D., Lulekal, E., Farlow, A., Li, L., Cazar, M., & Leaman, D. J. (2020). Molecules from nature: Reconciling biodiversity conservation and global healthcare imperatives for sustainable use of medicinal plants and fungi. Plants, People, Planet, 2(5), 463–481.

Kamath, V. (2024). Assessing classification approaches for categorizing Ayurvedic herbs. Multimedia Tools and Applications, 1–25.

Kolhar, S., & Jagtap, J. (2023). Plant trait estimation and classification studies in plant phenotyping using machine vision–A review. Information Processing in Agriculture, 10(1), 114–135.

Meng, J., You, X., Zhang, X., Shi, T., Zhang, L., Chen, X., Zhao, H., & Xu, M. (2023). Remote Sensing Application in Chinese Medicinal Plant Identification and Acreage Estimation—A Review. Remote Sensing, 15(23), 5580.

Mulugeta, A. K., Sharma, D. P., & Mesfin, A. H. (2024). Deep learning for medicinal plant species classification and recognition: a systematic review. Frontiers in Plant Science, 14, 1286088.

Naeem, S., Ali, A., Chesneau, C., Tahir, M. H., Jamal, F., Sherwani, R. A. K., & Ul Hassan, M. (2021). The classification of medicinal plant leaves based on multispectral and texture feature using machine learning approach. Agronomy, 11(2), 263.

Pellegrino, E., Jacques, C., Beaufils, N., Nanni, I., Carlioz, A., Metellus, P., & Ouafik, L. (2021). Machine learning random forest for predicting oncosomatic variant NGS analysis. Scientific reports, 11(1), 21820.

Pushpanathan, K., Hanafi, M., Mashohor, S., & Fazlil Ilahi, W. F. (2021). Machine learning in medicinal plants recognition: a review. Artificial Intelligence Review, 54(1), 305–327.

Roy, P. K., & Kumar, A. (2022). Early prediction of COVID-19 using ensemble of transfer learning. Computers and Electrical Engineering, 101, 108018.

Sachar, S., & Kumar, A. (2021). Survey of feature extraction and classification techniques to identify plant through leaves. Expert Systems with Applications, 167, 114181.

Shaban, W. M., Rabie, A. H., Saleh, A. I., & Abo-Elsoud, M. A. (2021). Accurate detection of COVID-19 patients based on distance biased Naïve Bayes (DBNB) classification strategy. Pattern Recognition, 119, 108110.

Siddalingappa, R., & Kanagaraj, S. (2022). K-nearest-neighbor algorithm to predict the survival time and classification of various stages of oral cancer: a machine learning approach. F1000Research, 11.

Tangkiatkumjai, M., Boardman, H., & Walker, D.-M. (2020). Potential factors that influence usage of complementary and alternative medicine worldwide: a systematic review. BMC complementary medicine and therapies, 20, 1–15.

Twum, F., Missah, Y. M., Oppong, S. O., & Ussiph, N. (2022). Textural Analysis for Medicinal Plants Identification Using Log Gabor Filters. IEEE Access, 10, 83204–83220.

Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1), 6256.

Ullah, H., Ahmad, B., Sana, I., Sattar, A., Khan, A., Akbar, S., & Asghar, M. Z. (2021). Comparative study for machine learning classifier recommendation to predict political affiliation based on online reviews. CAAI Transactions on Intelligence Technology, 6(3), 251–264.

Upton, R., David, B., Gafner, S., & Glasl, S. (2020). Botanical ingredient identification and quality assessment: strengths and limitations of analytical techniques. Phytochemistry Reviews, 19(5), 1157–1177.

Vishnoi, V. K., Kumar, K., & Kumar, B. (2022). A comprehensive study of feature extraction techniques for plant leaf disease detection. Multimedia Tools and Applications, 81(1), 367–419.

Wu, N., Crusiol, L. G. T., Liu, G., Wuyun, D., & Han, G. (2023). Comparing machine learning algorithms for pixel/object-based classifications of semi-arid grassland in northern China using multisource medium resolution imageries. Remote Sensing, 15(3), 750.




How to Cite

Nugroho, B. I., Khusni, M. W., Ananda, P. S., & Gunawan, G. (2024). Comparison of naïve bayes and KNN for herbal leaf classification . Jurnal Mandiri IT, 13(1), 18–27.

Most read articles by the same author(s)

1 2 > >>