Table of Contents
Fetching ...

Lemon and Orange Disease Classification using CNN-Extracted Features and Machine Learning Classifier

Khandoker Nosiba Arifin, Sayma Akter Rupa, Md Musfique Anwar, Israt Jahan

TL;DR

This work tackles the automated classification of lemon and orange diseases by extracting features with pretrained CNNs (VGG16, VGG19, ResNet50) and classifying them with traditional ML algorithms (KNN, RF, NB, LR). The main finding is that using ResNet50 as the feature extractor combined with Logistic Regression yields the highest accuracies: $95\%$ for lemon and $99.69\%$ for orange, significantly outperforming single CNN classifiers and other baselines. The approach leverages public Kaggle datasets with four disease classes per fruit, demonstrating a practical, hybrid pipeline that improves early disease detection and potential agricultural decision-making. Limitations include small lemon sample size and reliance on external datasets; future work aims to collect larger, in-field data, extend class coverage, and explore disease severity assessment.

Abstract

Lemons and oranges, both are the most economically significant citrus fruits globally. The production of lemons and oranges is severely affected due to diseases in its growth stages. Fruit quality has degraded due to the presence of flaws. Thus, it is necessary to diagnose the disease accurately so that we can avoid major loss of lemons and oranges. To improve citrus farming, we proposed a disease classification approach for lemons and oranges. This approach would enable early disease detection and intervention, reduce yield losses, and optimize resource allocation. For the initial modeling of disease classification, the research uses innovative deep learning architectures such as VGG16, VGG19 and ResNet50. In addition, for achieving better accuracy, the basic machine learning algorithms used for classification problems include Random Forest, Naive Bayes, K-Nearest Neighbors (KNN) and Logistic Regression. The lemon and orange fruits diseases are classified more accurately (95.0% for lemon and 99.69% for orange) by the model. The model's base features were extracted from the ResNet50 pre-trained model and the diseases are classified by the Logistic Regression which beats the performance given by VGG16 and VGG19 for other classifiers. Experimental outcomes show that the proposed model also outperforms existing models in which most of them classified the diseases using the Softmax classifier without using any individual classifiers.

Lemon and Orange Disease Classification using CNN-Extracted Features and Machine Learning Classifier

TL;DR

This work tackles the automated classification of lemon and orange diseases by extracting features with pretrained CNNs (VGG16, VGG19, ResNet50) and classifying them with traditional ML algorithms (KNN, RF, NB, LR). The main finding is that using ResNet50 as the feature extractor combined with Logistic Regression yields the highest accuracies: for lemon and for orange, significantly outperforming single CNN classifiers and other baselines. The approach leverages public Kaggle datasets with four disease classes per fruit, demonstrating a practical, hybrid pipeline that improves early disease detection and potential agricultural decision-making. Limitations include small lemon sample size and reliance on external datasets; future work aims to collect larger, in-field data, extend class coverage, and explore disease severity assessment.

Abstract

Lemons and oranges, both are the most economically significant citrus fruits globally. The production of lemons and oranges is severely affected due to diseases in its growth stages. Fruit quality has degraded due to the presence of flaws. Thus, it is necessary to diagnose the disease accurately so that we can avoid major loss of lemons and oranges. To improve citrus farming, we proposed a disease classification approach for lemons and oranges. This approach would enable early disease detection and intervention, reduce yield losses, and optimize resource allocation. For the initial modeling of disease classification, the research uses innovative deep learning architectures such as VGG16, VGG19 and ResNet50. In addition, for achieving better accuracy, the basic machine learning algorithms used for classification problems include Random Forest, Naive Bayes, K-Nearest Neighbors (KNN) and Logistic Regression. The lemon and orange fruits diseases are classified more accurately (95.0% for lemon and 99.69% for orange) by the model. The model's base features were extracted from the ResNet50 pre-trained model and the diseases are classified by the Logistic Regression which beats the performance given by VGG16 and VGG19 for other classifiers. Experimental outcomes show that the proposed model also outperforms existing models in which most of them classified the diseases using the Softmax classifier without using any individual classifiers.
Paper Structure (12 sections, 9 figures, 8 tables)

This paper contains 12 sections, 9 figures, 8 tables.

Figures (9)

  • Figure 1: Proposed Model for Disease classification Approach.
  • Figure 2: Fruit Disease classes.
  • Figure 3: Visualization of lemon images with class label.
  • Figure 4: Visualization of orange images with class label.
  • Figure 5: Structure of Pretrained Model.
  • ...and 4 more figures