Table of Contents
Fetching ...

Identification of Traditional Medicinal Plant Leaves Using an effective Deep Learning model and Self-Curated Dataset

Deepjyoti Chetia, Sanjib Kr Kalita, Prof Partha Pratim Baruah, Debasish Dutta, Tanaz Akhter

TL;DR

The study tackles automated identification of medicinal plant leaves amid visually similar morphologies by proposing a custom CNN trained from scratch. It evaluates the model on three datasets—Indian Medicinal Leaves Image Dataset, MED117, and a self-curated collection—using data augmentation and standardized preprocessing, with three optimizers (Adam, RMSprop, SGD with momentum). The model achieves high accuracies across datasets (up to 0.997 on the self-curated set), highlighting RMSprop and Adam as effective for convergence. A significant contribution is the large, curated dataset of 42,250 images across 50 plants and the demonstration that a compact, from-scratch CNN can surpass or match transfer-learning approaches on this domain, with potential for mobile applications and future transformer-based enhancements.

Abstract

Medicinal plants have been a key component in producing traditional and modern medicines, especially in the field of Ayurveda, an ancient Indian medical system. Producing these medicines and collecting and extracting the right plant is a crucial step due to the visually similar nature of some plants. The extraction of these plants from nonmedicinal plants requires human expert intervention. To solve the issue of accurate plant identification and reduce the need for a human expert in the collection process; employing computer vision methods will be efficient and beneficial. In this paper, we have proposed a model that solves such issues. The proposed model is a custom convolutional neural network (CNN) architecture with 6 convolution layers, max-pooling layers, and dense layers. The model was tested on three different datasets named Indian Medicinal Leaves Image Dataset,MED117 Medicinal Plant Leaf Dataset, and the self-curated dataset by the authors. The proposed model achieved respective accuracies of 99.5%, 98.4%, and 99.7% using various optimizers including Adam, RMSprop, and SGD with momentum.

Identification of Traditional Medicinal Plant Leaves Using an effective Deep Learning model and Self-Curated Dataset

TL;DR

The study tackles automated identification of medicinal plant leaves amid visually similar morphologies by proposing a custom CNN trained from scratch. It evaluates the model on three datasets—Indian Medicinal Leaves Image Dataset, MED117, and a self-curated collection—using data augmentation and standardized preprocessing, with three optimizers (Adam, RMSprop, SGD with momentum). The model achieves high accuracies across datasets (up to 0.997 on the self-curated set), highlighting RMSprop and Adam as effective for convergence. A significant contribution is the large, curated dataset of 42,250 images across 50 plants and the demonstration that a compact, from-scratch CNN can surpass or match transfer-learning approaches on this domain, with potential for mobile applications and future transformer-based enhancements.

Abstract

Medicinal plants have been a key component in producing traditional and modern medicines, especially in the field of Ayurveda, an ancient Indian medical system. Producing these medicines and collecting and extracting the right plant is a crucial step due to the visually similar nature of some plants. The extraction of these plants from nonmedicinal plants requires human expert intervention. To solve the issue of accurate plant identification and reduce the need for a human expert in the collection process; employing computer vision methods will be efficient and beneficial. In this paper, we have proposed a model that solves such issues. The proposed model is a custom convolutional neural network (CNN) architecture with 6 convolution layers, max-pooling layers, and dense layers. The model was tested on three different datasets named Indian Medicinal Leaves Image Dataset,MED117 Medicinal Plant Leaf Dataset, and the self-curated dataset by the authors. The proposed model achieved respective accuracies of 99.5%, 98.4%, and 99.7% using various optimizers including Adam, RMSprop, and SGD with momentum.
Paper Structure (23 sections, 5 figures, 3 tables)

This paper contains 23 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Proposed methodology
  • Figure 2: mages of the self-curated dataset
  • Figure 3: Image data augmentation process
  • Figure 4: Proposed CNN architecture
  • Figure 5: Train and Validation accuracy vs. epoch of the model on three datasets