Deep-Wide Learning Assistance for Insect Pest Classification

Toan Nguyen; Huy Nguyen; Huy Ung; Hieu Ung; Binh Nguyen

Deep-Wide Learning Assistance for Insect Pest Classification

Toan Nguyen, Huy Nguyen, Huy Ung, Hieu Ung, Binh Nguyen

TL;DR

DeWi introduces a Deep-Wide learning framework for insect pest classification that alternates between a Deep step (triplet-margin loss with a multi-level feature extractor) and a Wide step (Mixup augmentation) to jointly boost discrimination and generalization. The Deep step forms 8192-d embeddings from dual projectors and optimizes a combined loss $L = \beta_1 L_{CE} + \beta_2 L_T$, where $L_T$ is the batch-hard triplet margin loss with distance $D$ and margin $m$, while the Wide step uses Mixup to generate augmented samples with mixed labels. Empirically, DeWi achieves state-of-the-art results on IP102 (Acc $=76.44\%$, $mF1=69.46$, $GM=65.07$) and D0 (Acc $=99.79\%$), and extensive ablations validate the importance of each component, the chosen margin, and the superiority of the one-stage approach over self-supervised pretraining. The framework is backbone-friendly, efficient, and scalable, with potential extensions to Vision Transformers and additional augmentation and contrastive losses, offering practical value for smart agriculture deployments.

Abstract

Accurate insect pest recognition plays a critical role in agriculture. It is a challenging problem due to the intricate characteristics of insects. In this paper, we present DeWi, novel learning assistance for insect pest classification. With a one-stage and alternating training strategy, DeWi simultaneously improves several Convolutional Neural Networks in two perspectives: discrimination (by optimizing a triplet margin loss in a supervised training manner) and generalization (via data augmentation). From that, DeWi can learn discriminative and in-depth features of insect pests (deep) yet still generalize well to a large number of insect categories (wide). Experimental results show that DeWi achieves the highest performances on two insect pest classification benchmarks (76.44\% accuracy on the IP102 dataset and 99.79\% accuracy on the D0 dataset, respectively). In addition, extensive evaluations and ablation studies are conducted to thoroughly investigate our DeWi and demonstrate its superiority. Our source code is available at https://github.com/toannguyen1904/DeWi.

Deep-Wide Learning Assistance for Insect Pest Classification

TL;DR

, where

is the batch-hard triplet margin loss with distance

and margin

, while the Wide step uses Mixup to generate augmented samples with mixed labels. Empirically, DeWi achieves state-of-the-art results on IP102 (Acc

) and D0 (Acc

), and extensive ablations validate the importance of each component, the chosen margin, and the superiority of the one-stage approach over self-supervised pretraining. The framework is backbone-friendly, efficient, and scalable, with potential extensions to Vision Transformers and additional augmentation and contrastive losses, offering practical value for smart agriculture deployments.

Abstract

Paper Structure (19 sections, 10 equations, 10 figures, 6 tables, 1 algorithm)

This paper contains 19 sections, 10 equations, 10 figures, 6 tables, 1 algorithm.

Introduction
Related Works
Insect Pest Classification
Contrastive Learning
Image Augmentation
Methodology
Deep step
Wide step
Experiments
Datasets
Evaluation metrics
Experimental settings
Compare with other state-of-the-art methods
Compare with baseline residual networks
Activation map
...and 4 more sections

Figures (10)

Figure 1: The overview of our proposed method. During training, the Deep step and Wide step are alternately applied in an epoch. We use the same network architecture in Deep and Wide steps. The triplet margin loss is computed in the Deep step, while the Mixup data augmentation is employed in the Wide step. In inference, the input image is fed through the entire network, from the multi-level feature extractor $F(\cdot)$ to the linear layer and the softmax layer, to get the final prediction.
Figure 2: The detailed architecture of our projector network. Both the high-level and low-level projectors produce 4096-dim vectors.
Figure 3: Examples of images from IP102 dataset augmented by Mixup. Top: mix of two images of different classes. Bottom: mix of two images of the same class.
Figure 4: Examples of six insect pests in D0 dataset.
Figure 5: Samples of IP102 dataset. Column (a) presents three different morphologies corresponding to three developmental stages of the same worm species. Column (b) shows examples of three different butterfly species, but their appearances are particularly hard to distinguish. Column (c) gives examples of damaged crop fields without the appearance of insect pests. Column (d) shows images of small-scale insects on noisy backgrounds.
...and 5 more figures

Deep-Wide Learning Assistance for Insect Pest Classification

TL;DR

Abstract

Deep-Wide Learning Assistance for Insect Pest Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (10)