ConMamba: Contrastive Vision Mamba for Plant Disease Detection
Abdullah Al Mamun, Miaohua Zhang, David Ahmedt-Aristizabal, Zeeshan Hayder, Mohammad Awrangjeb
TL;DR
ConMamba addresses plant disease detection with limited labels by uniting Vision Mamba encoders based on bidirectional State Space Models with a dual-level contrastive loss and a dynamic weighting mechanism. This combination enables efficient long-range context modeling and robust local-global feature alignment, improving representations learned from unlabeled plant images. Empirical results on PlantVillage, PlantDoc, and Citrus show state-of-the-art performance across accuracy and F1 metrics, with strong qualitative localization via CAMs. The approach offers practical potential for scalable, real-world PDD in precision agriculture, including considerations for deployment efficiency and class-imbalance robustness.
Abstract
Plant Disease Detection (PDD) is a key aspect of precision agriculture. However, existing deep learning methods often rely on extensively annotated datasets, which are time-consuming and costly to generate. Self-supervised Learning (SSL) offers a promising alternative by exploiting the abundance of unlabeled data. However, most existing SSL approaches suffer from high computational costs due to convolutional neural networks or transformer-based architectures. Additionally, they struggle to capture long-range dependencies in visual representation and rely on static loss functions that fail to align local and global features effectively. To address these challenges, we propose ConMamba, a novel SSL framework specially designed for PDD. ConMamba integrates the Vision Mamba Encoder (VME), which employs a bidirectional State Space Model (SSM) to capture long-range dependencies efficiently. Furthermore, we introduce a dual-level contrastive loss with dynamic weight adjustment to optimize local-global feature alignment. Experimental results on three benchmark datasets demonstrate that ConMamba significantly outperforms state-of-the-art methods across multiple evaluation metrics. This provides an efficient and robust solution for PDD.
