AgriPath: A Systematic Exploration of Architectural Trade-offs for Crop Disease Classification

Hamza Mooraj; George Pantazopoulos; Alessandro Suglia

AgriPath: A Systematic Exploration of Architectural Trade-offs for Crop Disease Classification

Hamza Mooraj, George Pantazopoulos, Alessandro Suglia

Abstract

Reliable crop disease detection requires models that perform consistently across diverse acquisition conditions, yet existing evaluations often focus on single architectural families or lab-generated datasets. This work presents a systematic empirical comparison of three model paradigms for fine-grained crop disease classification: Convolutional Neural Networks (CNNs), contrastive Vision-Language Models (VLMs), and generative VLMs. To enable controlled analysis of domain effects, we introduce AgriPath-LF16, a benchmark containing 111k images spanning 16 crops and 41 diseases with explicit separation between laboratory and field imagery, alongside a balanced 30k subset for standardized training and evaluation. All models are trained and evaluated under unified protocols across full, lab-only, and field-only training regimes using macro-F1 and Parse Success Rate (PSR) to account for generative reliability. The results reveal distinct performance profiles. CNNs achieve the highest accuracy on lab imagery but degrade under domain shift. Contrastive VLMs provide a robust and parameter-efficient alternative with competitive cross-domain performance. Generative VLMs demonstrate the strongest resilience to distributional variation, albeit with additional failure modes stemming from free-text generation. These findings highlight that architectural choice should be guided by deployment context rather than aggregate accuracy alone.

AgriPath: A Systematic Exploration of Architectural Trade-offs for Crop Disease Classification

Abstract

Paper Structure (48 sections, 6 figures, 10 tables)

This paper contains 48 sections, 6 figures, 10 tables.

Introduction
Related Work
Datasets for Crop Disease Classification
CNN-Based Approaches
Vision–Language Models for Agricultural Tasks
Building VLMs for Crop Disease Classification
Dataset
CNN Baseline
Contrastive VLM Experiments
Generative VLM Experiments
Evaluation & Analysis
Evaluation Setup
CNN-Based Models under Domain Shift
Contrastive Vision-Language Models
Generative Vision-Language Models
...and 33 more sections

Figures (6)

Figure 1: Example of source difference in AgriPath-LF16. Left: Lab-sourced image of Tomato with Bacterial Spot. Right: Field-sourced image of the same disease, illustrating background clutter and lighting variation.
Figure 2: Class and source distribution of AgriPath-LF16-30k across 65 crop-disease pairs. Blue refers to lab-based samples and orange refers to field-based samples
Figure 3: A corn crop with common rust in the field with probabilities for a CNN and CLIP prediction.
Figure 4: A potato crop with an ambiguous blight disease in the field with an uncertain CLIP prediction.
Figure 5: Top confusion pairs are discussed in \ref{['fail_cnn']}. Actual labels are along the y-axis, and Predicted labels are along the x-axis.
...and 1 more figures

AgriPath: A Systematic Exploration of Architectural Trade-offs for Crop Disease Classification

Abstract

AgriPath: A Systematic Exploration of Architectural Trade-offs for Crop Disease Classification

Authors

Abstract

Table of Contents

Figures (6)