BBoxCut: A Targeted Data Augmentation Technique for Enhancing Wheat Head Detection Under Occlusions
Yasashwini Sai Gowri P, Karthik Seemakurthy, Andrews Agyemang Opoku, Sita Devi Bharatula
TL;DR
The paper addresses robust wheat head detection under occlusions in field imagery and proposes BBoxCut, a targeted data augmentation that masks within non-overlapping bounding boxes using a histogram-based dominant color to simulate realistic occluders. Evaluated on Faster R-CNN, FCOS, and DETR with the GWHD 2021 dataset, BBoxCut yields consistent improvements, including gains of $2.76$, $3.26$, and $1.90$ percentage points in mAP, and outperforms other masking-based augmentation methods. The approach is detector-agnostic and relies on domain-specific constraints—identifying non-overlapping boxes, estimating a representative mask color, and applying controlled masking—to produce diverse occlusion scenarios that improve generalization. The method has practical impact for phenotyping workflows, enabling more accurate yield estimation and selection of wheat varieties under real-world occlusion conditions, with potential applicability to other crops and occlusion-heavy detection tasks.
Abstract
Wheat plays a critical role in global food security, making it one of the most extensively studied crops. Accurate identification and measurement of key characteristics of wheat heads are essential for breeders to select varieties for cross-breeding, with the goal of developing nutrient-dense, resilient, and sustainable cultivars. Traditionally, these measurements are performed manually, which is both time-consuming and inefficient. Advances in digital technologies have paved the way for automating this process. However, field conditions pose significant challenges, such as occlusions of leaves, overlapping wheat heads, varying lighting conditions, and motion blur. In this paper, we propose a novel data augmentation technique, BBoxCut, which uses random localized masking to simulate occlusions caused by leaves and neighboring wheat heads. We evaluated our approach using three state-of-the-art object detectors and observed mean average precision (mAP) gains of 2.76, 3.26, and 1.9 for Faster R-CNN, FCOS, and DETR, respectively. Our augmentation technique led to significant improvements both qualitatively and quantitatively. In particular, the improvements were particularly evident in scenarios involving occluded wheat heads, demonstrating the robustness of our method in challenging field conditions.
