CitDet: A Benchmark Dataset for Citrus Fruit Detection
Jordan A. James, Heather K. Manching, Matthew R. Mattia, Kim D. Bowman, Amanda M. Hulse-Kemp, William J. Beksi
TL;DR
CitDet introduces a high-resolution benchmark dataset for citrus fruit detection in orchards affected by Huanglongbing (HLB), featuring bounding-box annotations for fruit on trees and on the ground across 579 images and 32,448 boxes, plus tree-level yield metadata. The authors benchmark several state-of-the-art detectors on full-resolution and tiled-image variants, finding tiling improves detection performance and that YOLOv7 generally outperforms others, though Faster R-CNN remains strong for small objects. They demonstrate yield estimation from detections using front/back views, reporting substantial correlations with ground-truth yields, especially with the filter-detect-count method (per-tree $R^2$ up to $0.793$). CitDet thus provides a robust resource for robust citrus detection, yield mapping, and HLB impact assessment in real orchard conditions, enabling fairer cross-method comparisons and toward automated harvesting and monitoring.
Abstract
In this letter, we present a new dataset to advance the state of the art in detecting citrus fruit and accurately estimate yield on trees affected by the Huanglongbing (HLB) disease in orchard environments via imaging. Despite the fact that significant progress has been made in solving the fruit detection problem, the lack of publicly available datasets has complicated direct comparison of results. For instance, citrus detection has long been of interest to the agricultural research community, yet there is an absence of work, particularly involving public datasets of citrus affected by HLB. To address this issue, we enhance state-of-the-art object detection methods for use in typical orchard settings. Concretely, we provide high-resolution images of citrus trees located in an area known to be highly affected by HLB, along with high-quality bounding box annotations of citrus fruit. Fruit on both the trees and the ground are labeled to allow for identification of fruit location, which contributes to advancements in yield estimation and potential measure of HLB impact via fruit drop. The dataset consists of over 32,000 bounding box annotations for fruit instances contained in 579 high-resolution images. In summary, our contributions are the following: (i) we introduce a novel dataset along with baseline performance benchmarks on multiple contemporary object detection algorithms, (ii) we show the ability to accurately capture fruit location on tree or on ground, and finally (ii) we present a correlation of our results with yield estimations.
