Dual-band feature selection for maturity classification of specialty crops by hyperspectral imaging
Usman A. Zahidi, Krystian Łukasik, Grzegorz Cielniak
TL;DR
This work tackles strawberry and tomato maturity classification using VNIR hyperspectral imaging, proposing a dual-band feature-extraction approach that targets pigment-band ($510$--$670$ nm) and chlorophyll-band ($671$--$790$ nm) extrema. By performing an exhaustive search over fixed-width subbands and computing a compact feature vector of extrema values and their wavelengths, the method achieves near-optimal accuracy (strawberry ≈ $98 ext{.}7 ext{%}$, tomato ≈ $96 ext{.}8 ext{%}$) while bypassing heavy preprocessing. Compared with full-spectrum CNNs and SVMs, the proposed features deliver substantial accuracy gains and a dramatic speedup (≈13 FPS vs ≈1.16 FPS for full-spectrum SVM), enabling practical real-time deployment. A public dataset of over 1100 VNIR hyperspectral images and accompanying code/data is released, supporting benchmarking and further research. The results indicate that targeted spectral features with extremum-position information can robustly drive fast, cost-effective maturity classification for selective harvesting and QC in packaging facilities.
Abstract
The maturity classification of specialty crops such as strawberries and tomatoes is an essential agricultural downstream activity for selective harvesting and quality control (QC) at production and packaging sites. Recent advancements in Deep Learning (DL) have produced encouraging results in color images for maturity classification applications. However, hyperspectral imaging (HSI) outperforms methods based on color vision. Multivariate analysis methods and Convolutional Neural Networks (CNN) deliver promising results; however, a large amount of input data and the associated preprocessing requirements cause hindrances in practical application. Conventionally, the reflectance intensity in a given electromagnetic spectrum is employed in estimating fruit maturity. We present a feature extraction method to empirically demonstrate that the peak reflectance in subbands such as 500-670 nm (pigment band) and the wavelength of the peak position, and contrarily, the trough reflectance and its corresponding wavelength within 671-790 nm (chlorophyll band) are convenient to compute yet distinctive features for the maturity classification. The proposed feature selection method is beneficial because preprocessing, such as dimensionality reduction, is avoided before every prediction. The feature set is designed to capture these traits. The best SOTA methods, among 3D-CNN, 1D-CNN, and SVM, achieve at most 90.0 % accuracy for strawberries and 92.0 % for tomatoes on our dataset. Results show that the proposed method outperforms the SOTA as it yields an accuracy above 98.0 % in strawberry and 96.0 % in tomato classification. A comparative analysis of the time efficiency of these methods is also conducted, which shows the proposed method performs prediction at 13 Frames Per Second (FPS) compared to the maximum 1.16 FPS attained by the full-spectrum SVM classifier.
