Deep Feature Optimization for Enhanced Fish Freshness Assessment
Phi-Hung Hoang, Nam-Thuan Trinh, Van-Manh Tran, Thi-Thu-Hong Phan
TL;DR
This work tackles automated fish freshness assessment by introducing a three-stage deep feature optimization framework that combines fine-tuned backbones, multi-level feature extraction, embedded feature selection, and classical classifiers, with Grad-CAM for interpretability. The study systematically evaluates ResNet-50, DenseNet-121, EfficientNet-B0, Swin-Tiny, and ConvNeXt-Base, demonstrating that high-level features and transformer-based representations yield superior discriminative power when paired with ensemble ML methods, particularly Extra Trees. LightGBM-driven feature selection can further reduce dimensionality with minimal or even positive impact on accuracy, achieving a peak of 85.99% on the FFE dataset using Swin-Tiny features and an ET classifier. Overall, the framework delivers improved accuracy over existing methods, enhanced interpretability, and practical deployment potential for visual quality evaluation in seafood workflows, while acknowledging dataset limitations and proposing future feature-fusion directions.
Abstract
Assessing fish freshness is vital for ensuring food safety and minimizing economic losses in the seafood industry. However, traditional sensory evaluation remains subjective, time-consuming, and inconsistent. Although recent advances in deep learning have automated visual freshness prediction, challenges related to accuracy and feature transparency persist. This study introduces a unified three-stage framework that refines and leverages deep visual representations for reliable fish freshness assessment. First, five state-of-the-art vision architectures - ResNet-50, DenseNet-121, EfficientNet-B0, ConvNeXt-Base, and Swin-Tiny - are fine-tuned to establish a strong baseline. Next, multi-level deep features extracted from these backbones are used to train seven classical machine learning classifiers, integrating deep and traditional decision mechanisms. Finally, feature selection methods based on Light Gradient Boosting Machine (LGBM), Random Forest, and Lasso identify a compact and informative subset of features. Experiments on the Freshness of the Fish Eyes (FFE) dataset demonstrate that the best configuration combining Swin-Tiny features, an Extra Trees classifier, and LGBM-based feature selection achieves an accuracy of 85.99%, outperforming recent studies on the same dataset by 8.69-22.78%. These findings confirm the effectiveness and generalizability of the proposed framework for visual quality evaluation tasks.
