ELMF4EggQ: Ensemble Learning with Multimodal Feature Fusion for Non-Destructive Egg Quality Assessment
Md Zahim Hassan, Md. Osama, Muhammad Ashad Kabir, Md. Saiful Islam, Zannatul Naim
TL;DR
This work targets non-destructive egg quality assessment by predicting internal grade and freshness from external attributes. It introduces ELMF4EggQ, a multimodal ensemble framework that fuses CNN-derived image features with shape and weight data, augmented by SMOTE and PCA, and evaluated via 10-fold cross-validation. A new 186-sample dataset with HU- and YI-based labels is released to support supervised learning and reproducibility. Results show the multimodal ensemble achieves 86.57% accuracy for grade and 70.83% for freshness, demonstrating the added value of combining visual cues with physical attributes. The study highlights practical potential for real-world processing facilities and outlines future work on larger datasets and advanced fusion techniques.
Abstract
Accurate, non-destructive assessment of egg quality is critical for ensuring food safety, maintaining product standards, and operational efficiency in commercial poultry production. This paper introduces ELMF4EggQ, an ensemble learning framework that employs multimodal feature fusion to classify egg grade and freshness using only external attributes - image, shape, and weight. A novel, publicly available dataset of 186 brown-shelled eggs was constructed, with egg grade and freshness levels determined through laboratory-based expert assessments involving internal quality measurements, such as yolk index and Haugh unit. To the best of our knowledge, this is the first study to apply machine learning methods for internal egg quality assessment using only external, non-invasive features, and the first to release a corresponding labeled dataset. The proposed framework integrates deep features extracted from external egg images with structural characteristics such as egg shape and weight, enabling a comprehensive representation of each egg. Image feature extraction is performed using top-performing pre-trained CNN models (ResNet152, DenseNet169, and ResNet152V2), followed by PCA-based dimensionality reduction, SMOTE augmentation, and classification using multiple machine learning algorithms. An ensemble voting mechanism combines predictions from the best-performing classifiers to enhance overall accuracy. Experimental results demonstrate that the multimodal approach significantly outperforms image-only and tabular (shape and weight) only baselines, with the multimodal ensemble approach achieving 86.57% accuracy in grade classification and 70.83% in freshness prediction. All code and data are publicly available at https://github.com/Kenshin-Keeps/Egg_Quality_Prediction_ELMF4EggQ, promoting transparency, reproducibility, and further research in this domain.
