Vision-Based Approach for Food Weight Estimation from 2D Images
Chathura Wimalasiri, Prasan Kumar Sahoo
TL;DR
This paper addresses estimating food weight from 2D images to enable non-invasive dietary assessment. It combines Faster R-CNN for food detection with a MobileNetV3-based regressor, trained on a 2380-image dataset covering 14 foods. Key results include $mAP=83.41\%$, $IoU=91.82\%$, and classifier accuracy $100\%$ for detection, and $RMSE=6.3204$, $MAPE=0.0640\%$, and $R^2=0.9865$ for weight estimation. Significance: demonstrates a robust, practical framework for nutrition counseling, wellness monitoring, and waste reduction by enabling accurate weight estimation from 2D imagery.
Abstract
In response to the increasing demand for efficient and non-invasive methods to estimate food weight, this paper presents a vision-based approach utilizing 2D images. The study employs a dataset of 2380 images comprising fourteen different food types in various portions, orientations, and containers. The proposed methodology integrates deep learning and computer vision techniques, specifically employing Faster R-CNN for food detection and MobileNetV3 for weight estimation. The detection model achieved a mean average precision (mAP) of 83.41\%, an average Intersection over Union (IoU) of 91.82\%, and a classification accuracy of 100\%. For weight estimation, the model demonstrated a root mean squared error (RMSE) of 6.3204, a mean absolute percentage error (MAPE) of 0.0640\%, and an R-squared value of 98.65\%. The study underscores the potential applications of this technology in healthcare for nutrition counseling, fitness and wellness for dietary intake assessment, and smart food storage solutions to reduce waste. The results indicate that the combination of Faster R-CNN and MobileNetV3 provides a robust framework for accurate food weight estimation from 2D images, showcasing the synergy of computer vision and deep learning in practical applications.
