Table of Contents
Fetching ...

NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation

Chi-en Amy Tai, Saeejith Nair, Olivia Markham, Matthew Keller, Yifan Wu, Yuhao Chen, Alexander Wong

TL;DR

NutritionVerse-Real addresses the need for openly accessible, manually curated 2D food scene data to support dietary intake estimation. The authors created 889 images covering 251 dishes and 45 food types by manually collecting real-life scenes, weighing each ingredient, and computing nutritional content using packaging data or the Canada Nutrient File; segmentation masks were generated through manual labelling with Roboflow. They analyze data diversity to disclose biases and distributional characteristics, highlighting challenges for model robustness in real-world dietary sensing. The dataset is publicly available on Kaggle to accelerate ML-based dietary assessment.

Abstract

Dietary intake estimation plays a crucial role in understanding the nutritional habits of individuals and populations, aiding in the prevention and management of diet-related health issues. Accurate estimation requires comprehensive datasets of food scenes, including images, segmentation masks, and accompanying dietary intake metadata. In this paper, we introduce NutritionVerse-Real, an open access manually collected 2D food scene dataset for dietary intake estimation with 889 images of 251 distinct dishes and 45 unique food types. The NutritionVerse-Real dataset was created by manually collecting images of food scenes in real life, measuring the weight of every ingredient and computing the associated dietary content of each dish using the ingredient weights and nutritional information from the food packaging or the Canada Nutrient File. Segmentation masks were then generated through human labelling of the images. We provide further analysis on the data diversity to highlight potential biases when using this data to develop models for dietary intake estimation. NutritionVerse-Real is publicly available at https://www.kaggle.com/datasets/nutritionverse/nutritionverse-real as part of an open initiative to accelerate machine learning for dietary sensing.

NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation

TL;DR

NutritionVerse-Real addresses the need for openly accessible, manually curated 2D food scene data to support dietary intake estimation. The authors created 889 images covering 251 dishes and 45 food types by manually collecting real-life scenes, weighing each ingredient, and computing nutritional content using packaging data or the Canada Nutrient File; segmentation masks were generated through manual labelling with Roboflow. They analyze data diversity to disclose biases and distributional characteristics, highlighting challenges for model robustness in real-world dietary sensing. The dataset is publicly available on Kaggle to accelerate ML-based dietary assessment.

Abstract

Dietary intake estimation plays a crucial role in understanding the nutritional habits of individuals and populations, aiding in the prevention and management of diet-related health issues. Accurate estimation requires comprehensive datasets of food scenes, including images, segmentation masks, and accompanying dietary intake metadata. In this paper, we introduce NutritionVerse-Real, an open access manually collected 2D food scene dataset for dietary intake estimation with 889 images of 251 distinct dishes and 45 unique food types. The NutritionVerse-Real dataset was created by manually collecting images of food scenes in real life, measuring the weight of every ingredient and computing the associated dietary content of each dish using the ingredient weights and nutritional information from the food packaging or the Canada Nutrient File. Segmentation masks were then generated through human labelling of the images. We provide further analysis on the data diversity to highlight potential biases when using this data to develop models for dietary intake estimation. NutritionVerse-Real is publicly available at https://www.kaggle.com/datasets/nutritionverse/nutritionverse-real as part of an open initiative to accelerate machine learning for dietary sensing.
Paper Structure (4 sections, 4 figures)

This paper contains 4 sections, 4 figures.

Figures (4)

  • Figure 1: Example dishes from NutritionVerse-Real dataset.
  • Figure 2: Examples of the segmentation mask for scenes labelled using Roboflow in the NutritionVerse-Real dataset.
  • Figure 3: Distribution of number of ingredients in a dish.
  • Figure 4: Distribution of the dataset across various macronutrients as a percent of the daily value (DV) obtained from percent-daily-valueosilla2018calories.