Table of Contents
Fetching ...

SnappyMeal: Design and Longitudinal Evaluation of a Multimodal AI Food Logging Application

Liam Bakar, Zachary Englhardt, Vidya Srinivas, Girish Narayanswamy, Dilini Nissanka, Shwetak Patel, Vikram Iyer

TL;DR

This work addresses the rigidity and inaccuracies of traditional food logging by introducing SnappyMeal, a multimodal AI-powered system for flexible dietary tracking. It combines image, text, and audio inputs with retrieval-augmented context from receipts and nutritional databases, plus goal-driven follow-up questions to fill in missing details. The approach is validated through public nutrition benchmarks and a 3-week longitudinal deployment (n>500 logs) that shows high user engagement and perceived accuracy, while highlighting trade-offs where follow-up prompts can introduce cognitive load. The study demonstrates the value of context-aware, restrained AI in self-tracking, laying groundwork for intelligent, user-centric nutrition logging tools and informing design principles for future health-domain applications.

Abstract

Food logging, both self-directed and prescribed, plays a critical role in uncovering correlations between diet, medical, fitness, and health outcomes. Through conversations with nutritional experts and individuals who practice dietary tracking, we find current logging methods, such as handwritten and app-based journaling, are inflexible and result in low adherence and potentially inaccurate nutritional summaries. These findings, corroborated by prior literature, emphasize the urgent need for improved food logging methods. In response, we propose SnappyMeal, an AI-powered dietary tracking system that leverages multimodal inputs to enable users to more flexibly log their food intake. SnappyMeal introduces goal-dependent follow-up questions to intelligently seek missing context from the user and information retrieval from user grocery receipts and nutritional databases to improve accuracy. We evaluate SnappyMeal through publicly available nutrition benchmarks and a multi-user, 3-week, in-the-wild deployment capturing over 500 logged food instances. Users strongly praised the multiple available input methods and reported a strong perceived accuracy. These insights suggest that multimodal AI systems can be leveraged to significantly improve dietary tracking flexibility and context-awareness, laying the groundwork for a new class of intelligent self-tracking applications.

SnappyMeal: Design and Longitudinal Evaluation of a Multimodal AI Food Logging Application

TL;DR

This work addresses the rigidity and inaccuracies of traditional food logging by introducing SnappyMeal, a multimodal AI-powered system for flexible dietary tracking. It combines image, text, and audio inputs with retrieval-augmented context from receipts and nutritional databases, plus goal-driven follow-up questions to fill in missing details. The approach is validated through public nutrition benchmarks and a 3-week longitudinal deployment (n>500 logs) that shows high user engagement and perceived accuracy, while highlighting trade-offs where follow-up prompts can introduce cognitive load. The study demonstrates the value of context-aware, restrained AI in self-tracking, laying groundwork for intelligent, user-centric nutrition logging tools and informing design principles for future health-domain applications.

Abstract

Food logging, both self-directed and prescribed, plays a critical role in uncovering correlations between diet, medical, fitness, and health outcomes. Through conversations with nutritional experts and individuals who practice dietary tracking, we find current logging methods, such as handwritten and app-based journaling, are inflexible and result in low adherence and potentially inaccurate nutritional summaries. These findings, corroborated by prior literature, emphasize the urgent need for improved food logging methods. In response, we propose SnappyMeal, an AI-powered dietary tracking system that leverages multimodal inputs to enable users to more flexibly log their food intake. SnappyMeal introduces goal-dependent follow-up questions to intelligently seek missing context from the user and information retrieval from user grocery receipts and nutritional databases to improve accuracy. We evaluate SnappyMeal through publicly available nutrition benchmarks and a multi-user, 3-week, in-the-wild deployment capturing over 500 logged food instances. Users strongly praised the multiple available input methods and reported a strong perceived accuracy. These insights suggest that multimodal AI systems can be leveraged to significantly improve dietary tracking flexibility and context-awareness, laying the groundwork for a new class of intelligent self-tracking applications.
Paper Structure (72 sections, 5 equations, 9 figures, 7 tables)

This paper contains 72 sections, 5 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Traditional food logging using handwritten diaries or even mobile applications rely heavily on manual data entry or suffer from poor accuracy estimation techniques. We instead develop a smartphone-based multimodal AI system that combines diverse multimodal context from food and receipt images to natural language text and audio with interactive follow up questions to improve tracking flexibility and contextual awareness.
  • Figure 2: System overview: Users input multimodal food logs which are processed along with relevant context to extract nutritional information. This data is fed into an LLM to validate the nutritional information and determine if there is any missing information. Finally, all the information is sent to Gemini to generate food log data. The resulting data is displayed as individual food logs that can be examined and aggregate graphical visualizations.
  • Figure 3: Overview of the application's main interface screens.
  • Figure 4: User logs over time.
  • Figure 5: One day timeline showing the mix of modalities used throughout the day
  • ...and 4 more figures