Table of Contents
Fetching ...

What's on Your Plate? Inferring Chinese Cuisine Intake from Wearable IMUs

Jiaxi Yin, Pengcheng Wang, Han Ding, Fei Wang

TL;DR

This work tackles unobtrusive dietary monitoring by addressing the limitations of self-report and camera-based methods, particularly for the diverse landscape of Chinese cuisine. It introduces CuisineSense, a two-stage system that fuses IMU data from a smartwatch and smart glasses to first detect eating actions via a reconstruction-based anomaly detector and then classify the specific Chinese foods using a 1D Swin Transformer. Key contributions include (1) accurate recognition of 11 Chinese food categories with a wearable setup, (2) a reconstruction-based eating-state detector that robustly filters non-eating activities, and (3) a 27.5-hour, richly annotated dataset spanning 11 foods collected from 10 participants. The approach enables real-time, privacy-preserving dietary monitoring with practical implications for health management and chronic disease prevention.

Abstract

Accurate food intake detection is vital for dietary monitoring and chronic disease prevention. Traditional self-report methods are prone to recall bias, while camera-based approaches raise concerns about privacy. Furthermore, existing wearable-based methods primarily focus on a limited number of food types, such as hamburgers and pizza, failing to address the vast diversity of Chinese cuisine. To bridge this gap, we propose CuisineSense, a system that classifies Chinese food types by integrating hand motion cues from a smartwatch with head dynamics from smart glasses. To filter out irrelevant daily activities, we design a two-stage detection pipeline. The first stage identifies eating states by distinguishing characteristic temporal patterns from non-eating behaviors. The second stage then conducts fine-grained food type recognition based on the motions captured during food intake. To evaluate CuisineSense, we construct a dataset comprising 27.5 hours of IMU recordings across 11 food categories and 10 participants. Experiments demonstrate that CuisineSense achieves high accuracy in both eating state detection and food classification, offering a practical solution for unobtrusive, wearable-based dietary monitoring.The system code is publicly available at https://github.com/joeeeeyin/CuisineSense.git.

What's on Your Plate? Inferring Chinese Cuisine Intake from Wearable IMUs

TL;DR

This work tackles unobtrusive dietary monitoring by addressing the limitations of self-report and camera-based methods, particularly for the diverse landscape of Chinese cuisine. It introduces CuisineSense, a two-stage system that fuses IMU data from a smartwatch and smart glasses to first detect eating actions via a reconstruction-based anomaly detector and then classify the specific Chinese foods using a 1D Swin Transformer. Key contributions include (1) accurate recognition of 11 Chinese food categories with a wearable setup, (2) a reconstruction-based eating-state detector that robustly filters non-eating activities, and (3) a 27.5-hour, richly annotated dataset spanning 11 foods collected from 10 participants. The approach enables real-time, privacy-preserving dietary monitoring with practical implications for health management and chronic disease prevention.

Abstract

Accurate food intake detection is vital for dietary monitoring and chronic disease prevention. Traditional self-report methods are prone to recall bias, while camera-based approaches raise concerns about privacy. Furthermore, existing wearable-based methods primarily focus on a limited number of food types, such as hamburgers and pizza, failing to address the vast diversity of Chinese cuisine. To bridge this gap, we propose CuisineSense, a system that classifies Chinese food types by integrating hand motion cues from a smartwatch with head dynamics from smart glasses. To filter out irrelevant daily activities, we design a two-stage detection pipeline. The first stage identifies eating states by distinguishing characteristic temporal patterns from non-eating behaviors. The second stage then conducts fine-grained food type recognition based on the motions captured during food intake. To evaluate CuisineSense, we construct a dataset comprising 27.5 hours of IMU recordings across 11 food categories and 10 participants. Experiments demonstrate that CuisineSense achieves high accuracy in both eating state detection and food classification, offering a practical solution for unobtrusive, wearable-based dietary monitoring.The system code is publicly available at https://github.com/joeeeeyin/CuisineSense.git.

Paper Structure

This paper contains 11 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: CuisineSense pipeline. A user wears a smartwatch and glasses, whose embedded IMUs capture hand and head motions, respectively. The IMU signals are first fed into the Eating State Detection module to determine if a food intake event is occurring. If yes, the signal segment is then passed to the second stage to classify the specific food being consumed.
  • Figure 2: Hyperparameter search for the eating state detection module. The plot shows the top 20 masking ratio–MSE threshold combinations, with circle size proportional to accuracy. Best performance was achieved with a mask ratio of 0.15 and an MSE threshold at the 80th percentile of the losses.
  • Figure 3: Confusion matrix of food type recognition.