Table of Contents
Fetching ...

MunchSonic: Tracking Fine-grained Dietary Actions through Active Acoustic Sensing on Eyeglasses

Saif Mahmud, Devansh Agarwal, Ashwin Ajit, Qikang Liang, Thalia Viranda, Francois Guimbretiere, Cheng Zhang

TL;DR

MunchSonic integrates active acoustic sensing into eyeglasses to track fine-grained dietary actions, addressing the limitation of previous wearables which mainly detect eating episodes. Using cross-correlation of ultrasonic chirps (C-FMCW) and a differential Echo Profile fed into a MobileNetV2-based DL classifier, the system distinguishes actions such as hand-to-mouth intake, chewing, drinking, talking, and face touching with high accuracy in unconstrained settings. In a 12-participant study, it achieved a macro F1-score of 0.935 at 2-second frame resolution and demonstrated robust performance in detecting eating episodes, counting intakes, and estimating chewing time. The work demonstrates the feasibility of continuous, objective dietary monitoring with potential applications in nutrition management and clinical health, while highlighting considerations around comfort, safety, and real-world deployment.

Abstract

We introduce MunchSonic, an AI-powered active acoustic sensing system integrated into eyeglasses to track fine-grained dietary actions. MunchSonic emits inaudible ultrasonic waves from the eyeglass frame, with the reflected signals capturing detailed positions and movements of body parts, including the mouth, jaw, arms, and hands involved in eating. These signals are processed by a deep learning pipeline to classify six actions: hand-to-mouth movements for food intake, chewing, drinking, talking, face-hand touching, and other activities (null). In an unconstrained study with 12 participants, MunchSonic achieved a 93.5% macro F1-score in a user-independent evaluation with a 2-second resolution in tracking these actions, also demonstrating its effectiveness in tracking eating episodes and food intake frequency within those episodes.

MunchSonic: Tracking Fine-grained Dietary Actions through Active Acoustic Sensing on Eyeglasses

TL;DR

MunchSonic integrates active acoustic sensing into eyeglasses to track fine-grained dietary actions, addressing the limitation of previous wearables which mainly detect eating episodes. Using cross-correlation of ultrasonic chirps (C-FMCW) and a differential Echo Profile fed into a MobileNetV2-based DL classifier, the system distinguishes actions such as hand-to-mouth intake, chewing, drinking, talking, and face touching with high accuracy in unconstrained settings. In a 12-participant study, it achieved a macro F1-score of 0.935 at 2-second frame resolution and demonstrated robust performance in detecting eating episodes, counting intakes, and estimating chewing time. The work demonstrates the feasibility of continuous, objective dietary monitoring with potential applications in nutrition management and clinical health, while highlighting considerations around comfort, safety, and real-world deployment.

Abstract

We introduce MunchSonic, an AI-powered active acoustic sensing system integrated into eyeglasses to track fine-grained dietary actions. MunchSonic emits inaudible ultrasonic waves from the eyeglass frame, with the reflected signals capturing detailed positions and movements of body parts, including the mouth, jaw, arms, and hands involved in eating. These signals are processed by a deep learning pipeline to classify six actions: hand-to-mouth movements for food intake, chewing, drinking, talking, face-hand touching, and other activities (null). In an unconstrained study with 12 participants, MunchSonic achieved a 93.5% macro F1-score in a user-independent evaluation with a 2-second resolution in tracking these actions, also demonstrating its effectiveness in tracking eating episodes and food intake frequency within those episodes.
Paper Structure (17 sections, 7 figures, 1 table)

This paper contains 17 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: MunchSonic Hardware and Form Factor: (a) User wearing eyeglasses form factor, (b) Customized controller unit with nRF52840 microcontroller, (c) Top view of the eyeglasses form factor, (d) MunchSonic transceiver for active acoustic sensing housing one speaker (top) and one microphone (bottom).
  • Figure 2: User study in unconstrained conditions: (a) Foods consumed by user study participants, (b) Sample images of the chest-mounted camera view of MunchSonic data collection pipeline.
  • Figure 3: Precision and recall of leave-one-participant-out evaluation of MunchSonic, where data from each participant on the $x$-axis serves as the test set.
  • Figure 4: Normalized confusion matrix from the leave-one-participant-out evaluation across 12 participants in the user study of MunchSonic. The values in parentheses represent the total number of instances for each cell.
  • Figure 5: Episode-level evaluation of MunchSonic: (a) Segmentation of user study data into 4.5-minute-long episodes, (b) Detection of food intake within each episode.
  • ...and 2 more figures