Table of Contents
Fetching ...

WatchGuardian: Enabling User-Defined Personalized Just-in-Time Intervention on Smartwatch

Ying Lei, Yancheng Cao, Will Wang, Yuanzhe Dong, Changchang Yin, Weidan Cao, Ping Zhang, Jingzhen Yang, Bingsheng Yao, Yifan Peng, Chunhua Weng, Randy Auerbach, Lena Mamykina, Dakuo Wang, Yuntao Wang, Xuhai Xu

TL;DR

WatchGuardian tackles the challenge of personal, idiosyncratic undesirable actions by enabling user-defined JITIs on a smartwatch. It introduces a three-stage few-shot pipeline built on a pre-trained IMU SSL model, followed by fine-tuning with fine-grained gestures and heavy data augmentation to support new actions from a few examples, deployed in real time with a server-assisted engine. Offline evaluation shows robust action recognition with 1–10 shots (up to 87.7% accuracy, 87.2% F1), and a four-hour intervention study demonstrates a significant reduction (64.0% duration) of undesired actions versus a rule-based baseline, plus usable human-AI interaction insights. The work highlights a practical path to personalized, AI-powered JITIs with broader implications for patient- and behavior-focused wearables, while acknowledging perceptual and ethical considerations in human-AI collaboration.

Abstract

While just-in-time interventions (JITIs) have effectively targeted common health behaviors, individuals often have unique needs to intervene in personal undesirable actions that can negatively affect physical, mental, and social well-being. We present WatchGuardian, a smartwatch-based JITI system that empowers users to define custom interventions for these personal actions with a small number of samples. For the model to detect new actions based on limited new data samples, we developed a few-shot learning pipeline that finetuned a pre-trained inertial measurement unit (IMU) model on public hand-gesture datasets. We then designed a data augmentation and synthesis process to train additional classification layers for customization. Our offline evaluation with 26 participants showed that with three, five, and ten examples, our approach achieved an average accuracy of 76.8%, 84.7%, and 87.7%, and an F1 score of 74.8%, 84.2%, and 87.2% We then conducted a four-hour intervention study to compare WatchGuardian against a rule-based intervention. Our results demonstrated that our system led to a significant reduction by 64.0 +- 22.6% in undesirable actions, substantially outperforming the baseline by 29.0%. Our findings underscore the effectiveness of a customizable, AI-driven JITI system for individuals in need of behavioral intervention in personal undesirable actions. We envision that our work can inspire broader applications of user-defined personalized intervention with advanced AI solutions.

WatchGuardian: Enabling User-Defined Personalized Just-in-Time Intervention on Smartwatch

TL;DR

WatchGuardian tackles the challenge of personal, idiosyncratic undesirable actions by enabling user-defined JITIs on a smartwatch. It introduces a three-stage few-shot pipeline built on a pre-trained IMU SSL model, followed by fine-tuning with fine-grained gestures and heavy data augmentation to support new actions from a few examples, deployed in real time with a server-assisted engine. Offline evaluation shows robust action recognition with 1–10 shots (up to 87.7% accuracy, 87.2% F1), and a four-hour intervention study demonstrates a significant reduction (64.0% duration) of undesired actions versus a rule-based baseline, plus usable human-AI interaction insights. The work highlights a practical path to personalized, AI-powered JITIs with broader implications for patient- and behavior-focused wearables, while acknowledging perceptual and ethical considerations in human-AI collaboration.

Abstract

While just-in-time interventions (JITIs) have effectively targeted common health behaviors, individuals often have unique needs to intervene in personal undesirable actions that can negatively affect physical, mental, and social well-being. We present WatchGuardian, a smartwatch-based JITI system that empowers users to define custom interventions for these personal actions with a small number of samples. For the model to detect new actions based on limited new data samples, we developed a few-shot learning pipeline that finetuned a pre-trained inertial measurement unit (IMU) model on public hand-gesture datasets. We then designed a data augmentation and synthesis process to train additional classification layers for customization. Our offline evaluation with 26 participants showed that with three, five, and ten examples, our approach achieved an average accuracy of 76.8%, 84.7%, and 87.7%, and an F1 score of 74.8%, 84.2%, and 87.2% We then conducted a four-hour intervention study to compare WatchGuardian against a rule-based intervention. Our results demonstrated that our system led to a significant reduction by 64.0 +- 22.6% in undesirable actions, substantially outperforming the baseline by 29.0%. Our findings underscore the effectiveness of a customizable, AI-driven JITI system for individuals in need of behavioral intervention in personal undesirable actions. We envision that our work can inspire broader applications of user-defined personalized intervention with advanced AI solutions.

Paper Structure

This paper contains 39 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: WatchGuardian empowers users to easily define personal actions that they want to receive just-in-time intervention (JITI) from a smartwatch. The user journey is as follows: (1) Users determine one or more custom target actions. (2) They follow the instructions on the smartwatch to collect a small set of samples with the accelerometer sensor. (3) WatchGuardian applies multiple data augmentation and data synthesis techniques to expand the training dataset, (4) WatchGuardian adapts a pre-trained model through fine-tuning and personal customization. (5) WatchGuardian leverages the custom model to provide a JITI system for real-time action recognition and intervention delivery.
  • Figure 2: Three-stage Few-shot Pipeline for Model Customization. (A) Stage 1: We adopted A pre-trained SSL model for human activity recognition that takes 30 Hz tri-axis accelerometer data streams. (B) Stage 2: We finetuned the pre-trained model on two human activity recognition datasets with more fine-grained gestures, together with additional negative data collected by us. (C) Stage 3: Given the data sequence of a few samples of the new target action, we designed a series of data augmentation and synthesis techniques to enable robust modeling training for customization.
  • Figure 3: Smartwatch Interface Designs. (a) Few-shot data collection interface, where a user can define the target behavior and the number of shots. The user can name the gesture once the collection is finished. (b) Intervention reminder interface, which is shown when the system detects undesirable target actions.
  • Figure 4: Target Actions for Evaluation. (1-5) presents the five pre-determined actions. (6-17) visualizes new target behaviors defined by participants. Only identical actions are grouped as one. Actions that have minor differences are counted separately, as each of them could be highly personal.
  • Figure 5: Few-shot Learning Pipeline Performance of Accuracy and F1 Score. We experimented with different numbers of shots using 1 to 10 samples to train a custom model. We also experimented with adding more than one target action simultaneously (i.e., multi-class classification). Error bars indicate standard error. The same below.
  • ...and 4 more figures