Table of Contents
Fetching ...

GazeFlow: Personalized Ambient Soundscape Generation for Passive Strabismus Self-Monitoring

Joydeep Chandra, Satyam Kumar Navneet, Yong Zhang

TL;DR

GazeFlow addresses the need for passive self-monitoring of binocular coordination by combining a personalized autoencoder with a three-pronged methodological framework: Binocular Temporal-Frequency Disentanglement (BTFD) to separate gaze dynamics, Contrastive Biometric Pre-training (CBP) to enable cross-resolution transfer, and Gaze-MAML for rapid 5-shot personalization. The ambient sonification design maps disentangled anomaly factors to gradual musical changes, supporting peripheral awareness without disruption. Empirical results show strong drift detection performance (F1≈0.84) on 1000Hz-to-30Hz transfer benchmarks and favorable user feedback, with participants reporting increased awareness and a preference for ambient feedback over alerts. While preliminary, the work demonstrates a scalable, privacy-conscious approach to continuous eye-health monitoring that could complement clinical validation and broader self-management tools.

Abstract

Strabismus affects 2-4% of the population, yet individuals recovering from corrective surgery lack accessible tools for monitoring eye alignment. Dichoptic therapies require active engagement & clinical supervision, limiting their adoption for passive self-awareness. We present GazeFlow, a browser-based self-monitoring system that uses a personalized temporal autoencoder to detect eye drift patterns from webcam-based gaze tracking & provides ambient audio feedback. Unlike alert-based systems, GazeFlow operates according to calm computing principles, morphing musical parameters in proportion to drift severity while remaining in peripheral awareness. We address the challenges of inter-individual variability & domain transfer (1000Hz research to 30Hz webcam) by introducing Binocular Temporal-Frequency Disentanglement (BTFD), Contrastive Biometric Pre-training (CBP), & Gaze-MAML. We validate our approach on the GazeBase dataset (N=50) achieving F1=0.84 for drift detection, & conduct a preliminary user study (N=6) with participants having intermittent strabismus. Participants reported increased awareness of their eye behaviour (M=5.8/7) & preference for ambient feedback over alerts (M=6.2/7). We discuss the system's potential for self-awareness applications & outline directions for clinical validation.

GazeFlow: Personalized Ambient Soundscape Generation for Passive Strabismus Self-Monitoring

TL;DR

GazeFlow addresses the need for passive self-monitoring of binocular coordination by combining a personalized autoencoder with a three-pronged methodological framework: Binocular Temporal-Frequency Disentanglement (BTFD) to separate gaze dynamics, Contrastive Biometric Pre-training (CBP) to enable cross-resolution transfer, and Gaze-MAML for rapid 5-shot personalization. The ambient sonification design maps disentangled anomaly factors to gradual musical changes, supporting peripheral awareness without disruption. Empirical results show strong drift detection performance (F1≈0.84) on 1000Hz-to-30Hz transfer benchmarks and favorable user feedback, with participants reporting increased awareness and a preference for ambient feedback over alerts. While preliminary, the work demonstrates a scalable, privacy-conscious approach to continuous eye-health monitoring that could complement clinical validation and broader self-management tools.

Abstract

Strabismus affects 2-4% of the population, yet individuals recovering from corrective surgery lack accessible tools for monitoring eye alignment. Dichoptic therapies require active engagement & clinical supervision, limiting their adoption for passive self-awareness. We present GazeFlow, a browser-based self-monitoring system that uses a personalized temporal autoencoder to detect eye drift patterns from webcam-based gaze tracking & provides ambient audio feedback. Unlike alert-based systems, GazeFlow operates according to calm computing principles, morphing musical parameters in proportion to drift severity while remaining in peripheral awareness. We address the challenges of inter-individual variability & domain transfer (1000Hz research to 30Hz webcam) by introducing Binocular Temporal-Frequency Disentanglement (BTFD), Contrastive Biometric Pre-training (CBP), & Gaze-MAML. We validate our approach on the GazeBase dataset (N=50) achieving F1=0.84 for drift detection, & conduct a preliminary user study (N=6) with participants having intermittent strabismus. Participants reported increased awareness of their eye behaviour (M=5.8/7) & preference for ambient feedback over alerts (M=6.2/7). We discuss the system's potential for self-awareness applications & outline directions for clinical validation.
Paper Structure (13 sections, 1 equation, 1 figure, 4 tables, 1 algorithm)

This paper contains 13 sections, 1 equation, 1 figure, 4 tables, 1 algorithm.

Figures (1)

  • Figure 1: GazeFlow architecture. Phase 1: Extracts 6D gaze features (velocity, H/V deviation, vergence, pupil ratio, fixation stability) from webcam via MediaPipe at 30Hz, windowed into 3-second segments. Phase 2: Applies wavelet decomposition (DWT) separating fast (7--15Hz) & slow ($<$2Hz) dynamics; dual-pathway encoder processes temporal & frequency representations; $\beta$-VAE produces disentangled 24D latent space with interpretable factors. Phase 3: Biometric pre-training learns cross-resolution invariant representations; Gaze-MAML enables 5-shot personalization via single gradient step. Phase 4: Maps per-factor anomaly scores to graduated ambient audio parameters (scale, rhythm, filter, reverb) via continuous morphing.