Quantifying the Privacy-Utility Trade-off in GPS-based Daily Stress Recognition using Semantic Features
Hoang Khang Phan, Nhat Tan Le
TL;DR
This work tackles the privacy risks of GPS-based stress recognition by introducing a privacy-aware framework that semantically encodes locations via a self-hosted OSM reverse-geocoder and an LLM-bootstrapped static map. It quantifies the privacy-utility trade-off using re-identification attacks, mutual information, and multiple validation schemes, demonstrating that the proposed PA encoding can match non-private baselines in LOSO stress prediction while substantially reducing identity leakage. With Random Forest and XGBoost classifiers, the approach achieves competitive accuracy and F1-scores (e.g., ~67% accuracy, ~64% F1 in PA) and shows that eight of the top ten MI features can be preserved without compromising privacy. Ablation and feature analyses reveal that time-based features, especially deadlines and recreational activity, drive stress predictions, while privacy-preserving transformations remove highly identifying signals such as class schedules, enabling safer deployment in educational contexts. The findings highlight a viable path toward end-to-end privacy-preserving mobile mental-health monitoring with robust generalization and practical implications for students and educators alike.
Abstract
Psychological stress is a widespread issue that significantly impacts student well-being and academic performance. Effective remote stress recognition is crucial, yet existing methods often rely on wearable devices or GPS-based clustering techniques that pose privacy risks. In this study, we introduce a novel, end-to-end privacy-enhanced framework for semantic location encoding using a self-hosted OSM engine and an LLM-bootstrapped static map. We rigorously quantify the privacy-utility trade-off and demonstrate (via LOSO validation) that our Privacy-Aware (PA) model achieves performance statistically indistinguishable from a non-private model, proving that utility does not require sacrificing privacy. Feature importance analysis highlights that recreational activity time, working time, and travel time play a significant role in stress recognition.
