Towards Understanding Human Emotional Fluctuations with Sparse Check-In Data
Sagar Paresh Shah, Ga Wu, Sean W. Kortschot, Samuel Daviau
TL;DR
The paper tackles the data sparsity challenge in predicting human emotional shifts from sparse user check-ins by proposing MSPSC, a probabilistic framework that fuses core affect psychology with personalized, environment-aware learning. It models the mood state as $e^u \in \mathcal{E}$ with $|\mathcal{E}|=64$ and uses a hidden factor $\mathbf{z}$ to capture joint effects across historical checks and current environmental factors, incorporating a personalization vector $\mathbf{w}^u$ and a utility-driven feedback loop. Clustering of influential environmental factors and a top-$k$ similarity strategy enable targeted retrieval of relevant past moments, while the utility function updates $\mathbf{w}^u$ based on agreement between predictions and self-reported emotions. Empirical results show MSPSC outperforms baselines, achieving significant accuracy on sparse data (including about $60\%$ accuracy on a $64$-class mood grid) and demonstrating robustness for users with limited activity, indicating strong potential for real-world deployment in UpBeing and similar self-report-driven systems.
Abstract
Data sparsity is a key challenge limiting the power of AI tools across various domains. The problem is especially pronounced in domains that require active user input rather than measurements derived from automated sensors. It is a critical barrier to harnessing the full potential of AI in domains requiring active user engagement, such as self-reported mood check-ins, where capturing a continuous picture of emotional states is essential. In this context, sparse data can hinder efforts to capture the nuances of individual emotional experiences such as causes, triggers, and contributing factors. Existing methods for addressing data scarcity often rely on heuristics or large established datasets, favoring deep learning models that lack adaptability to new domains. This paper proposes a novel probabilistic framework that integrates user-centric feedback-based learning, allowing for personalized predictions despite limited data. Achieving 60% accuracy in predicting user states among 64 options (chance of 1/64), this framework effectively mitigates data sparsity. It is versatile across various applications, bridging the gap between theoretical AI research and practical deployment.
