Table of Contents
Fetching ...

Disentangling Emotional Bases and Transient Fluctuations: A Low-Rank Sparse Decomposition Approach for Video Affective Analysis

Feng-Qi Cui, Jinyang Huang, Ziyu Jia, Xinyu Li, Xin Yan, Xiaokang Zhou, Meng Wang

TL;DR

This work addresses the instability and entanglement in video-based affective computing by proposing LSEF, a hierarchical framework that disentangles stable emotional bases from transient fluctuations using a low-rank sparse decomposition. The method integrates three plug-and-play modules—Stability Encoding (SEM), Dynamic Decoupling (DDM), and Consistency Integration (CIM)—alongside Rank Aware Optimization (RAO) to balance smoothness and sensitivity. Empirical results on DFEW, FERV39k, and VEATIC demonstrate state-of-the-art performance for both discrete emotion recognition and continuous valence-arousal estimation, highlighting improved robustness and dynamic discrimination. The approach provides a principled, interpretable foundation for modeling affective dynamics and holds promise for generalization across varied real-world conditions and potential multi-modal extensions.

Abstract

Video-based Affective Computing (VAC), vital for emotion analysis and human-computer interaction, suffers from model instability and representational degradation due to complex emotional dynamics. Since the meaning of different emotional fluctuations may differ under different emotional contexts, the core limitation is the lack of a hierarchical structural mechanism to disentangle distinct affective components, i.e., emotional bases (the long-term emotional tone), and transient fluctuations (the short-term emotional fluctuations). To address this, we propose the Low-Rank Sparse Emotion Understanding Framework (LSEF), a unified model grounded in the Low-Rank Sparse Principle, which theoretically reframes affective dynamics as a hierarchical low-rank sparse compositional process. LSEF employs three plug-and-play modules, i.e., the Stability Encoding Module (SEM) captures low-rank emotional bases; the Dynamic Decoupling Module (DDM) isolates sparse transient signals; and the Consistency Integration Module (CIM) reconstructs multi-scale stability and reactivity coherence. This framework is optimized by a Rank Aware Optimization (RAO) strategy that adaptively balances gradient smoothness and sensitivity. Extensive experiments across multiple datasets confirm that LSEF significantly enhances robustness and dynamic discrimination, which further validates the effectiveness and generality of hierarchical low-rank sparse modeling for understanding affective dynamics.

Disentangling Emotional Bases and Transient Fluctuations: A Low-Rank Sparse Decomposition Approach for Video Affective Analysis

TL;DR

This work addresses the instability and entanglement in video-based affective computing by proposing LSEF, a hierarchical framework that disentangles stable emotional bases from transient fluctuations using a low-rank sparse decomposition. The method integrates three plug-and-play modules—Stability Encoding (SEM), Dynamic Decoupling (DDM), and Consistency Integration (CIM)—alongside Rank Aware Optimization (RAO) to balance smoothness and sensitivity. Empirical results on DFEW, FERV39k, and VEATIC demonstrate state-of-the-art performance for both discrete emotion recognition and continuous valence-arousal estimation, highlighting improved robustness and dynamic discrimination. The approach provides a principled, interpretable foundation for modeling affective dynamics and holds promise for generalization across varied real-world conditions and potential multi-modal extensions.

Abstract

Video-based Affective Computing (VAC), vital for emotion analysis and human-computer interaction, suffers from model instability and representational degradation due to complex emotional dynamics. Since the meaning of different emotional fluctuations may differ under different emotional contexts, the core limitation is the lack of a hierarchical structural mechanism to disentangle distinct affective components, i.e., emotional bases (the long-term emotional tone), and transient fluctuations (the short-term emotional fluctuations). To address this, we propose the Low-Rank Sparse Emotion Understanding Framework (LSEF), a unified model grounded in the Low-Rank Sparse Principle, which theoretically reframes affective dynamics as a hierarchical low-rank sparse compositional process. LSEF employs three plug-and-play modules, i.e., the Stability Encoding Module (SEM) captures low-rank emotional bases; the Dynamic Decoupling Module (DDM) isolates sparse transient signals; and the Consistency Integration Module (CIM) reconstructs multi-scale stability and reactivity coherence. This framework is optimized by a Rank Aware Optimization (RAO) strategy that adaptively balances gradient smoothness and sensitivity. Extensive experiments across multiple datasets confirm that LSEF significantly enhances robustness and dynamic discrimination, which further validates the effectiveness and generality of hierarchical low-rank sparse modeling for understanding affective dynamics.

Paper Structure

This paper contains 23 sections, 14 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Illustration of the inherent low-rank sparse structure of affective dynamics. The smooth low-rank curve (blue) represents stable emotional states, while the sparse curve (red) highlights transient expressive surges.
  • Figure 2: An overview of the proposed Low-Rank Sparse Emotion Understanding Framework (LSEF).
  • Figure 3: Visualization of the learned feature maps.
  • Figure 4: