Table of Contents
Fetching ...

BriMA: Bridged Modality Adaptation for Multi-Modal Continual Action Quality Assessment

Kanglei Zhou, Chang Li, Qingyi Pan, Liyuan Wang

TL;DR

BriMA introduces Bridged Modality Adaptation to tackle non-stationary modality imbalance in multi-modal continual action quality assessment. It couples Memory-Guided Bridging Imputation, which retrieves structurally aligned exemplars and applies a residual correction to reconstruct missing modalities, with Modality-Aware Replay Optimization, which curates and prioritizes replay samples to counterdistribution drift. Across RG, Fis-V, and FS1000, BriMA yields consistent improvements in SRCC and reductions in MSE/RL2 under varying missing-modality rates, while maintaining efficient computation. The framework demonstrates practical robustness for real-world deployments where sensor failures and annotation gaps create evolving modality availability. BriMA’s bridging-and-replay paradigm also shows promise for generalization to other multi-modal regression tasks beyond AQA.

Abstract

Action Quality Assessment (AQA) aims to score how well an action is performed and is widely used in sports analysis, rehabilitation assessment, and human skill evaluation. Multi-modal AQA has recently achieved strong progress by leveraging complementary visual and kinematic cues, yet real-world deployments often suffer from non-stationary modality imbalance, where certain modalities become missing or intermittently available due to sensor failures or annotation gaps. Existing continual AQA methods overlook this issue and assume that all modalities remain complete and stable throughout training, which restricts their practicality. To address this challenge, we introduce Bridged Modality Adaptation (BriMA), an innovative approach to multi-modal continual AQA under modality-missing conditions. BriMA consists of a memory-guided bridging imputation module that reconstructs missing modalities using both task-agnostic and task-specific representations, and a modality-aware replay mechanism that prioritizes informative samples based on modality distortion and distribution drift. Experiments on three representative multi-modal AQA datasets (RG, Fis-V, and FS1000) show that BriMA consistently improves performance under different modality-missing conditions, achieving 6--8\% higher correlation and 12--15\% lower error on average. These results demonstrate a step toward robust multi-modal AQA systems under real-world deployment constraints.

BriMA: Bridged Modality Adaptation for Multi-Modal Continual Action Quality Assessment

TL;DR

BriMA introduces Bridged Modality Adaptation to tackle non-stationary modality imbalance in multi-modal continual action quality assessment. It couples Memory-Guided Bridging Imputation, which retrieves structurally aligned exemplars and applies a residual correction to reconstruct missing modalities, with Modality-Aware Replay Optimization, which curates and prioritizes replay samples to counterdistribution drift. Across RG, Fis-V, and FS1000, BriMA yields consistent improvements in SRCC and reductions in MSE/RL2 under varying missing-modality rates, while maintaining efficient computation. The framework demonstrates practical robustness for real-world deployments where sensor failures and annotation gaps create evolving modality availability. BriMA’s bridging-and-replay paradigm also shows promise for generalization to other multi-modal regression tasks beyond AQA.

Abstract

Action Quality Assessment (AQA) aims to score how well an action is performed and is widely used in sports analysis, rehabilitation assessment, and human skill evaluation. Multi-modal AQA has recently achieved strong progress by leveraging complementary visual and kinematic cues, yet real-world deployments often suffer from non-stationary modality imbalance, where certain modalities become missing or intermittently available due to sensor failures or annotation gaps. Existing continual AQA methods overlook this issue and assume that all modalities remain complete and stable throughout training, which restricts their practicality. To address this challenge, we introduce Bridged Modality Adaptation (BriMA), an innovative approach to multi-modal continual AQA under modality-missing conditions. BriMA consists of a memory-guided bridging imputation module that reconstructs missing modalities using both task-agnostic and task-specific representations, and a modality-aware replay mechanism that prioritizes informative samples based on modality distortion and distribution drift. Experiments on three representative multi-modal AQA datasets (RG, Fis-V, and FS1000) show that BriMA consistently improves performance under different modality-missing conditions, achieving 6--8\% higher correlation and 12--15\% lower error on average. These results demonstrate a step toward robust multi-modal AQA systems under real-world deployment constraints.
Paper Structure (18 sections, 12 equations, 9 figures, 9 tables, 1 algorithm)

This paper contains 18 sections, 12 equations, 9 figures, 9 tables, 1 algorithm.

Figures (9)

  • Figure 1: Our motivation: non-stationary modality imbalance (a) significantly challenges continual AQA performance (b).
  • Figure 2: Overview of BriMA. At each session, incomplete multi-modal inputs are encoded and completed via memory-guided bridging imputation (see \ref{['sec:mbi']}), then scored by the AQA model. The modality-aware replay optimization (see \ref{['sec:mro']}) selects and updates representative prototypes in the memory bank to maintain consistency in continual adaptation across tasks.
  • Figure 3: Results of different candidates.
  • Figure 4: Results of different imputation strategies.
  • Figure 5: Task-wise SRCC performance comparison.
  • ...and 4 more figures