BriMA: Bridged Modality Adaptation for Multi-Modal Continual Action Quality Assessment
Kanglei Zhou, Chang Li, Qingyi Pan, Liyuan Wang
TL;DR
BriMA introduces Bridged Modality Adaptation to tackle non-stationary modality imbalance in multi-modal continual action quality assessment. It couples Memory-Guided Bridging Imputation, which retrieves structurally aligned exemplars and applies a residual correction to reconstruct missing modalities, with Modality-Aware Replay Optimization, which curates and prioritizes replay samples to counterdistribution drift. Across RG, Fis-V, and FS1000, BriMA yields consistent improvements in SRCC and reductions in MSE/RL2 under varying missing-modality rates, while maintaining efficient computation. The framework demonstrates practical robustness for real-world deployments where sensor failures and annotation gaps create evolving modality availability. BriMA’s bridging-and-replay paradigm also shows promise for generalization to other multi-modal regression tasks beyond AQA.
Abstract
Action Quality Assessment (AQA) aims to score how well an action is performed and is widely used in sports analysis, rehabilitation assessment, and human skill evaluation. Multi-modal AQA has recently achieved strong progress by leveraging complementary visual and kinematic cues, yet real-world deployments often suffer from non-stationary modality imbalance, where certain modalities become missing or intermittently available due to sensor failures or annotation gaps. Existing continual AQA methods overlook this issue and assume that all modalities remain complete and stable throughout training, which restricts their practicality. To address this challenge, we introduce Bridged Modality Adaptation (BriMA), an innovative approach to multi-modal continual AQA under modality-missing conditions. BriMA consists of a memory-guided bridging imputation module that reconstructs missing modalities using both task-agnostic and task-specific representations, and a modality-aware replay mechanism that prioritizes informative samples based on modality distortion and distribution drift. Experiments on three representative multi-modal AQA datasets (RG, Fis-V, and FS1000) show that BriMA consistently improves performance under different modality-missing conditions, achieving 6--8\% higher correlation and 12--15\% lower error on average. These results demonstrate a step toward robust multi-modal AQA systems under real-world deployment constraints.
