MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates

Binyu Zhao; Wei Zhang; Zhaonian Zou

MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates

Binyu Zhao, Wei Zhang, Zhaonian Zou

TL;DR

This work tackles the challenge of imbalanced missing rates in multi-modal learning, where underrepresented modalities are both under-sampled and under-learned. It introduces Modality Capability Enhancement (MCE), a dual-component framework combining Learning Capability Enhancement (LCE) to balance learning dynamics with dataset-level and batch-level incentives, and Representational Capability Enhancement (RCE) to improve feature semantics via subset prediction and cross-modal completion. A Shapley-value–based mechanism drives adaptive incentives, complemented by a Transformer-based cross-modal reconstruction module, yielding robust representations across arbitrary modality subsets. Across four benchmarks, MCE consistently outperforms state-of-the-art baselines and demonstrates strong resilience to severe missingness, offering a principled, generalizable approach to real-world incomplete multi-modal data.

Abstract

Multi-modal learning has made significant advances across diverse pattern recognition applications. However, handling missing modalities, especially under imbalanced missing rates, remains a major challenge. This imbalance triggers a vicious cycle: modalities with higher missing rates receive fewer updates, leading to inconsistent learning progress and representational degradation that further diminishes their contribution. Existing methods typically focus on global dataset-level balancing, often overlooking critical sample-level variations in modality utility and the underlying issue of degraded feature quality. We propose Modality Capability Enhancement (MCE) to tackle these limitations. MCE includes two synergistic components: i) Learning Capability Enhancement (LCE), which introduces multi-level factors to dynamically balance modality-specific learning progress, and ii) Representation Capability Enhancement (RCE), which improves feature semantics and robustness through subset prediction and cross-modal completion tasks. Comprehensive evaluations on four multi-modal benchmarks show that MCE consistently outperforms state-of-the-art methods under various missing configurations. The final published version is now available at https://doi.org/10.1016/j.patcog.2025.112591. Our code is available at https://github.com/byzhaoAI/MCE.

MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates

TL;DR

Abstract

MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)

Theorems & Definitions (2)