Multimodal Learning with Uncertainty Quantification based on Discounted Belief Fusion
Grigor Bezirganyan, Sana Sellami, Laure Berti-Équille, Sébastien Fournier
TL;DR
Multimodal learning often faces uncertainty from noise and conflicts between modalities, which can lead to overconfident incorrect predictions. The authors propose Discounted Belief Fusion (DBF), an order-invariant fusion scheme built on subjective logic, using a conflict-based discounting mechanism to reallocate mass toward uncertainty when modalities conflict; it relies on generalized belief averaging to scale to $V$ modalities and a per-sample discounting factor $\,\eta^v\,$ derived from a conflict-controlled agreement matrix, with a hyperparameter $\lambda$ controlling discount strength. They also train multimodal evidential networks that emit Dirichlet-based evidence and uncertainties via an exponential output activation and a three-term loss with KL-divergence regularization and a consistency term. Experiments on five benchmarks show improved separation between conflictive and non-conflictive uncertainties while maintaining accuracy, demonstrating practical gains in reliability and interpretability for safety-critical multimodal tasks. This work enables robust, scalable, and uncertainty-aware decision making in multimodal AI.
Abstract
Multimodal AI models are increasingly used in fields like healthcare, finance, and autonomous driving, where information is drawn from multiple sources or modalities such as images, texts, audios, videos. However, effectively managing uncertainty - arising from noise, insufficient evidence, or conflicts between modalities - is crucial for reliable decision-making. Current uncertainty-aware machine learning methods leveraging, for example, evidence averaging, or evidence accumulation underestimate uncertainties in high-conflict scenarios. Moreover, the state-of-the-art evidence averaging strategy is not order invariant and fails to scale to multiple modalities. To address these challenges, we propose a novel multimodal learning method with order-invariant evidence fusion and introduce a conflict-based discounting mechanism that reallocates uncertain mass when unreliable modalities are detected. We provide both theoretical analysis and experimental validation, demonstrating that unlike the previous work, the proposed approach effectively distinguishes between conflicting and non-conflicting samples based on the provided uncertainty estimates, and outperforms the previous models in uncertainty-based conflict detection.
