Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

Sahil Sidheekh; Pranuthi Tenali; Saurabh Mathur; Erik Blasch; Kristian Kersting; Sriraam Natarajan

Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

Sahil Sidheekh, Pranuthi Tenali, Saurabh Mathur, Erik Blasch, Kristian Kersting, Sriraam Natarajan

TL;DR

The paper tackles credibility-aware late fusion for noisy, multi-modal data by modeling the joint distribution of unimodal predictions and the target with Probabilistic Circuits (PCs). It defines a principled credibility measure based on divergence and conditional entropy, and introduces two fusion variants: Direct-PC (DPC) and Credibility-Weighted Mean (CWM). The authors establish that PCs enable tractable inference for predictive and credibility queries and demonstrate competitive performance across multiple datasets (AV-MNIST, CUB, NYUD, SUNRGBD) while providing reliable modality credibility estimates. The work offers a principled, robust, and scalable approach to multi-modal fusion with explicit uncertainty and source reliability considerations, with potential impact on safety-critical applications.

Abstract

We consider the problem of late multi-modal fusion for discriminative learning. Motivated by noisy, multi-source domains that require understanding the reliability of each data source, we explore the notion of credibility in the context of multi-modal fusion. We propose a combination function that uses probabilistic circuits (PCs) to combine predictive distributions over individual modalities. We also define a probabilistic measure to evaluate the credibility of each modality via inference queries over the PC. Our experimental evaluation demonstrates that our fusion method can reliably infer credibility while maintaining competitive performance with the state-of-the-art.

Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

TL;DR

Abstract

Paper Structure (15 sections, 4 theorems, 22 equations, 4 figures, 5 tables)

This paper contains 15 sections, 4 theorems, 22 equations, 4 figures, 5 tables.

Introduction
Background
Multi-Modal Fusion
Credibility
Probabilistic Circuits (PCs)
Multimodal fusion via PCs
PCs as Combination Functions
Empirical Evaluation
Performance Evaluation
Credibility Evaluation
Robustness to Noise
Conclusion
Theorems and Proofs
Implementation Details
Experimental Setup

Key Result

Theorem 3.1

The expected credibility $\mathcal{C}^{j}$ of a modality $j$ in predicting the target $Y,$ under a Marginal Dominant distribution is lower bounded by the negative of the conditional entropy $(\mathbb{H})$ of the unimodal predictive distribution of modality $j$ over $Y,$ given the predictive distribu

Figures (4)

Figure 1: Model Diagram for our proposed PC-based fusion method. Each input modality $\mathbf{X}_i$ is processed by a unimodal predictor $\mathcal{M}_{\phi_i}$ to get the corresponding predictive distribution $\mathbf{p}_i$ over the target $Y$. A probabilistic circuit $\theta$ is used to model the joint distribution over the unimodal predictive distributions and $Y$, and the final prediction is obtained by running an inference routine over it, governed by the form of fusion function employed ($\mathcal{M}_{\theta}$).
Figure 2: Mean Validation Relative Credibility obtained using a PC for the two modalities of the AV-MNIST dataset across training epochs. Varying degrees of noise (controlled by $\lambda$) are introduced into the audio modality.
Figure 3: Mean Test Relative Credibility outputted by a PC for the two modalities of the AV-MNIST dataset across varying degrees of noise (controlled by $\lambda$) introduced into each modality.
Figure 4: Robustness to Noise. Mean test performance of late fusion methods across varying degrees of noise.

Theorems & Definitions (10)

Definition 1
Definition 2
Theorem 3.1
proof
Theorem 3.2
proof
Theorem A.1
proof
Theorem A.2
proof

Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

TL;DR

Abstract

Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (10)