Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective

Kaifang Long; Lianbo Ma; Jiaqi Liu; Liming Liu; Guoyang Xie

Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective

Kaifang Long, Lianbo Ma, Jiaqi Liu, Liming Liu, Guoyang Xie

TL;DR

A novel denoising framework called IB-IUMAD is introduced, which exploits the complementary benefits of the Mamba decoder and information bottleneck fusion module, and serves to filter out redundant features from the fused features, thus explicitly preserving discriminative information.

Abstract

The quest for incremental unified multimodal anomaly detection seeks to empower a single model with the ability to systematically detect anomalies across all categories and support incremental learning to accommodate emerging objects/categories. Central to this pursuit is resolving the catastrophic forgetting dilemma, which involves acquiring new knowledge while preserving prior learned knowledge. Despite some efforts to address this dilemma, a key oversight persists: ignoring the potential impact of spurious and redundant features on catastrophic forgetting. In this paper, we delve into the negative effect of spurious and redundant features on this dilemma in incremental unified frameworks, and reveal that under similar conditions, the multimodal framework developed by naive aggregation of unimodal architectures is more prone to forgetting. To address this issue, we introduce a novel denoising framework called IB-IUMAD, which exploits the complementary benefits of the Mamba decoder and information bottleneck fusion module: the former dedicated to disentangle inter-object feature coupling, preventing spurious feature interference between objects; the latter serves to filter out redundant features from the fused features, thus explicitly preserving discriminative information. A series of theoretical analyses and experiments on MVTec 3D-AD and Eyecandies datasets demonstrates the effectiveness and competitive performance of IB-IUMAD.

Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective

TL;DR

Abstract

Paper Structure (16 sections, 11 equations, 4 figures, 7 tables)

This paper contains 16 sections, 11 equations, 4 figures, 7 tables.

Introduction
Spurious and Redundant Features Impact
Problem Formulation
Empirical Study Settings
Empirical Observations
Method
Overview
Mamba Decoder
Information Bottleneck Fusion Module
Theoretical Analysis
Evaluation
Dataset and Experimental Setting
Quantitative Evaluation
Ablation Study
Related Work
...and 1 more sections

Figures (4)

Figure 1: Top shows the performance of IUMAD task on the MVTec 3D-AD dataset, where catastrophic forgetting occurs significantly. Bottom is the comparison of our incremental unified multimodal framework with previous paradigms.
Figure 2: The impact of spurious and redundant features on catastrophic forgetting in incremental unified frameworks.
Figure 3: The overall framework of IB-IUMAD, where the Mamba decoder and information bottleneck fusion module (IBFM) are the core designs of this work, aims to mitigate the effects of spurious and redundant features on catastrophic forgetting.
Figure 4: Visualizations results on MVTec 3D-AD.

Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective

TL;DR

Abstract

Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective

Authors

TL;DR

Abstract

Table of Contents

Figures (4)