Table of Contents
Fetching ...

SoK: The Security-Safety Continuum of Multimodal Foundation Models through Information Flow and Global Game-Theoretic Analysis of Asymmetric Threats

Ruoxi Sun, Jiamin Chang, Hammond Pearce, Chaowei Xiao, Bo Li, Qi Wu, Surya Nepal, Minhui Xue

TL;DR

This paper addresses the intertwined safety and security challenges of multimodal foundation models by introducing an information-theoretic SoK that maps information flow to channel concepts. It develops a deterministic minimax defense framework and a Defense Coverage Index (DCI) to evaluate 15 defenses against a broad taxonomy of model- and system-level threats, framed around six information flows. The study shows that system-level bandwidth constraints and architectural compartmentalization provide more general and robust protection than model-only defenses, and it formalizes a self-destructive circuit-breaker as a last-resort safeguard. Overall, the work establishes principled foundations for analyzing MFM vulnerabilities and guiding future defenses, highlighting the need for cross-cutting, architecture-aware protections in high-stakes deployments.

Abstract

Multimodal foundation models (MFMs) integrate diverse data modalities to support complex and wide-ranging tasks. However, this integration also introduces distinct safety and security challenges. In this paper, we unify the concepts of safety and security in the context of MFMs by identifying critical threats that arise from both model behavior and system-level interactions. We propose a taxonomy grounded in information theory, evaluating risks through the concepts of channel capacity, signal, noise, and bandwidth. This perspective provides a principled way to analyze how information flows through MFMs and how vulnerabilities can emerge across modalities. Building on this foundation, we introduce a deterministic minimax formulation to analyze defense mechanisms and expose structural vulnerabilities in multimodal systems. Our framework projects attacks onto the noise, signal, and bandwidth axes, collapsing the defense search space and mitigating defender asymmetry. Across 15 defenses, we find that system-level bandwidth and behavior constraints generalize substantially better than brittle model-only methods. Finally, we formalize an MFM "self-destruction threshold" that specifies when termination should be triggered, providing a concrete activation rule for circuit-breaker safeguards within multimodal systems.

SoK: The Security-Safety Continuum of Multimodal Foundation Models through Information Flow and Global Game-Theoretic Analysis of Asymmetric Threats

TL;DR

This paper addresses the intertwined safety and security challenges of multimodal foundation models by introducing an information-theoretic SoK that maps information flow to channel concepts. It develops a deterministic minimax defense framework and a Defense Coverage Index (DCI) to evaluate 15 defenses against a broad taxonomy of model- and system-level threats, framed around six information flows. The study shows that system-level bandwidth constraints and architectural compartmentalization provide more general and robust protection than model-only defenses, and it formalizes a self-destructive circuit-breaker as a last-resort safeguard. Overall, the work establishes principled foundations for analyzing MFM vulnerabilities and guiding future defenses, highlighting the need for cross-cutting, architecture-aware protections in high-stakes deployments.

Abstract

Multimodal foundation models (MFMs) integrate diverse data modalities to support complex and wide-ranging tasks. However, this integration also introduces distinct safety and security challenges. In this paper, we unify the concepts of safety and security in the context of MFMs by identifying critical threats that arise from both model behavior and system-level interactions. We propose a taxonomy grounded in information theory, evaluating risks through the concepts of channel capacity, signal, noise, and bandwidth. This perspective provides a principled way to analyze how information flows through MFMs and how vulnerabilities can emerge across modalities. Building on this foundation, we introduce a deterministic minimax formulation to analyze defense mechanisms and expose structural vulnerabilities in multimodal systems. Our framework projects attacks onto the noise, signal, and bandwidth axes, collapsing the defense search space and mitigating defender asymmetry. Across 15 defenses, we find that system-level bandwidth and behavior constraints generalize substantially better than brittle model-only methods. Finally, we formalize an MFM "self-destruction threshold" that specifies when termination should be triggered, providing a concrete activation rule for circuit-breaker safeguards within multimodal systems.

Paper Structure

This paper contains 40 sections, 16 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: We propose an information-theoretic framework that unifies safety and security in MFMs, use it to categorize threats at the model and system levels, and analyze defenses as a minimax game between attackers and defenders, revealing critical gaps in current research.
  • Figure 2: An illustration of information flows in MFM systems (represented by arrows).
  • Figure 3: The taxonomy of threats at the model level.
  • Figure 4: The taxonomy of threats at the system level.
  • Figure 5: An illustration of multimodal learning.
  • ...and 2 more figures