Intrinsic Barriers to Explaining Deep Foundation Models

Zhen Tan; Huan Liu

Intrinsic Barriers to Explaining Deep Foundation Models

Zhen Tan, Huan Liu

TL;DR

The feasibility of achieving satisfactory explanations of deep foundation models is probed and the implications for how the verification and governance of these powerful technologies must approach the verification and governance are considered.

Abstract

Deep Foundation Models (DFMs) offer unprecedented capabilities but their increasing complexity presents profound challenges to understanding their internal workings-a critical need for ensuring trust, safety, and accountability. As we grapple with explaining these systems, a fundamental question emerges: Are the difficulties we face merely temporary hurdles, awaiting more sophisticated analytical techniques, or do they stem from \emph{intrinsic barriers} deeply rooted in the nature of these large-scale models themselves? This paper delves into this critical question by examining the fundamental characteristics of DFMs and scrutinizing the limitations encountered by current explainability methods when confronted with this inherent challenge. We probe the feasibility of achieving satisfactory explanations and consider the implications for how we must approach the verification and governance of these powerful technologies.

Intrinsic Barriers to Explaining Deep Foundation Models

TL;DR

Abstract

Intrinsic Barriers to Explaining Deep Foundation Models

TL;DR

Abstract

Paper Structure

Table of Contents