Table of Contents
Fetching ...

On Catastrophic Inheritance of Large Foundation Models

Hao Chen, Bhiksha Raj, Xing Xie, Jindong Wang

TL;DR

The challenges behind this issue are discussed and UIM, a framework to Understand the catastrophic inheritance of LFMs from both pre-training and downstream adaptation, is proposed, to Interpret the implications of catastrophic inheritance on downstream tasks, and how to Mitigate it.

Abstract

Large foundation models (LFMs) are claiming incredible performances. Yet great concerns have been raised about their mythic and uninterpreted potentials not only in machine learning, but also in various other disciplines. In this position paper, we propose to identify a neglected issue deeply rooted in LFMs: Catastrophic Inheritance, describing the weaknesses and limitations inherited from biased large-scale pre-training data to behaviors of LFMs on the downstream tasks, including samples that are corrupted, long-tailed, noisy, out-of-distributed, to name a few. Such inheritance can potentially cause catastrophes to downstream applications, such as bias, lack of generalization, deteriorated performance, security vulnerability, privacy leakage, and value misalignment. We discuss the challenges behind this issue and propose UIM, a framework to Understand the catastrophic inheritance of LFMs from both pre-training and downstream adaptation, Interpret the implications of catastrophic inheritance on downstream tasks, and how to Mitigate it. UIM aims to unite both the machine learning and social sciences communities for more responsible and promising AI development and deployment.

On Catastrophic Inheritance of Large Foundation Models

TL;DR

The challenges behind this issue are discussed and UIM, a framework to Understand the catastrophic inheritance of LFMs from both pre-training and downstream adaptation, is proposed, to Interpret the implications of catastrophic inheritance on downstream tasks, and how to Mitigate it.

Abstract

Large foundation models (LFMs) are claiming incredible performances. Yet great concerns have been raised about their mythic and uninterpreted potentials not only in machine learning, but also in various other disciplines. In this position paper, we propose to identify a neglected issue deeply rooted in LFMs: Catastrophic Inheritance, describing the weaknesses and limitations inherited from biased large-scale pre-training data to behaviors of LFMs on the downstream tasks, including samples that are corrupted, long-tailed, noisy, out-of-distributed, to name a few. Such inheritance can potentially cause catastrophes to downstream applications, such as bias, lack of generalization, deteriorated performance, security vulnerability, privacy leakage, and value misalignment. We discuss the challenges behind this issue and propose UIM, a framework to Understand the catastrophic inheritance of LFMs from both pre-training and downstream adaptation, Interpret the implications of catastrophic inheritance on downstream tasks, and how to Mitigate it. UIM aims to unite both the machine learning and social sciences communities for more responsible and promising AI development and deployment.
Paper Structure (13 sections, 1 equation, 2 figures, 2 tables)

This paper contains 13 sections, 1 equation, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Illustration of catastrophic inheritance. Large foundation models pre-trained on biased datasets may cause significantly malicious consequence to various downstream tasks (rf. \ref{['tb-examples']}).
  • Figure 2: The UIM framework addressing catastrophic inheritance from understanding, interpretation, and mitigation.

Theorems & Definitions (1)

  • Definition 1