Upper Bound of Bayesian Generalization Error in Partial Concept Bottleneck Model (CBM): Partial CBM outperforms naive CBM

Naoki Hayashi; Yoshihide Sawada

Upper Bound of Bayesian Generalization Error in Partial Concept Bottleneck Model (CBM): Partial CBM outperforms naive CBM

Naoki Hayashi, Yoshihide Sawada

TL;DR

The result indcates that the structure of partially observed concepts decreases the Bayesian generalization error compared with that of CBM (full-observed concepts).

Abstract

Concept Bottleneck Model (CBM) is a methods for explaining neural networks. In CBM, concepts which correspond to reasons of outputs are inserted in the last intermediate layer as observed values. It is expected that we can interpret the relationship between the output and concept similar to linear regression. However, this interpretation requires observing all concepts and decreases the generalization performance of neural networks. Partial CBM (PCBM), which uses partially observed concepts, has been devised to resolve these difficulties. Although some numerical experiments suggest that the generalization performance of PCBMs is almost as high as that of the original neural networks, the theoretical behavior of its generalization error has not been yet clarified since PCBM is singular statistical model. In this paper, we reveal the Bayesian generalization error in PCBM with a three-layered and linear architecture. The result indcates that the structure of partially observed concepts decreases the Bayesian generalization error compared with that of CBM (full-observed concepts).

Upper Bound of Bayesian Generalization Error in Partial Concept Bottleneck Model (CBM): Partial CBM outperforms naive CBM

TL;DR

The result indcates that the structure of partially observed concepts decreases the Bayesian generalization error compared with that of CBM (full-observed concepts).

Abstract

Paper Structure (9 sections, 4 theorems, 42 equations, 1 figure)

This paper contains 9 sections, 4 theorems, 42 equations, 1 figure.

Introduction
Related Works
Preliminaries
Framework of Bayesian Inference
Singular Learning Theory
Main Theorem
Discussion
Conclusion
Proof of Main Theorem

Key Result

Theorem 3.1

Let $\lambda$ be the RLCT with regard to $K(w)$. The free energy $F_n$ and the Bayesian generalization error $G_n$ satisfies

Figures (1)

Figure 1: Schematics of CBM and PCBM architectures.

Theorems & Definitions (8)

Definition 3.1: RLCT
Theorem 3.1
Definition 3.2: RLCT of Reduced Rank Regression
Theorem 3.2: Aoyagi
Definition 4.1: RLCT of PCBM
Theorem 4.1: Main Theorem
Theorem 4.2: Bayesian Generalization Error in PCBM
proof

Upper Bound of Bayesian Generalization Error in Partial Concept Bottleneck Model (CBM): Partial CBM outperforms naive CBM

TL;DR

Abstract

Upper Bound of Bayesian Generalization Error in Partial Concept Bottleneck Model (CBM): Partial CBM outperforms naive CBM

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (8)