Table of Contents
Fetching ...

Explaining Generalization Power of a DNN Using Interactive Concepts

Huilin Zhou, Hao Zhang, Huiqi Deng, Dongrui Liu, Wen Shen, Shih-Han Chan, Quanshi Zhang

TL;DR

This work tackles the lack of a formal notion of concepts inside DNNs by adopting a sparse interactive-concept framework based on $I(S|\boldsymbol{x})$, a Harsanyi interaction that represents an AND relationship among input variables. It shows that only a small subset $\Omega_{salient}$ of all possible interactions matters, and that low-order concepts generalize better while high-order concepts are prone to inconsistency and over-fitting, with their variance growing roughly exponentially with order. The authors provide both empirical and analytic evidence: (i) distributions of concepts across orders, (ii) a distribution-based measure of generalization for $m$-order concepts, (iii) an inconsistency ratio under adversarial perturbations that rises with order, and (iv) a Taylor-expansion analytic argument coupled with Gaussian noise to explain high-order instability. They also demonstrate detouring learning dynamics where high-order concepts are learned after lower-order ones and often as mixtures, especially in over-fitted models trained with label noise $\rho$, offering a principled lens to understand and potentially improve generalization. Overall, the paper advances a concept-powered explanation of DNN generalization with implications for model evaluation and training strategies.

Abstract

This paper explains the generalization power of a deep neural network (DNN) from the perspective of interactions. Although there is no universally accepted definition of the concepts encoded by a DNN, the sparsity of interactions in a DNN has been proved, i.e., the output score of a DNN can be well explained by a small number of interactions between input variables. In this way, to some extent, we can consider such interactions as interactive concepts encoded by the DNN. Therefore, in this paper, we derive an analytic explanation of inconsistency of concepts of different complexities. This may shed new lights on using the generalization power of concepts to explain the generalization power of the entire DNN. Besides, we discover that the DNN with stronger generalization power usually learns simple concepts more quickly and encodes fewer complex concepts. We also discover the detouring dynamics of learning complex concepts, which explains both the high learning difficulty and the low generalization power of complex concepts. The code will be released when the paper is accepted.

Explaining Generalization Power of a DNN Using Interactive Concepts

TL;DR

This work tackles the lack of a formal notion of concepts inside DNNs by adopting a sparse interactive-concept framework based on , a Harsanyi interaction that represents an AND relationship among input variables. It shows that only a small subset of all possible interactions matters, and that low-order concepts generalize better while high-order concepts are prone to inconsistency and over-fitting, with their variance growing roughly exponentially with order. The authors provide both empirical and analytic evidence: (i) distributions of concepts across orders, (ii) a distribution-based measure of generalization for -order concepts, (iii) an inconsistency ratio under adversarial perturbations that rises with order, and (iv) a Taylor-expansion analytic argument coupled with Gaussian noise to explain high-order instability. They also demonstrate detouring learning dynamics where high-order concepts are learned after lower-order ones and often as mixtures, especially in over-fitted models trained with label noise , offering a principled lens to understand and potentially improve generalization. Overall, the paper advances a concept-powered explanation of DNN generalization with implications for model evaluation and training strategies.

Abstract

This paper explains the generalization power of a deep neural network (DNN) from the perspective of interactions. Although there is no universally accepted definition of the concepts encoded by a DNN, the sparsity of interactions in a DNN has been proved, i.e., the output score of a DNN can be well explained by a small number of interactions between input variables. In this way, to some extent, we can consider such interactions as interactive concepts encoded by the DNN. Therefore, in this paper, we derive an analytic explanation of inconsistency of concepts of different complexities. This may shed new lights on using the generalization power of concepts to explain the generalization power of the entire DNN. Besides, we discover that the DNN with stronger generalization power usually learns simple concepts more quickly and encodes fewer complex concepts. We also discover the detouring dynamics of learning complex concepts, which explains both the high learning difficulty and the low generalization power of complex concepts. The code will be released when the paper is accepted.
Paper Structure (11 sections, 3 theorems, 6 equations, 9 figures)

This paper contains 11 sections, 3 theorems, 6 equations, 9 figures.

Key Result

Theorem 1

An input sample $\boldsymbol{x}$ can be masked in $2^n$ ways by sampling different $T \subseteq N$. For any randomly masked sample $\boldsymbol{x}_T$, ren2021AOG have proved that

Figures (9)

  • Figure 1: Interactions encoded by the DNN. Each interaction $S$ represents an AND relationship between a set of input variables (e.g., image regions). Masking any patches in $S_{\text{face}}$ will deactivate the interaction, making $I(S_{\text{face}} \vert \boldsymbol{x})=0$.
  • Figure 2: Interactive concepts encoded by a DNN are usually very sparse. This phenomenon exists in various DNNs trained on different datasets. We sort the interactive concepts to a decreasing order of the interaction strength $\vert I(S \vert \boldsymbol{x}) \vert$.
  • Figure 3: (a) Histogramfn:bottleneck of salient concepts of different orders, $\vert \Omega_{\tau\text{-salient}} ^{(s)} \vert$. (b) Visualization of salient concepts of different orders. Salient concepts are usually made up by image patches that contain discriminative parts of the object.
  • Figure 4: Average similarity between interactive concepts from training samples and those extracted from testing samples.
  • Figure 5: Comparison of the ratio $r^{(m)}$ of inconsistent concepts over different orders. High-order interactive concepts are usually more likely to make inconsistent effects on given noisy data, which verifies Theorem \ref{['theorem1']}.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Theorem 1
  • Lemma 1
  • Theorem 2