Searching for internal symbols underlying deep learning

Jung H. Lee; Sujith Vijayan

Searching for internal symbols underlying deep learning

Jung H. Lee, Sujith Vijayan

TL;DR

It is hypothesized that DNNs can develop abstract codes that can be used to augment DNNs' decision-making, and foundation segmentation models and unsupervised learning are combined to extract internal codes and identify potential use of abstract codes to make DL's decision-making more reliable and safer.

Abstract

Deep learning (DL) enables deep neural networks (DNNs) to automatically learn complex tasks or rules from given examples without instructions or guiding principles. As we do not engineer DNNs' functions, it is extremely difficult to diagnose their decisions, and multiple lines of studies proposed to explain the principles of their operations. Notably, one line of studies suggests that DNNs may learn concepts, the high level features that are recognizable to humans. In this study, we extend this line of studies and hypothesize that DNNs can develop abstract codes that can be used to augment DNNs' decision-making. To address this hypothesis, we combine foundation segmentation models and unsupervised learning to extract internal codes and identify potential use of abstract codes to make DL's decision-making more reliable and safer.

Searching for internal symbols underlying deep learning

TL;DR

Abstract

Paper Structure (17 sections, 5 equations, 10 figures, 6 tables)

This paper contains 17 sections, 5 equations, 10 figures, 6 tables.

Introduction
Extracting symbols underlying DNNs' operations
ROIs Identification
Generating hidden layer activity vectors
Clustering analysis of hidden layer activity vectors
Symbols Analysis
Links between symbols and semantic meanings of inputs
How confident are you, DNNs?
Detecting out-of-distribution (OOD) examples
Symbols used to address the vulnerabilities to adversarial perturbations
Temporary learning
Symbols associated with internal features
Discussion
Limitations
Appendix and Supplementary Material
...and 2 more sections

Figures (10)

Figure 1: Schematics of STCert lee2023having.
Figure 2: The correlations between the symbols of ResNet18 and classes. We show the correlation maps $CM(i,j)$ between 100 symbols randomly chosen and all 78 classes of Mixed_13. $y$-axis denotes the indices of symbols, $x$-axis denotes the class. (A)-(D), the correlations observed in layers 1-4, respectively.
Figure 3: The correlations between symbols and classes. The top, middle and bottom rows show the correlations observed from ResNet50, VGG19 and DenseNet121. $y$-axis denotes the indices of symbols, $x$-axis denotes the class. (A)-(D), the correlations observed in layers 1-4, respectively.
Figure 4: The correlations between symbols and classes in ResNet18. These plots show the correlation maps between 200 consecutive symbols and the labels of inputs observed in layer 1 (A), layer 2 (B), layer 3 (C) and layer 4 (D). We note that some consecutive symbols are correlated with the same classes and thus can be merged together, especially in layer 4.
Figure 5: Symbol-based predictions. (A), Symbol-based prediction on the labels of test examples of Mixed_13. (B), Symbol-based prediction on the DNNs' answers. PET denotes ResNet50 trained on Oxford-IIT PET. L1, L2, L3 and L4 denote the analyzed layers. For ViT, they denote $3^{rd}, 6^{th}, 9^{th}$, and $12^{th}$ embedding layer. For VGG19, they denote $2^{nd}, 3^{rd}, 4^{th}$ and $5^{th}$ Max-pooling layers. For ResNet and DenseNet, they denote all 4 layer blocks.
...and 5 more figures

Searching for internal symbols underlying deep learning

TL;DR

Abstract

Searching for internal symbols underlying deep learning

Authors

TL;DR

Abstract

Table of Contents

Figures (10)