Table of Contents
Fetching ...

Normalized Space Alignment: A Versatile Metric for Representation Analysis

Danish Ebadulla, Aditya Gulati, Ambuj Singh

TL;DR

Normalized Space Alignment can act as both an analytical tool and a differentiable loss function, providing a robust means of comparing and aligning representations across different layers and models, and can approximate the global structural discrepancy during mini-batching.

Abstract

We introduce a manifold analysis technique for neural network representations. Normalized Space Alignment (NSA) compares pairwise distances between two point clouds derived from the same source and having the same size, while potentially possessing differing dimensionalities. NSA can act as both an analytical tool and a differentiable loss function, providing a robust means of comparing and aligning representations across different layers and models. It satisfies the criteria necessary for both a similarity metric and a neural network loss function. We showcase NSA's versatility by illustrating its utility as a representation space analysis metric, a structure-preserving loss function, and a robustness analysis tool. NSA is not only computationally efficient but it can also approximate the global structural discrepancy during mini-batching, facilitating its use in a wide variety of neural network training paradigms.

Normalized Space Alignment: A Versatile Metric for Representation Analysis

TL;DR

Normalized Space Alignment can act as both an analytical tool and a differentiable loss function, providing a robust means of comparing and aligning representations across different layers and models, and can approximate the global structural discrepancy during mini-batching.

Abstract

We introduce a manifold analysis technique for neural network representations. Normalized Space Alignment (NSA) compares pairwise distances between two point clouds derived from the same source and having the same size, while potentially possessing differing dimensionalities. NSA can act as both an analytical tool and a differentiable loss function, providing a robust means of comparing and aligning representations across different layers and models. It satisfies the criteria necessary for both a similarity metric and a neural network loss function. We showcase NSA's versatility by illustrating its utility as a representation space analysis metric, a structure-preserving loss function, and a robustness analysis tool. NSA is not only computationally efficient but it can also approximate the global structural discrepancy during mini-batching, facilitating its use in a wide variety of neural network training paradigms.

Paper Structure

This paper contains 65 sections, 18 theorems, 59 equations, 25 figures, 11 tables.

Key Result

Lemma B.1

Let $X$ be a point cloud over some space, then $\mathsf{GNSA}(X,X)=0$.

Figures (25)

  • Figure 1: Specificity (Left) and Sensitivity (Right) Tests. Left (from top to bottom): CKA', RTD and NSA pairwise distances between each layer of two differently initialized networks. Right: Dissimilarities between a layer's representation and its low-rank approximation. Principal components are deleted in order of least variance.
  • Figure 2: Expectation of subset metrics over a large number of trials. (a) Mean subset GlobalNSA variation over increasing trials. (b) Mean subset RTD variation over increasing trials.
  • Figure 3: Robustness tests on GNN architectures with NSA. (a) Misclassification Rate against Data Perturbation Rate under global evasion attack. (b) NSA against perturbation rate under global evasion attack. (c) Misclassification Rate against Data Perturbation Rate under global poisoning attack. (d) NSA against perturbation rate under global poisoning attack
  • Figure 4: Analyzing node vulnerability with NSA. (a) Nodewise NSA of the 50 nodes with the greatest decline in Classification Confidence. SVD-GCN has the highest nodewise NSA variations. (b) Distribution of Classification Confidence before and after an evasion attack on various Graph Neural Network architectures. A suffix of 'C' after the architecture name refers to clean dataset results and a suffix of 'E' refers to confidence on the poisoned dataset (c) Increase in number of boundary nodes for each model post attack and its correlation with the NSA of the boundary nodes.
  • Figure 5: Running time of NSA over increasing batch sizes. (a) Running time of GlobalNSA over increasing batch size. (b) Running time of NSA over increasing batch size
  • ...and 20 more figures

Theorems & Definitions (31)

  • Definition 3.1
  • Lemma B.1: Identity
  • proof
  • Lemma B.2: Symmetry
  • proof
  • Lemma B.3: Non-negativity
  • proof
  • Lemma B.4: Triangle inequality
  • proof
  • Lemma C.1: Invariance to Isotropic scaling in the first coordinate
  • ...and 21 more