Table of Contents
Fetching ...

Reliability of CKA as a Similarity Measure in Deep Learning

MohammadReza Davari, Stefan Horoi, Amine Natik, Guillaume Lajoie, Guy Wolf, Eugene Belilovsky

TL;DR

<3-5 sentence high-level summary> The paper investigates the reliability of Centered Kernel Alignment (CKA), particularly its linear variant, as a representation similarity measure in deep learning. It provides a theoretical characterization showing that CKA_lin can be highly sensitive to translations of a subset of representations, including outliers and directions of high variance, and can even decrease despite preserving linear separability. Empirically, the work demonstrates that CKA values and maps can be manipulated with minimal impact on network functionality, prompting manipulation of CKA maps and challenging prior conclusions drawn from CKA heatmaps. The authors advocate using multiple similarity metrics and visual analyses to interpret neural representations, especially when comparing diverse architectures or open pretrained models.

Abstract

Comparing learned neural representations in neural networks is a challenging but important problem, which has been approached in different ways. The Centered Kernel Alignment (CKA) similarity metric, particularly its linear variant, has recently become a popular approach and has been widely used to compare representations of a network's different layers, of architecturally similar networks trained differently, or of models with different architectures trained on the same data. A wide variety of conclusions about similarity and dissimilarity of these various representations have been made using CKA. In this work we present analysis that formally characterizes CKA sensitivity to a large class of simple transformations, which can naturally occur in the context of modern machine learning. This provides a concrete explanation of CKA sensitivity to outliers, which has been observed in past works, and to transformations that preserve the linear separability of the data, an important generalization attribute. We empirically investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counter-intuitive results. Finally we study approaches for modifying representations to maintain functional behaviour while changing the CKA value. Our results illustrate that, in many cases, the CKA value can be easily manipulated without substantial changes to the functional behaviour of the models, and call for caution when leveraging activation alignment metrics.

Reliability of CKA as a Similarity Measure in Deep Learning

TL;DR

<3-5 sentence high-level summary> The paper investigates the reliability of Centered Kernel Alignment (CKA), particularly its linear variant, as a representation similarity measure in deep learning. It provides a theoretical characterization showing that CKA_lin can be highly sensitive to translations of a subset of representations, including outliers and directions of high variance, and can even decrease despite preserving linear separability. Empirically, the work demonstrates that CKA values and maps can be manipulated with minimal impact on network functionality, prompting manipulation of CKA maps and challenging prior conclusions drawn from CKA heatmaps. The authors advocate using multiple similarity metrics and visual analyses to interpret neural representations, especially when comparing diverse architectures or open pretrained models.

Abstract

Comparing learned neural representations in neural networks is a challenging but important problem, which has been approached in different ways. The Centered Kernel Alignment (CKA) similarity metric, particularly its linear variant, has recently become a popular approach and has been widely used to compare representations of a network's different layers, of architecturally similar networks trained differently, or of models with different architectures trained on the same data. A wide variety of conclusions about similarity and dissimilarity of these various representations have been made using CKA. In this work we present analysis that formally characterizes CKA sensitivity to a large class of simple transformations, which can naturally occur in the context of modern machine learning. This provides a concrete explanation of CKA sensitivity to outliers, which has been observed in past works, and to transformations that preserve the linear separability of the data, an important generalization attribute. We empirically investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counter-intuitive results. Finally we study approaches for modifying representations to maintain functional behaviour while changing the CKA value. Our results illustrate that, in many cases, the CKA value can be easily manipulated without substantial changes to the functional behaviour of the models, and call for caution when leveraging activation alignment metrics.
Paper Structure (31 sections, 4 theorems, 24 equations, 17 figures, 1 table, 1 algorithm)

This paper contains 31 sections, 4 theorems, 24 equations, 17 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Consider a set of $n$ internal representations in $p$ dimensions $X\in\mathbb{R}^{n\times p}$ that have been centered column-wise, let $S\subset X$ such that ${\rho=\frac{|S|}{|X|}\leq\frac{1}{2}}$ and ${\bm{v}}{v}$ such that $\|{\bm{v}}{v}\|=1$. We define $X_{S,{\bm{v}}{v},c}=S\cup\{x+c{\bm{v}}{v}: where $\Gamma(\rho) = \frac{\rho}{1-\rho}\in(0,1]$, and ${\dim_{PR}(X) \triangleq \frac{\left(\sum_

Figures (17)

  • Figure 1: Visual representations of the transformations considered in the theoretical results. a) Thm. \ref{['thm:main']}: The original set of neural representations $X$ contains subsets $S$ (red) and $X\backslash S$ (green). We can then build $X_{S,{\bm{v}}{v},c}$ as a copy of $X$, where the points in $X\backslash S$ are translated a distance $c$ in direction ${\bm{v}}{v}$. The linear CKA value between $X$ and $X_{S,{\bm{v}}{v},c}$ is then computed. b) Cor. \ref{['cor:outlier']}:$X$ and $X_{S,{\bm{v}}{v},c}$ differ by a single point, which has been translated by $c{\bm{v}}{v}$ in $X_{S,{\bm{v}}{v},c}$. c) Cor. \ref{['cor:lin_sep']}:$S$ and $X\backslash S$ are linearly separable (red line with orange margins), the transformation made to obtain $X_{S,{\bm{v}}{v},c}$ preserves the linear separability of the data as well as the margins.
  • Figure 2: A layer-wise comparison based on the value of the CKA between a generalized, memorized, and randomly populated network. This comparison reveals that early layers of these networks achieve relatively high CKA values. Mean and standard deviation across 5 seeds are shown.
  • Figure 3: The convolution filters within the first two layers of a generalized, memorized, and a randomly initialized network elucidates that the features are (1) drastically different, and (2) not equally useful despite the CKA results in Fig. \ref{['fig:cka-gen-mem-rand']}
  • Figure 4: a) Linear and RBF CKA values between the artificial representations $X$ and the subset translated version $Y$ as a function of the translation distance. b) CKA value between a CNN's internal representations of the CIFAR10 training set and modified versions where either a class or a single point is translated as functions of the translation distance.
  • Figure 5: Original Map is the CKA map of a network trained on CIFAR10. We manipulate this network to produce CKA maps which: (1) maximizes the CKA similarity between the $1^{\text{st}}$ and last layer, (2) maximizes the CKA similarity between all layers, and (3) minimizes the CKA similarity between all layers. In cases (1) and (2), the network experiences only a slight loss in performance, which counters previous findings by achieving a strong CKA similarity between early and late layers. We find similar results in the kernel CKA case.
  • ...and 12 more figures

Theorems & Definitions (7)

  • Theorem 1
  • Corollary 2
  • Corollary 3
  • Corollary 4
  • proof
  • proof
  • proof