Table of Contents
Fetching ...

On the Internal Representations of Graph Metanetworks

Taesun Yeom, Jaeho Lee

TL;DR

This paper addresses the question of what Graph Metanetworks (GMNs) learn from neural-network parameters by studying their internal representations via Centered Kernel Alignment. It contrasts GMNs with conventional data-driven networks (MLPs and CNNs) on implicit neural representation (INR) classification tasks across MNIST, Fashion-MNIST, and CIFAR-10, revealing that GMNs develop representations that are highly sensitive to random initialization and differ from those learned by standard NNs. The authors show that cross-architecture representation similarity, as measured by CKA, is low between GMNs and general NNs despite similar accuracies, and GMNs' predictions can diverge, with some cases where GMNs correct while NN fail. The findings suggest that weight-space learning captures complementary representations to image-based learning, with implications for metanetwork design and task selection.

Abstract

Weight space learning is an emerging paradigm in the deep learning community. The primary goal of weight space learning is to extract informative features from a set of parameters using specially designed neural networks, often referred to as \emph{metanetworks}. However, it remains unclear how these metanetworks learn solely from parameters. To address this, we take the first step toward understanding \emph{representations} of metanetworks, specifically graph metanetworks (GMNs), which achieve state-of-the-art results in this field, using centered kernel alignment (CKA). Through various experiments, we reveal that GMNs and general neural networks (\textit{e.g.,} multi-layer perceptrons (MLPs) and convolutional neural networks (CNNs)) differ in terms of their representation space.

On the Internal Representations of Graph Metanetworks

TL;DR

This paper addresses the question of what Graph Metanetworks (GMNs) learn from neural-network parameters by studying their internal representations via Centered Kernel Alignment. It contrasts GMNs with conventional data-driven networks (MLPs and CNNs) on implicit neural representation (INR) classification tasks across MNIST, Fashion-MNIST, and CIFAR-10, revealing that GMNs develop representations that are highly sensitive to random initialization and differ from those learned by standard NNs. The authors show that cross-architecture representation similarity, as measured by CKA, is low between GMNs and general NNs despite similar accuracies, and GMNs' predictions can diverge, with some cases where GMNs correct while NN fail. The findings suggest that weight-space learning captures complementary representations to image-based learning, with implications for metanetwork design and task selection.

Abstract

Weight space learning is an emerging paradigm in the deep learning community. The primary goal of weight space learning is to extract informative features from a set of parameters using specially designed neural networks, often referred to as \emph{metanetworks}. However, it remains unclear how these metanetworks learn solely from parameters. To address this, we take the first step toward understanding \emph{representations} of metanetworks, specifically graph metanetworks (GMNs), which achieve state-of-the-art results in this field, using centered kernel alignment (CKA). Through various experiments, we reveal that GMNs and general neural networks (\textit{e.g.,} multi-layer perceptrons (MLPs) and convolutional neural networks (CNNs)) differ in terms of their representation space.

Paper Structure

This paper contains 13 sections, 1 equation, 4 figures, 2 tables.

Figures (4)

  • Figure 1: GMN vs. General NN.
  • Figure 2: Averaged CKA over five pairs of random GMNs, MLPs, and CNNs in MNIST classification. The skyblue plot represents the averaged $\overline{\text{CKA}}(\textcolor{skyblue}{3},\textbf{a})$, while the blue plot represents the averaged $\overline{\text{CKA}}(\textcolor{blue}{4},\textbf{a})$ for each architecture.
  • Figure 3: Averaged CKA over five pairs of random GMNs, MLPs, and CNNs in Fashion-MNIST and CIFAR-10 classification. The skyblue plot represents the averaged $\overline{\text{CKA}}(\textcolor{skyblue}{3},\textbf{a})$, while the blue plot represents the averaged $\overline{\text{CKA}}(\textcolor{blue}{4},\textbf{a})$ for each architecture. Note that the range of $\overline{\text{CKA}}$ varies significantly across datasets.
  • Figure 4: Sample analysis. A sample in which the prediction of general NNs is always incorrect, while that of GMNs is always correct, across different initialization.