HyperAggregation: Aggregating over Graph Edges with Hypernetworks

Nicolas Lell; Ansgar Scherp

HyperAggregation: Aggregating over Graph Edges with Hypernetworks

Nicolas Lell, Ansgar Scherp

TL;DR

HyperAggregation introduces a hypernetwork-based aggregation that dynamically generates neighborhood-specific weights to aggregate variable-sized graph neighborhoods, enabling GNNs to adapt to each vertex's local structure. Implemented in two architectures, GraphHyperConv (GHC) and GraphHyperMixer (GHM), the approach extends typical message passing with a two-layer hypernetwork that outputs a target weight matrix used for channel mixing across the neighborhood. Empirical results across vertex classification (both homophilic and heterophilic), graph classification, and graph regression show that GHC generally outperforms GHM and achieves a new state-of-the-art on the Roman-Empire heterophilic dataset; graph-level tasks are competitive with similarly sized models. Ablation studies confirm robustness to several hyperparameters and provide guidance on architectural choices, while code and experiments are publicly available for reproducibility.

Abstract

HyperAggregation is a hypernetwork-based aggregation function for Graph Neural Networks. It uses a hypernetwork to dynamically generate weights in the size of the current neighborhood, which are then used to aggregate this neighborhood. This aggregation with the generated weights is done like an MLP-Mixer channel mixing over variable-sized vertex neighborhoods. We demonstrate HyperAggregation in two models, GraphHyperMixer is a model based on MLP-Mixer while GraphHyperConv is derived from a GCN but with a hypernetwork-based aggregation function. We perform experiments on diverse benchmark datasets for the vertex classification, graph classification, and graph regression tasks. The results show that HyperAggregation can be effectively used for homophilic and heterophilic datasets in both inductive and transductive settings. GraphHyperConv performs better than GraphHyperMixer and is especially strong in the transductive setting. On the heterophilic dataset Roman-Empire it reaches a new state of the art. On the graph-level tasks our models perform in line with similarly sized models. Ablation studies investigate the robustness against various hyperparameter choices. The implementation of HyperAggregation as well code to reproduce all experiments is available under https://github.com/Foisunt/HyperAggregation .

HyperAggregation: Aggregating over Graph Edges with Hypernetworks

TL;DR

Abstract

Paper Structure (15 sections, 3 equations, 3 figures, 5 tables)

This paper contains 15 sections, 3 equations, 3 figures, 5 tables.

Introduction
Related Work
Hypernetworks
Classical GNNs
Alternative GNNs
HyperAggregation
Aggregating using Hypernetworks
Models using HyperAggregation
Experimental Apparatus
Datasets
Procedure
Hyperparameter
Results and Discussion
Ablation Studies
Conclusion and Future Work

Figures (3)

Figure 1: Proposed HyperAggregation. The left MLP is the hypernetwork that predicts the target MLP's weights. Its size depends on the vertex's neighborhood size. The target MLP uses those weights to perform the aggregation, i. e., channel mixing across the neighborhood. Arrows are labeled with activation shapes. $W_A$ and $W_B$ are trainable parameters of the Hypernetwork. Dropout and normalization layers are not shown and the use of the leftmost activation $\sigma$ is optional.
Figure 2: Depiction of a single a) GCN, b) GraphHyperConv, and c) GraphHyperMixer layer.
Figure 3: Ablation studies of the mixing and hidden dimension of GraphHyperConv.

HyperAggregation: Aggregating over Graph Edges with Hypernetworks

TL;DR

Abstract

HyperAggregation: Aggregating over Graph Edges with Hypernetworks

Authors

TL;DR

Abstract

Table of Contents

Figures (3)