GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs
Jiani Zhang, Xingjian Shi, Junyuan Xie, Hao Ma, Irwin King, Dit-Yan Yeung
TL;DR
The paper introduces GaAN, a gated attention network that assigns learnable gates to each attention head in graph aggregators, enabling selective use of information from neighbors. It provides a unified framework to convert graph aggregators into graph recurrent units, illustrated by the Graph GRU (GGRU) for spatiotemporal forecasting. Empirical results on inductive node classification (PPI, Reddit) and traffic speed forecasting (METR-LA) show state-of-the-art performance, with ablations confirming the benefit of head gates and sampling strategies. The work offers a scalable, flexible approach for both static and dynamic graph tasks, with potential extensions to edge features and NLP applications.
Abstract
We propose a new network architecture, Gated Attention Networks (GaAN), for learning on graphs. Unlike the traditional multi-head attention mechanism, which equally consumes all attention heads, GaAN uses a convolutional sub-network to control each attention head's importance. We demonstrate the effectiveness of GaAN on the inductive node classification problem. Moreover, with GaAN as a building block, we construct the Graph Gated Recurrent Unit (GGRU) to address the traffic speed forecasting problem. Extensive experiments on three real-world datasets show that our GaAN framework achieves state-of-the-art results on both tasks.
