Representation Learning on Graphs: Methods and Applications
William L. Hamilton, Rex Ying, Jure Leskovec
TL;DR
This survey synthesizes the landscape of representation learning on graphs, detailing node and subgraph embedding methods from shallow factorization and random-walk approaches to deep, neighborhood-aggregation and graph neural networks. It presents a unified encoder–decoder framework that clarifies how embeddings are learned and what graph information they capture, including extensions to heterogeneous and multi-layer graphs. The authors discuss supervised and unsupervised strategies, highlight practical applications such as visualization, classification, and link prediction, and outline crucial open problems, particularly around scalability, temporal dynamics, higher-order motifs, and interpretability. Overall, the work clarifies core design choices, connects diverse methods, and maps a path for theoretical and practical advancements in graph representation learning.
Abstract
Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.
