GLEMOS: Benchmark for Instantaneous Graph Learning Model Selection
Namyong Park, Ryan Rossi, Xing Wang, Antoine Simoulin, Nesreen Ahmed, Christos Faloutsos
TL;DR
GLEMOS tackles the challenge of instantly selecting effective graph learning (GL) models for downstream tasks by introducing the first comprehensive benchmark environment for instantaneous GL model selection. It provides extensive performance data for 366 GL models across 457 graphs over two core tasks (node classification and link prediction), coupled with diverse meta-graph features and ten evaluation testbeds to assess selection methods. Experimental results show that meta-graph features generally improve model-selection performance, especially in sparse or out-of-domain settings, though effectiveness varies by method and task; some simple baselines remain competitive in certain regimes while optimizable approaches offer transfer capabilities. The benchmark is designed to be extensible and open-source, enabling ongoing growth of graphs, models, and tasks, and aiming to spur future research directions in near-instantaneous GL model selection and performance estimation.
Abstract
The choice of a graph learning (GL) model (i.e., a GL algorithm and its hyperparameter settings) has a significant impact on the performance of downstream tasks. However, selecting the right GL model becomes increasingly difficult and time consuming as more and more GL models are developed. Accordingly, it is of great significance and practical value to equip users of GL with the ability to perform a near-instantaneous selection of an effective GL model without manual intervention. Despite the recent attempts to tackle this important problem, there has been no comprehensive benchmark environment to evaluate the performance of GL model selection methods. To bridge this gap, we present GLEMOS in this work, a comprehensive benchmark for instantaneous GL model selection that makes the following contributions. (i) GLEMOS provides extensive benchmark data for fundamental GL tasks, i.e., link prediction and node classification, including the performances of 366 models on 457 graphs on these tasks. (ii) GLEMOS designs multiple evaluation settings, and assesses how effectively representative model selection techniques perform in these different settings. (iii) GLEMOS is designed to be easily extended with new models, new graphs, and new performance records. (iv) Based on the experimental results, we discuss the limitations of existing approaches and highlight future research directions. To promote research on this significant problem, we make the benchmark data and code publicly available at https://github.com/facebookresearch/glemos.
