Table of Contents
Fetching ...

One for All: Towards Training One Graph Model for All Classification Tasks

Hao Liu, Jiarui Feng, Lecheng Kong, Ningyue Liang, Dacheng Tao, Yixin Chen, Muhan Zhang

TL;DR

This work tackles the challenge of a single foundation model for diverse graph tasks across domains. It introduces OFA, a framework that uses text-attributed graphs (TAGs), Nodes-of-Interest (NOI), and a Graph Prompting Paradigm (GPP) to enable cross-domain, cross-task, and in-context learning without parameter fine-tuning. OFA is trained on nine heterogeneous graph datasets and leverages LLM encodings of textual node/edge descriptions to unify representations, with a GNN-based readout that processes a prompted graph to predict class nodes. The results demonstrate competitive performance in supervised, few-shot, and zero-shot settings, including zero-shot capabilities across unseen classes, highlighting strong cross-domain generalization. Limitations include the lack of regression support and limited cross-domain data, suggesting future work to extend tasks, datasets, and training strategies.

Abstract

Designing a single model to address multiple tasks has been a long-standing objective in artificial intelligence. Recently, large language models have demonstrated exceptional capability in solving different tasks within the language domain. However, a unified model for various graph tasks remains underexplored, primarily due to the challenges unique to the graph learning domain. First, graph data from different areas carry distinct attributes and follow different distributions. Such discrepancy makes it hard to represent graphs in a single representation space. Second, tasks on graphs diversify into node, link, and graph tasks, requiring distinct embedding strategies. Finally, an appropriate graph prompting paradigm for in-context learning is unclear. We propose \textbf{One for All (OFA)}, the first general framework that can use a single graph model to address the above challenges. Specifically, OFA proposes text-attributed graphs to unify different graph data by describing nodes and edges with natural language and uses language models to encode the diverse and possibly cross-domain text attributes to feature vectors in the same embedding space. Furthermore, OFA introduces the concept of nodes-of-interest to standardize different tasks with a single task representation. For in-context learning on graphs, OFA introduces a novel graph prompting paradigm that appends prompting substructures to the input graph, which enables it to address varied tasks without fine-tuning. We train the OFA model using graph data from multiple domains (including citation networks, molecular graphs, knowledge graphs, etc.) simultaneously and evaluate its ability in supervised, few-shot, and zero-shot learning scenarios. OFA performs well across different tasks, making it the first general-purpose across-domains classification model on graphs.

One for All: Towards Training One Graph Model for All Classification Tasks

TL;DR

This work tackles the challenge of a single foundation model for diverse graph tasks across domains. It introduces OFA, a framework that uses text-attributed graphs (TAGs), Nodes-of-Interest (NOI), and a Graph Prompting Paradigm (GPP) to enable cross-domain, cross-task, and in-context learning without parameter fine-tuning. OFA is trained on nine heterogeneous graph datasets and leverages LLM encodings of textual node/edge descriptions to unify representations, with a GNN-based readout that processes a prompted graph to predict class nodes. The results demonstrate competitive performance in supervised, few-shot, and zero-shot settings, including zero-shot capabilities across unseen classes, highlighting strong cross-domain generalization. Limitations include the lack of regression support and limited cross-domain data, suggesting future work to extend tasks, datasets, and training strategies.

Abstract

Designing a single model to address multiple tasks has been a long-standing objective in artificial intelligence. Recently, large language models have demonstrated exceptional capability in solving different tasks within the language domain. However, a unified model for various graph tasks remains underexplored, primarily due to the challenges unique to the graph learning domain. First, graph data from different areas carry distinct attributes and follow different distributions. Such discrepancy makes it hard to represent graphs in a single representation space. Second, tasks on graphs diversify into node, link, and graph tasks, requiring distinct embedding strategies. Finally, an appropriate graph prompting paradigm for in-context learning is unclear. We propose \textbf{One for All (OFA)}, the first general framework that can use a single graph model to address the above challenges. Specifically, OFA proposes text-attributed graphs to unify different graph data by describing nodes and edges with natural language and uses language models to encode the diverse and possibly cross-domain text attributes to feature vectors in the same embedding space. Furthermore, OFA introduces the concept of nodes-of-interest to standardize different tasks with a single task representation. For in-context learning on graphs, OFA introduces a novel graph prompting paradigm that appends prompting substructures to the input graph, which enables it to address varied tasks without fine-tuning. We train the OFA model using graph data from multiple domains (including citation networks, molecular graphs, knowledge graphs, etc.) simultaneously and evaluate its ability in supervised, few-shot, and zero-shot learning scenarios. OFA performs well across different tasks, making it the first general-purpose across-domains classification model on graphs.
Paper Structure (30 sections, 12 equations, 4 figures, 18 tables)

This paper contains 30 sections, 12 equations, 4 figures, 18 tables.

Figures (4)

  • Figure 1: The pipeline of OFA. An input to the model contains a text-attributed graph and a task description. Cross-domain texts in graphs and task descriptions can be co-embedded in the same space by an LLM. OFA's graph prompting paradigm converts the input with embedded features to prompted graphs with a unified task representation, which allows adaptive downstream prediction.
  • Figure 2: In-context learning design in OFA
  • Figure 3: Output embedding space of NOI prompt nodes on all datasets for OFA-joint-st.
  • Figure 4: Embedded node features from all OFA datasets (sentence transformer).