Graph In-Context Operator Networks for Generalizable Spatiotemporal Prediction

Chenghan Wu; Zongmin Yu; Boai Sun; Liu Yang

Graph In-Context Operator Networks for Generalizable Spatiotemporal Prediction

Chenghan Wu, Zongmin Yu, Boai Sun, Liu Yang

Abstract

In-context operator learning enables neural networks to infer solution operators from contextual examples without weight updates. While prior work has demonstrated the effectiveness of this paradigm in leveraging vast datasets, a systematic comparison against single-operator learning using identical training data has been absent. We address this gap through controlled experiments comparing in-context operator learning against classical operator learning (single-operator models trained without contextual examples), under the same training steps and dataset. To enable this investigation on real-world spatiotemporal systems, we propose GICON (Graph In-Context Operator Network), combining graph message passing for geometric generalization with example-aware positional encoding for cardinality generalization. Experiments on air quality prediction across two Chinese regions show that in-context operator learning outperforms classical operator learning on complex tasks, generalizing across spatial domains and scaling robustly from few training examples to 100 at inference.

Graph In-Context Operator Networks for Generalizable Spatiotemporal Prediction

Abstract

Paper Structure (23 sections, 5 equations, 11 figures, 2 tables, 1 algorithm)

This paper contains 23 sections, 5 equations, 11 figures, 2 tables, 1 algorithm.

Introduction
Related Work
Operator Learning
Graph Neural Networks for Operator Learning
In-Context Operator Networks
Method
Problem Setup
Graph In-Context Operator Networks
Retrieval of Examples
Network Architecture
Experiments
Experimental Setup
Datasets
Training and Evaluation Protocol
Example Cardinality Generalization
...and 8 more sections

Figures (11)

Figure 1: Illustration of In-Context Operator Networks (ICON). Given $k$ contextual examples, each consisting of a key-value pair related by the governing operator $\mathcal{F}$, ICON infers the underlying operator from these in-context examples and applies it to map a new query input to its predicted output---all in a single forward pass without any weight updates.
Figure 2: Overview of the Graph In-Context Operator Network (GICON) architecture. Left: The overall pipeline selects contextual examples from historical frames, producing an interleaved sequence of keys and values along with the query. Each frame is processed by separate key and value encoders, passed through $N$ GICON layers, and decoded at key and query positions to produce predictions (during inference, only the query prediction is used). Right: Detailed structure of a single GICON layer, consisting of (1) spatial update via message passing that aggregates information from neighboring nodes within each graph, and (2) contextual per-node update using transformers that perform in-context learning across the example sequence with positional encodings.
Figure 3: Example cardinality generalization on BTHSA for simple to moderate operators. Top: PM$_{2.5}$. Bottom: O$_3$. Left to right: $\Delta t = 1, 4, 12$h. Classical single-operator learning achieves lower RMSE for simple operators ($\Delta t = 1, 4$h), while ICON with operator diversity outperforms the baseline at $\Delta t = 12$h given sufficient examples. All models are evaluated with up to 100 examples despite training with at most 5.
Figure 4: Example cardinality generalization on BTHSA at $\Delta t = 24$h. Left: PM$_{2.5}$. Right: O$_3$. For this complex operator, ICON with operator diversity surpasses the single-operator baseline with sufficient examples, with error decreasing for PM$_{2.5}$ and a sharp initial drop followed by stable performance for O$_3$.
Figure 5: Operator extrapolation to $\Delta t = 48$ (out-of-distribution) on BTHSA. Left: PM$_{2.5}$. Right: O$_3$. Single-operator shows flat curves, while example-trained ICON models improve with examples. Models with $k = 5$ achieve best extrapolation, with sustained improvement for PM$_{2.5}$ and a sharp initial drop for O$_3$.
...and 6 more figures

Graph In-Context Operator Networks for Generalizable Spatiotemporal Prediction

Abstract

Graph In-Context Operator Networks for Generalizable Spatiotemporal Prediction

Authors

Abstract

Table of Contents

Figures (11)