Table of Contents
Fetching ...

Graph Your Own Prompt

Xi Ding, Lei Wang, Piotr Koniusz, Yongsheng Gao

TL;DR

Graph Your Own Prompt introduces Graph Consistency Regularization (GCR), a lightweight, model-agnostic framework that enforces semantic coherence by aligning batch-level feature graphs with class-aware prediction graphs across multiple depths. It achieves this via parameter-free Graph Consistency Layers (GCLs) that compute $\mathbf{F}^{(l)}$ from features and $\mathbf{P}$ from predictions, minimizing $\mathcal{L}_{\text{GCR}}$ across selected layers with adaptive weights. Theoretical analysis connects GCR to reduced hypothesis space, spectral alignment of graph Laplacians, and PAC-Bayesian generalization bounds. Empirically, GCR improves intra-class cohesion and generalization across CNNs and transformers on CIFAR, Tiny ImageNet, and ImageNet-1K, with modest compute overhead. This work provides a practical, scalable form of semantic regularization that leverages the model's own prediction structure.

Abstract

We propose Graph Consistency Regularization (GCR), a novel framework that injects relational graph structures, derived from model predictions, into the learning process to promote class-aware, semantically meaningful feature representations. Functioning as a form of self-prompting, GCR enables the model to refine its internal structure using its own outputs. While deep networks learn rich representations, these often capture noisy inter-class similarities that contradict the model's predicted semantics. GCR addresses this issue by introducing parameter-free Graph Consistency Layers (GCLs) at arbitrary depths. Each GCL builds a batch-level feature similarity graph and aligns it with a global, class-aware masked prediction graph, derived by modulating softmax prediction similarities with intra-class indicators. This alignment enforces that feature-level relationships reflect class-consistent prediction behavior, acting as a semantic regularizer throughout the network. Unlike prior work, GCR introduces a multi-layer, cross-space graph alignment mechanism with adaptive weighting, where layer importance is learned from graph discrepancy magnitudes. This allows the model to prioritize semantically reliable layers and suppress noisy ones, enhancing feature quality without modifying the architecture or training procedure. GCR is model-agnostic, lightweight, and improves semantic structure across various networks and datasets. Experiments show that GCR promotes cleaner feature structure, stronger intra-class cohesion, and improved generalization, offering a new perspective on learning from prediction structure. [Project website](https://darcyddx.github.io/gcr/) [Code](https://github.com/Darcyddx/graph-prompt)

Graph Your Own Prompt

TL;DR

Graph Your Own Prompt introduces Graph Consistency Regularization (GCR), a lightweight, model-agnostic framework that enforces semantic coherence by aligning batch-level feature graphs with class-aware prediction graphs across multiple depths. It achieves this via parameter-free Graph Consistency Layers (GCLs) that compute from features and from predictions, minimizing across selected layers with adaptive weights. Theoretical analysis connects GCR to reduced hypothesis space, spectral alignment of graph Laplacians, and PAC-Bayesian generalization bounds. Empirically, GCR improves intra-class cohesion and generalization across CNNs and transformers on CIFAR, Tiny ImageNet, and ImageNet-1K, with modest compute overhead. This work provides a practical, scalable form of semantic regularization that leverages the model's own prediction structure.

Abstract

We propose Graph Consistency Regularization (GCR), a novel framework that injects relational graph structures, derived from model predictions, into the learning process to promote class-aware, semantically meaningful feature representations. Functioning as a form of self-prompting, GCR enables the model to refine its internal structure using its own outputs. While deep networks learn rich representations, these often capture noisy inter-class similarities that contradict the model's predicted semantics. GCR addresses this issue by introducing parameter-free Graph Consistency Layers (GCLs) at arbitrary depths. Each GCL builds a batch-level feature similarity graph and aligns it with a global, class-aware masked prediction graph, derived by modulating softmax prediction similarities with intra-class indicators. This alignment enforces that feature-level relationships reflect class-consistent prediction behavior, acting as a semantic regularizer throughout the network. Unlike prior work, GCR introduces a multi-layer, cross-space graph alignment mechanism with adaptive weighting, where layer importance is learned from graph discrepancy magnitudes. This allows the model to prioritize semantically reliable layers and suppress noisy ones, enhancing feature quality without modifying the architecture or training procedure. GCR is model-agnostic, lightweight, and improves semantic structure across various networks and datasets. Experiments show that GCR promotes cleaner feature structure, stronger intra-class cohesion, and improved generalization, offering a new perspective on learning from prediction structure. [Project website](https://darcyddx.github.io/gcr/) [Code](https://github.com/Darcyddx/graph-prompt)

Paper Structure

This paper contains 38 sections, 14 theorems, 63 equations, 8 figures, 17 tables.

Key Result

Theorem 1

Let $\ell(f(x), y)$ be a $\gamma$-Lipschitz loss function (e.g., cross-entropy), and let $\mathcal{F}_L$ be the class of functions at layer $l$ such that each function $f^{(l)}$ satisfies the $\ell_2$-bounded constraint $\|f^{(l)}(x)\|_2 \leq B$. Suppose $\mathcal{F}_\epsilon \subseteq \mathcal{F}_L where $\mathbf{P}_{ij}$ is the target alignment between the normalized feature vectors $\mathbf{x}_

Figures (8)

  • Figure 1: Relational graph visualization using a batch of 64 samples from CIFAR-10 on (left) DenseNet-121 and (right) MobileNet. We compare the baselines with their counterparts augmented by our GCLs. Our method promotes richer, class-aware semantic representations by acting as a form of self-prompting. For DenseNet-121, the baseline feature relational graph tends to connect samples based on superficial visual similarity (e.g., deer, horse, and automobile), often ignoring semantic boundaries. In contrast, our GCL-enhanced model produces more semantically coherent groupings, clearly separating animals from vehicles. On MobileNet, the prediction relational graph further highlights the strength of our method, demonstrating cleaner, more distinct class relationships compared to the baseline. These improvements reflect the effectiveness of our model in aligning feature and prediction spaces with semantic structure, despite being lightweight and parameter-free.
  • Figure 2: Our parameter-free Graph Consistency Layer (GCL), highlighted in red, can be inserted after any micro-network block (e.g., Inception) or specific layer (e.g., fully connected). Each GCL constructs a relational graph from batch-level features using a similarity metric (e.g., cosine). A reference graph is generated from softmax predictions and masked by intra-class indicators: binary masks identifying semantically consistent pairs. Each GCL enforces alignment between masked prediction graph and the feature-level graphs. The resulting consistency signals are adaptively weighted, forming the Graph Consistency Regularization (GCR) framework, which integrates with the primary loss (e.g., cross-entropy), acting as a semantic regularizer to guide learning.
  • Figure 3: Feature map visualizations from models trained on identical data batches: (top) baseline and (bottom) our GCL-augmented model. Brighter red regions indicate stronger feature activations. Compared to the baseline, GCL-enhanced maps more clearly emphasize class-discriminative cues, e.g., cat faces, ears, and eyes, and for dogs, tongues, noses, and facial contours, reflecting improved focus and interpretability. GCL also yields higher classification accuracy (98.1% $\rightarrow$ 99.8%).
  • Figure 4: Relational graph visualization on Kaggle cats vs. dogs. We compare the best baseline model and our GCL-augmented model using the same batch of 32 samples (red = cat, blue = dog). The baseline consists of four convolutional blocks and two fully connected layers; our method inserts a Graph Consistency Layer (GCL) after each, totaling six GCLs. The top row shows the baseline (without GCLs); the bottom row shows our GCL-enhanced model. Each column visualizes the relational graph at a specific layer, from early features (left) to final predictions (right). Early layers exhibit weak connectivity, as low-level features poorly capture class semantics. As depth increases, both models shift toward more structured, class-separable relationships. GCLs amplify this effect by attenuating low-similarity inter-class edges and reinforcing intra-class coherence, leading to improved accuracy (98.1% vs.99.8%). For clarity, edges with similarity $<$ 0.4 are omitted.
  • Figure 5: Relational graph comparison across five models on the same batch. Top row: baselines; bottom row: GCL-augmented versions, showing sparser inter-class connections and stronger class-aware structure, highlighting GCL's effectiveness in enhancing relational representations.
  • ...and 3 more figures

Theorems & Definitions (36)

  • Theorem 1: Generalization via Dudley's entropy integral
  • proof
  • Remark 1
  • Proposition 1: Spectral alignment
  • proof
  • Corollary 1
  • Theorem 2: PAC-Bayes generalization bound with GCR
  • Proposition 2: Structure-induced KL complexity
  • proof
  • Remark 2
  • ...and 26 more