Table of Contents
Fetching ...

FoGE: Fock Space inspired encoding for graph prompting

Sotirios Panagiotis Chytas, Rudrasis Chakraborty, Vikas Singh

TL;DR

FoGE addresses graph-grounded reasoning for LLMs by introducing a parameter-free Fock-space–inspired encoding that maps graphs to fixed-dimensional embeddings. It builds a pipeline from graphs to Clifford algebras and into a graded Fock space, using a Dirac operator with $D^2 = \mathcal{L}$ and a spinor/Fock-space isomorphism to implement practical operations via $E_k \simeq a_k + a_k^*$. The embeddings are integrated with a frozen LLM through simple linear adapters for prefix-tuning, enabling graph questions from simple graphs to hypergraphs and proteins with minimal task-specific training. Across diverse datasets, FoGE achieves rich graph representations, competitive performance against specialized graph models, and favorable runtime, with open-source code enabling broad adoption.

Abstract

Recent results show that modern Large Language Models (LLM) are indeed capable of understanding and answering questions about structured data such as graphs. This new paradigm can lead to solutions that require less supervision while, at the same time, providing a model that can generalize and answer questions beyond the training labels. Existing proposals often use some description of the graph to create an ``augmented'' prompt fed to the LLM. For a chosen class of graphs, if a well-tailored graph encoder is deployed to play together with a pre-trained LLM, the model can answer graph-related questions well. Existing solutions to graph-based prompts range from graph serialization to graph transformers. In this work, we show that the use of a parameter-free graph encoder based on Fock space representations, a concept borrowed from mathematical physics, is remarkably versatile in this problem setting. The simple construction, inherited directly from the theory with a few small adjustments, can provide rich and informative graph encodings, for a wide range of different graphs. We investigate the use of this idea for prefix-tuned prompts leveraging the capabilities of a pre-trained, frozen LLM. The modifications lead to a model that can answer graph-related questions -- from simple graphs to proteins to hypergraphs -- effectively and with minimal, if any, adjustments to the architecture. Our work significantly simplifies existing solutions and generalizes well to multiple different graph-based structures effortlessly.

FoGE: Fock Space inspired encoding for graph prompting

TL;DR

FoGE addresses graph-grounded reasoning for LLMs by introducing a parameter-free Fock-space–inspired encoding that maps graphs to fixed-dimensional embeddings. It builds a pipeline from graphs to Clifford algebras and into a graded Fock space, using a Dirac operator with and a spinor/Fock-space isomorphism to implement practical operations via . The embeddings are integrated with a frozen LLM through simple linear adapters for prefix-tuning, enabling graph questions from simple graphs to hypergraphs and proteins with minimal task-specific training. Across diverse datasets, FoGE achieves rich graph representations, competitive performance against specialized graph models, and favorable runtime, with open-source code enabling broad adoption.

Abstract

Recent results show that modern Large Language Models (LLM) are indeed capable of understanding and answering questions about structured data such as graphs. This new paradigm can lead to solutions that require less supervision while, at the same time, providing a model that can generalize and answer questions beyond the training labels. Existing proposals often use some description of the graph to create an ``augmented'' prompt fed to the LLM. For a chosen class of graphs, if a well-tailored graph encoder is deployed to play together with a pre-trained LLM, the model can answer graph-related questions well. Existing solutions to graph-based prompts range from graph serialization to graph transformers. In this work, we show that the use of a parameter-free graph encoder based on Fock space representations, a concept borrowed from mathematical physics, is remarkably versatile in this problem setting. The simple construction, inherited directly from the theory with a few small adjustments, can provide rich and informative graph encodings, for a wide range of different graphs. We investigate the use of this idea for prefix-tuned prompts leveraging the capabilities of a pre-trained, frozen LLM. The modifications lead to a model that can answer graph-related questions -- from simple graphs to proteins to hypergraphs -- effectively and with minimal, if any, adjustments to the architecture. Our work significantly simplifies existing solutions and generalizes well to multiple different graph-based structures effortlessly.

Paper Structure

This paper contains 27 sections, 6 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: Augmenting LLM's capabilities by prompting them with carefully encoded graphs.
  • Figure 2: Single, Bi- and Tri-vectors in Clifford Algebra with wedge products.
  • Figure 3: From graph to Fock space representations.
  • Figure 4: Graphs, Hypergraphs, Attributed graphs, Proteins. All these types can be efficiently encoded using FoGE.
  • Figure 5: FoGE-LLM overview. Using a parameter-free graph encoder we get graph embeddings for a range of different graphs. Then, we use linear adapters with a frozen LLM for prefix tuning.
  • ...and 4 more figures

Theorems & Definitions (1)

  • Definition 2.1