Table of Contents
Fetching ...

SAFT: Structure-Aware Fine-Tuning of LLMs for AMR-to-Text Generation

Rafiq Kamel, Filippo Guerranti, Simon Geisler, Stephan Günnemann

TL;DR

This paper introduces SAFT, a lightweight, architecture-agnostic method to inject graph structure into decoder-only LLMs for AMR-to-text generation. By transforming AMR graphs into semantically-preserving graphs and deriving AmrPEs from the magnetic Laplacian, SAFT injects structure-aware guidance directly into input embeddings during fine-tuning. Empirically, SAFT achieves state-of-the-art results on AMR 3.0, with notable advantages on structurally complex and document-level inputs, and demonstrates that the benefits scale with graph complexity while maintaining modest computational overhead. The approach offers a general pathway for bridging structured data and language models, with potential extensions to other graph-structured inputs beyond AMR.

Abstract

Large Language Models (LLMs) are increasingly applied to tasks involving structured inputs such as graphs. Abstract Meaning Representations (AMRs), which encode rich semantics as directed graphs, offer a rigorous testbed for evaluating LLMs on text generation from such structures. Yet, current methods often arbitrarily linearize AMRs, discarding key structural cues, or rely on architectures incompatible with standard LLMs. We introduce SAFT, a structure-aware fine-tuning approach that injects graph topology into pretrained LLMs without architectural changes. We compute direction-sensitive positional encodings from the magnetic Laplacian of transformed AMRs and project them into the embedding space of the LLM. While possibly applicable to any graph-structured inputs, we focus on AMR-to-text generation as a representative and challenging benchmark. SAFT sets a new state-of-the-art on AMR 3.0 with a 3.5 BLEU improvement over baselines. Gains scale with graph complexity, highlighting the value of structure-aware representations in enhancing LLM performance. SAFT offers a general and effective pathway for bridging structured data and language models.

SAFT: Structure-Aware Fine-Tuning of LLMs for AMR-to-Text Generation

TL;DR

This paper introduces SAFT, a lightweight, architecture-agnostic method to inject graph structure into decoder-only LLMs for AMR-to-text generation. By transforming AMR graphs into semantically-preserving graphs and deriving AmrPEs from the magnetic Laplacian, SAFT injects structure-aware guidance directly into input embeddings during fine-tuning. Empirically, SAFT achieves state-of-the-art results on AMR 3.0, with notable advantages on structurally complex and document-level inputs, and demonstrates that the benefits scale with graph complexity while maintaining modest computational overhead. The approach offers a general pathway for bridging structured data and language models, with potential extensions to other graph-structured inputs beyond AMR.

Abstract

Large Language Models (LLMs) are increasingly applied to tasks involving structured inputs such as graphs. Abstract Meaning Representations (AMRs), which encode rich semantics as directed graphs, offer a rigorous testbed for evaluating LLMs on text generation from such structures. Yet, current methods often arbitrarily linearize AMRs, discarding key structural cues, or rely on architectures incompatible with standard LLMs. We introduce SAFT, a structure-aware fine-tuning approach that injects graph topology into pretrained LLMs without architectural changes. We compute direction-sensitive positional encodings from the magnetic Laplacian of transformed AMRs and project them into the embedding space of the LLM. While possibly applicable to any graph-structured inputs, we focus on AMR-to-text generation as a representative and challenging benchmark. SAFT sets a new state-of-the-art on AMR 3.0 with a 3.5 BLEU improvement over baselines. Gains scale with graph complexity, highlighting the value of structure-aware representations in enhancing LLM performance. SAFT offers a general and effective pathway for bridging structured data and language models.

Paper Structure

This paper contains 59 sections, 13 equations, 17 figures, 12 tables, 1 algorithm.

Figures (17)

  • Figure 1: Overview of SAFT. An AMR graph ${\mathcal{A}}$ is first linearized into a token sequence ${\mathcal{L}}_{\mathcal{A}}$. We then construct a graph transformation ${\mathcal{G}}_{\mathcal{A}}$ and compute structure-aware positional encodings from its magnetic Laplacian. These encodings are combined with standard token positions to form AMR-specific embeddings (AmrPE). A simple MLP $f_\vartheta$ aligns them with the embedding space of the LLM $\pi_\theta$, after which they are injected into the token embeddings $X$. The model is fine-tuned to generate text ${\mathcal{S}}$, enabling structure-aware AMR-to-text generation without altering the LLM architecture.
  • Figure 2: BLEU score improvements of structurally-aware fine-tuned (SAFT) models over conventionally fine-tuned (FT) counterparts, on AMRs of depth $\delta(\mathcal{A}) \geq z$. (a) Absolute improvement ($\Delta_\text{BLEU}$): differences in BLEU between SAFT and FT models across graph depths and model families. (b) Relative improvement (${\Delta^{1}_\text{BLEU}}$): differences in BLEU between SAFT and FT models across graph depths and model families normalized by performance at depth-1 graphs. Both plots reveal an increasing advantage of SAFT as structural complexity grows, demonstrating its effectiveness in leveraging graph topology for improved generation. Lines are 2nd-degree polynomial fits.
  • Figure 3: SAFT demonstrates increasing gains over standard fine-tuning (FT) as document complexity increases. Performance on the DocAMR test set: Each plot shows the BLEU score improvement of SAFT over FT models, evaluated cumulatively on document-level AMRs with $\#_{\mathrm{AMR}} \leq z$, where $\#_{\mathrm{AMR}}$ denotes the number of AMR graphs contained in a document. This bottom-up stratified evaluation reveals how SAFT performs on increasingly complex document structures. On average across document sizes, SAFT outperforms FT models by +6.16, +5.24, and +4.50 BLEU for Qwen 2.5 3B, LLaMA 3.2 1B, and LLaMA 3.2 3B, respectively.
  • Figure 4: Effect of AMR positional embeddings (AmrPE) on token representation geometry. (a) AmrPE has a larger but same-order magnitude ${\bm{X}}$. (b) It is injected along directions largely orthogonal to ${\bm{X}}$. (c) Its variance concentrates on a small number of PCA directions of ${\bm{X}}$. (d) In this PCA space, ${\bm{H}}={\bm{X}}+\textsc{AmrPE}$ shifts coherently along the dominant axis. (e) ${\bm{H}}$ exhibits a flatter explained-variance spectrum, indicating increased intrinsic dimensionality.
  • Figure 5: Three aligned representations of the sentence "The child wants the parent to believe them.": (a) a graph-based AMR structure, (b) its corresponding Penman notation, and (c) a BFS linearization used for sequence-based processing.
  • ...and 12 more figures