Iterative Graph Alignment

Fangyuan Yu; Hardeep Singh Arora; Matt Johnson

Iterative Graph Alignment

Fangyuan Yu, Hardeep Singh Arora, Matt Johnson

TL;DR

This work tackles representation gaps that impede reliable rule-based alignment in large language models. It introduces Iterative Graph Alignment (IGA), a annotation-free framework that fuses Iterative Graph Prompting (IGP) for graph-based reasoning with Self-Aligned Incremental Learning (SAIL) for adaptive, diverse data augmentation and iterative fine-tuning. Through RuleAlign, a 1.5K-query dataset across five rule-based tasks, the authors demonstrate substantial gains: IGP yields up to 73.12% relative improvement in rule-based alignment on Claude Sonnet 3.5, and IGA finetuning with Llama3-8B-Instruct achieves up to 86.20% improvement, matching or exceeding proprietary baselines. The approach reduces reliance on human annotation, patches representation gaps, and offers a scalable path toward more robust, rule-consistent LLMs through a multi-agent curriculum learning paradigm.

Abstract

By compressing diverse narratives, LLMs go beyond memorization, achieving intelligence by capturing generalizable causal relationships. However, they suffer from local 'representation gaps' due to insufficient training data diversity, limiting their real-world utility, especially in tasks requiring strict alignment to rules. Traditional alignment methods relying on heavy human annotations are inefficient and unscalable. Recent self-alignment techniques also fall short, as they often depend on self-selection based prompting and memorization-based learning. To address these issues, we introduce Iterative Graph Alignment (IGA), an annotation-free rule-based alignment algorithm. A teacher model (VLM) employs Iterative Graph Prompting (IGP) to create logical graphs and reference answers. The student model (LLM) identifies local knowledge gaps by attempting to align its responses with these references, collaborating with helper models to generate diverse answers. These aligned responses are then used for iterative supervised fine-tuning (SFT). Our evaluations across five rule-based scenarios demonstrate IGP's effectiveness, with a 73.12\% alignment improvement in Claude Sonnet 3.5, and Llama3-8B-Instruct achieving an 86.20\% improvement, outperforming Claude Sonnet 3.5 in rule-based alignment.

Iterative Graph Alignment

TL;DR

Abstract

Paper Structure (20 sections, 3 equations, 4 figures, 2 tables, 4 algorithms)

This paper contains 20 sections, 3 equations, 4 figures, 2 tables, 4 algorithms.

Introduction
Thinking: Iterative Graph Prompting
Learning: Self-Aligned Incremental Learning
Iterative Graph Alignment
Background and Related work
Alignment Algorithm and Preference Optimization
Iterative self-improving LLM system
Thinking mechanism
Efficient Alignment in LLMs
Methodology
Iterative Graph Prompting
Self Evaluation
Iterative Refinement
Visual Prompting
Self-Aligned Incremental Learning
...and 5 more sections

Figures (4)

Figure 1: Customer Roleplay Issue
Figure 2: Iterative Graph Alignment (IGA) . A teacher model (VLM) iteratively generates logical graphs and reference answers using Iterative Graph Prompting (IGP). A student model (LLM) reviews its responses against these reference answers to identify hard cases where representation gaps exist. The student then collaborates with helper models to explore diverse ways to respond to these challenging queries by taking hints from the logical graphs and reference answers, before fine-tuning on the collected insights and proceed to the next iteration.
Figure 3: Iterative Graph Prompting (IGP)
Figure 4: Self-Aligned Incremental Learning (SAIL)

Iterative Graph Alignment

TL;DR

Abstract

Iterative Graph Alignment

Authors

TL;DR

Abstract

Table of Contents

Figures (4)