Table of Contents
Fetching ...

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

Jiaming Zhou, Abbas Ghaddar, Ge Zhang, Liheng Ma, Yaochen Hu, Soumyasundar Pal, Mark Coates, Bin Wang, Yingxue Zhang, Jianye Hao

TL;DR

This work tackles the difficulty of long-chain logical reasoning in LLMs by introducing a graph-centric data augmentation framework. It represents reasoning tasks as relational graphs and generates synthetic, diverse subgraphs through random-walk sampling, then tunes models with supervised fine-tuning using a novel Extract-Then-Answer prompting strategy. Across CLUTRR and StepGame, the approach yields consistent reasoning gains, especially at higher complexity, while maintaining performance on standard benchmarks and preserving open-domain knowledge. The findings demonstrate that controllable graph-based data, when paired with task-specific prompting, provides an effective, data-efficient path to enhance reasoning in open LLMs.

Abstract

Despite recent advances in training and prompting strategies for Large Language Models (LLMs), these models continue to face challenges with complex logical reasoning tasks that involve long reasoning chains. In this work, we explore the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance LLMs' reasoning capabilities. Our extensive experiments, conducted on two established natural language reasoning tasks -- inductive reasoning and spatial reasoning -- demonstrate that supervised fine-tuning (SFT) with synthetic graph-based reasoning data effectively enhances LLMs' reasoning performance without compromising their effectiveness on other standard evaluation benchmarks.

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

TL;DR

This work tackles the difficulty of long-chain logical reasoning in LLMs by introducing a graph-centric data augmentation framework. It represents reasoning tasks as relational graphs and generates synthetic, diverse subgraphs through random-walk sampling, then tunes models with supervised fine-tuning using a novel Extract-Then-Answer prompting strategy. Across CLUTRR and StepGame, the approach yields consistent reasoning gains, especially at higher complexity, while maintaining performance on standard benchmarks and preserving open-domain knowledge. The findings demonstrate that controllable graph-based data, when paired with task-specific prompting, provides an effective, data-efficient path to enhance reasoning in open LLMs.

Abstract

Despite recent advances in training and prompting strategies for Large Language Models (LLMs), these models continue to face challenges with complex logical reasoning tasks that involve long reasoning chains. In this work, we explore the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance LLMs' reasoning capabilities. Our extensive experiments, conducted on two established natural language reasoning tasks -- inductive reasoning and spatial reasoning -- demonstrate that supervised fine-tuning (SFT) with synthetic graph-based reasoning data effectively enhances LLMs' reasoning performance without compromising their effectiveness on other standard evaluation benchmarks.
Paper Structure (23 sections, 3 figures, 13 tables, 1 algorithm)

This paper contains 23 sections, 3 figures, 13 tables, 1 algorithm.

Figures (3)

  • Figure 1: Illustration of a kinship graph highlighting a reasoning chain sampled by our algorithm (green) for LLM adaptation, and an ignored simpler chain (red).
  • Figure 2: System performance on the CLUTRR (top) and StepGame (bottom) datasets for 2, 6, and 10 hop.
  • Figure 3: Mistral-2-7B performances on CLUTRR (left) and StepGame (right) datasets under FS and SFT-S settings when using STD-P and ETA-P prompting.