GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Jiale Fu; Yaqing Wang; Simeng Han; Jiaming Fan; Xu Yang

GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Jiale Fu, Yaqing Wang, Simeng Han, Jiaming Fan, Xu Yang

TL;DR

GraphIC presents a reasoning-aware, graph-based approach to in-context example retrieval for multi-step reasoning. By constructing thought graphs and a tailored asymmetric similarity metric, it filters superficial semantics and aligns retrieved ICEs with the underlying reasoning process. Across mathematical, coding, and logical tasks, GraphIC outperforms both training-free and training-based baselines, underscoring the value of representing cognitive steps in ICE selection. The work highlights a promising direction for robust, reasoning-guided retrieval in LLM-driven problem solving.

Abstract

In-context learning (ICL) enhances large language models (LLMs) by incorporating demonstration examples, yet its effectiveness heavily depends on the quality of selected examples. Current methods typically use text embeddings to measure semantic similarity, which often introduces bias in multi-step reasoning tasks. This occurs because text embeddings contain irrelevant semantic information and lack deeper reasoning structures. To address this, we propose GraphIC, a graph-based retrieval model that leverages reasoning-aware representation and specialized similarity metric for in-context example retrieval. GraphIC first constructs thought graphs-directed, node-attributed graphs that explicitly model reasoning steps and their dependencies-for candidate examples and queries. This approach filters out superficial semantics while preserving essential reasoning processes. Next, GraphIC retrieves examples using a novel similarity metric tailored for these graphs, capturing sequential reasoning patterns and asymmetry between examples. Comprehensive evaluations across mathematical reasoning, code generation, and logical reasoning tasks demonstrate that GraphIC outperforms 10 baseline methods. Our results highlight the importance of reasoning-aware retrieval in ICL, offering a robust solution for enhancing LLM performance in multi-step reasoning scenarios.

GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

TL;DR

Abstract

Paper Structure (13 sections, 5 equations, 4 figures, 2 tables)

This paper contains 13 sections, 5 equations, 4 figures, 2 tables.

Introduction
Related Work
The Proposed GraphIC
Thought Graphs
Similarity Measure for Thought Graphs
Example Retrieval
Experiments
Experimental Setup
Baselines
Main Results
Ablation Study
Analysis
Conclusion

Figures (4)

Figure 1: ICL with different ICE retrieval mechanisms. The left panel shows examples retrieved via BERT embedding devlin-etal-2019-bert, while the right panel displays examples retrieved via GraphIC. Semantically related terms are highlighted in blue, and quantities or variables needing resolution are highlighted in green.
Figure 2: An example of a thought graph (a) and its corresponding FRR (b).
Figure 3: Performance of GraphIC and Top Training-based/Training-free Baselines (DQ-LoRe and Complex-CoT) across 1–8 Shot Settings.
Figure 4: Ground-truth matrix and score matrices of various models. The matrix values have been linearly scaled to the range [0,1], with darker shades representing values closer to 1, and the diagonal elements have been set to 1.

GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

TL;DR

Abstract

GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)