IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

Tao Liu; Jiafan Lu; Bohan Yu; Pengcheng Wu; Liu Haixin; Guoyu Xu; Li Xiangheng; Lixiao Li; Jiaming Hou; Zhao Shijun; Xinglin Lyu; Kunli Zhang; Yuxiang Jia; Hongyin Zan

IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

Tao Liu, Jiafan Lu, Bohan Yu, Pengcheng Wu, Liu Haixin, Guoyu Xu, Li Xiangheng, Lixiao Li, Jiaming Hou, Zhao Shijun, Xinglin Lyu, Kunli Zhang, Yuxiang Jia, Hongyin Zan

TL;DR

IESR tackles Text-to-SQL under complex cross-domain reasoning by decoupling numerical computation from SQL construction and leveraging Monte Carlo Tree Search with trajectory-level verification. The three-stage pipeline—Question Information Understanding, MCTS-based Reasoning, and Trajectory Selection with a discriminator-based check—enables robust, executable SQL with lightweight 7B–8B LLMs without instruction tuning. It achieves state-of-the-art results on LogicCat ($EX=24.28$) and strong Archer results ($EX=37.28$), while revealing biases in current coders. The work highlights directions for integrating explicit math and domain knowledge into scalable Text-to-SQL systems.

Abstract

Text-to-SQL is a key natural language processing task that maps natural language questions to SQL queries, enabling intuitive interaction with web-based databases. Although current methods perform well on benchmarks like BIRD and Spider, they struggle with complex reasoning, domain knowledge, and hypothetical queries, and remain costly in enterprise deployment. To address these issues, we propose a framework named IESR(Information Enhanced Structured Reasoning) for lightweight large language models: (i) leverages LLMs for key information understanding and schema linking, and decoupling mathematical computation and SQL generation, (ii) integrates a multi-path reasoning mechanism based on Monte Carlo Tree Search (MCTS) with majority voting, and (iii) introduces a trajectory consistency verification module with a discriminator model to ensure accuracy and consistency. Experimental results demonstrate that IESR achieves state-of-the-art performance on the complex reasoning benchmark LogicCat (24.28 EX) and the Archer dataset (37.28 EX) using only compact lightweight models without fine-tuning. Furthermore, our analysis reveals that current coder models exhibit notable biases and deficiencies in physical knowledge, mathematical computation, and common-sense reasoning, highlighting important directions for future research. We released code at https://github.com/Ffunkytao/IESR-SLM.

IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

TL;DR

) and strong Archer results (

), while revealing biases in current coders. The work highlights directions for integrating explicit math and domain knowledge into scalable Text-to-SQL systems.

Abstract

Paper Structure (40 sections, 9 equations, 16 figures, 8 tables, 3 algorithms)

This paper contains 40 sections, 9 equations, 16 figures, 8 tables, 3 algorithms.

Introduction
Related Work
Text-to-SQL with Decomposition and Search-based Reasoning
Structured Reasoning and Optimization for Text-to-SQL
Methodology
Question Information Understanding
Intent and Information Understanding.
Constraint-aware Relation Filtering.
Soft Consistency Scoring.
Schema Linking and Compression.
MCTS-based CoT Reasoning
Problem Formulation.
Human-inspired Reasoning Actions.
Reward-based Node Evaluation.
MCTS Search and Backpropagation.
...and 25 more sections

Figures (16)

Figure 1: Motivation for Decoupling Mathematical Computation and SQL Generation in Text-to-SQL.
Figure 2: The comprehensive workflow of IESR including three stages: Question Understanding with Schema Linking, Monte Carlo Tree Search(MCTS)-based Reasoning and Trajectory Selection with Mutual Reasoning Consistency.
Figure 3: A visual illustration of heterogeneous MCTS actions (A1–A6) for SQL generation and reasoning.
Figure 4: Performance heatmap of different methods on the LogicCat dataset across three difficulty levels (Easy, Medium, Hard) and different reasoning types. The heatmap reveals that IESR methods consistently outperform all standard prompting baselines across difficulty levels and reasoning types.
Figure 5: Ablation study of $N_\text{rollout}$ across four backbone models on the LogicCat dataset.
...and 11 more figures

IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

TL;DR

Abstract

IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (16)