Table of Contents
Fetching ...

LLMSR@XLLM25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation

Jiahao Yuan, Xingzhe Sun, Xing Yu, Jingwen Wang, Dehui Du, Zhiqing Cui, Zixiang Di

TL;DR

This work addresses structured multi-agent reasoning under severe data scarcity by introducing Less is More, a pipeline that builds high-quality supervision from 24 labeled examples through prompt induction, retrieval-augmented synthesis, and reward-guided filtering. It deploys a three-agent inference flow (Parser, Decomposer, Verifier) and a three-stage distillation process to generate robust, interpretable reasoning traces. Experiments on LLMSR@XLLM25 show that data quality-driven signals, especially when combining few-shot and zero-shot reward prompts, substantially improve parsing, reasoning, and verification metrics, with an emergent boost in overall question understanding. The study highlights the value of controllable distillation for scalable structured inference in data-scarce domains and provides a practical, open-source approach for low-resource reasoning tasks.

Abstract

The LLMSR@XLLM25 formulates a low-resource structural reasoning task that challenges LLMs to generate interpretable, step-by-step rationales with minimal labeled data. We present Less is More, the third-place winning approach in the LLMSR@XLLM25, which focuses on structured reasoning from only 24 labeled examples. Our approach leverages a multi-agent framework with reverse-prompt induction, retrieval-augmented reasoning synthesis via GPT-4o, and dual-stage reward-guided filtering to distill high-quality supervision across three subtasks: question parsing, CoT parsing, and step-level verification. All modules are fine-tuned from Meta-Llama-3-8B-Instruct under a unified LoRA+ setup. By combining structure validation with reward filtering across few-shot and zero-shot prompts, our pipeline consistently improves structure reasoning quality. These results underscore the value of controllable data distillation in enhancing structured inference under low-resource constraints. Our code is available at https://github.com/JhCircle/Less-is-More.

LLMSR@XLLM25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation

TL;DR

This work addresses structured multi-agent reasoning under severe data scarcity by introducing Less is More, a pipeline that builds high-quality supervision from 24 labeled examples through prompt induction, retrieval-augmented synthesis, and reward-guided filtering. It deploys a three-agent inference flow (Parser, Decomposer, Verifier) and a three-stage distillation process to generate robust, interpretable reasoning traces. Experiments on LLMSR@XLLM25 show that data quality-driven signals, especially when combining few-shot and zero-shot reward prompts, substantially improve parsing, reasoning, and verification metrics, with an emergent boost in overall question understanding. The study highlights the value of controllable distillation for scalable structured inference in data-scarce domains and provides a practical, open-source approach for low-resource reasoning tasks.

Abstract

The LLMSR@XLLM25 formulates a low-resource structural reasoning task that challenges LLMs to generate interpretable, step-by-step rationales with minimal labeled data. We present Less is More, the third-place winning approach in the LLMSR@XLLM25, which focuses on structured reasoning from only 24 labeled examples. Our approach leverages a multi-agent framework with reverse-prompt induction, retrieval-augmented reasoning synthesis via GPT-4o, and dual-stage reward-guided filtering to distill high-quality supervision across three subtasks: question parsing, CoT parsing, and step-level verification. All modules are fine-tuned from Meta-Llama-3-8B-Instruct under a unified LoRA+ setup. By combining structure validation with reward filtering across few-shot and zero-shot prompts, our pipeline consistently improves structure reasoning quality. These results underscore the value of controllable data distillation in enhancing structured inference under low-resource constraints. Our code is available at https://github.com/JhCircle/Less-is-More.

Paper Structure

This paper contains 14 sections, 5 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview of the Less is More reasoning framework. Training includes reverse-prompt induction, GPT-4o-based synthesis, and reward filtering. Inference deploys fine-tuned agents for question parsing and structured CoT generation.
  • Figure 2: Question Parsing Prompt $\mathcal{P}_{QP}$
  • Figure 3: Unified CoT Reasoning Prompt $\mathcal{P}_{UCoT}$
  • Figure 4: CoT Statement Prompt $\mathcal{P}_{CP}$
  • Figure 5: CoT Evidence Prompt $\mathcal{P}_{CV}^{evidence}$
  • ...and 1 more figures