Table of Contents
Fetching ...

Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation

Kounianhua Du, Hanjing Wang, Jianxing Liu, Jizheng Chen, Xinyi Dai, Yasheng Wang, Ruiming Tang, Yong Yu, Jun Wang, Weinan Zhang

TL;DR

The paper tackles how to translate System2 reasoning into robust System1 code generation by addressing hidden reasoning complexity and heterogeneous data distributions. It introduces the BDC pipeline, combining mutual verification among multiple LLMs with a Monte-Carlo Tree Search framework, disentangling heterogeneous data into composable LoRA experts (DisenLoRA), and using an input-aware hypernetwork to customize problem solvers per instance. The approach integrates reflection-based pruning, thought-to-solution simulation, and rank-wise expert weighting to produce robust, adaptable solvers for code tasks. Experimental results on APPS and CodeContest show superior performance versus baselines and demonstrate cross-dataset generalization, highlighting practical impact for scalable, automated code generation. Overall, the work provides a modular, data-driven pathway to harness System2 capabilities for System1-style code generation with enhanced robustness and flexibility.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks, yet the intricacies of their problem-solving mechanisms in system 2 tasks are not sufficiently explored. Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation and compressing the explored knowledge into System 1 process. In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges: (1) the complex hidden reasoning processes and (2) the heterogeneous data distributions that complicate the exploration and training of robust LLM solvers. To tackle these issues, we propose a novel BDC framework that explores insightful System 2 knowledge of LLMs using a MC-Tree-Of-Agents algorithm with mutual \textbf{B}oosting, \textbf{D}isentangles the heterogeneous training data for composable LoRA-experts, and obtain \textbf{C}ustomized problem solver for each data instance with an input-aware hypernetwork to weight over the LoRA-experts, offering effectiveness, flexibility, and robustness. This framework leverages multiple LLMs through mutual verification and boosting, integrated into a Monte-Carlo Tree Search process enhanced by reflection-based pruning and refinement. Additionally, we introduce the DisenLora algorithm, which clusters heterogeneous data to fine-tune LLMs into composable Lora experts, enabling the adaptive generation of customized problem solvers through an input-aware hypernetwork. This work lays the groundwork for advancing LLM capabilities in complex reasoning tasks, offering a novel System2-to-System1 solution.

Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation

TL;DR

The paper tackles how to translate System2 reasoning into robust System1 code generation by addressing hidden reasoning complexity and heterogeneous data distributions. It introduces the BDC pipeline, combining mutual verification among multiple LLMs with a Monte-Carlo Tree Search framework, disentangling heterogeneous data into composable LoRA experts (DisenLoRA), and using an input-aware hypernetwork to customize problem solvers per instance. The approach integrates reflection-based pruning, thought-to-solution simulation, and rank-wise expert weighting to produce robust, adaptable solvers for code tasks. Experimental results on APPS and CodeContest show superior performance versus baselines and demonstrate cross-dataset generalization, highlighting practical impact for scalable, automated code generation. Overall, the work provides a modular, data-driven pathway to harness System2 capabilities for System1-style code generation with enhanced robustness and flexibility.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks, yet the intricacies of their problem-solving mechanisms in system 2 tasks are not sufficiently explored. Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation and compressing the explored knowledge into System 1 process. In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges: (1) the complex hidden reasoning processes and (2) the heterogeneous data distributions that complicate the exploration and training of robust LLM solvers. To tackle these issues, we propose a novel BDC framework that explores insightful System 2 knowledge of LLMs using a MC-Tree-Of-Agents algorithm with mutual \textbf{B}oosting, \textbf{D}isentangles the heterogeneous training data for composable LoRA-experts, and obtain \textbf{C}ustomized problem solver for each data instance with an input-aware hypernetwork to weight over the LoRA-experts, offering effectiveness, flexibility, and robustness. This framework leverages multiple LLMs through mutual verification and boosting, integrated into a Monte-Carlo Tree Search process enhanced by reflection-based pruning and refinement. Additionally, we introduce the DisenLora algorithm, which clusters heterogeneous data to fine-tune LLMs into composable Lora experts, enabling the adaptive generation of customized problem solvers through an input-aware hypernetwork. This work lays the groundwork for advancing LLM capabilities in complex reasoning tasks, offering a novel System2-to-System1 solution.

Paper Structure

This paper contains 24 sections, 16 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Illustration of the motivation.
  • Figure 2: Illustration of the overall framework of BDC.
  • Figure 3: Pruning and refinement operations.
  • Figure 4: T-sne visualization of the APPS data encoding.