Table of Contents
Fetching ...

Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs

Zhihong Sun, Chen Lyu, Bolun Li, Yao Wan, Hongyu Zhang, Ge Li, Zhi Jin

TL;DR

The CodePLAN framework is proposed, which aims to transfer LLMs’ reasoning capabilities to smaller models through distillation, and adopts a multi-task learning approach, jointly undertaking code generation and solution plan generation tasks, to enhance the code generation capabilities of smaller model.

Abstract

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate programming challenges, thereby improving its performance in code generation. Nevertheless, smaller models have been struggling to keep up with LLMs in deducing these plans, adversely affecting their code generation capabilities. Given the considerable size and associated deployment costs, along with concerns about data security, many teams opt for deploying smaller models for code generation. Consequently, there arises a compelling need for transferring LLMs' code generation reasoning abilities to the smaller models. In this paper, we propose the CodePLAN framework, which aims to transfer LLMs' reasoning capabilities to smaller models through distillation. We adopt a multi-task learning approach, jointly undertaking code generation and solution plan generation tasks, to enhance the code generation capabilities of the smaller model. To ensure the superior quality of the solution plans, we advocate for the utilization of backward reasoning and plan sampling strategies. Our experiments show that in comparison to the conventional fine-tuning approach, our approach improves the smaller model's code generation performance (measured in pass@1 metric) by over 130% on the challenging APPS benchmark.

Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs

TL;DR

The CodePLAN framework is proposed, which aims to transfer LLMs’ reasoning capabilities to smaller models through distillation, and adopts a multi-task learning approach, jointly undertaking code generation and solution plan generation tasks, to enhance the code generation capabilities of smaller model.

Abstract

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate programming challenges, thereby improving its performance in code generation. Nevertheless, smaller models have been struggling to keep up with LLMs in deducing these plans, adversely affecting their code generation capabilities. Given the considerable size and associated deployment costs, along with concerns about data security, many teams opt for deploying smaller models for code generation. Consequently, there arises a compelling need for transferring LLMs' code generation reasoning abilities to the smaller models. In this paper, we propose the CodePLAN framework, which aims to transfer LLMs' reasoning capabilities to smaller models through distillation. We adopt a multi-task learning approach, jointly undertaking code generation and solution plan generation tasks, to enhance the code generation capabilities of the smaller model. To ensure the superior quality of the solution plans, we advocate for the utilization of backward reasoning and plan sampling strategies. Our experiments show that in comparison to the conventional fine-tuning approach, our approach improves the smaller model's code generation performance (measured in pass@1 metric) by over 130% on the challenging APPS benchmark.
Paper Structure (27 sections, 4 equations, 11 figures, 6 tables)

This paper contains 27 sections, 4 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Comparison results of different models without solution plans, spliced LLM greedy generated solution plans and the best quality solution plans as prompt, where all models were fine-tuned on the APPS train dataset.
  • Figure 2: We use the prompt to allow LLM to reason backwards a solution plan from the code written by the programmer (highlighted in green).
  • Figure 3: Our framework for the training phase of CodePLAN: backward reasoning from solutions via LLM about the programmer's solution plan at the time of solving this programming problem, and using these solution plans and solutions to fine-tune the code generation model in an alternating multi-task fashion.
  • Figure 4: The inference phase schematic comprises three stages: ① Initially, the model formulates candidate solution plans based on the provided problem description. ② Subsequently, as indicated by the dashed line, solution plans generated in Stage ① are integrated with the problem description for code generation. Candidate solution plans are chosen based on the evaluation outcomes of the code generated through example unit tests. ③ Ultimately, the selected high-quality solution plan is used as a prompt, integrated within the problem description for a new cycle of code generation.
  • Figure 5: Results of different number of solution plans on the number of correct codes generated.
  • ...and 6 more figures