Table of Contents
Fetching ...

Multi-Agent Collaboration for Multilingual Code Instruction Tuning

Jian Yang, Wei Zhang, Jiaxi Yang, Yibo Miao, Shanghaoran Quan, Zhenhe Wu, Qiyao Peng, Liqun Yang, Tianyu Liu, Zeyu Cui, Binyuan Hui, Junyang Lin

TL;DR

The paper tackles cross-language interference in code-focused LLMs by introducing a multilingual multi-agent framework that generates high-quality multilingual instruction data (x-Instruct) for fine-tuning Qwen2.5-xCoder. Each language-specific agent maintains generation memory and participates in centralized or parallel discussions to synthesize instructions and solutions applicable across languages, enabling efficient cross-language knowledge transfer. The authors construct seed data from code snippets, augment it through multi-agent collaboration and memory reflection, and train with supervised fine-tuning and preference-based optimization (DPO) across languages. Experiments on Python benchmarks and the multi-language MultiPL-E benchmark show that Qwen2.5-xCoder achieves strong cross-language performance, narrowing the cross-language gap and demonstrating effective multilingual code generation and understanding. The framework offers a scalable approach to multilingual code instruction tuning, with potential implications for broader multilingual software engineering tasks.

Abstract

Recent advancement in code understanding and generation demonstrates that code LLMs fine-tuned on a high-quality instruction dataset can gain powerful capabilities to address wide-ranging code-related tasks. However, most previous existing methods mainly view each programming language in isolation and ignore the knowledge transfer among different programming languages. To bridge the gap among different programming languages, we introduce a novel multi-agent collaboration framework to enhance multilingual instruction tuning for code LLMs, where multiple language-specific intelligent agent components with generation memory work together to transfer knowledge from one language to another efficiently and effectively. Specifically, we first generate the language-specific instruction data from the code snippets and then provide the generated data as the seed data for language-specific agents. Multiple language-specific agents discuss and collaborate to formulate a new instruction and its corresponding solution (A new programming language or existing programming language), To further encourage the cross-lingual transfer, each agent stores its generation history as memory and then summarizes its merits and faults. Finally, the high-quality multilingual instruction data is used to encourage knowledge transfer among different programming languages to train Qwen2.5-xCoder. Experimental results on multilingual programming benchmarks demonstrate the superior performance of Qwen2.5-xCoder in sharing common knowledge, highlighting its potential to reduce the cross-lingual gap.

Multi-Agent Collaboration for Multilingual Code Instruction Tuning

TL;DR

The paper tackles cross-language interference in code-focused LLMs by introducing a multilingual multi-agent framework that generates high-quality multilingual instruction data (x-Instruct) for fine-tuning Qwen2.5-xCoder. Each language-specific agent maintains generation memory and participates in centralized or parallel discussions to synthesize instructions and solutions applicable across languages, enabling efficient cross-language knowledge transfer. The authors construct seed data from code snippets, augment it through multi-agent collaboration and memory reflection, and train with supervised fine-tuning and preference-based optimization (DPO) across languages. Experiments on Python benchmarks and the multi-language MultiPL-E benchmark show that Qwen2.5-xCoder achieves strong cross-language performance, narrowing the cross-language gap and demonstrating effective multilingual code generation and understanding. The framework offers a scalable approach to multilingual code instruction tuning, with potential implications for broader multilingual software engineering tasks.

Abstract

Recent advancement in code understanding and generation demonstrates that code LLMs fine-tuned on a high-quality instruction dataset can gain powerful capabilities to address wide-ranging code-related tasks. However, most previous existing methods mainly view each programming language in isolation and ignore the knowledge transfer among different programming languages. To bridge the gap among different programming languages, we introduce a novel multi-agent collaboration framework to enhance multilingual instruction tuning for code LLMs, where multiple language-specific intelligent agent components with generation memory work together to transfer knowledge from one language to another efficiently and effectively. Specifically, we first generate the language-specific instruction data from the code snippets and then provide the generated data as the seed data for language-specific agents. Multiple language-specific agents discuss and collaborate to formulate a new instruction and its corresponding solution (A new programming language or existing programming language), To further encourage the cross-lingual transfer, each agent stores its generation history as memory and then summarizes its merits and faults. Finally, the high-quality multilingual instruction data is used to encourage knowledge transfer among different programming languages to train Qwen2.5-xCoder. Experimental results on multilingual programming benchmarks demonstrate the superior performance of Qwen2.5-xCoder in sharing common knowledge, highlighting its potential to reduce the cross-lingual gap.

Paper Structure

This paper contains 45 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: An example of Qwen2.5-xCoder. The Code LLM solves the code generation question by "translating" the pseudocode description (Universal Code) into executable code of the target programming language.
  • Figure 2: Overview of multilingual multi-agent data generation framework. we first construct the multilingual instruction dataset from the code snippets. We introduce a multi-agent framework, with each agent possessing expertise in a different programming language, allowing for efficient knowledge transfer across various languages. "R-Generator" generates the responses based on the instruction while "I-Generator" generates the instruction based on the responses. Each snippet is assigned to a language-specific agent who uses it to create individual instructions. The agents then collaborate, using their specialized knowledge to create new instructions that can be applied to either a new or existing programming language, along with the appropriate solutions. To improve cross-lingual learning, agents maintain a history of their generated instructions, allowing them to identify their strengths and areas for improvement. Through this collaborative process, we produce high-quality multilingual instruction data for instruction tuning.
  • Figure 3: Prompt of the multilingual multi-agent framework.
  • Figure 4: Prompt of evaluation.
  • Figure 5: Evaluation results (average scores of 8 programming languages) of Pass@1 on the MultiPL-E with different training sizes by randomly down-sampling.
  • ...and 1 more figures