Table of Contents
Fetching ...

RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution

Aofan Liu, Haoxuan Li, Bin Wang, Ao Yang, Hui Li

TL;DR

RA-Gen introduces a ReAct-based multi-agent framework for controllable code generation, coordinating four specialized agents—Planner, Searcher, CodeGen, and Extractor—to decompose tasks, reason with external tools, generate code, and extract actionable knowledge. The Searcher uses reasoning traces $R$ and actions $A$ within an MDP $\,\mathcal{M}=\\langle S,A,P,R\\rangle$ and dynamically integrates external resources, enabling transparent, audit-friendly decision making. Evaluation on the SVEN CWE-focused dataset shows a notable security performance with a Sec.Rate of $94.8\%$ using CodeQL, outperforming several baselines while preserving multi-language support and interpretability through documented reasoning trajectories. Overall, RA-Gen demonstrates that coordinated, tool-assisted reasoning in a multi-agent setting can enhance safety, reliability, and user trust in automated code generation while highlighting practical trade-offs in computation and tool integration.

Abstract

Code generation models based on large language models (LLMs) have gained wide adoption, but challenges remain in ensuring safety, accuracy, and controllability, especially for complex tasks. Existing methods often lack dynamic integration of external tools, transparent reasoning, and user control over safety. To address these issues, we propose a controllable code generation framework utilizing the ReAct paradigm for multi-agent task execution. This framework is a multi-agent system designed to enable efficient, precise, and interpretable code generation through dynamic interactions between LLMs and external resources. The framework adopts a collaborative architecture comprising four specialized agents: a Planner for task decomposition, a Searcher that leverages the ReAct framework for reasoning and tool integration, a CodeGen agent for accurate code generation, and an Extractor for structured data retrieval. The ReAct-based Searcher alternates between generating reasoning traces and executing actions, facilitating seamless integration of internal knowledge with external tools (such as search engines) to enhance accuracy and user control. Experimental results show the framework's effectiveness across multiple languages, achieving a 94.8% security rate on the SVEN dataset with CodeQL, outperforming existing approaches. Its transparent reasoning process fosters user trust and improves controllability.

RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution

TL;DR

RA-Gen introduces a ReAct-based multi-agent framework for controllable code generation, coordinating four specialized agents—Planner, Searcher, CodeGen, and Extractor—to decompose tasks, reason with external tools, generate code, and extract actionable knowledge. The Searcher uses reasoning traces and actions within an MDP and dynamically integrates external resources, enabling transparent, audit-friendly decision making. Evaluation on the SVEN CWE-focused dataset shows a notable security performance with a Sec.Rate of using CodeQL, outperforming several baselines while preserving multi-language support and interpretability through documented reasoning trajectories. Overall, RA-Gen demonstrates that coordinated, tool-assisted reasoning in a multi-agent setting can enhance safety, reliability, and user trust in automated code generation while highlighting practical trade-offs in computation and tool integration.

Abstract

Code generation models based on large language models (LLMs) have gained wide adoption, but challenges remain in ensuring safety, accuracy, and controllability, especially for complex tasks. Existing methods often lack dynamic integration of external tools, transparent reasoning, and user control over safety. To address these issues, we propose a controllable code generation framework utilizing the ReAct paradigm for multi-agent task execution. This framework is a multi-agent system designed to enable efficient, precise, and interpretable code generation through dynamic interactions between LLMs and external resources. The framework adopts a collaborative architecture comprising four specialized agents: a Planner for task decomposition, a Searcher that leverages the ReAct framework for reasoning and tool integration, a CodeGen agent for accurate code generation, and an Extractor for structured data retrieval. The ReAct-based Searcher alternates between generating reasoning traces and executing actions, facilitating seamless integration of internal knowledge with external tools (such as search engines) to enhance accuracy and user control. Experimental results show the framework's effectiveness across multiple languages, achieving a 94.8% security rate on the SVEN dataset with CodeQL, outperforming existing approaches. Its transparent reasoning process fosters user trust and improves controllability.

Paper Structure

This paper contains 17 sections, 9 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Architecture of the multi-agent framework for secure code generation. The framework comprises four key components: the Planner, which decomposes tasks and generates initial reasoning trajectories; the Searcher, which refines trajectories by combining reasoning and external tools; the CodeGen, which generates secure code patches; and the Extractor, which validates and extracts functional code snippets. This collaborative process ensures the generation of high-quality, secure code.
  • Figure 2: Case example illustrating secure string copying to buffers in C/Cpp to prevent buffer overflow with tool "Online Search". The process involves RA-Gen agents: the Planner identifies the problem of string manipulation and buffer safety, the Searcher retrieves relevant information on secure string operations, the Extractor validates and extracts functional code, and the CodeGen agent generates safe implementation examples.
  • Figure 3: Evaluation of RA-Gen's effectiveness in addressing various Common Weakness Enumeration (CWE) types, demonstrating its ability to mitigate specific security vulnerabilities.