Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks
Amr Almorsi, Mohanned Ahmed, Walid Gomaa
TL;DR
The paper addresses the limitations of large language models in handling long-context, compositional code tasks by introducing a guided, multi-agent framework. It combines hierarchical problem decomposition, bottom-up code generation, and a multi-agent validation system to manage complexity, test components, and iteratively refine solutions. Empirical evaluation on HumanEval with a relatively small model (Llama 3.1 8B int4) shows a significant improvement in Pass@1 to 56.2% from 45.4% ( +23.79pp ), illustrating the practical gains of structured decomposition and validation over direct generation. The authors also present a theoretical framework that separates leaf-level information retrieval from interface-based composition, offering a principled approach to robust code construction and suggesting scalability and paradigm diversification as avenues for future work.
Abstract
Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks, yet they face significant limitations in handling complex, long-context programming challenges and demonstrating complex compositional reasoning abilities. This paper introduces a novel agentic framework for ``guided code generation'' that tries to address these limitations through a deliberately structured, fine-grained approach to code generation tasks. Our framework leverages LLMs' strengths as fuzzy searchers and approximate information retrievers while mitigating their weaknesses in long sequential reasoning and long-context understanding. Empirical evaluation using OpenAI's HumanEval benchmark with Meta's Llama 3.1 8B model (int4 precision) demonstrates a 23.79\% improvement in solution accuracy compared to direct one-shot generation. Our results indicate that structured, guided approaches to code generation can significantly enhance the practical utility of LLMs in software development while overcoming their inherent limitations in compositional reasoning and context handling.
