Large Language Models for Code Generation: A Comprehensive Survey of Challenges, Techniques, Evaluation, and Applications
Nam Huynh, Beiyu Lin
TL;DR
This survey addresses how Large Language Models enable automatic code generation from natural language, focusing on limitations, fine-tuning strategies, evaluation metrics, and applications. It presents a structured view from foundational LLM architectures and code-generation workflows to domain-specific tuning, feedback-driven improvements, and prompting techniques, supported by benchmarks such as HumanEval, CodeBLEU, and ICE-Score. The paper also reviews practical applications and tools (e.g., CodeLlama, GitHub Copilot, ToolGen) and discusses serious concerns around resource demands, errors, biases, and security, offering a roadmap for advancing reliable, efficient code-generation systems. Collectively, the work highlights that combining domain-focused fine-tuning, execution-based feedback, and advanced prompting can substantially boost code-generation performance while underscoring the need for rigorous evaluation and secure deployment in real-world development tasks.
Abstract
Large Language Models (LLMs) have demonstrated their remarkable capabilities in numerous fields. This survey focuses on how LLMs empower users, regardless of their technical background, to use human languages to automatically generate executable code. We begin with understanding LLMs' limitations and challenges in automated code generation. Subsequently, we review various fine-tuning techniques designed to enhance both the performance and adaptability of LLMs in code generation tasks. We then review the existing metrics and benchmarks for evaluations to assess model performance based on fine-tuning techniques. Finally, we explore the applications of LLMs (e.g. CodeLlama, GitHub Copilot, ToolGen) in code generation tasks to illustrate their roles and functionalities. This survey provides a comprehensive overview of LLMs for code generation, helps researchers in diverse fields better understand the current state-of-the-art technologies, and offers the potential of effectively leveraging LLMs for code generation tasks.
