Table of Contents
Fetching ...

Empowering AI to Generate Better AI Code: Guided Generation of Deep Learning Projects with LLMs

Chen Xie, Mingsheng Jiao, Xiaodong Gu, Beijun Shen

TL;DR

DLCodeGen introduces a planning-guided framework for generating deep learning projects by first predicting a structured solution plan, then using dual retrieval-augmented prompts (Code RAG and Template RAG) guided by a comparative learning mechanism to synthesize final code. The plan predictor is a fine-tuned GPT-2 model trained on a purpose-built DLPlanData corpus, and the approach benefits from both concrete code references and generalized templates to improve coherence and domain alignment. Experimental results on a dedicated DLCodeEval benchmark show DLCodeGen achieving superior CodeBLEU scores and human-evaluated quality, with ablations demonstrating the critical roles of the planning step, dual RAG strategies, and comparative learning. The work provides a scalable blueprint for domain-aware, end-to-end code generation in deep learning, with practical impact for automating complex ML project construction and facilitating reproducible research.

Abstract

While large language models (LLMs) have been widely applied to code generation, they struggle with generating entire deep learning projects, which are characterized by complex structures, longer functions, and stronger reliance on domain knowledge than general-purpose code. An open-domain LLM often lacks coherent contextual guidance and domain expertise for specific projects, making it challenging to produce complete code that fully meets user requirements. In this paper, we propose a novel planning-guided code generation method, DLCodeGen, tailored for generating deep learning projects. DLCodeGen predicts a structured solution plan, offering global guidance for LLMs to generate the project. The generated plan is then leveraged to retrieve semantically analogous code samples and subsequently abstract a code template. To effectively integrate these multiple retrieval-augmented techniques, a comparative learning mechanism is designed to generate the final code. We validate the effectiveness of our approach on a dataset we build for deep learning code generation. Experimental results demonstrate that DLCodeGen outperforms other baselines, achieving improvements of 9.7% in CodeBLEU and 3.6% in human evaluation metrics.

Empowering AI to Generate Better AI Code: Guided Generation of Deep Learning Projects with LLMs

TL;DR

DLCodeGen introduces a planning-guided framework for generating deep learning projects by first predicting a structured solution plan, then using dual retrieval-augmented prompts (Code RAG and Template RAG) guided by a comparative learning mechanism to synthesize final code. The plan predictor is a fine-tuned GPT-2 model trained on a purpose-built DLPlanData corpus, and the approach benefits from both concrete code references and generalized templates to improve coherence and domain alignment. Experimental results on a dedicated DLCodeEval benchmark show DLCodeGen achieving superior CodeBLEU scores and human-evaluated quality, with ablations demonstrating the critical roles of the planning step, dual RAG strategies, and comparative learning. The work provides a scalable blueprint for domain-aware, end-to-end code generation in deep learning, with practical impact for automating complex ML project construction and facilitating reproducible research.

Abstract

While large language models (LLMs) have been widely applied to code generation, they struggle with generating entire deep learning projects, which are characterized by complex structures, longer functions, and stronger reliance on domain knowledge than general-purpose code. An open-domain LLM often lacks coherent contextual guidance and domain expertise for specific projects, making it challenging to produce complete code that fully meets user requirements. In this paper, we propose a novel planning-guided code generation method, DLCodeGen, tailored for generating deep learning projects. DLCodeGen predicts a structured solution plan, offering global guidance for LLMs to generate the project. The generated plan is then leveraged to retrieve semantically analogous code samples and subsequently abstract a code template. To effectively integrate these multiple retrieval-augmented techniques, a comparative learning mechanism is designed to generate the final code. We validate the effectiveness of our approach on a dataset we build for deep learning code generation. Experimental results demonstrate that DLCodeGen outperforms other baselines, achieving improvements of 9.7% in CodeBLEU and 3.6% in human evaluation metrics.

Paper Structure

This paper contains 26 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Overview of DLCodeGen
  • Figure 2: An example of a solution plan for a deep learning project
  • Figure 3: An Illustration of Comparative Generation
  • Figure 4: Performance of Solution Plan Predictor
  • Figure 5: The Trend of Scores with Temperature Changes