Table of Contents
Fetching ...

Language Models for Code Optimization: Survey, Challenges and Future Directions

Jingzhi Gong, Vardan Voskanyan, Paul Brookes, Fan Wu, Wei Jie, Jie Xu, Rafail Giavrimis, Mike Basios, Leslie Kanthan, Zheng Wang

TL;DR

This survey addresses the gap in systematic understanding of LM-based code optimization by aggregating 53 recent studies. It identifies four primary research questions to characterize LMs, their applications, problem definitions, and evaluation practices, revealing a landscape dominated by very large general-purpose LMs and a strong reliance on off-the-shelf models with selective fine-tuning. The study highlights five open challenges—model complexity versus practicality, limited external-system interaction, generalizability across languages/metrics, real-world evaluation gaps, and trust in AI-driven optimization—and outlines eight future directions, including agentic LMs, cross-language multi-objective optimization, and standardized real-world benchmarks. Collectively, the findings provide a roadmap for researchers and practitioners to develop more scalable, robust, and trustworthy LM-based code optimization methods and to bridge the gap between synthetic benchmarks and production software.

Abstract

Language models (LMs) built upon deep neural networks (DNNs) have recently demonstrated breakthrough effectiveness in software engineering tasks such as code generation, completion, and repair. This has paved the way for the emergence of LM-based code optimization techniques, which are crucial for enhancing the performance of existing programs, such as accelerating program execution time. However, a comprehensive survey dedicated to this specific application has been lacking. To fill this gap, we present a systematic literature review of over 50 primary studies, identifying emerging trends and addressing 11 specialized questions. Our findings reveal five critical open challenges, such as balancing model complexity with practical usability, cross-language/performance generalizability, and building trust in AI-driven solutions. Furthermore, we provide eight future research directions to facilitate more efficient, robust, and reliable LM-based code optimization. Thereby, this study aims to provide actionable insights and foundational references for both researchers and practitioners in this rapidly evolving field.

Language Models for Code Optimization: Survey, Challenges and Future Directions

TL;DR

This survey addresses the gap in systematic understanding of LM-based code optimization by aggregating 53 recent studies. It identifies four primary research questions to characterize LMs, their applications, problem definitions, and evaluation practices, revealing a landscape dominated by very large general-purpose LMs and a strong reliance on off-the-shelf models with selective fine-tuning. The study highlights five open challenges—model complexity versus practicality, limited external-system interaction, generalizability across languages/metrics, real-world evaluation gaps, and trust in AI-driven optimization—and outlines eight future directions, including agentic LMs, cross-language multi-objective optimization, and standardized real-world benchmarks. Collectively, the findings provide a roadmap for researchers and practitioners to develop more scalable, robust, and trustworthy LM-based code optimization methods and to bridge the gap between synthetic benchmarks and production software.

Abstract

Language models (LMs) built upon deep neural networks (DNNs) have recently demonstrated breakthrough effectiveness in software engineering tasks such as code generation, completion, and repair. This has paved the way for the emergence of LM-based code optimization techniques, which are crucial for enhancing the performance of existing programs, such as accelerating program execution time. However, a comprehensive survey dedicated to this specific application has been lacking. To fill this gap, we present a systematic literature review of over 50 primary studies, identifying emerging trends and addressing 11 specialized questions. Our findings reveal five critical open challenges, such as balancing model complexity with practical usability, cross-language/performance generalizability, and building trust in AI-driven solutions. Furthermore, we provide eight future research directions to facilitate more efficient, robust, and reliable LM-based code optimization. Thereby, this study aims to provide actionable insights and foundational references for both researchers and practitioners in this rapidly evolving field.
Paper Structure (64 sections, 11 figures, 7 tables)

This paper contains 64 sections, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Visualization of the survey scope.
  • Figure 2: Two Python implementations for calculating the sum of the first $n$ natural numbers.
  • Figure 3: Development of code optimization methods: strengths and weaknesses
  • Figure 4: Overview of the survey methodology used in this study.
  • Figure 5: Overview of the taxonomy for all RQs (one study might be in multiple categories).
  • ...and 6 more figures