Table of Contents
Fetching ...

A Problem-Oriented Perspective and Anchor Verification for Code Optimization

Tong Ye, Tengfei Ma, Xuhong Zhang, Hang Yu, Jianwei Yin, Wenhai Wang

TL;DR

This work addresses the underexplored area of time-focused code optimization by moving beyond local, user-specific improvements to a problem-oriented paradigm that aggregates solutions from multiple programmers for the same problem. It introduces an anchor verification framework to mitigate the optimization tax, ensuring correctness while pursuing performance gains. Empirical results across multiple LLM backbones show that problem-oriented optimization (PCO) significantly boosts optimization rate and speedup compared to the prior user-oriented approach (PIE), and that anchor verification further enhances correctness and robustness. The proposed approach demonstrates notable data efficiency and practical potential for real-world code optimization, with publicly available code to support reproducibility.

Abstract

Large language models (LLMs) have shown remarkable capabilities in solving various programming tasks, such as code generation. However, their potential for code optimization, particularly in performance enhancement, remains largely unexplored. This paper investigates the capabilities of LLMs in optimizing code for minimal execution time, addressing a critical gap in current research. The recently proposed code optimization dataset constructs program optimization pairs based on iterative submissions from the same programmer for the same problem. However, this approach limits LLMs to local performance improvements, neglecting global algorithmic innovation. To overcome this limitation, we adopt a completely different perspective by reconstructing the optimization pairs into a problem-oriented approach. This allows for the integration of various ideas from multiple programmers tackling the same problem. Experimental results demonstrate that adapting LLMs to problem-oriented optimization pairs significantly enhances their optimization capabilities. Furthermore, recognizing the inherent trade-offs in code optimization, we introduce an anchor verification mechanism to mitigate the "optimization tax". Ultimately, our approach elevates both the optimization ratio and speedup to new levels.

A Problem-Oriented Perspective and Anchor Verification for Code Optimization

TL;DR

This work addresses the underexplored area of time-focused code optimization by moving beyond local, user-specific improvements to a problem-oriented paradigm that aggregates solutions from multiple programmers for the same problem. It introduces an anchor verification framework to mitigate the optimization tax, ensuring correctness while pursuing performance gains. Empirical results across multiple LLM backbones show that problem-oriented optimization (PCO) significantly boosts optimization rate and speedup compared to the prior user-oriented approach (PIE), and that anchor verification further enhances correctness and robustness. The proposed approach demonstrates notable data efficiency and practical potential for real-world code optimization, with publicly available code to support reproducibility.

Abstract

Large language models (LLMs) have shown remarkable capabilities in solving various programming tasks, such as code generation. However, their potential for code optimization, particularly in performance enhancement, remains largely unexplored. This paper investigates the capabilities of LLMs in optimizing code for minimal execution time, addressing a critical gap in current research. The recently proposed code optimization dataset constructs program optimization pairs based on iterative submissions from the same programmer for the same problem. However, this approach limits LLMs to local performance improvements, neglecting global algorithmic innovation. To overcome this limitation, we adopt a completely different perspective by reconstructing the optimization pairs into a problem-oriented approach. This allows for the integration of various ideas from multiple programmers tackling the same problem. Experimental results demonstrate that adapting LLMs to problem-oriented optimization pairs significantly enhances their optimization capabilities. Furthermore, recognizing the inherent trade-offs in code optimization, we introduce an anchor verification mechanism to mitigate the "optimization tax". Ultimately, our approach elevates both the optimization ratio and speedup to new levels.
Paper Structure (38 sections, 3 equations, 21 figures, 4 tables)

This paper contains 38 sections, 3 equations, 21 figures, 4 tables.

Figures (21)

  • Figure 1: For a given problem, different users submit and iterate on their code solutions. The user-oriented perspective constructs optimization pairs based on the submission trajectories of individual users. In contrast, the problem-oriented perspective analyzes all solutions for the problem to build trajectories and form optimization pairs.
  • Figure 2: Structural Analysis of the Disparities between Problem-oriented and User-oriented Optimization Pairs.
  • Figure 3: Semantic Representation Analysis of Problem-oriented and User-oriented Optimization Pairs.
  • Figure 4: Human Analysis of the Optimization Types between Problem-oriented and User-oriented Pairs.
  • Figure 5: Impact of using varying percentages of PCO optimization pairs on %Opt, Speedup, and Correct. The blue line represents the original PCO datasets, while the yellow line represents the original PIE datasets.
  • ...and 16 more figures