Table of Contents
Fetching ...

Rethinking Code Refinement: Learning to Judge Code Efficiency

Minju Seo, Jinheon Baek, Sung Ju Hwang

TL;DR

This work proposes a novel method based on the code language model that is trained to judge the efficiency between two different codes by either classifying the superior one or predicting the relative improvement.

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities in understanding and generating codes. Due to these capabilities, many recent methods are proposed to automatically refine the codes with LLMs. However, we should rethink that the refined codes (from LLMs and even humans) are not always more efficient than their original versions. On the other hand, running two different versions of codes and comparing them every time is not ideal and time-consuming. Therefore, in this work, we propose a novel method based on the code language model that is trained to judge the efficiency between two different codes (generated across humans and machines) by either classifying the superior one or predicting the relative improvement. We validate our method on multiple programming languages with multiple refinement steps, demonstrating that the proposed method can effectively distinguish between more and less efficient versions of code.

Rethinking Code Refinement: Learning to Judge Code Efficiency

TL;DR

This work proposes a novel method based on the code language model that is trained to judge the efficiency between two different codes by either classifying the superior one or predicting the relative improvement.

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities in understanding and generating codes. Due to these capabilities, many recent methods are proposed to automatically refine the codes with LLMs. However, we should rethink that the refined codes (from LLMs and even humans) are not always more efficient than their original versions. On the other hand, running two different versions of codes and comparing them every time is not ideal and time-consuming. Therefore, in this work, we propose a novel method based on the code language model that is trained to judge the efficiency between two different codes (generated across humans and machines) by either classifying the superior one or predicting the relative improvement. We validate our method on multiple programming languages with multiple refinement steps, demonstrating that the proposed method can effectively distinguish between more and less efficient versions of code.

Paper Structure

This paper contains 29 sections, 7 figures, 11 tables.

Figures (7)

  • Figure 1: (A) Existing code refinement approaches sometimes generate the code which has inferior efficiency to the original code. (B) Our proposed approach identifies the efficient code among two different versions of codes (before and after modifications), and further predicts its relative improvement. (C) We categorize the refined code according to its efficiency gain (%) compared to the original into three classes: Degradation (less than 0.9), Non-Improvement (0.9 to 1.1), and Improvement (greater than 1.1).
  • Figure 2: Results with bucketing the code pairs according to their absolute relative improvement in efficiency, on Python.
  • Figure 3: Visualization of the Spearman’s rank correlation between the ranks of the actual relative improvements and the predicted relative improvements of code pairs, for our model.
  • Figure 4: Generated Python and C++ samples for the question "For an integer N, we will choose a permutation $\{P_1, P_2, ..., P_N\}$ of $\{1, 2, ..., N\}$. Then, for each $i=1,2,...,N,$ let $M_i$ be the remainder when i is divided by $P_i$. Find the maximum possible value of $M_1 + M_2 + \cdots + M_N$. Constraints $N$ is an integer satisfying $1 \leq N \leq 10^9$".
  • Figure 5: Generated Python and C++ codes for the question "Takahashi has a deposit of 100 yen (the currency of Japan) in AtCoder Bank. The bank pays an annual interest rate of 1% compounded annually. (A fraction of less than one yen is discarded.) Assuming that nothing other than the interest affects Takahashi's balance, in how many years does the balance reach X yen or above for the first time?".
  • ...and 2 more figures