Performance-Aligned LLMs for Generating Fast Code
Daniel Nichols, Pranav Polasam, Harshitha Menon, Aniruddha Marathe, Todd Gamblin, Abhinav Bhatele
TL;DR
The paper tackles the bottleneck of speed in code produced by large language models by introducing performance-aware fine-tuning. It combines a structured CodeContests-Perf dataset with synthetic data and proposes two methods, reinforcement learning with performance feedback (RLPF) and direct performance alignment (DPA), to align LLM outputs with faster code while preserving correctness. Across code generation and optimization tasks, the approach yields substantial speedups—up to 1.6x on serial code and up to 4.5x on OpenMP—alongside strong correctness metrics, validating the effectiveness of performance-oriented fine-tuning. The work demonstrates a practical path to integrating performance considerations into AI-assisted software development for both serial and parallel HPC workloads.
Abstract
Optimizing scientific software is a difficult task because codebases are often large and complex, and performance can depend upon several factors including the algorithm, its implementation, and hardware among others. Causes of poor performance can originate from disparate sources and be difficult to diagnose. Recent years have seen a multitude of work that use large language models (LLMs) to assist in software development tasks. However, these tools are trained to model the distribution of code as text, and are not specifically designed to understand performance aspects of code. In this work, we introduce a reinforcement learning based methodology to align the outputs of code LLMs with performance. This allows us to build upon the current code modeling capabilities of LLMs and extend them to generate better performing code. We demonstrate that our fine-tuned model improves the expected speedup of generated code over base models for a set of benchmark tasks from 0.9 to 1.6 for serial code and 1.9 to 4.5 for OpenMP code.
