Table of Contents
Fetching ...

Meta Large Language Model Compiler: Foundation Models of Compiler Optimization

Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Roziere, Jonas Gehring, Gabriel Synnaeve, Hugh Leather

TL;DR

<3-5 sentence high-level summary> LLM Compiler presents a specialized family of foundation models built on Code Llama, pretrained on a large compiler-centric corpus of LLVM-IR and assembly to understand compiler semantics and emulation. The authors introduce two downstream fine-tuning tasks—optimization flag tuning and disassembly—yielding LLM Compiler FTD models that reach a substantial portion of autotuner performance (about 77% for optimization) and demonstrate meaningful disassembly round-trip capabilities (about 45% with 14% exact matches in the abstract’s terms). The evaluation spans flag tuning, disassembly, foundation-model tasks, and software-engineering benchmark suites, showing improvements over baselines like Code Llama and GPT-4 Turbo while highlighting trade-offs in general programming tasks. The work provides two openly available model sizes (7B and 13B) under a bespoke license to accelerate research and practical development in compiler optimization and IR understanding.>

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across a variety of software engineering and coding tasks. However, their application in the domain of code and compiler optimization remains underexplored. Training LLMs is resource-intensive, requiring substantial GPU hours and extensive data collection, which can be prohibitive. To address this gap, we introduce Meta Large Language Model Compiler (LLM Compiler), a suite of robust, openly available, pre-trained models specifically designed for code optimization tasks. Built on the foundation of Code Llama, LLM Compiler enhances the understanding of compiler intermediate representations (IRs), assembly language, and optimization techniques. The model has been trained on a vast corpus of 546 billion tokens of LLVM-IR and assembly code and has undergone instruction fine-tuning to interpret compiler behavior. LLM Compiler is released under a bespoke commercial license to allow wide reuse and is available in two sizes: 7 billion and 13 billion parameters. We also present fine-tuned versions of the model, demonstrating its enhanced capabilities in optimizing code size and disassembling from x86_64 and ARM assembly back into LLVM-IR. These achieve 77% of the optimising potential of an autotuning search, and 45% disassembly round trip (14% exact match). This release aims to provide a scalable, cost-effective foundation for further research and development in compiler optimization by both academic researchers and industry practitioners.

Meta Large Language Model Compiler: Foundation Models of Compiler Optimization

TL;DR

<3-5 sentence high-level summary> LLM Compiler presents a specialized family of foundation models built on Code Llama, pretrained on a large compiler-centric corpus of LLVM-IR and assembly to understand compiler semantics and emulation. The authors introduce two downstream fine-tuning tasks—optimization flag tuning and disassembly—yielding LLM Compiler FTD models that reach a substantial portion of autotuner performance (about 77% for optimization) and demonstrate meaningful disassembly round-trip capabilities (about 45% with 14% exact matches in the abstract’s terms). The evaluation spans flag tuning, disassembly, foundation-model tasks, and software-engineering benchmark suites, showing improvements over baselines like Code Llama and GPT-4 Turbo while highlighting trade-offs in general programming tasks. The work provides two openly available model sizes (7B and 13B) under a bespoke license to accelerate research and practical development in compiler optimization and IR understanding.>

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across a variety of software engineering and coding tasks. However, their application in the domain of code and compiler optimization remains underexplored. Training LLMs is resource-intensive, requiring substantial GPU hours and extensive data collection, which can be prohibitive. To address this gap, we introduce Meta Large Language Model Compiler (LLM Compiler), a suite of robust, openly available, pre-trained models specifically designed for code optimization tasks. Built on the foundation of Code Llama, LLM Compiler enhances the understanding of compiler intermediate representations (IRs), assembly language, and optimization techniques. The model has been trained on a vast corpus of 546 billion tokens of LLVM-IR and assembly code and has undergone instruction fine-tuning to interpret compiler behavior. LLM Compiler is released under a bespoke commercial license to allow wide reuse and is available in two sizes: 7 billion and 13 billion parameters. We also present fine-tuned versions of the model, demonstrating its enhanced capabilities in optimizing code size and disassembling from x86_64 and ARM assembly back into LLVM-IR. These achieve 77% of the optimising potential of an autotuning search, and 45% disassembly round trip (14% exact match). This release aims to provide a scalable, cost-effective foundation for further research and development in compiler optimization by both academic researchers and industry practitioners.
Paper Structure (54 sections, 18 figures, 12 tables)

This paper contains 54 sections, 18 figures, 12 tables.

Figures (18)

  • Figure 1: LLM Compiler models are specialized from Code Llama by training on 546 billion tokens of compiler-centric data in two stages. In the first stage the models are trained predominantly on unlabelled compiler IRs and assembly code. In the next stage the models are instruction fine-tuned to predict the output and effect of optimizations. LLM Compiler FTD models are then further fine-tuned on 164 billion tokens of downstream flag tuning and disassembly task datasets, for a total of 710 billion training tokens. During each of the four stages of training, 15% of data from the previous tasks is retained.
  • Figure 2: To give the model an understanding of how compiler optimizations work, we use compiler emulation. Unoptimized code samples and random pass lists are given to to generate optimized code (IR or assembly). Pass list and input code are taken together as prompt while the generated output code is used as label.
  • Figure 3: Commands used to apply an optimization pipeline comprising -Oz passes followed by IR outlining to an unoptimized IR input.bc. Binary size is the sum of .TEXT and .DATA section sizes of the lowered object file as reported by size.
  • Figure 4: Overview of our approach, showing the model input (Prompt) and output (Label) during training ① and inference ②. The prompt contains unoptimized code. The label contains an optimization pass list, binary size, and the optimized code. To generate the label for the training prompt, the unoptimized code is compiled against multiple random pass lists. The pass list achieving the minimum binary size is selected, minimized and checked for correctness with PassListEval. The final pass list together with its corresponding optimized IR are used as label during training. In a last step, the top 100 most often selected pass lists are broadcast among all programs. For deployment we generate only the optimization pass list which we feed into the compiler, ensuring that the optimized code is correct.
  • Figure 5: Validating a candidate list of optimization passes using PassListEval. The candidate pass list is applied to the reference solutions for all 164 programs in HumanEval-X. The unit tests for these reference solutions are optimized using a conservative -O2 pass pipeline to ensure correctness, and then linked against the reference solutions. The resulting binaries are executed and if any of the binaries crash during execution, or if any of the compiler invocations fail, the pass list is rejected.
  • ...and 13 more figures