Table of Contents
Fetching ...

Enhancing Translation Validation of Compiler Transformations with Large Language Models

Yanzhao Wang, Fei Xie

TL;DR

The paper tackles the challenge of validating compiler transformations in LLVM IR when formal verification tools such as Alive2 struggle with unbounded loops and external calls. It proposes a hybrid framework that first leverages SMT-based formal checking and, when inconclusive, uses fine-tuned Large Language Models to predict soundness, followed by fuzzing to search for counterexamples for unsound predictions. The approach is validated on LLVM IR transformations and deep-learning accelerator designs, showing that fine-tuned small models (e.g., GPT-3.5, Llama2-7B) can outperform larger models in this task, with GPT-3.5 achieving up to 88% accuracy in predictions. This work demonstrates the practical potential of combining formal verification with targeted learning-based prediction to enhance the reliability and scalability of compiler transformation validation, especially in domains with complex computations.

Abstract

This paper presents a framework that integrates Large Language Models (LLMs) into translation validation, targeting LLVM compiler transformations where formal verification tools fall short. Our framework first utilizes existing formal verification tools for translation validation. In this work, we use Alive2, a well-known tool in LLVM compiler verification, as an example. When formal verification tools are unable to confirm a transformation's soundness, our framework employs fine-tuned LLMs for prediction. It then applies fuzzing to transformations predicted as potentially unsound by the LLMs due to return values or memory inconsistencies, aiming to find counterexamples. In cases where transformations are unsound for other reasons or sound, or if no counterexamples emerge, the framework directly reports these outcomes without further fuzzing. This methodology has shown effectiveness in complex application such as deep-learning accelerator designs, where traditional formal verification tools struggle.

Enhancing Translation Validation of Compiler Transformations with Large Language Models

TL;DR

The paper tackles the challenge of validating compiler transformations in LLVM IR when formal verification tools such as Alive2 struggle with unbounded loops and external calls. It proposes a hybrid framework that first leverages SMT-based formal checking and, when inconclusive, uses fine-tuned Large Language Models to predict soundness, followed by fuzzing to search for counterexamples for unsound predictions. The approach is validated on LLVM IR transformations and deep-learning accelerator designs, showing that fine-tuned small models (e.g., GPT-3.5, Llama2-7B) can outperform larger models in this task, with GPT-3.5 achieving up to 88% accuracy in predictions. This work demonstrates the practical potential of combining formal verification with targeted learning-based prediction to enhance the reliability and scalability of compiler transformation validation, especially in domains with complex computations.

Abstract

This paper presents a framework that integrates Large Language Models (LLMs) into translation validation, targeting LLVM compiler transformations where formal verification tools fall short. Our framework first utilizes existing formal verification tools for translation validation. In this work, we use Alive2, a well-known tool in LLVM compiler verification, as an example. When formal verification tools are unable to confirm a transformation's soundness, our framework employs fine-tuned LLMs for prediction. It then applies fuzzing to transformations predicted as potentially unsound by the LLMs due to return values or memory inconsistencies, aiming to find counterexamples. In cases where transformations are unsound for other reasons or sound, or if no counterexamples emerge, the framework directly reports these outcomes without further fuzzing. This methodology has shown effectiveness in complex application such as deep-learning accelerator designs, where traditional formal verification tools struggle.
Paper Structure (14 sections, 2 equations, 6 figures)

This paper contains 14 sections, 2 equations, 6 figures.

Figures (6)

  • Figure 1: Workflow of the Translation Validation Approach
  • Figure 2: Schematic Diagram of the Translation Validation Framework
  • Figure 3: Example of LLVM programs
  • Figure 4: Evaluation Results on LLVM Transformations
  • Figure 5: load_2d module from hVTA
  • ...and 1 more figures