Table of Contents
Fetching ...

Towards Semantics Lifting for Scientific Computing: A Case Study on FFT

Naifeng Zhang, Sanil Rao, Mike Franusich, Franz Franchetti

TL;DR

This work addresses the difficulty of ensuring correctness and performance for LLM-generated scientific kernels, focusing on numerical stability and floating-point accuracy when runtime sizes are unknown. It introduces stepwise semantics lifting, an approach that reverses SPIRAL's lowering by using symbolic execution and theorem proving to statically derive high-level semantics from kernel code, demonstrated by lifting a GPT-generated FFT implementation to a high-level specification. The authors present an end-to-end pipeline including an LLVM-to-SPIRAL parser and a sequence of lifting steps that convert C code into SPL- and mathematical specifications, ultimately confirming a Cooley–Tukey FFT representation. This approach enables early bug detection and performance bottleneck identification, and offers a pathway to feed lifted semantics back into neural code generation for improved scientific computing software.

Abstract

The rise of automated code generation tools, such as large language models (LLMs), has introduced new challenges in ensuring the correctness and efficiency of scientific software, particularly in complex kernels, where numerical stability, domain-specific optimizations, and precise floating-point arithmetic are critical. We propose a stepwise semantics lifting approach using an extended SPIRAL framework with symbolic execution and theorem proving to statically derive high-level code semantics from LLM-generated kernels. This method establishes a structured path for verifying the source code's correctness via a step-by-step lifting procedure to high-level specification. We conducted preliminary tests on the feasibility of this approach by successfully lifting GPT-generated fast Fourier transform code to high-level specifications.

Towards Semantics Lifting for Scientific Computing: A Case Study on FFT

TL;DR

This work addresses the difficulty of ensuring correctness and performance for LLM-generated scientific kernels, focusing on numerical stability and floating-point accuracy when runtime sizes are unknown. It introduces stepwise semantics lifting, an approach that reverses SPIRAL's lowering by using symbolic execution and theorem proving to statically derive high-level semantics from kernel code, demonstrated by lifting a GPT-generated FFT implementation to a high-level specification. The authors present an end-to-end pipeline including an LLVM-to-SPIRAL parser and a sequence of lifting steps that convert C code into SPL- and mathematical specifications, ultimately confirming a Cooley–Tukey FFT representation. This approach enables early bug detection and performance bottleneck identification, and offers a pathway to feed lifted semantics back into neural code generation for improved scientific computing software.

Abstract

The rise of automated code generation tools, such as large language models (LLMs), has introduced new challenges in ensuring the correctness and efficiency of scientific software, particularly in complex kernels, where numerical stability, domain-specific optimizations, and precise floating-point arithmetic are critical. We propose a stepwise semantics lifting approach using an extended SPIRAL framework with symbolic execution and theorem proving to statically derive high-level code semantics from LLM-generated kernels. This method establishes a structured path for verifying the source code's correctness via a step-by-step lifting procedure to high-level specification. We conducted preliminary tests on the feasibility of this approach by successfully lifting GPT-generated fast Fourier transform code to high-level specifications.
Paper Structure (16 sections, 10 equations, 1 figure)

This paper contains 16 sections, 10 equations, 1 figure.

Figures (1)

  • Figure 1: Overview of proposed semantics lifting procedure via SPIRAL. We propose to reverse the well-established code generation (i.e., lowering) process puschel2005spiralfranchetti2018spiralxiong2001splfranchetti2005formallow2017high to stepwisely lift the semantics of the source code.