Table of Contents
Fetching ...

Tenspiler: A Verified Lifting-Based Compiler for Tensor Operations (Extended Version)

Jie Qiu, Colin Cai, Sahil Bhatia, Niranjan Hasabnis, Sanjit A. Seshia, Alvin Cheung

TL;DR

Tenspiler addresses the burden of migrating legacy sequential code to tensor-oriented DSLs by employing a verified lifting workflow built around a novel Tensor algebra–based IR called Tensir. The approach uses SyGuS-based synthesis to produce a Tensir program, SMT-based verification to guarantee semantic equivalence, and a simple pattern-matching backend to generate executable code for six tensor backends, achieving substantial speedups across 10 real-world benchmarks ($105\times$ kernel and $9.65\times$ end-to-end on average). Tensir’s design emphasizes extensibility and verification, offering a flexible pathway to add new backends while maintaining correctness guarantees. Experimental results demonstrate robust performance across multiple hardware targets, with notable gains on Gaudi2 and detailed ablations confirming the effectiveness of the synthesis optimizations. Overall, Tenspiler provides a practical, verifiable route to harness tensor processing infrastructures without reimplementing domain-specific compilers for each backend.

Abstract

Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and optimized. These infrastructures provide powerful and expressive abstractions while ensuring high performance. However, to utilize them, code must be written specifically using the APIs / ISAs of such software frameworks or hardware accelerators. Importantly, given the fast pace of innovation in these domains, code written today quickly becomes legacy as new frameworks and accelerators are developed, and migrating such legacy code manually is a considerable effort. To enable developers in leveraging such DSLs while preserving their current programming paradigm, we introduce Tenspiler, a verified lifting-based compiler that uses program synthesis to translate sequential programs written in general-purpose programming languages (e.g., C++ or Python code) into tensor operations. Central to Tenspiler is our carefully crafted yet simple intermediate language, named TensIR, that expresses tensor operations. TensIR enables efficient lifting, verification, and code generation. Currently, Tenspiler already supports $\textbf{six}$ DSLs, spanning a broad spectrum of software and hardware environments. Furthermore, we show that new backends can be easily supported by Tenspiler by adding simple pattern-matching rules for TensIR. Using 10 real-world code benchmark suites, our experimental evaluation shows that by translating code to be executed on $\textbf{6}$ different software frameworks and hardware devices, Tenspiler offers on average 105$\times$ kernel and 9.65$\times$ end-to-end execution time improvement over the fully-optimized sequential implementation of the same benchmarks.

Tenspiler: A Verified Lifting-Based Compiler for Tensor Operations (Extended Version)

TL;DR

Tenspiler addresses the burden of migrating legacy sequential code to tensor-oriented DSLs by employing a verified lifting workflow built around a novel Tensor algebra–based IR called Tensir. The approach uses SyGuS-based synthesis to produce a Tensir program, SMT-based verification to guarantee semantic equivalence, and a simple pattern-matching backend to generate executable code for six tensor backends, achieving substantial speedups across 10 real-world benchmarks ( kernel and end-to-end on average). Tensir’s design emphasizes extensibility and verification, offering a flexible pathway to add new backends while maintaining correctness guarantees. Experimental results demonstrate robust performance across multiple hardware targets, with notable gains on Gaudi2 and detailed ablations confirming the effectiveness of the synthesis optimizations. Overall, Tenspiler provides a practical, verifiable route to harness tensor processing infrastructures without reimplementing domain-specific compilers for each backend.

Abstract

Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and optimized. These infrastructures provide powerful and expressive abstractions while ensuring high performance. However, to utilize them, code must be written specifically using the APIs / ISAs of such software frameworks or hardware accelerators. Importantly, given the fast pace of innovation in these domains, code written today quickly becomes legacy as new frameworks and accelerators are developed, and migrating such legacy code manually is a considerable effort. To enable developers in leveraging such DSLs while preserving their current programming paradigm, we introduce Tenspiler, a verified lifting-based compiler that uses program synthesis to translate sequential programs written in general-purpose programming languages (e.g., C++ or Python code) into tensor operations. Central to Tenspiler is our carefully crafted yet simple intermediate language, named TensIR, that expresses tensor operations. TensIR enables efficient lifting, verification, and code generation. Currently, Tenspiler already supports DSLs, spanning a broad spectrum of software and hardware environments. Furthermore, we show that new backends can be easily supported by Tenspiler by adding simple pattern-matching rules for TensIR. Using 10 real-world code benchmark suites, our experimental evaluation shows that by translating code to be executed on different software frameworks and hardware devices, Tenspiler offers on average 105 kernel and 9.65 end-to-end execution time improvement over the fully-optimized sequential implementation of the same benchmarks.
Paper Structure (30 sections, 4 equations, 15 figures, 1 table)

This paper contains 30 sections, 4 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: End-to-End example of using Tenspiler to transpile code.
  • Figure 2: Tensir grammar.
  • Figure 3: An overview of the Tenspiler Framework.
  • Figure 4: Example of an inductive axiom for the tensor_scalar operator in Tensir described using SMT-LIB. "+" corresponds to the concat operator.
  • Figure 5: Code generation for the element-wise add operator to different targets.
  • ...and 10 more figures