Table of Contents
Fetching ...

TensorRight: Automated Verification of Tensor Graph Rewrites

Jai Arora, Sirui Lu, Devansh Jain, Tianfan Xu, Farzin Houshmand, Phitchaya Mangpo Phothilimthana, Mohsen Lesani, Praveen Narayanan, Karthik Srinivasa Murthy, Rastislav Bodik, Amit Sabne, Charith Mendis

TL;DR

TensorRight introduces the first automatic verifier for tensor graph rewrites that are valid for input tensors of arbitrary rank and size. It defines TensorRight DSL with aggregated-axes to capture rank- and size-polymorphic rules and proves that a finite rank bound suffices for unbounded verification, enabling bounded symbolic execution and SMT-based proof across all instances. The system represents XLA-HLO-like operators, provides denotational semantics, and uses a bound-inference algorithm to generate a finite set of bounded-verification obligations, which are discharged by Z3 (via Grisette). Evaluation on XLA’s algebraic simplifier shows TensorRight can express 121 rules and verify 115 unboundedly (most within a second), outperforming prior bounded-verification approaches (TASO, PET) by substantial margins. This work enables rapid, counterexample-guided refinement of complex rewrites and offers public tooling to help compiler developers broaden and validate rank- and size-polymorphic optimizations.

Abstract

Tensor compilers, essential for generating efficient code for deep learning models across various applications, employ tensor graph rewrites as one of the key optimizations. These rewrites optimize tensor computational graphs with the expectation of preserving semantics for tensors of arbitrary rank and size. Despite this expectation, to the best of our knowledge, there does not exist a fully automated verification system to prove the soundness of these rewrites for tensors of arbitrary rank and size. Previous works, while successful in verifying rewrites with tensors of concrete rank, do not provide guarantees in the unbounded setting. To fill this gap, we introduce TensorRight, the first automatic verification system that can verify tensor graph rewrites for input tensors of arbitrary rank and size. We introduce a core language, TensorRight DSL, to represent rewrite rules using a novel axis definition, called aggregated-axis, which allows us to reason about an unbounded number of axes. We achieve unbounded verification by proving that there exists a bound on tensor ranks, under which bounded verification of all instances implies the correctness of the rewrite rule in the unbounded setting. We derive an algorithm to compute this rank using the denotational semantics of TensorRight DSL. TensorRight employs this algorithm to generate a finite number of bounded-verification proof obligations, which are then dispatched to an SMT solver using symbolic execution to automatically verify the correctness of the rewrite rules. We evaluate TensorRight's verification capabilities by implementing rewrite rules present in XLA's algebraic simplifier. The results demonstrate that TensorRight can prove the correctness of 115 out of 175 rules in their full generality, while the closest automatic, bounded-verification system can express only 18 of these rules.

TensorRight: Automated Verification of Tensor Graph Rewrites

TL;DR

TensorRight introduces the first automatic verifier for tensor graph rewrites that are valid for input tensors of arbitrary rank and size. It defines TensorRight DSL with aggregated-axes to capture rank- and size-polymorphic rules and proves that a finite rank bound suffices for unbounded verification, enabling bounded symbolic execution and SMT-based proof across all instances. The system represents XLA-HLO-like operators, provides denotational semantics, and uses a bound-inference algorithm to generate a finite set of bounded-verification obligations, which are discharged by Z3 (via Grisette). Evaluation on XLA’s algebraic simplifier shows TensorRight can express 121 rules and verify 115 unboundedly (most within a second), outperforming prior bounded-verification approaches (TASO, PET) by substantial margins. This work enables rapid, counterexample-guided refinement of complex rewrites and offers public tooling to help compiler developers broaden and validate rank- and size-polymorphic optimizations.

Abstract

Tensor compilers, essential for generating efficient code for deep learning models across various applications, employ tensor graph rewrites as one of the key optimizations. These rewrites optimize tensor computational graphs with the expectation of preserving semantics for tensors of arbitrary rank and size. Despite this expectation, to the best of our knowledge, there does not exist a fully automated verification system to prove the soundness of these rewrites for tensors of arbitrary rank and size. Previous works, while successful in verifying rewrites with tensors of concrete rank, do not provide guarantees in the unbounded setting. To fill this gap, we introduce TensorRight, the first automatic verification system that can verify tensor graph rewrites for input tensors of arbitrary rank and size. We introduce a core language, TensorRight DSL, to represent rewrite rules using a novel axis definition, called aggregated-axis, which allows us to reason about an unbounded number of axes. We achieve unbounded verification by proving that there exists a bound on tensor ranks, under which bounded verification of all instances implies the correctness of the rewrite rule in the unbounded setting. We derive an algorithm to compute this rank using the denotational semantics of TensorRight DSL. TensorRight employs this algorithm to generate a finite number of bounded-verification proof obligations, which are then dispatched to an SMT solver using symbolic execution to automatically verify the correctness of the rewrite rules. We evaluate TensorRight's verification capabilities by implementing rewrite rules present in XLA's algebraic simplifier. The results demonstrate that TensorRight can prove the correctness of 115 out of 175 rules in their full generality, while the closest automatic, bounded-verification system can express only 18 of these rules.

Paper Structure

This paper contains 96 sections, 10 theorems, 77 equations, 14 figures, 1 table, 2 algorithms.

Key Result

Lemma 1

Let $R$ be any rewrite rule written in our DSL. Let $m$ be a map containing ranks of RClasses in $R$. For any RClass$c$ in the rule, if $k = \textsc{InferBound}(R, c)$, then

Figures (14)

  • Figure 1: FoldConvInputPad rule taken from XLA's Algebraic Simplifier.
  • Figure 2: Illustration for SliceDyUpSlice rule depicting various regions in the input tensor for ranks 1 and 2. The leftmost element is shown as crossed out. The green and blue regions indicate the elements extracted by $\mathsf{slice}$ in LHS and RHS, respectively. The zeroed out region after the $\mathsf{dyup{\hbox{-}}slice}$ is indicated by 0-elements.
  • Figure 3: The SliceDyUpSlice specialized for rank-3 inputs. The LHS and RHS expressions are presented using 2-dimensional cross-sections along axis 2. The access $A^3 = [0,0,1]$ is highlighted in LHS and RHS.
  • Figure 4: Illustration of $\mathsf{dy{\hbox{-}}slice}$ and $\mathsf{slice}$ operators. The shaded regions denote the operator outputs.
  • Figure 5: TensorRight Overview and Workflow.
  • ...and 9 more figures

Theorems & Definitions (20)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Lemma 1
  • Theorem 2
  • Definition 6: Valid Rewrite Rule
  • Definition 7: Invalid Rewrite Rule
  • Definition 8: RClass-rank Map
  • ...and 10 more