CoNST: Code Generator for Sparse Tensor Networks

Saurabh Raje; Yufan Xu; Atanas Rountev; Edward F. Valeev; Saday Sadayappan

CoNST: Code Generator for Sparse Tensor Networks

Saurabh Raje, Yufan Xu, Atanas Rountev, Edward F. Valeev, Saday Sadayappan

TL;DR

CoNST tackles the challenging problem of efficiently generating code for sparse tensor contraction trees by formulating a unified constraint-based optimization that jointly selects loop fusion, tensor mode layouts, and contraction order. It encodes these decisions with a Z3 SMT solver to produce a legal, fused loop structure and CSF layouts, which are then lowered to the Tensor Algebra Compiler (TACO) IR for code generation. The approach yields significant performance gains, often orders of magnitude faster than state-of-the-art sparse tensor systems, across benchmarks in quantum chemistry and tensor decompositions. This integrated framework enables high-performance sparse tensor networks and shows strong potential for parallelization and GPU acceleration in future work. $E$ and $F$ are treated as tensors of dynamic extents, and the method emphasizes reducing intermediate tensor orders to improve memory and compute efficiency.

Abstract

Sparse tensor networks are commonly used to represent contractions over sparse tensors. Tensor contractions are higher-order analogs of matrix multiplication. Tensor networks arise commonly in many domains of scientific computing and data science. After a transformation into a tree of binary contractions, the network is implemented as a sequence of individual contractions. Several critical aspects must be considered in the generation of efficient code for a contraction tree, including sparse tensor layout mode order, loop fusion to reduce intermediate tensors, and the interdependence of loop order, mode order, and contraction order. We propose CoNST, a novel approach that considers these factors in an integrated manner using a single formulation. Our approach creates a constraint system that encodes these decisions and their interdependence, while aiming to produce reduced-order intermediate tensors via fusion. The constraint system is solved by the Z3 SMT solver and the result is used to create the desired fused loop structure and tensor mode layouts for the entire contraction tree. This structure is lowered to the IR of the TACO compiler, which is then used to generate executable code. Our experimental evaluation demonstrates very significant (sometimes orders of magnitude) performance improvements over current state-of-the-art sparse tensor compiler/library alternatives.

CoNST: Code Generator for Sparse Tensor Networks

TL;DR

and

are treated as tensors of dynamic extents, and the method emphasizes reducing intermediate tensor orders to improve memory and compute efficiency.

Abstract

Paper Structure (30 sections, 11 equations, 12 figures, 2 tables)

This paper contains 30 sections, 11 equations, 12 figures, 2 tables.

Introduction
Background and Overview
Tensor Networks
Sparse tensors
Tensor references
CSF representation
Tensor contractions
Tensor networks
Contraction tree
Challenges and Overview of Solution
Constraint-Based Integrated Fusion and Data Layout Selection
Input and Output
Constraint Formulation
Ordering of assignments
Ordering of tensor modes
...and 15 more sections

Figures (12)

Figure 1: The CSF format for representing an order-$4$ sparse tensor in memory. The table on the left shows the indices of non-zero elements. The tree on the right shows the CSF representation (root node is not shown).
Figure 2: Tensor network and code for direct $n$-ary contraction for expression $R_{i j k} = A_{ i p q } \times B_{ j p r } \times C_{ k q r } \times D_{ j k r }$
Figure 3: Contraction tree for a tensor network
Figure 4: Reduction of size of intermediate tensors via loop fusion
Figure 5: Fused code structure
...and 7 more figures

CoNST: Code Generator for Sparse Tensor Networks

TL;DR

Abstract

CoNST: Code Generator for Sparse Tensor Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (12)