Compressing Structured Tensor Algebra
Mahdi Ghorbani, Emilien Bauer, Tobias Grosser, Amir Shaikhha
TL;DR
DASTAC addresses the challenge of efficiently computing structured tensor algebra by propagating high-level structure into a densely packed data layout and low-level, structure-aware code. It combines StructTensor-based structure inference with a novel symbolic indexing compression and progressive MLIR-based code generation leveraging the polyhedral model. The approach delivers up to 1–2 orders of magnitude speedups over state-of-the-art sparse and structured tensor compilers while achieving substantially lower memory footprints, and it scales effectively on multi-core CPUs. This work suggests a practical path toward high-performance structured tensor computations on CPUs and future GPU targets by unifying dense and sparse optimization techniques through polyhedral and MLIR pipelines.
Abstract
Tensor algebra is a crucial component for data-intensive workloads such as machine learning and scientific computing. As the complexity of data grows, scientists often encounter a dilemma between the highly specialized dense tensor algebra and efficient structure-aware algorithms provided by sparse tensor algebra. In this paper, we introduce DASTAC, a framework to propagate the tensors's captured high-level structure down to low-level code generation by incorporating techniques such as automatic data layout compression, polyhedral analysis, and affine code generation. Our methodology reduces memory footprint by automatically detecting the best data layout, heavily benefits from polyhedral optimizations, leverages further optimizations, and enables parallelization through MLIR. Through extensive experimentation, we show that DASTAC achieves 1 to 2 orders of magnitude speedup over TACO, a state-of-the-art sparse tensor compiler, and StructTensor, a state-of-the-art structured tensor algebra compiler, with a significantly lower memory footprint.
