Latent Idiom Recognition for a Minimalist Functional Array Language using Equality Saturation
Jonathan Van der Cruysse, Christophe Dubach
TL;DR
LIAR addresses idiom recognition for high-performance libraries by encoding programs and library idioms in a minimalist functional array IR and applying equality saturation. The approach uses a target-independent IR with a small set of core rewrite rules plus target-specific extractors and idiom rules to expose library calls such as BLAS and PyTorch functions. Evaluation on PolyBench kernels shows a geometric mean speedup of 1.46x over reference C kernels (excluding gemver) and demonstrates how idioms evolve over saturation steps toward complete library coverage. The results indicate that automated idiom discovery using equality saturation is robust to input variations and adaptable to multiple libraries, reducing the need for hand-crafted analyses.
Abstract
Accelerating programs is typically done by recognizing code idioms matching high-performance libraries or hardware interfaces. However, recognizing such idioms automatically is challenging. The idiom recognition machinery is difficult to write and requires expert knowledge. In addition, slight variations in the input program might hide the idiom and defeat the recognizer. This paper advocates for the use of a minimalist functional array language supporting a small, but expressive, set of operators. The minimalist design leads to a tiny sets of rewrite rules, which encode the language semantics. Crucially, the same minimalist language is also used to encode idioms. This removes the need for hand-crafted analysis passes, or for having to learn a complex domain-specific language to define the idioms. Coupled with equality saturation, this approach is able to match the core functions from the BLAS and PyTorch libraries on a set of computational kernels. Compared to reference C kernel implementations, the approach produces a geometric mean speedup of 1.46x for C programs using BLAS, when generating such programs from the high-level minimalist language.
