Generalized cyclic symmetric decompositions for the matrix multiplication tensor
Charlotte Vermeylen, Marc Van Barel
TL;DR
The paper tackles the challenge of discovering fast matrix multiplication algorithms by casting matrix multiplication as a canonical polyadic decomposition of the matrix multiplication tensor $T_{mpn}$ and addressing the non-uniqueness and large-parameter search space via a generalized cyclic symmetric structure in the factor matrices. This CS-based structure is integrated into an augmented Lagrangian solver and extended to non-equal modes through a generalized CS formulation, plus recursive and subblock CS strategies to further reduce parameters. Empirical results across multiple tensors show that the generalized CS approach yields more exact and practically useful decompositions, including sparse PDs with entries in $\\{0, \\pm 1\}$. The work thus improves convergence, expands the set of practical CPDs, and enhances the practical discovery of efficient FMM algorithms with potential hardware-friendly properties.
Abstract
A new generalized cyclic symmetric structure in the factor matrices of polyadic decompositions of matrix multiplication tensors for non-square matrix multiplication is proposed to reduce the number of variables in the optimization problem and in this way improve the convergence. The structure is implemented in an existing numerical optimization algorithm. Extensive numerical experiments are given that the proposed structure indeed finds more (practical) decompositions.
