Improving the Leading Constant of Matrix Multiplication

Josh Alman; Hantao Yu

Improving the Leading Constant of Matrix Multiplication

Josh Alman, Hantao Yu

Abstract

Algebraic matrix multiplication algorithms are designed by bounding the rank of matrix multiplication tensors, and then using a recursive method. However, designing algorithms in this way quickly leads to large constant factors: if one proves that the tensor for multiplying $n \times n$ matrices has rank $\leq t$, then the resulting recurrence shows that $M \times M$ matrices can be multiplied using $O(n^2 \cdot M^{\log_n t})$ operations, where the leading constant scales proportionally to $n^2$. Even modest increases in $n$ can blow up the leading constant too much to be worth the slight decrease in the exponent of $M$. Meanwhile, the asymptotically best algorithms use very large $n$, such that $n^2$ is larger than the number of atoms in the visible universe! In this paper, we give new ways to use tensor rank bounds to design matrix multiplication algorithms, which lead to smaller leading constants than the standard recursive method. Our main result shows that, if the tensor for multiplying $n \times n$ matrices has rank $\leq t$, then $M \times M$ matrices can be multiplied using only $n^{O(1/(\log n)^{0.33})} \cdot M^{\log_n t}$ operations. In other words, we improve the leading constant in general from $O(n^2)$ to $n^{O(1/(\log n)^{0.33})} < n^{o(1)}$. We then apply this and further improve the leading constant in a number of situations of interest. We show that, in the popularly-conjectured case where $ω=2$, a new, different recursive approach can lead to an improvement. We also show that the leading constant of the current asymptotically fastest matrix multiplication algorithm, and any algorithm designed using the group-theoretic method, can be further improved by taking advantage of additional structure of the underlying tensor identities.

Improving the Leading Constant of Matrix Multiplication

Abstract

matrices has rank

, then the resulting recurrence shows that

matrices can be multiplied using

operations, where the leading constant scales proportionally to

. Even modest increases in

can blow up the leading constant too much to be worth the slight decrease in the exponent of

. Meanwhile, the asymptotically best algorithms use very large

, such that

is larger than the number of atoms in the visible universe! In this paper, we give new ways to use tensor rank bounds to design matrix multiplication algorithms, which lead to smaller leading constants than the standard recursive method. Our main result shows that, if the tensor for multiplying

matrices has rank

, then

matrices can be multiplied using only

operations. In other words, we improve the leading constant in general from

. We then apply this and further improve the leading constant in a number of situations of interest. We show that, in the popularly-conjectured case where

, a new, different recursive approach can lead to an improvement. We also show that the leading constant of the current asymptotically fastest matrix multiplication algorithm, and any algorithm designed using the group-theoretic method, can be further improved by taking advantage of additional structure of the underlying tensor identities.

Improving the Leading Constant of Matrix Multiplication

Abstract

Improving the Leading Constant of Matrix Multiplication

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (91)