OpenTensor: Reproducing Faster Matrix Multiplication Discovering Algorithms
Yiwen Sun, Wenye Li
TL;DR
OpenTensor reproduces and clarifies AlphaTensor's framework for discovering faster matrix multiplication algorithms using a neural policy guided by Monte Carlo Tree Search within a tensor-decomposition view of MM. It casts MM as finding a short rank-$r$ decomposition $T = \sum_{i=1}^{r} u_i \otimes v_i \otimes w_i$ and iteratively extracts factors, guided by a learned policy and rank estimator. The paper introduces practical improvements—Change of Basis, Action Canonicalization, Order Shuffling, and a no-redundancy filter for synthetic demonstrations—to reduce rank overestimation and accelerate convergence. On standard hardware with synthetic data, OpenTensor converges faster and can reproduce and improve upon the original AlphaTensor algorithm, including discovering a $2\times 2$ MM algorithm requiring $8$ multiplications.
Abstract
OpenTensor is a reproduction of AlphaTensor, which discovered a new algorithm that outperforms the state-of-the-art methods for matrix multiplication by Deep Reinforcement Learning (DRL). While AlphaTensor provides a promising framework for solving scientific problems, it is really hard to reproduce due to the massive tricks and lack of source codes. In this paper, we clean up the algorithm pipeline, clarify the technical details, and make some improvements to the training process. Computational results show that OpenTensor can successfully find efficient matrix multiplication algorithms.
