OpenTensor: Reproducing Faster Matrix Multiplication Discovering Algorithms

Yiwen Sun; Wenye Li

OpenTensor: Reproducing Faster Matrix Multiplication Discovering Algorithms

Yiwen Sun, Wenye Li

TL;DR

OpenTensor reproduces and clarifies AlphaTensor's framework for discovering faster matrix multiplication algorithms using a neural policy guided by Monte Carlo Tree Search within a tensor-decomposition view of MM. It casts MM as finding a short rank-$r$ decomposition $T = \sum_{i=1}^{r} u_i \otimes v_i \otimes w_i$ and iteratively extracts factors, guided by a learned policy and rank estimator. The paper introduces practical improvements—Change of Basis, Action Canonicalization, Order Shuffling, and a no-redundancy filter for synthetic demonstrations—to reduce rank overestimation and accelerate convergence. On standard hardware with synthetic data, OpenTensor converges faster and can reproduce and improve upon the original AlphaTensor algorithm, including discovering a $2\times 2$ MM algorithm requiring $8$ multiplications.

Abstract

OpenTensor is a reproduction of AlphaTensor, which discovered a new algorithm that outperforms the state-of-the-art methods for matrix multiplication by Deep Reinforcement Learning (DRL). While AlphaTensor provides a promising framework for solving scientific problems, it is really hard to reproduce due to the massive tricks and lack of source codes. In this paper, we clean up the algorithm pipeline, clarify the technical details, and make some improvements to the training process. Computational results show that OpenTensor can successfully find efficient matrix multiplication algorithms.

OpenTensor: Reproducing Faster Matrix Multiplication Discovering Algorithms

TL;DR

decomposition

and iteratively extracts factors, guided by a learned policy and rank estimator. The paper introduces practical improvements—Change of Basis, Action Canonicalization, Order Shuffling, and a no-redundancy filter for synthetic demonstrations—to reduce rank overestimation and accelerate convergence. On standard hardware with synthetic data, OpenTensor converges faster and can reproduce and improve upon the original AlphaTensor algorithm, including discovering a

MM algorithm requiring

multiplications.

Abstract

Paper Structure (12 sections, 1 theorem, 6 equations, 1 figure)

This paper contains 12 sections, 1 theorem, 6 equations, 1 figure.

Introduction
Algorithm and Training Details
Problem Formulation and Algorithm Pipeline
Problem Formulation.
Algorithm.
Training.
Training Details
Change of Basis
Action Canonicalization
Order Shuffling
Synthetic Demonstrations
Results and Summary

Key Result

Theorem 2.1

Assume $\mathcal{M}$ is the full set of $m \in \mathbb{R}^S$ used to generate tensors: and let the index set be $I = \{ u,v,w\}.$ For an arbitrary $i$, if the collection contains the same elements, then the decomposition of $\;T$ has redundancy.

Figures (1)

Figure 1: Results of OpenTensor and Decomposition Solution. Left is the matrix multiplication algorithm found by OpenTensor ($2 \times 2$ matrix product with 8 multiplications), and right is the loss of three different methods. The original algorithm (red) contains action canonicalization, change of basis and original synthetic demonstrations technique. OpenTensor with augmentation (green) adds order shuffling and the final OpenTensor (purple) uses all techniques we have mentioned.

Theorems & Definitions (1)

Theorem 2.1: Necessary Condition of No Redundancy

OpenTensor: Reproducing Faster Matrix Multiplication Discovering Algorithms

TL;DR

Abstract

OpenTensor: Reproducing Faster Matrix Multiplication Discovering Algorithms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (1)