A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler

Mohammed Tirichine; Nassim Ameur; Nazim Bendib; Iheb Nassim Aouadj; Bouchama Djad; Rafik Bouloudene; Riyadh Baghdadi

A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler

Mohammed Tirichine, Nassim Ameur, Nazim Bendib, Iheb Nassim Aouadj, Bouchama Djad, Rafik Bouloudene, Riyadh Baghdadi

TL;DR

The paper addresses automatic optimization of MLIR code by wrapping loop-nest transformations in a reinforcement learning environment. It introduces MLIR RL, featuring a multi-discrete action space and a level-pointers mechanism to manage the large search space of loop interchange, trained with PPO in an actor-critic framework focused on the MLIR Linalg dialect. The authors demonstrate MLIR RL on DL and LQCD workloads, comparing against PyTorch, PyTorch JIT, Halide RL, and Halide autoscheduler, and show favorable results in select domains, along with an ablation study that informs design choices. They also provide a public artifact enabling reproduction and further RL-driven exploration of loop-nest optimization within MLIR, highlighting its potential as a research infrastructure for ML-driven compiler optimization. The work advances automatic code optimization by furnishing a specialized, MLIR-integrated RL environment that can generalize across domains and dialects, potentially reducing manual tuning and unlocking new optimization strategies.

Abstract

Code optimization is a crucial task that aims to enhance code performance. However, this process is often tedious and complex, highlighting the necessity for automatic code optimization techniques. Reinforcement Learning (RL) has emerged as a promising approach for tackling such complex optimization problems. In this project, we introduce MLIR RL, an RL environment for the MLIR compiler, dedicated to facilitating MLIR compiler research and enabling automatic code optimization. We propose a multi-discrete formulation of the action space where the action space is the Cartesian product of simpler action subspaces. We also propose a new method, called level pointers, to reduce the size of the action space related to the loop interchange transformation. This enables more efficient and effective learning of the policy. To demonstrate the effectiveness of MLIR RL, we train an RL agent to optimize MLIR Linalg code, targeting CPU. The code is generated from two domain-specific frameworks: deep-learning models generated from PyTorch, and LQCD (Lattice Quantum Chromodynamics) code generated from an LQCD compiler. The result of this work is a research environment that allows the community to experiment with novel ideas in RL-driven loop-nest optimization.

A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler

TL;DR

Abstract

Paper Structure (57 sections, 7 figures, 5 tables)

This paper contains 57 sections, 7 figures, 5 tables.

Introduction
Background and Related Work
MLIR
Machine Learning for Compilers
Reinforcement Learning for Compilers
MLIR-based Compilers for Machine Learning
IREE
ONNX-MLIR
XLA
Overview of MLIR RL
Reinforcement Learning Environment
Action Space
Multi-Discrete Formulation
Action Mask
States and Observations
...and 42 more sections

Figures (7)

Figure 1: The pipeline of extracting the features from a Linalg operation and building the representation vector.
Figure 3: The RL agent’s policy network architecture consists of a backbone that processes the input representation vector into a feature vector that is then passed to the subnetworks to predict the transformation to apply and its parameters.
Figure 4: The detailed architecture of the networks used in the policy network: a) The backbone; b) The transformation selection network; c) Tiled transformations network. The interchange network varies in size depending on the interchange method, but it is always a dense layer outputting one distribution.
Figure 5: Speedups over MLIR baseline for each method across neural network operators, comparing our system (MLIR RL), Halide RL, PyTorch, and the PyTorch compiler.
Figure 6: Comparison between the speedups achieved by training with a Flat Action Space and a Multi-Discrete Action Space.
...and 2 more figures

A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler

TL;DR

Abstract

A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler

Authors

TL;DR

Abstract

Table of Contents

Figures (7)