Reinforcement Learning for Graph Coloring: Understanding the Power and Limits of Non-Label Invariant Representations
Chase Cummins, Richard Veras
TL;DR
This work investigates reinforcement learning for graph coloring by casting register allocation as a $k$-coloring problem and evaluating model-free methods (DQN and PPO) within a GraphColoring Gym environment. It introduces a progression of reward designs, culminating in a PPO-based approach that can color small graphs and reveals a critical sensitivity to graph labeling, demonstrating that non-label invariant representations hinder consistent performance. The study shows that while PPO can achieve optimal or near-optimal coloring on small graphs, performance degrades with graph size and relabelings, highlighting the need for invariant graph representations, such as those offered by Graph Neural Networks. The findings have practical implications for compiler optimization pipelines and suggest a clear path for improving ML-based graph coloring through invariant representations and scalable architectures.
Abstract
Register allocation is one of the most important problems for modern compilers. With a practically unlimited number of user variables and a small number of CPU registers, assigning variables to registers without conflicts is a complex task. This work demonstrates the use of casting the register allocation problem as a graph coloring problem. Using technologies such as PyTorch and OpenAI Gymnasium Environments we will show that a Proximal Policy Optimization model can learn to solve the graph coloring problem. We will also show that the labeling of a graph is critical to the performance of the model by taking the matrix representation of a graph and permuting it. We then test the model's effectiveness on each of these permutations and show that it is not effective when given a relabeling of the same graph. Our main contribution lies in showing the need for label reordering invariant representations of graphs for machine learning models to achieve consistent performance.
