Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing
S M Ferdous, Reece Neff, Bo Peng, Salman Shuvo, Marco Minutoli, Sayak Mukherjee, Karol Kowalski, Michela Becchi, Mahantesh Halappanavar
TL;DR
Picasso presents a memory-efficient, palette-based graph coloring approach tailored for large, dense graphs arising from Pauli-string representations in quantum computing. By iteratively constructing a conflict graph on-the-fly and coloring via dynamic list-coloring, Picasso achieves sublinear space usage under practical assumptions and leverages GPU acceleration to scale to inputs with millions of vertices and trillions of edges. The authors introduce a machine-learning predictor to choose palette sizes and list parameters, balancing coloring quality and resource consumption, and demonstrate substantial memory savings and competitive performance against state-of-the-art methods. This work enables scalable clique-partitioning for unitary representations in quantum simulations, with potential impact on qubit tapering and broader memory-constrained graph-coloring tasks.
Abstract
A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomized memory-efficient iterative parallel graph coloring algorithm with theoretical sublinear space guarantees under practical assumptions. The parameters of our algorithm provide a trade-off between coloring quality and resource consumption. To assist the user, we also propose a machine learning model to predict the coloring algorithm's parameters considering these trade-offs. We provide a sequential and a parallel implementation of the proposed algorithm. We perform an experimental evaluation on a 64-core AMD CPU equipped with 512 GB of memory and an Nvidia A100 GPU with 40GB of memory. For a small dataset where existing coloring algorithms can be executed within the 512 GB memory budget, we show up to 68x memory savings. On massive datasets we demonstrate that GPU-accelerated Picasso can process inputs with 49.5x more Pauli strings (vertex set in our graph) and 2,478x more edges than state-of-the-art parallel approaches.
