Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

S M Ferdous; Reece Neff; Bo Peng; Salman Shuvo; Marco Minutoli; Sayak Mukherjee; Karol Kowalski; Michela Becchi; Mahantesh Halappanavar

Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

S M Ferdous, Reece Neff, Bo Peng, Salman Shuvo, Marco Minutoli, Sayak Mukherjee, Karol Kowalski, Michela Becchi, Mahantesh Halappanavar

TL;DR

Picasso presents a memory-efficient, palette-based graph coloring approach tailored for large, dense graphs arising from Pauli-string representations in quantum computing. By iteratively constructing a conflict graph on-the-fly and coloring via dynamic list-coloring, Picasso achieves sublinear space usage under practical assumptions and leverages GPU acceleration to scale to inputs with millions of vertices and trillions of edges. The authors introduce a machine-learning predictor to choose palette sizes and list parameters, balancing coloring quality and resource consumption, and demonstrate substantial memory savings and competitive performance against state-of-the-art methods. This work enables scalable clique-partitioning for unitary representations in quantum simulations, with potential impact on qubit tapering and broader memory-constrained graph-coloring tasks.

Abstract

A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomized memory-efficient iterative parallel graph coloring algorithm with theoretical sublinear space guarantees under practical assumptions. The parameters of our algorithm provide a trade-off between coloring quality and resource consumption. To assist the user, we also propose a machine learning model to predict the coloring algorithm's parameters considering these trade-offs. We provide a sequential and a parallel implementation of the proposed algorithm. We perform an experimental evaluation on a 64-core AMD CPU equipped with 512 GB of memory and an Nvidia A100 GPU with 40GB of memory. For a small dataset where existing coloring algorithms can be executed within the 512 GB memory budget, we show up to 68x memory savings. On massive datasets we demonstrate that GPU-accelerated Picasso can process inputs with 49.5x more Pauli strings (vertex set in our graph) and 2,478x more edges than state-of-the-art parallel approaches.

Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

TL;DR

Abstract

Paper Structure (21 sections, 2 theorems, 7 equations, 5 figures, 5 tables, 3 algorithms)

This paper contains 21 sections, 2 theorems, 7 equations, 5 figures, 5 tables, 3 algorithms.

Introduction
Problem Formulation
Quantum computing problem
Connection to Clique partitioning and Graph coloring
Related Work
Our Algorithm
Conflict Graph Construction
Coloring the Conflict Graph
Analysis of the Algorithm
Parallel GPU implementation
Prediction of Palette Size
Experimental Evaluation
Quality and Memory Comparisons
Small Dataset
Medium and Large Dataset
...and 6 more sections

Key Result

Lemma 1

Let $X_1,\ldots, X_m$ be $m$ independent binary random variables such that $Pr(X_i) = p_i$. Let $X = \sum_{i=1}^m X_i$ and $\mu = \mathbb{E}[X]$. Then the following holds for $0<\gamma\leq1$:

Figures (5)

Figure 1: An overview of the mapping problem solved as clique partition using graph coloring of the conflict graph for H2 molecule with sto-3g basis function.
Figure 2: Input dataset scaling on the iterative GPU implementation up to 2 million vertices. $\alpha=2$, $\mathcal{P}=12.5\%$. The black dashed line in the top plot denotes the maximum conflicting edge ratio supported by a 40GB NVIDIA A100.
Figure 3: Input dataset scaling on the iterative GPU implementation up to 2 million vertices. $\alpha=2$, $\mathcal{P}=12.5\%$.
Figure 4: Performance comparison of Picasso and Kokkos-EB on the small datasets relative to ECL-GC-R execution time. For Picasso runs, $\mathcal{P}$ is varied and $\alpha=4.5$
Figure 5: Impact of $\mathcal{P}$ and $\alpha$ using H4_2D_6311g on final colors, number of conflicting edges and runtime for different inputs (lighter color is better).

Theorems & Definitions (3)

Definition 1: Clique partitioning
Lemma 1: Chernoff-Hoeffding bound Mitzenmacher-book-05
Lemma 2

Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

TL;DR

Abstract

Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (3)