Table of Contents
Fetching ...

Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

S M Ferdous, Reece Neff, Bo Peng, Salman Shuvo, Marco Minutoli, Sayak Mukherjee, Karol Kowalski, Michela Becchi, Mahantesh Halappanavar

TL;DR

Picasso presents a memory-efficient, palette-based graph coloring approach tailored for large, dense graphs arising from Pauli-string representations in quantum computing. By iteratively constructing a conflict graph on-the-fly and color­ing via dynamic list-coloring, Picasso achieves sublinear space usage under practical assumptions and leverages GPU acceleration to scale to inputs with millions of vertices and trillions of edges. The authors introduce a machine-learning predictor to choose palette sizes and list parameters, balancing coloring quality and resource consumption, and demonstrate substantial memory savings and competitive performance against state-of-the-art methods. This work enables scalable clique-partitioning for unitary representations in quantum simulations, with potential impact on qubit tapering and broader memory-constrained graph-coloring tasks.

Abstract

A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomized memory-efficient iterative parallel graph coloring algorithm with theoretical sublinear space guarantees under practical assumptions. The parameters of our algorithm provide a trade-off between coloring quality and resource consumption. To assist the user, we also propose a machine learning model to predict the coloring algorithm's parameters considering these trade-offs. We provide a sequential and a parallel implementation of the proposed algorithm. We perform an experimental evaluation on a 64-core AMD CPU equipped with 512 GB of memory and an Nvidia A100 GPU with 40GB of memory. For a small dataset where existing coloring algorithms can be executed within the 512 GB memory budget, we show up to 68x memory savings. On massive datasets we demonstrate that GPU-accelerated Picasso can process inputs with 49.5x more Pauli strings (vertex set in our graph) and 2,478x more edges than state-of-the-art parallel approaches.

Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

TL;DR

Picasso presents a memory-efficient, palette-based graph coloring approach tailored for large, dense graphs arising from Pauli-string representations in quantum computing. By iteratively constructing a conflict graph on-the-fly and color­ing via dynamic list-coloring, Picasso achieves sublinear space usage under practical assumptions and leverages GPU acceleration to scale to inputs with millions of vertices and trillions of edges. The authors introduce a machine-learning predictor to choose palette sizes and list parameters, balancing coloring quality and resource consumption, and demonstrate substantial memory savings and competitive performance against state-of-the-art methods. This work enables scalable clique-partitioning for unitary representations in quantum simulations, with potential impact on qubit tapering and broader memory-constrained graph-coloring tasks.

Abstract

A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomized memory-efficient iterative parallel graph coloring algorithm with theoretical sublinear space guarantees under practical assumptions. The parameters of our algorithm provide a trade-off between coloring quality and resource consumption. To assist the user, we also propose a machine learning model to predict the coloring algorithm's parameters considering these trade-offs. We provide a sequential and a parallel implementation of the proposed algorithm. We perform an experimental evaluation on a 64-core AMD CPU equipped with 512 GB of memory and an Nvidia A100 GPU with 40GB of memory. For a small dataset where existing coloring algorithms can be executed within the 512 GB memory budget, we show up to 68x memory savings. On massive datasets we demonstrate that GPU-accelerated Picasso can process inputs with 49.5x more Pauli strings (vertex set in our graph) and 2,478x more edges than state-of-the-art parallel approaches.
Paper Structure (21 sections, 2 theorems, 7 equations, 5 figures, 5 tables, 3 algorithms)

This paper contains 21 sections, 2 theorems, 7 equations, 5 figures, 5 tables, 3 algorithms.

Key Result

Lemma 1

Let $X_1,\ldots, X_m$ be $m$ independent binary random variables such that $Pr(X_i) = p_i$. Let $X = \sum_{i=1}^m X_i$ and $\mu = \mathbb{E}[X]$. Then the following holds for $0<\gamma\leq1$:

Figures (5)

  • Figure 1: An overview of the mapping problem solved as clique partition using graph coloring of the conflict graph for H2 molecule with sto-3g basis function.
  • Figure 2: Input dataset scaling on the iterative GPU implementation up to 2 million vertices. $\alpha=2$, $\mathcal{P}=12.5\%$. The black dashed line in the top plot denotes the maximum conflicting edge ratio supported by a 40GB NVIDIA A100.
  • Figure 3: Input dataset scaling on the iterative GPU implementation up to 2 million vertices. $\alpha=2$, $\mathcal{P}=12.5\%$.
  • Figure 4: Performance comparison of Picasso and Kokkos-EB on the small datasets relative to ECL-GC-R execution time. For Picasso runs, $\mathcal{P}$ is varied and $\alpha=4.5$
  • Figure 5: Impact of $\mathcal{P}$ and $\alpha$ using H4_2D_6311g on final colors, number of conflicting edges and runtime for different inputs (lighter color is better).

Theorems & Definitions (3)

  • Definition 1: Clique partitioning
  • Lemma 1: Chernoff-Hoeffding bound Mitzenmacher-book-05
  • Lemma 2