Table of Contents
Fetching ...

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

Jan-Nico Zaech, Martin Danelljan, Tolga Birdal, Luc Van Gool

TL;DR

The paper tackles calibrated uncertainty in balanced $K$-means clustering by leveraging adiabatic quantum computing (AQC) to sample high-probability binary cluster assignments from an energy-based model. It formulates the clustering task as a QUBO with one-hot encoding and uses Lagrangian penalties to enforce balance, then relies on the D-Wave hardware to obtain samples that approximate a Boltzmann distribution; posterior calibration is performed from the measured solutions to identify ambiguous points and alternative clusterings. A probabilistic clustering framework is introduced, including a Gaussian-m mixture data model, posterior recomputation from samples, and coresets for scalability, with extensive experiments on synthetic data, IRIS, and high-dimensional image features demonstrating well-calibrated uncertainties and competitive clustering performance. The work highlights the potential of quantum sampling to enrich clustering with informative uncertainty measures and alternative solutions, while acknowledging current hardware constraints and the need for further refinements to scale and optimize problem formulations for real-world applications.

Abstract

Adiabatic quantum computing (AQC) is a promising approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding information contained in the remaining ones. In this work, we explore the potential of using this information for probabilistic balanced k-means clustering. Instead of discarding non-optimal solutions, we propose to use them to compute calibrated posterior probabilities with little additional compute cost. This allows us to identify ambiguous solutions and data points, which we demonstrate on a D-Wave AQC on synthetic tasks and real visual data.

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

TL;DR

The paper tackles calibrated uncertainty in balanced -means clustering by leveraging adiabatic quantum computing (AQC) to sample high-probability binary cluster assignments from an energy-based model. It formulates the clustering task as a QUBO with one-hot encoding and uses Lagrangian penalties to enforce balance, then relies on the D-Wave hardware to obtain samples that approximate a Boltzmann distribution; posterior calibration is performed from the measured solutions to identify ambiguous points and alternative clusterings. A probabilistic clustering framework is introduced, including a Gaussian-m mixture data model, posterior recomputation from samples, and coresets for scalability, with extensive experiments on synthetic data, IRIS, and high-dimensional image features demonstrating well-calibrated uncertainties and competitive clustering performance. The work highlights the potential of quantum sampling to enrich clustering with informative uncertainty measures and alternative solutions, while acknowledging current hardware constraints and the need for further refinements to scale and optimize problem formulations for real-world applications.

Abstract

Adiabatic quantum computing (AQC) is a promising approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding information contained in the remaining ones. In this work, we explore the potential of using this information for probabilistic balanced k-means clustering. Instead of discarding non-optimal solutions, we propose to use them to compute calibrated posterior probabilities with little additional compute cost. This allows us to identify ambiguous solutions and data points, which we demonstrate on a D-Wave AQC on synthetic tasks and real visual data.
Paper Structure (27 sections, 15 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 27 sections, 15 equations, 8 figures, 1 table, 1 algorithm.

Figures (8)

  • Figure 1: The proposed approach uses an adiabatic quantum computer to sample solutions of a balanced k-means problem. By using an energy-based formulation, likely solutions are drawn from a Boltzmann distribution. By reparametrizing the distribution, the calibrated posterior probability of each solution can be estimated.
  • Figure 2: Evaluation of the calibration and distribution for QA, SIM and exhaustive search in tasks with 2 to 4 clusters, each with 5 points. All results are generated with 1,000 problems in each scenario and 5,000 measurements for each clustering problem.
  • Figure 3: Sparsification plots of clustering metrics.
  • Figure 4: Visualization of coresets for synthetic data with the probability for each of the determined pointsets.
  • Figure 5: Qualitative results on the IRIS dataset.
  • ...and 3 more figures