Table of Contents
Fetching ...

Big data applications on small quantum computers

Boniface Yogendran, Daniel Charlton, Miriam Beddig, Ioannis Kolotouros, Petros Wallden

TL;DR

The coreset method is applied in three different well-studied classical machine learning problems, namely Divisive Clustering, 3-means Clustering, and Gaussian Mixture Model Clustering, for which the number of qubits scales linearly with the size of the coreset.

Abstract

Current quantum hardware prohibits any direct use of large classical datasets. Coresets allow for a succinct description of these large datasets and their solution in a computational task is competitive with the solution on the original dataset. The method of combining coresets with small quantum computers to solve a given task that requires a large number of data points was first introduced by Harrow [arXiv:2004.00026]. In this paper, we apply the coreset method in three different well-studied classical machine learning problems, namely Divisive Clustering, 3-means Clustering, and Gaussian Mixture Model Clustering. We provide a Hamiltonian formulation of the aforementioned problems for which the number of qubits scales linearly with the size of the coreset. Then, we evaluate how the variational quantum eigensolver (VQE) performs on these problems and demonstrate the practical efficiency of coresets when used along with a small quantum computer. We perform noiseless simulations on instances of sizes up to 25 qubits on CUDA Quantum and show that our approach provides comparable performance to classical solvers.

Big data applications on small quantum computers

TL;DR

The coreset method is applied in three different well-studied classical machine learning problems, namely Divisive Clustering, 3-means Clustering, and Gaussian Mixture Model Clustering, for which the number of qubits scales linearly with the size of the coreset.

Abstract

Current quantum hardware prohibits any direct use of large classical datasets. Coresets allow for a succinct description of these large datasets and their solution in a computational task is competitive with the solution on the original dataset. The method of combining coresets with small quantum computers to solve a given task that requires a large number of data points was first introduced by Harrow [arXiv:2004.00026]. In this paper, we apply the coreset method in three different well-studied classical machine learning problems, namely Divisive Clustering, 3-means Clustering, and Gaussian Mixture Model Clustering. We provide a Hamiltonian formulation of the aforementioned problems for which the number of qubits scales linearly with the size of the coreset. Then, we evaluate how the variational quantum eigensolver (VQE) performs on these problems and demonstrate the practical efficiency of coresets when used along with a small quantum computer. We perform noiseless simulations on instances of sizes up to 25 qubits on CUDA Quantum and show that our approach provides comparable performance to classical solvers.
Paper Structure (29 sections, 32 equations, 5 figures, 1 table, 3 algorithms)

This paper contains 29 sections, 32 equations, 5 figures, 1 table, 3 algorithms.

Figures (5)

  • Figure 1: General framework of a variational quantum algorithm. The quantum computer iteratively prepares and measures quantum states, and the classical computer employs a classical optimization algorithm to update the parameters (following the direction that minimizes the loss). When the optimization terminates, the algorithm returns a ground state approximation.
  • Figure 2: Hierarchical clustering analysis of synthetic data points presented as a dendrogram (left). By drawing perpendicular lines across the dendrogram, the hierarchical relationships are transformed into data clusters. The number of clusters is determined by the intersections with dendrogram branches. The green and blue horizontal lines (right image) create 2 and 4 clusters, respectively, as indicated. Visual confirmation of data points with the same colors grouped together validates the clustering outcome.
  • Figure 3: Parameterized quantum circuit used for the numerical simulations.
  • Figure 4: On the left side, the dendrogram is illustrated and a vertical line is drawn in order to create 6 clusters. On the right side, we visualize the 6 clusters created by the perpendicular line. The quantum computer is able to correctly cluster the datapoints.
  • Figure 5: (A) All data points are positive. The clustering outcome is not acceptable as it fails to follow any logic. (B) The outcome of (A) after normalizing the coreset vector. The outcome is acceptable as it follows a logical pattern.