Quantum Visual Feature Encoding Revisited

Xuan-Bac Nguyen; Hoang-Quan Nguyen; Hugh Churchill; Samee U. Khan; Khoa Luu

Quantum Visual Feature Encoding Revisited

Xuan-Bac Nguyen, Hoang-Quan Nguyen, Hugh Churchill, Samee U. Khan, Khoa Luu

TL;DR

This work identifies a fundamental information-preservation gap, termed the Quantum Information Gap (QIG), between classical features and their quantum encodings in vision tasks, which impedes learning on quantum machines. It proposes a simple yet effective Quantum Information Preserving (QIP) loss that regularizes the feature extractor to align classical representations with their quantum counterparts, minimizing the gap via a KL-divergence term between classical logits and quantum logits. Through extensive experiments on MSCeleb-1M and Google Landmarks, the approach yields state-of-the-art performance for quantum clustering, notably improving QClusformer metrics and recovering much of the classical baseline accuracy. The method demonstrates that suitable encoding design and a targeted loss can substantially enhance quantum machine learning performance in large-scale vision tasks, with implications for near-term quantum hardware and hybrid quantum-classical pipelines.

Abstract

Although quantum machine learning has been introduced for a while, its applications in computer vision are still limited. This paper, therefore, revisits the quantum visual encoding strategies, the initial step in quantum machine learning. Investigating the root cause, we uncover that the existing quantum encoding design fails to ensure information preservation of the visual features after the encoding process, thus complicating the learning process of the quantum machine learning models. In particular, the problem, termed "Quantum Information Gap" (QIG), leads to a gap of information between classical and corresponding quantum features. We provide theoretical proof and practical demonstrations of that found and underscore the significance of QIG, as it directly impacts the performance of quantum machine learning algorithms. To tackle this challenge, we introduce a simple but efficient new loss function named Quantum Information Preserving (QIP) to minimize this gap, resulting in enhanced performance of quantum machine learning algorithms. Extensive experiments validate the effectiveness of our approach, showcasing superior performance compared to current methodologies and consistently achieving state-of-the-art results in quantum modeling.

Quantum Visual Feature Encoding Revisited

TL;DR

Abstract

Paper Structure (23 sections, 1 theorem, 13 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 23 sections, 1 theorem, 13 equations, 6 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Quantum Computer Vision
Hybrid Classical-Quantum Machine Learning
Background
Quantum Basics
Limitations in Current Quantum Encoding Methods
Theoretical Analysis and Problem Visualization
Our Proposed Approach
Problem Formulation
Quantum Information Preserving Loss
Experiment Setup and Implementation
Experiment Setup
Implementation Details
Datasets and Metrics
...and 8 more sections

Key Result

Proposition 1

Consider two different quantum state vectors, denoted as $|\psi_1\rangle$ and $|\psi_2\rangle$, and these corresponding quantum information vectors $\mathbf{q}_1$ and $\mathbf{q}_2$. We have $\langle\psi_1|\psi_2\rangle \neq \mathbf{q}_1^\mathsf{T} \mathbf{q}_2$ for any Pauli observable and quantum

Figures (6)

Figure 1: Limitations in current quantum encoding strategies, which result in non-robust feature representations in the quantum feature space and our proposed QIP solution. Figure (b) showcases encoded quantum features. Figure (c) presents our proposed method for enhancing the discriminative of quantum features.
Figure 2: Overview of the hybrid quantum system. The red components run in the classical machine. The yellow box includes components running on the quantum machine. The dashed orange box indicates our focus on this paper. Best viewed in color.
Figure 3: Experiment setup and objective of the clustering problem. Figure (a) depicts the typical experiment setup used by nguyen2021clusformer for the classical machine. Figure (b) shows a similar setup. However, only deep model $\mathcal{M}(x)$ retains running on the classical machine, while the rest of the modules are redesigned to run on the quantum computer.
Figure 4: The MSCeleb-1M and Google Landmark datasets are illustrated through samples. Each row represents either a subject (for MSCeleb-1M) or a location (for Google Landmark). The first image in each row denotes the center of a cluster $\mathbf{\Phi}_i$, while the subsequent images are the nearest neighbors of the first one, identified through the K-NN algorithm utilizing quantum features. Images bordered in red signify that they belong to a different class than the first image in the row, whereas those bordered in green share the same class as the first image. The clusters obtained without QIP loss in (a) exhibit more noisy samples compared to (b), which are obtained with QIP Loss. Best view in color.
Figure 5: Ablation studies on different QIP Loss factor $\lambda$
...and 1 more figures

Theorems & Definitions (2)

Proposition 1
proof

Quantum Visual Feature Encoding Revisited

TL;DR

Abstract

Quantum Visual Feature Encoding Revisited

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (2)