Table of Contents
Fetching ...

Quantum-Inspired Fine-Tuning for Few-Shot AIGC Detection via Phase-Structured Reparameterization

Kaiyang Xing, Han Fang, Zhaoyun Chen, Zhonghui Li, Yang Yang, Weiming Zhang, Guoping Guo

TL;DR

Q-LoRA, a quantum-enhanced fine-tuning scheme that integrates lightweight QNNs into the low-rank adaptation (LoRA) adapter, is proposed and H-LoRA, a fully classical variant that applies the Hilbert transform within the LoRA adapter to retain similar phase structure and constraints are introduced.

Abstract

Recent studies show that quantum neural networks (QNNs) generalize well in few-shot regimes. To extend this advantage to large-scale tasks, we propose Q-LoRA, a quantum-enhanced fine-tuning scheme that integrates lightweight QNNs into the low-rank adaptation (LoRA) adapter. Applied to AI-generated content (AIGC) detection, Q-LoRA consistently outperforms standard LoRA under few-shot settings. We analyze the source of this improvement and identify two possible structural inductive biases from QNNs: (i) phase-aware representations, which encode richer information across orthogonal amplitude-phase components, and (ii) norm-constrained transformations, which stabilize optimization via inherent orthogonality. However, Q-LoRA incurs non-trivial overhead due to quantum simulation. Motivated by our analysis, we further introduce H-LoRA, a fully classical variant that applies the Hilbert transform within the LoRA adapter to retain similar phase structure and constraints. Experiments on few-shot AIGC detection show that both Q-LoRA and H-LoRA outperform standard LoRA by over 5% accuracy, with H-LoRA achieving comparable accuracy at significantly lower cost in this task.

Quantum-Inspired Fine-Tuning for Few-Shot AIGC Detection via Phase-Structured Reparameterization

TL;DR

Q-LoRA, a quantum-enhanced fine-tuning scheme that integrates lightweight QNNs into the low-rank adaptation (LoRA) adapter, is proposed and H-LoRA, a fully classical variant that applies the Hilbert transform within the LoRA adapter to retain similar phase structure and constraints are introduced.

Abstract

Recent studies show that quantum neural networks (QNNs) generalize well in few-shot regimes. To extend this advantage to large-scale tasks, we propose Q-LoRA, a quantum-enhanced fine-tuning scheme that integrates lightweight QNNs into the low-rank adaptation (LoRA) adapter. Applied to AI-generated content (AIGC) detection, Q-LoRA consistently outperforms standard LoRA under few-shot settings. We analyze the source of this improvement and identify two possible structural inductive biases from QNNs: (i) phase-aware representations, which encode richer information across orthogonal amplitude-phase components, and (ii) norm-constrained transformations, which stabilize optimization via inherent orthogonality. However, Q-LoRA incurs non-trivial overhead due to quantum simulation. Motivated by our analysis, we further introduce H-LoRA, a fully classical variant that applies the Hilbert transform within the LoRA adapter to retain similar phase structure and constraints. Experiments on few-shot AIGC detection show that both Q-LoRA and H-LoRA outperform standard LoRA by over 5% accuracy, with H-LoRA achieving comparable accuracy at significantly lower cost in this task.
Paper Structure (20 sections, 12 equations, 2 figures, 1 table)

This paper contains 20 sections, 12 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Method of LoRA and Q-LoRA applied to the CLIP model, and their detection accuracy for AI-generated image detection tasks under different training samples.
  • Figure 2: AI-generated image detection framework: (a) Q-LoRA: A QNN is connected after the CLIP features to extract high-dimensional features through quantum state encoding and processing, and likewise fused with the backbone features via a bypass LoRA module; (b) H-LoRA: The features extracted by CLIP undergo a Hilbert transform to separate and enhance the amplitude and phase characteristics of the signal, thereby enriching the model representation, and are finally fused with the backbone features through a bypass LoRA module, where FFT and IFFT denote Fast Fourier Transform and Inverse Fast Fourier Transform, respectively; (c) Overall Framework: Illustrates the system pipeline for detecting AI-generated images, which is based on a CLIP visual encoder and integrates multiple efficient fine-tuning adapters.