Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance

Jia-Wei Liao; Winston Wang; Tzu-Sian Wang; Li-Xuan Peng; Cheng-Fu Chou; Jun-Cheng Chen

Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance

Jia-Wei Liao, Winston Wang, Tzu-Sian Wang, Li-Xuan Peng, Cheng-Fu Chou, Jun-Cheng Chen

TL;DR

A novel diffusion-model-based aesthetic QR code generation pipeline, utilizing pre-trained ControlNet and guided iterative refinement via a novel classifier guidance (SRG) based on the proposed Scanning-Robust Loss (SRL) tailored with QR code mechanisms, which ensures both aesthetics and scannability.

Abstract

QR codes, prevalent in daily applications, lack visual appeal due to their conventional black-and-white design. Integrating aesthetics while maintaining scannability poses a challenge. In this paper, we introduce a novel diffusion-model-based aesthetic QR code generation pipeline, utilizing pre-trained ControlNet and guided iterative refinement via a novel classifier guidance (SRG) based on the proposed Scanning-Robust Loss (SRL) tailored with QR code mechanisms, which ensures both aesthetics and scannability. To further improve the scannability while preserving aesthetics, we propose a two-stage pipeline with Scanning-Robust Perceptual Guidance (SRPG). Moreover, we can further enhance the scannability of the generated QR code by post-processing it through the proposed Scanning-Robust Projected Gradient Descent (SRPGD) post-processing technique based on SRL with proven convergence. With extensive quantitative, qualitative, and subjective experiments, the results demonstrate that the proposed approach can generate diverse aesthetic QR codes with flexibility in detail. In addition, our pipelines outperforming existing models in terms of Scanning Success Rate (SSR) 86.67% (+40%) with comparable aesthetic scores. The pipeline combined with SRPGD further achieves 96.67% (+50%). Our code will be available https://github.com/jwliao1209/DiffQRCode.

Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance

TL;DR

Abstract

Paper Structure (30 sections, 11 equations, 10 figures, 3 tables)

This paper contains 30 sections, 11 equations, 10 figures, 3 tables.

Introduction
Related Work
Image Diffusion Models
Aesthetic QR Codes
Method
Scanning-Robust Loss
Pixel-wise Error.
Error Re-weighting by Gaussian Kernel.
Early-stopping Mechanism.
One-stage Generation with iterative refinement via Scanning-Robust Guidance
Two-stage Generation with Scanning-Robust Perceptual Guidance
Post-processing via Scanning-Robust Projected Gradient Descent (SRPGD)
Experiment
Comparison with Others Aesthetic QR Code Methods
Implementation Details.
...and 15 more sections

Figures (10)

Figure 1: Leveraging the preeminent capability of Latent Diffusion Model (LDM) and ControlNet as a prior knowledge of aesthetic QR code images, coupled with our proposed Scanning-Robust (Perceptual) Guidance, we can generate custom-styled QR codes conform to user prompts while assuring both scannability and aesthetics.
Figure 2: The scannability and aesthetics dilemma of prevailing methods. QR Code Monsterqrcodemonster2023 and QRBTF qrbtf2023 are capable of generating visual-appealing QR codes but the scannability is uncertain; QR Code AI Art qrcodeai and QR Diffusion qrdiffusion are capable of generating scannable QR codes but with limited aesthetics; Our proposed one-stage generation pipeline could generate both aesthetic and scanning-robust QR codes, and our proposed two-stage generation pipeline further improved the visual quality by harmonically merging the QR code alignment pattern into prompt-specific semantics. Red frames indicated unscannable, while green frames indicated scannable, zoom in for better details.
Figure 3: An overview of our proposed iterative refinement with Scanning Robustness Guidance (SRG). First, we leverage pre-trained ControlNet to obtain the initial score prediction conditioned on the target QR code and user-specified prompt. During each denoising step, we approximate $\mathbf{z}_{0|t}$ followed by DDIM formulation, then apply the VAE decoder to get $\mathbf{x}_{0|t}$ for $\mathcal{L}_\text{SR}$ calculation. We utilize the gradient of $\mathcal{L}_\text{SR}$ as a guidance term to update the predicted score. Repeat the above iterative refinement process until convergence.
Figure 4: An illustration of our proposed Scanning-Robust Loss (SRL). Without losing the generality, here we demonstrate a small region. We emulate the scanning process using module pixel extraction and binarization to calculate the pixel-wise error matrix and module-wise optimization decision mask. Then we apply a Gaussian kernel to re-weight the error matrix. Finally, we mask the error matrix with the decision mask via Hadamard product, then take the average to form our SRL.
Figure 5: An overview of our proposed two-stage generation pipeline with Scanning-Robust Perceptual Guidance (SRPG). In Stage 1, we utilize the pre-trained plain ControlNet to generate an aesthetic yet unscannable sub-optimal QR code; In Stage 2, we first perform SDEdit to convert the sub-optimal QR code to latent space, then leverage Qart to merge with the target QR code, finally, we apply our proposed iterative refinement to produce aesthetic and scannable QR code.
...and 5 more figures

Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance

TL;DR

Abstract

Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance

Authors

TL;DR

Abstract

Table of Contents

Figures (10)