License Plate Super-Resolution Using Diffusion Models

Sawsan AlHalawani; Bilel Benjdira; Adel Ammar; Anis Koubaa; Anas M. Ali

License Plate Super-Resolution Using Diffusion Models

Sawsan AlHalawani, Bilel Benjdira, Adel Ammar, Anis Koubaa, Anas M. Ali

TL;DR

This work addresses license plate super-resolution under challenging low-resolution surveillance conditions by employing a conditional diffusion model (DDPM) with a U-Net backbone to reconstruct high-resolution plate images from downsampled inputs. Trained on a Saudi license plate dataset, the diffusion-based approach achieves notable gains over state-of-the-art SR methods (approximately $PSNR$ improvements of 12.55% over SwinIR and 37.32% over ESRGAN; $SSIM$ improvements of 4.89% and 17.66%, respectively) and is favored by human evaluators (92% preferred). The study demonstrates diffusion models’ strong capability for preserving structural details and histogram characteristics in LP images, with substantial potential to enhance license plate recognition systems. A primary limitation is computational cost, suggesting future work to optimize efficiency for real-time or large-scale surveillance deployments.

Abstract

In surveillance, accurately recognizing license plates is hindered by their often low quality and small dimensions, compromising recognition precision. Despite advancements in AI-based image super-resolution, methods like Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) still fall short in enhancing license plate images. This study leverages the cutting-edge diffusion model, which has consistently outperformed other deep learning techniques in image restoration. By training this model using a curated dataset of Saudi license plates, both in low and high resolutions, we discovered the diffusion model's superior efficacy. The method achieves a 12.55\% and 37.32% improvement in Peak Signal-to-Noise Ratio (PSNR) over SwinIR and ESRGAN, respectively. Moreover, our method surpasses these techniques in terms of Structural Similarity Index (SSIM), registering a 4.89% and 17.66% improvement over SwinIR and ESRGAN, respectively. Furthermore, 92% of human evaluators preferred our images over those from other algorithms. In essence, this research presents a pioneering solution for license plate super-resolution, with tangible potential for surveillance systems.

License Plate Super-Resolution Using Diffusion Models

TL;DR

improvements of 12.55% over SwinIR and 37.32% over ESRGAN;

improvements of 4.89% and 17.66%, respectively) and is favored by human evaluators (92% preferred). The study demonstrates diffusion models’ strong capability for preserving structural details and histogram characteristics in LP images, with substantial potential to enhance license plate recognition systems. A primary limitation is computational cost, suggesting future work to optimize efficiency for real-time or large-scale surveillance deployments.

Abstract

Paper Structure (15 sections, 22 equations, 9 figures, 4 tables)

This paper contains 15 sections, 22 equations, 9 figures, 4 tables.

Introduction
Related Works
Proposed Method
Experiments
Datasets
Evaluation Metrics
Implementation Details
Discussion
Visualization of the Reconstruction Steps
Results
Comparison with State-of-the-art Methods
Quantitative Results
Qualitative Results: Human Evaluation
Qualitative Results: Visual Details Evaluation
Conclusions

Figures (9)

Figure 1: Comparison between the ground truth and the super resolved images using SwinIR and ESRGAN.
Figure 2: Starting from an HR license plate image $z_0^{hr}$, it's either noise-introduced or downscaled to produce $x_0^{lr}$. The forward process further adds noise until reaching isotropy at step $T$.
Figure 3: From the fully degraded image $x_T^{lr}$, the backward process reconstructs the HR image $y_0^{hr}$ using prior knowledge about $x_0^{lr}$.
Figure 4: This figure shows the intermediate steps of the HR image reconstruction. Starting from the LR image (a), then going through the denoising steps (b), to produce the output HR image (c) that resembles the ground truth image (d).
Figure 5: First top row has the original HR image of ($192\times192$) resolution. Middle row has the downscaled LR image of ($48\times48$) resolution. Last row (bottom) has the super resolved images using diffusion model with ($192\times192$) resolution.
...and 4 more figures

License Plate Super-Resolution Using Diffusion Models

TL;DR

Abstract

License Plate Super-Resolution Using Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (9)