Global License Plate Dataset

Siddharth Agrawal

Global License Plate Dataset

Siddharth Agrawal

TL;DR

The paper introduces the Global License Plate Dataset (GLPD), a large-scale, multinational resource with over 5 million images from 74 countries, designed to benchmark license plate recognition under real-world, diverse conditions. It details data collection primarily from Platesmania and supplementary open sources, extensive annotations (plate text, four-vertex corners, segmentation maps, and vehicle attributes), and ancillary COCO-style labels for a subset, all under a defined 60/20/20 train/validation/test split with near-duplicate avoidance using Normalised Edit Distance $NED(a,b) = \frac{dist(a,b)}{\max(len(a),len(b))}$. The paper presents end-to-end evaluation strategies, including detection via YOLOv5m and recognition with CRNN and PARSeq, reporting strong cross-country performance and highlighting PARSeq’s superior accuracy. Ethical considerations are discussed, emphasizing privacy protections and controlled sampling to mitigate bias, while stressing GLPD’s potential to improve generalization and enable country-specific fine-tuning for license plate recognition systems.

Abstract

In the pursuit of advancing the state-of-the-art (SOTA) in road safety, traffic monitoring, surveillance, and logistics automation, we introduce the Global License Plate Dataset (GLPD). The dataset consists of over 5 million images, including diverse samples captured from 74 countries with meticulous annotations, including license plate characters, license plate segmentation masks, license plate corner vertices, as well as vehicle make, colour, and model. We also include annotated data on more classes, such as pedestrians, vehicles, roads, etc. We include a statistical analysis of the dataset, and provide baseline efficient and accurate models. The GLPD aims to be the primary benchmark dataset for model development and finetuning for license plate recognition.

Global License Plate Dataset

TL;DR

. The paper presents end-to-end evaluation strategies, including detection via YOLOv5m and recognition with CRNN and PARSeq, reporting strong cross-country performance and highlighting PARSeq’s superior accuracy. Ethical considerations are discussed, emphasizing privacy protections and controlled sampling to mitigate bias, while stressing GLPD’s potential to improve generalization and enable country-specific fine-tuning for license plate recognition systems.

Abstract

Paper Structure (17 sections, 1 equation, 6 figures, 2 tables)

This paper contains 17 sections, 1 equation, 6 figures, 2 tables.

Introduction
Related Work
License Plate Recognition
Existing Datasets
Scene Text Recognition
Motivation
Novel Dataset Challenges:
Methodology
Data Collection
Dataset Labeling and Annotations
Dataset Splits
Dataset Statistics
Evaluation Metrics
Model Overview
License Plate Recognition
...and 2 more sections

Figures (6)

Figure 1: License Plate Samples from the Dataset
Figure 4: Example Annotations
Figure 5: Example of Labels: Car make, model, year, color, and other information
Figure 6: Example of multi-class Annotation labels: Bounding Boxes and Instance Segmentations
Figure 7: The Number of License Plate Images for each Country in the Dataset
...and 1 more figures

Global License Plate Dataset

TL;DR

Abstract

Global License Plate Dataset

Authors

TL;DR

Abstract

Table of Contents

Figures (6)