SamLP: A Customized Segment Anything Model for License Plate Detection
Haoxuan Ding, Junyu Gao, Yuan Yuan, Qi Wang
TL;DR
This work tackles license plate detection under diverse styles and limited data by repurposing the Segment Anything Model (SAM) as SamLP, a vision foundation-model-based LP detector. It combines parameter-efficient LoRA fine-tuning of SAM's image encoder and mask decoder with a promptable fine-tuning stage to preserve SAM's prompt-driven segmentation capability, enabling effective few-shot and zero-shot transfer. Across UFPR-ALPR, CCPD, CRPD, and AOLP datasets, SamLP and its promptable variant SamLP_P outperform traditional LP detectors and demonstrate strong robustness to domain shifts, while requiring only modest amounts of task-specific data. The approach highlights the practical potential of vision foundation models for cross-domain LP detection and points to future speed-ups via distillation or compression.
Abstract
With the emergence of foundation model, this novel paradigm of deep learning has encouraged many powerful achievements in natural language processing and computer vision. There are many advantages of foundation model, such as excellent feature extraction power, mighty generalization ability, great few-shot and zero-shot learning capacity, etc. which are beneficial to vision tasks. As the unique identity of vehicle, different countries and regions have diverse license plate (LP) styles and appearances, and even different types of vehicles have different LPs. However, recent deep learning based license plate detectors are mainly trained on specific datasets, and these limited datasets constrain the effectiveness and robustness of LP detectors. To alleviate the negative impact of limited data, an attempt to exploit the advantages of foundation model is implement in this paper. We customize a vision foundation model, i.e. Segment Anything Model (SAM), for LP detection task and propose the first LP detector based on vision foundation model, named SamLP. Specifically, we design a Low-Rank Adaptation (LoRA) fine-tuning strategy to inject extra parameters into SAM and transfer SAM into LP detection task. And then, we further propose a promptable fine-tuning step to provide SamLP with prompatable segmentation capacity. The experiments show that our proposed SamLP achieves promising detection performance compared to other LP detectors. Meanwhile, the proposed SamLP has great few-shot and zero-shot learning ability, which shows the potential of transferring vision foundation model. The code is available at https://github.com/Dinghaoxuan/SamLP
