Efficient Video-Based ALPR System Using YOLO and Visual Rhythm
Victor Nascimento Ribeiro, Nina S. T. Hirata
TL;DR
This work tackles the inefficiency of video-based ALPR by removing the need to process every frame. It combines YOLO-based detection with Visual Rhythm to select exactly one frame per vehicle when its crossing the VR line, and then applies OCR to read the license plate in that frame. The approach leverages a 600-frame VR window, fine-tuned detectors, and EasyOCR-based recognition, reporting preliminary results with a CER of about 15.8% on a Brazilian-plate dataset. The study demonstrates potential computational savings and practical viability, with clear paths for improvements in detector training and OCR tuning to enhance accuracy and throughput.
Abstract
Automatic License Plate Recognition (ALPR) involves extracting vehicle license plate information from image or a video capture. These systems have gained popularity due to the wide availability of low-cost surveillance cameras and advances in Deep Learning. Typically, video-based ALPR systems rely on multiple frames to detect the vehicle and recognize the license plates. Therefore, we propose a system capable of extracting exactly one frame per vehicle and recognizing its license plate characters from this singular image using an Optical Character Recognition (OCR) model. Early experiments show that this methodology is viable.
