Smart Parking with Pixel-Wise ROI Selection for Vehicle Detection Using YOLOv8, YOLOv9, YOLOv10, and YOLOv11
Gustavo P. C. P. da Luz, Gabriel Massuyoshi Sato, Luis Fernando Gomez Gonzalez, Juliana Freitag Borin
TL;DR
This work proposes a pixel‑wise ROI post‑processing approach to vehicle detection in smart parking using YOLOv8–YOLOv11, evaluated on edge versus cloud deployments. It demonstrates that pixel‑level ROI masks significantly improve counting accuracy, achieving a top balanced accuracy of $99.68\%$ with YOLOv9e on a $3{,}484$‑image dataset, while edge inference times range from $1$ to $92$ seconds depending on hardware. The study also provides a comprehensive cost analysis, showing that a low‑cost edge‑based camera solution can be more economical than sensor‑based approaches for moderate to large lots, while preserving data privacy. Overall, the results highlight the practicality of deploying modern YOLO models at the edge with flexible ROI definitions to enable scalable, privacy‑friendly smart parking systems. Future work points to further model optimization and broader hardware benchmarking.
Abstract
The increasing urbanization and the growing number of vehicles in cities have underscored the need for efficient parking management systems. Traditional smart parking solutions often rely on sensors or cameras for occupancy detection, each with its limitations. Recent advancements in deep learning have introduced new YOLO models (YOLOv8, YOLOv9, YOLOv10, and YOLOv11), but these models have not been extensively evaluated in the context of smart parking systems, particularly when combined with Region of Interest (ROI) selection for object detection. Existing methods still rely on fixed polygonal ROI selections or simple pixel-based modifications, which limit flexibility and precision. This work introduces a novel approach that integrates Internet of Things, Edge Computing, and Deep Learning concepts, by using the latest YOLO models for vehicle detection. By exploring both edge and cloud computing, it was found that inference times on edge devices ranged from 1 to 92 seconds, depending on the hardware and model version. Additionally, a new pixel-wise post-processing ROI selection method is proposed for accurately identifying regions of interest to count vehicles in parking lot images. The proposed system achieved 99.68% balanced accuracy on a custom dataset of 3,484 images, offering a cost-effective smart parking solution that ensures precise vehicle detection while preserving data privacy
