Table of Contents
Fetching ...

Enhancing Road Safety Through Multi-Camera Image Segmentation with Post-Encroachment Time Analysis

Shounak Ray Chaudhuri, Arash Jahangiri, Christopher Paolini

TL;DR

This work addresses the sparsity and latency of crash data by introducing a real-time, pixel-level PET framework that fuses four synchronized camera views on edge devices to quantify intersection safety. The method combines YOLOv11-seg vehicle detection, per-camera homography-based bird's-eye mapping, and a per-pixel PET metric to produce high-resolution hazard heatmaps (≈3.3 cm per pixel) at the H Street and Broadway intersection, with data logged in SQL for long-term monitoring. A stopwatch-based PET computation and multi-camera overlap enable fine-grained identification of high-risk zones, achieving real-time throughput of about 2.68 FPS on edge hardware and validating the approach as a scalable ITS safety analytics pipeline. The results demonstrate actionable spatial insights to guide road design and potential adaptive signal strategies, offering a low-cost, decentralized alternative to sensor-heavy infrastructure.

Abstract

Traffic safety analysis at signalized intersections is vital for reducing vehicle and pedestrian collisions, yet traditional crash-based studies are limited by data sparsity and latency. This paper presents a novel multi-camera computer vision framework for real-time safety assessment through Post-Encroachment Time (PET) computation, demonstrated at the intersection of H Street and Broadway in Chula Vista, California. Four synchronized cameras provide continuous visual coverage, with each frame processed on NVIDIA Jetson AGX Xavier devices using YOLOv11 segmentation for vehicle detection. Detected vehicle polygons are transformed into a unified bird's-eye map using homography matrices, enabling alignment across overlapping camera views. A novel pixel-level PET algorithm measures vehicle position without reliance on fixed cells, allowing fine-grained hazard visualization via dynamic heatmaps, accurate to 3.3 sq-cm. Timestamped vehicle and PET data is stored in an SQL database for long-term monitoring. Results over various time intervals demonstrate the framework's ability to identify high-risk regions with sub-second precision and real-time throughput on edge devices, producing data for an 800 x 800 pixel logarithmic heatmap at an average of 2.68 FPS. This study validates the feasibility of decentralized vision-based PET analysis for intelligent transportation systems, offering a replicable methodology for high-resolution, real-time, and scalable intersection safety evaluation.

Enhancing Road Safety Through Multi-Camera Image Segmentation with Post-Encroachment Time Analysis

TL;DR

This work addresses the sparsity and latency of crash data by introducing a real-time, pixel-level PET framework that fuses four synchronized camera views on edge devices to quantify intersection safety. The method combines YOLOv11-seg vehicle detection, per-camera homography-based bird's-eye mapping, and a per-pixel PET metric to produce high-resolution hazard heatmaps (≈3.3 cm per pixel) at the H Street and Broadway intersection, with data logged in SQL for long-term monitoring. A stopwatch-based PET computation and multi-camera overlap enable fine-grained identification of high-risk zones, achieving real-time throughput of about 2.68 FPS on edge hardware and validating the approach as a scalable ITS safety analytics pipeline. The results demonstrate actionable spatial insights to guide road design and potential adaptive signal strategies, offering a low-cost, decentralized alternative to sensor-heavy infrastructure.

Abstract

Traffic safety analysis at signalized intersections is vital for reducing vehicle and pedestrian collisions, yet traditional crash-based studies are limited by data sparsity and latency. This paper presents a novel multi-camera computer vision framework for real-time safety assessment through Post-Encroachment Time (PET) computation, demonstrated at the intersection of H Street and Broadway in Chula Vista, California. Four synchronized cameras provide continuous visual coverage, with each frame processed on NVIDIA Jetson AGX Xavier devices using YOLOv11 segmentation for vehicle detection. Detected vehicle polygons are transformed into a unified bird's-eye map using homography matrices, enabling alignment across overlapping camera views. A novel pixel-level PET algorithm measures vehicle position without reliance on fixed cells, allowing fine-grained hazard visualization via dynamic heatmaps, accurate to 3.3 sq-cm. Timestamped vehicle and PET data is stored in an SQL database for long-term monitoring. Results over various time intervals demonstrate the framework's ability to identify high-risk regions with sub-second precision and real-time throughput on edge devices, producing data for an 800 x 800 pixel logarithmic heatmap at an average of 2.68 FPS. This study validates the feasibility of decentralized vision-based PET analysis for intelligent transportation systems, offering a replicable methodology for high-resolution, real-time, and scalable intersection safety evaluation.

Paper Structure

This paper contains 25 sections, 1 equation, 10 figures.

Figures (10)

  • Figure 1: Intersection of H St and Broadway in Chula Vista, CA, with annotated points of interest. Annotations are cartesian pixel values in global coordinates.
  • Figure 2: Camera perspective view of H St and Broadway in Chula Vista, CA, with annotated points of interest using camera coordinates.
  • Figure 3: Dataset of annotated point of interest coordinates across each coordinate system. Camera X and Y values are HD image coordinates paired with global coordinates found from a satellite view.
  • Figure 4: Visual result of applying the homography matrix of one of the cameras to convert coordinates.
  • Figure 5: Rectangle fitting code operating on four layered camera views.
  • ...and 5 more figures