Table of Contents
Fetching ...

Typography-Based Monocular Distance Estimation Framework for Vehicle Safety Systems

Manognya Lokesh Reddy, Zheng Liu

Abstract

Accurate inter-vehicle distance estimation is a cornerstone of advanced driver assistance systems and autonomous driving. While LiDAR and radar provide high precision, their cost prohibits widespread adoption in mass-market vehicles. Monocular vision offers a low-cost alternative but suffers from scale ambiguity and sensitivity to environmental disturbances. This paper introduces a typography-based monocular distance estimation framework, which exploits the standardized typography of license plates as passive fiducial markers for metric distance estimation. The core geometric module uses robust plate detection and character segmentation to measure character height and computes distance via the pinhole camera model. The system incorporates interactive calibration, adaptive detection with strict and permissive modes, and multi-method character segmentation leveraging both adaptive and global thresholding. To enhance robustness, the framework further includes camera pose compensation using lane-based horizon estimation, hybrid deep-learning fusion, temporal Kalman filtering for velocity estimation, and multi-feature fusion that exploits additional typographic cues such as stroke width, character spacing, and plate border thickness. Experimental validation with a calibrated monocular camera in a controlled indoor setup achieved a coefficient of variation of 2.3% in character height across consecutive frames and a mean absolute error of 7.7%. The framework operates without GPU acceleration, demonstrating real-time feasibility. A comprehensive comparison with a plate-width based method shows that character-based ranging reduces the standard deviation of estimates by 35%, translating to smoother, more consistent distance readings in practice, where erratic estimates could trigger unnecessary braking or acceleration.

Typography-Based Monocular Distance Estimation Framework for Vehicle Safety Systems

Abstract

Accurate inter-vehicle distance estimation is a cornerstone of advanced driver assistance systems and autonomous driving. While LiDAR and radar provide high precision, their cost prohibits widespread adoption in mass-market vehicles. Monocular vision offers a low-cost alternative but suffers from scale ambiguity and sensitivity to environmental disturbances. This paper introduces a typography-based monocular distance estimation framework, which exploits the standardized typography of license plates as passive fiducial markers for metric distance estimation. The core geometric module uses robust plate detection and character segmentation to measure character height and computes distance via the pinhole camera model. The system incorporates interactive calibration, adaptive detection with strict and permissive modes, and multi-method character segmentation leveraging both adaptive and global thresholding. To enhance robustness, the framework further includes camera pose compensation using lane-based horizon estimation, hybrid deep-learning fusion, temporal Kalman filtering for velocity estimation, and multi-feature fusion that exploits additional typographic cues such as stroke width, character spacing, and plate border thickness. Experimental validation with a calibrated monocular camera in a controlled indoor setup achieved a coefficient of variation of 2.3% in character height across consecutive frames and a mean absolute error of 7.7%. The framework operates without GPU acceleration, demonstrating real-time feasibility. A comprehensive comparison with a plate-width based method shows that character-based ranging reduces the standard deviation of estimates by 35%, translating to smoother, more consistent distance readings in practice, where erratic estimates could trigger unnecessary braking or acceleration.
Paper Structure (32 sections, 10 equations, 11 figures, 3 tables, 2 algorithms)

This paper contains 32 sections, 10 equations, 11 figures, 3 tables, 2 algorithms.

Figures (11)

  • Figure 1: T-MDE geometric core pipeline: plate detection, perspective correction, character segmentation, and distance calculation. Each stage can be independently evaluated and improved.
  • Figure 2: Camera pose compensation module. Lane markings are detected and analyzed to estimate pitch and roll angles, which correct character height measurements before distance calculation.
  • Figure 3: Hybrid geometric and deep-learning fusion architecture. A monocular depth network provides an independent estimate that is scale-aligned using the geometric estimate, and both are fused through a Kalman filter with uncertainty weighting.
  • Figure 4: Temporal processing with optical flow tracking and Kalman filtering. Optical flow provides tracking through brief occlusions, while the Kalman filter smooths distance estimates and computes relative velocity.
  • Figure 5: License plate used in the validation experiment.
  • ...and 6 more figures