Table of Contents
Fetching ...

Camera-Based Localization and Enhanced Normalized Mutual Information

Vishnu Teja Kunde, Jean-Francois Chamberland, Siddharth Agarwal

TL;DR

This work addresses the challenge of fine-grained, image-based localization for autonomous vehicles using inexpensive cameras by accounting for nonuniform noise introduced by perspective projection. It introduces a weighted generalized inner product (GIP) formulation for maximum likelihood detection and extends normalized mutual information to Enhanced Normalized Mutual Information (ENMI) that incorporates noise in the posterior distribution. Through simulations, the authors show that GIP$_{2D}$ and ENMI$_{2D}$ outperform traditional SIP and NMI, with ENMI$_{2D}$ offering the strongest robustness to noise and illumination variations. The findings suggest practical, software-only upgrades to localization pipelines, with potential extensions to multi-camera systems, LiDAR, and sequential imagery for improved real-world autonomous driving performance.

Abstract

Robust and fine localization algorithms are crucial for autonomous driving. For the production of such vehicles as a commodity, affordable sensing solutions and reliable localization algorithms must be designed. This work considers scenarios where the sensor data comes from images captured by an inexpensive camera mounted on the vehicle and where the vehicle contains a fine global map. Such localization algorithms typically involve finding the section in the global map that best matches the captured image. In harsh environments, both the global map and the captured image can be noisy. Because of physical constraints on camera placement, the image captured by the camera can be viewed as a noisy perspective transformed version of the road in the global map. Thus, an optimal algorithm should take into account the unequal noise power in various regions of the captured image, and the intrinsic uncertainty in the global map due to environmental variations. This article briefly reviews two matching methods: (i) standard inner product (SIP) and (ii) normalized mutual information (NMI). It then proposes novel and principled modifications to improve the performance of these algorithms significantly in noisy environments. These enhancements are inspired by the physical constraints associated with autonomous vehicles. They are grounded in statistical signal processing and, in some context, are provably better. Numerical simulations demonstrate the effectiveness of such modifications.

Camera-Based Localization and Enhanced Normalized Mutual Information

TL;DR

This work addresses the challenge of fine-grained, image-based localization for autonomous vehicles using inexpensive cameras by accounting for nonuniform noise introduced by perspective projection. It introduces a weighted generalized inner product (GIP) formulation for maximum likelihood detection and extends normalized mutual information to Enhanced Normalized Mutual Information (ENMI) that incorporates noise in the posterior distribution. Through simulations, the authors show that GIP and ENMI outperform traditional SIP and NMI, with ENMI offering the strongest robustness to noise and illumination variations. The findings suggest practical, software-only upgrades to localization pipelines, with potential extensions to multi-camera systems, LiDAR, and sequential imagery for improved real-world autonomous driving performance.

Abstract

Robust and fine localization algorithms are crucial for autonomous driving. For the production of such vehicles as a commodity, affordable sensing solutions and reliable localization algorithms must be designed. This work considers scenarios where the sensor data comes from images captured by an inexpensive camera mounted on the vehicle and where the vehicle contains a fine global map. Such localization algorithms typically involve finding the section in the global map that best matches the captured image. In harsh environments, both the global map and the captured image can be noisy. Because of physical constraints on camera placement, the image captured by the camera can be viewed as a noisy perspective transformed version of the road in the global map. Thus, an optimal algorithm should take into account the unequal noise power in various regions of the captured image, and the intrinsic uncertainty in the global map due to environmental variations. This article briefly reviews two matching methods: (i) standard inner product (SIP) and (ii) normalized mutual information (NMI). It then proposes novel and principled modifications to improve the performance of these algorithms significantly in noisy environments. These enhancements are inspired by the physical constraints associated with autonomous vehicles. They are grounded in statistical signal processing and, in some context, are provably better. Numerical simulations demonstrate the effectiveness of such modifications.

Paper Structure

This paper contains 12 sections, 3 theorems, 41 equations, 11 figures, 1 table.

Key Result

Proposition 1

Let the a priori probability of the true location be uniformly distributed on $[L]$. Then, the maximum likelihood location estimate is given by

Figures (11)

  • Figure 1: This notional diagram shows the (standard) alignment between the external frame of reference of the camera (right) whose origin is at the pinhole, and the internal focal plane of the camera (left).
  • Figure 2: The above figure illustrates the coordinate system of the view in front of the camera $(x, y, z)$ and the (planar) 2D coordinate system of the road $(\bar{x}, \bar{y})$.
  • Figure 3: The fundamental triangles that govern the projection of points on the road to the focal plane of the camera. The two right triangles share a hypotenuse, which can be leveraged to compute pertinent quantities.
  • Figure 4: The left diagram shows an arbitrary grid of equally sized squares on the road in front of the autonomous vehicle. The right diagrams show the grid pattern's image on the focal plane of the camera. Squares with the same physical area do not project onto equally sized sections of the focal plane. The two diagrams on the right are for different values of $\theta$.
  • Figure 5: This diagram shows how the probability mass is added to the empirical joint distribution in NMI. In the example above, the selected tiles yield $(1,2)$ and, hence, the corresponding location is incremented. Once all the tiles are accounted for, the matrix is normalized and the empirical joint distribution is obtained. For ease of exposition, we restricted the space of possibilities to $\mathcal{V} = \{0, 1, 2, 3\}$.
  • ...and 6 more figures

Theorems & Definitions (4)

  • Proposition 1
  • Corollary 2
  • Corollary 3
  • Remark 4