Table of Contents
Fetching ...

SAMIRO: Spatial Attention Mutual Information Regularization with a Pre-trained Model as Oracle for Lane Detection

Hyunjong Lee, Jangho Lee, Jaekoo Lee

TL;DR

SAMIRO tackles lane-detection generalization under challenging real-world conditions by transferring domain-agnostic spatial cues from a large pre-trained model to a target detector. It combines CBAM-based spatial attention with a mutual-information regularization against an MIM-based oracle, enforcing alignment of normalized intermediate features via a learnable projection while maintaining low inference cost. The approach is plug-and-play across architectures and backbones, demonstrated by consistent gains on CULane, TuSimple, and LLAMAS and supported by targeted ablations showing the contribution of each component. This yields robust lane perception in clutter, varying illumination, and occlusion, with practical impact for autonomous driving systems.

Abstract

Lane detection is an important topic in the future mobility solutions. Real-world environmental challenges such as background clutter, varying illumination, and occlusions pose significant obstacles to effective lane detection, particularly when relying on data-driven approaches that require substantial effort and cost for data collection and annotation. To address these issues, lane detection methods must leverage contextual and global information from surrounding lanes and objects. In this paper, we propose a Spatial Attention Mutual Information Regularization with a pre-trained model as an Oracle, called SAMIRO. SAMIRO enhances lane detection performance by transferring knowledge from a pretrained model while preserving domain-agnostic spatial information. Leveraging SAMIRO's plug-and-play characteristic, we integrate it into various state-of-the-art lane detection approaches and conduct extensive experiments on major benchmarks such as CULane, Tusimple, and LLAMAS. The results demonstrate that SAMIRO consistently improves performance across different models and datasets. The code will be made available upon publication.

SAMIRO: Spatial Attention Mutual Information Regularization with a Pre-trained Model as Oracle for Lane Detection

TL;DR

SAMIRO tackles lane-detection generalization under challenging real-world conditions by transferring domain-agnostic spatial cues from a large pre-trained model to a target detector. It combines CBAM-based spatial attention with a mutual-information regularization against an MIM-based oracle, enforcing alignment of normalized intermediate features via a learnable projection while maintaining low inference cost. The approach is plug-and-play across architectures and backbones, demonstrated by consistent gains on CULane, TuSimple, and LLAMAS and supported by targeted ablations showing the contribution of each component. This yields robust lane perception in clutter, varying illumination, and occlusion, with practical impact for autonomous driving systems.

Abstract

Lane detection is an important topic in the future mobility solutions. Real-world environmental challenges such as background clutter, varying illumination, and occlusions pose significant obstacles to effective lane detection, particularly when relying on data-driven approaches that require substantial effort and cost for data collection and annotation. To address these issues, lane detection methods must leverage contextual and global information from surrounding lanes and objects. In this paper, we propose a Spatial Attention Mutual Information Regularization with a pre-trained model as an Oracle, called SAMIRO. SAMIRO enhances lane detection performance by transferring knowledge from a pretrained model while preserving domain-agnostic spatial information. Leveraging SAMIRO's plug-and-play characteristic, we integrate it into various state-of-the-art lane detection approaches and conduct extensive experiments on major benchmarks such as CULane, Tusimple, and LLAMAS. The results demonstrate that SAMIRO consistently improves performance across different models and datasets. The code will be made available upon publication.

Paper Structure

This paper contains 11 sections, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Illustration of transfer learning. (a) Describes a scenario in transfer learning. (b) Provides an overview of Naïve transfer learning, which fails to address the domain gap between source and target data. (c) Represents a scenario where continuous interaction between source and target data through SAMIRO leads to learning domain-agnostic knowledge.
  • Figure 2: Overall framework. (a) pre-training with masked image modeling, (b) transfer learning with SAMIRO.
  • Figure 3: Visualized comparison of the CULane dataset. The base represents the results of CLRerNet. The results show challenging situations in lane recognition, such as viewpoint variation, background clutter, illumination changes, and occlusions by vehicles, reflections, sunlight, and nighttime.