LINEA: Fast and Accurate Line Detection Using Scalable Transformers
Sebastian Janampa, Marios Pattichis
TL;DR
LINEA tackles the need for fast, accurate line segment detection in real-time scenarios by introducing Deformable Line Attention (DLA) within a hybrid encoder transformer. It eliminates the requirement for pretraining attention on large datasets, achieving low latency and competitive accuracy across benchmarks, especially in out-of-distribution tests. The approach leverages a compact parameter footprint, a query-selection mechanism, and a deformable attention scheme that samples along line endpoints to estimate endpoints with high efficiency. Ablation results confirm the critical role of DLA and related design choices in both performance gains and rapid convergence, making LINEA suitable for time-sensitive imaging tasks such as SLAM and autonomous navigation.
Abstract
Line detection is a basic digital image processing operation used by higher-level processing methods. Recently, transformer-based methods for line detection have proven to be more accurate than methods based on CNNs, at the expense of significantly lower inference speeds. As a result, video analysis methods that require low latencies cannot benefit from current transformer-based methods for line detection. In addition, current transformer-based models require pretraining attention mechanisms on large datasets (e.g., COCO or Object360). This paper develops a new transformer-based method that is significantly faster without requiring pretraining the attention mechanism on large datasets. We eliminate the need to pre-train the attention mechanism using a new mechanism, Deformable Line Attention (DLA). We use the term LINEA to refer to our new transformer-based method based on DLA. Extensive experiments show that LINEA is significantly faster and outperforms previous models on sAP in out-of-distribution dataset testing.
