Keypoint Detection and Description for Raw Bayer Images
Jiakai Lin, Jinchang Zhang, Guoyu Lu
TL;DR
This paper addresses the bottleneck of ISP-dependent processing for keypoint detection and local feature description by introducing a raw Bayer image–driven approach. It proposes two specialized Bayer convolution kernels and a two-branch encoder to produce a 256-d pixel-wise descriptor and per-pixel scores without demosaicing, enabling robust matching directly on raw data. Across HPatches-based experiments, the method achieves superior repeatability and higher homography-estimation accuracy for raw Bayer inputs, particularly under rotations and scale changes, outperforming RGB-based state-of-the-art methods on raw data. The work highlights the practical impact of ISP-free, resource-efficient feature extraction for robotics, and lays the groundwork for real-time raw-image pipelines in constrained environments.
Abstract
Keypoint detection and local feature description are fundamental tasks in robotic perception, critical for applications such as SLAM, robot localization, feature matching, pose estimation, and 3D mapping. While existing methods predominantly operate on RGB images, we propose a novel network that directly processes raw images, bypassing the need for the Image Signal Processor (ISP). This approach significantly reduces hardware requirements and memory consumption, which is crucial for robotic vision systems. Our method introduces two custom-designed convolutional kernels capable of performing convolutions directly on raw images, preserving inter-channel information without converting to RGB. Experimental results show that our network outperforms existing algorithms on raw images, achieving higher accuracy and stability under large rotations and scale variations. This work represents the first attempt to develop a keypoint detection and feature description network specifically for raw images, offering a more efficient solution for resource-constrained environments.
