DRKF: Distilled Rotated Kernel Fusion for Efficient Rotation Invariant Descriptors in Local Feature Matching
Ranran Huang, Jiancheng Cai, Chao Li, Zhuoyuan Wu, Xinmin Liu, Zhenhua Chai
TL;DR
This work tackles the challenge of rotation variation in local feature matching by introducing Rotated Kernel Fusion (RKF), which rotates and fuses kernels to embed rotation invariance directly into CNNs, and Multi-oriented Feature Aggregation (MOFA) as a training-time teacher to further boost robustness. A knowledge-distillation framework yields the distilled DRKF model, with re-parameterization fused kernels enabling the same inference cost as standard CNNs. The method exhibits strong rotation robustness on rotated HPatches and achieves state-of-the-art Mean Average Accuracy on the DiverseBEV aerial dataset, while maintaining practical efficiency on embedded hardware. Overall, DRKF provides a practical, rotation-invariant descriptor learning approach that generalizes across viewpoints and rotations with efficient deployment.
Abstract
The performance of local feature descriptors degrades in the presence of large rotation variations. To address this issue, we present an efficient approach to learning rotation invariant descriptors. Specifically, we propose Rotated Kernel Fusion (RKF) which imposes rotations on the convolution kernel to improve the inherent nature of CNN. Since RKF can be processed by the subsequent re-parameterization, no extra computational costs will be introduced in the inference stage. Moreover, we present Multi-oriented Feature Aggregation (MOFA) which aggregates features extracted from multiple rotated versions of the input image and can provide auxiliary knowledge for the training of RKF by leveraging the distillation strategy. We refer to the distilled RKF model as DRKF. Besides the evaluation on a rotation-augmented version of the public dataset HPatches, we also contribute a new dataset named DiverseBEV which is collected during the drone's flight and consists of bird's eye view images with large viewpoint changes and camera rotations. Extensive experiments show that our method can outperform other state-of-the-art techniques when exposed to large rotation variations.
