Table of Contents
Fetching ...

BEBLID: Boosted efficient binary local image descriptor

Iago Suárez, Ghesn Sfeir, José M. Buenaposada, Luis Baumela

TL;DR

The paper addresses the need for fast, accurate local feature descriptors on resource-constrained devices by introducing BEBLID, a binary descriptor learned with AdaBoost using unbalanced training and a fast Thresholded Average Box weak learner. BEBLID builds on BELID by binarizing a real-valued precursor and enforcing equal weak-learner weights, achieving accuracy close to SIFT with substantially better efficiency than ORB. Extensive HPatches evaluations across verification, matching, and retrieval show BEBLID outperforming other binary descriptors while offering real-time performance on mobile and embedded platforms. The work demonstrates a practical, scalable approach for real-time matching in mobile robotics, SLAM, and related vision tasks.

Abstract

Efficient matching of local image features is a fundamental task in many computer vision applications. However, the real-time performance of top matching algorithms is compromised in computationally limited devices, such as mobile phones or drones, due to the simplicity of their hardware and their finite energy supply. In this paper we introduce BEBLID, an efficient learned binary image descriptor. It improves our previous real-valued descriptor, BELID, making it both more efficient for matching and more accurate. To this end we use AdaBoost with an improved weak-learner training scheme that produces better local descriptions. Further, we binarize our descriptor by forcing all weak-learners to have the same weight in the strong learner combination and train it in an unbalanced data set to address the asymmetries arising in matching and retrieval tasks. In our experiments BEBLID achieves an accuracy close to SIFT and better computational efficiency than ORB, the fastest algorithm in the literature.

BEBLID: Boosted efficient binary local image descriptor

TL;DR

The paper addresses the need for fast, accurate local feature descriptors on resource-constrained devices by introducing BEBLID, a binary descriptor learned with AdaBoost using unbalanced training and a fast Thresholded Average Box weak learner. BEBLID builds on BELID by binarizing a real-valued precursor and enforcing equal weak-learner weights, achieving accuracy close to SIFT with substantially better efficiency than ORB. Extensive HPatches evaluations across verification, matching, and retrieval show BEBLID outperforming other binary descriptors while offering real-time performance on mobile and embedded platforms. The work demonstrates a practical, scalable approach for real-time matching in mobile robotics, SLAM, and related vision tasks.

Abstract

Efficient matching of local image features is a fundamental task in many computer vision applications. However, the real-time performance of top matching algorithms is compromised in computationally limited devices, such as mobile phones or drones, due to the simplicity of their hardware and their finite energy supply. In this paper we introduce BEBLID, an efficient learned binary image descriptor. It improves our previous real-valued descriptor, BELID, making it both more efficient for matching and more accurate. To this end we use AdaBoost with an improved weak-learner training scheme that produces better local descriptions. Further, we binarize our descriptor by forcing all weak-learners to have the same weight in the strong learner combination and train it in an unbalanced data set to address the asymmetries arising in matching and retrieval tasks. In our experiments BEBLID achieves an accuracy close to SIFT and better computational efficiency than ORB, the fastest algorithm in the literature.
Paper Structure (14 sections, 6 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 14 sections, 6 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Visualization of BELID and BEBLID pixel location sampling pairs (left) and spatial weight heat maps (right) trained on the Liberty patches data set. Both learn a well distributed set of point pairs giving more importance to the center area.
  • Figure 2: BEBLID descriptor extraction workflow. To describe an image patch, BEBLID efficiently calculates the mean gray value of the pixels in the red and blue boxes. For each pair of red-blue boxes it subtracts their average values obtaining ${\hbox{\boldmath $\bf f$}}({\hbox{\boldmath $\bf x$}})$, the WL. It then thresholds ${\hbox{\boldmath $\bf f$}}({\hbox{\boldmath $\bf x$}})$ to obtain ${\hbox{\boldmath $\bf h$}}({\hbox{\boldmath $\bf x$}})$ and the binary descriptor ${\hbox{\boldmath $\bf D$}}({\hbox{\boldmath $\bf x$}})={\hbox{\boldmath $\bf h$}}({\hbox{\boldmath $\bf x$}})\geq 0$.
  • Figure 3: ROC Curve for the verification task in the Brown data sets. We compare BoostedSSC with AdaBoost selecting in each iteration the best WL of a random selection (BELID-U-ADA-Rand), or by exhaustively searching for some of the WL parameters (BELID-U-ADA), or further normalizing the weights of the positive and negative classes.
  • Figure 4: BEBLID learning rate selection experiment. We show the mAP for verification, matching and retrieval in the "full" split of Hpatches for models trained with different learning rates, $\gamma$, in the Liberty data set.
  • Figure 5: Comparison of the state-of-the-art descriptors in the "full" split of the HPatches data set. The marker color indicates the noise level: EASY, HARD, and TOUGH. INTRA makers in Patch Verification task corresponds to Patch pairs obtained from the same sequence whereas INTER markers are from different ones. In the Image Matching task, the VIEWP markers refer to scenes with viewpoint distortions and the ILLUM to scenes with illumination changes. The bar length represents the mean of the six variants of each task.