Region Feature Descriptor Adapted to High Affine Transformations
Shaojie Zhang, Yinghui Wang, Bin Nan, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov
TL;DR
The paper tackles the decline of grayscale feature descriptor performance under high affine transformations, particularly tilt, which harms matching accuracy. It introduces a region-based descriptor that uses classification to simulate affine changes, generating multiple views, and augments region information using MSER-based segmentation, CLAHE/bilateral enhancement, a grayscale region histogram, and coordinates normalized to the region centroid, fused with the original descriptor via weights $\alpha_1$ and $\alpha_2$. The method employs an ASIFT-inspired affine decomposition focusing on non-uniform scaling parameters $\phi$ and tilt $\theta$, and validates performance on standard and simulated datasets using both homography and fundamental matrix criteria, showing improved precision and robustness, especially for larger affine variations, while maintaining compatibility with classical descriptors. Results indicate stronger invariance to affine distortions and practical applicability through integration with existing descriptors such as SIFT, SURF, ORB, AKAZE, and BRISK, albeit with additional computational cost primarily from region segmentation. Overall, the work presents a practical framework to enhance descriptor reliability under affine transformations with clear guidance for parameter settings and demonstrated benefits for 3D reconstruction workflows.
Abstract
To address the issue of feature descriptors being ineffective in representing grayscale feature information when images undergo high affine transformations, leading to a rapid decline in feature matching accuracy, this paper proposes a region feature descriptor based on simulating affine transformations using classification. The proposed method initially categorizes images with different affine degrees to simulate affine transformations and generate a new set of images. Subsequently, it calculates neighborhood information for feature points on this new image set. Finally, the descriptor is generated by combining the grayscale histogram of the maximum stable extremal region to which the feature point belongs and the normalized position relative to the grayscale centroid of the feature point's region. Experimental results, comparing feature matching metrics under affine transformation scenarios, demonstrate that the proposed descriptor exhibits higher precision and robustness compared to existing classical descriptors. Additionally, it shows robustness when integrated with other descriptors.
