Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Qiulei Dong, Jiayin Sun, Mengyu Gao
TL;DR
The paper tackles open-set fine-grained image recognition by designing CFAN, a three-module network that captures both high- and low-frequency information through a frequency-adjustable filtering mechanism and dual LSTM-based temporal fusion. CFAN-OSFGR applies this frequency-aware feature extraction to open-set recognition, achieving superior performance across multiple fine- and coarse-grained datasets and settings. The key contributions include a novel frequency-adjustable filter, a CFAN architecture for frequency-aware feature learning, and strong empirical results plus comprehensive ablations. This work demonstrates that balancing frequency content in features improves robustness to unknown classes and enhances discriminability in fine-grained open-set scenarios.
Abstract
Open-set image recognition is a challenging topic in computer vision. Most of the existing works in literature focus on learning more discriminative features from the input images, however, they are usually insensitive to the high- or low-frequency components in features, resulting in a decreasing performance on fine-grained image recognition. To address this problem, we propose a Complementary Frequency-varying Awareness Network that could better capture both high-frequency and low-frequency information, called CFAN. The proposed CFAN consists of three sequential modules: (i) a feature extraction module is introduced for learning preliminary features from the input images; (ii) a frequency-varying filtering module is designed to separate out both high- and low-frequency components from the preliminary features in the frequency domain via a frequency-adjustable filter; (iii) a complementary temporal aggregation module is designed for aggregating the high- and low-frequency components via two Long Short-Term Memory networks into discriminative features. Based on CFAN, we further propose an open-set fine-grained image recognition method, called CFAN-OSFGR, which learns image features via CFAN and classifies them via a linear classifier. Experimental results on 3 fine-grained datasets and 2 coarse-grained datasets demonstrate that CFAN-OSFGR performs significantly better than 9 state-of-the-art methods in most cases.
