Adaptive Confidence Multi-View Hashing for Multimedia Retrieval
Jian Zhu, Yu Cui, Zhangmin Huang, Xingyu Li, Lei Liu, Lingfang Zeng, Li-Rong Dai
TL;DR
The paper tackles noisy, unreliable fusion in multi-view multimedia retrieval by introducing Adaptive Confidence Multi-View Hashing (ACMVH). ACMVH employs per-view confidence networks, an adaptive confidence fusion mechanism, and a dilation-based enhancer to produce robust $k$-bit hash codes, optimizing both retrieval similarity and classification signals via $L_{total} = L_{sim} + \mu L_{clf}$. On MIR-Flickr25K and NUS-WIDE, ACMVH achieves up to $3.24\%$ improvements in mean Average Precision over state-of-the-art baselines, with ablation studies confirming the critical contributions of the confidence and fusion modules. Convergence analysis shows stable training and generalization, while the work highlights practical impact in noise-robust, semantically expressive multimedia retrieval. Future work aims to sustain gains with longer hash codes and further enhance cross-view representation learning, aided by the confidence-driven fusion framework.
Abstract
The multi-view hash method converts heterogeneous data from multiple views into binary hash codes, which is one of the critical technologies in multimedia retrieval. However, the current methods mainly explore the complementarity among multiple views while lacking confidence learning and fusion. Moreover, in practical application scenarios, the single-view data contain redundant noise. To conduct the confidence learning and eliminate unnecessary noise, we propose a novel Adaptive Confidence Multi-View Hashing (ACMVH) method. First, a confidence network is developed to extract useful information from various single-view features and remove noise information. Furthermore, an adaptive confidence multi-view network is employed to measure the confidence of each view and then fuse multi-view features through a weighted summation. Lastly, a dilation network is designed to further enhance the feature representation of the fused features. To the best of our knowledge, we pioneer the application of confidence learning into the field of multimedia retrieval. Extensive experiments on two public datasets show that the proposed ACMVH performs better than state-of-the-art methods (maximum increase of 3.24%). The source code is available at https://github.com/HackerHyper/ACMVH.
