Potential Field Based Deep Metric Learning
Shubhang Bhatnagar, Narendra Ahuja
TL;DR
PFML introduces a novel potential-field framework for deep metric learning, where each sample acts as a charge generating an attraction and a repulsion field that decays with distance. By superposing fields from embeddings and learnable proxies, PFML models global, all-pair interactions while mitigating noise via distance decay, and optimizes by minimizing a total potential energy. Theoretical results (Proposition 1 and Corollary 1) and extensive experiments show improved robustness to label noise and closer proxy-data alignment compared to non-decaying proxy-based or tuple-based methods, yielding state-of-the-art image retrieval performance on Cars-196, CUB-200-2011, and SOP. These findings highlight a scalable, robust alternative to tuple mining and proxy-only losses with strong practical impact for fine-grained recognition and retrieval tasks.
Abstract
Deep metric learning (DML) involves training a network to learn a semantically meaningful representation space. Many current approaches mine n-tuples of examples and model interactions within each tuplets. We present a novel, compositional DML model that instead of in tuples, represents the influence of each example (embedding) by a continuous potential field, and superposes the fields to obtain their combined global potential field. We use attractive/repulsive potential fields to represent interactions among embeddings from images of the same/different classes. Contrary to typical learning methods, where mutual influence of samples is proportional to their distance, we enforce reduction in such influence with distance, leading to a decaying field. We show that such decay helps improve performance on real world datasets with large intra-class variations and label noise. Like other proxy-based methods, we also use proxies to succinctly represent sub-populations of examples. We evaluate our method on three standard DML benchmarks- Cars-196, CUB-200-2011, and SOP datasets where it outperforms state-of-the-art baselines.
