Explainable Metric Learning for Deflating Data Bias
Emma Andrews, Prabhat Mishra
TL;DR
This work addresses the lack of interpretable similarity in deep metric learning by introducing Hierarchical Explainable Metric Learning (HEML), which decomposes images into semantic segments and learns segment-specific metrics in a bottom-up fashion. A metric tree then aggregates these local decisions into a global similarity, with the primary explainability metric $d_{SNR}$ guiding interpretations of segment contributions. Key contributions include the segmentation-based, memory-efficient framework, a lightweight alternative to saliency-driven approaches, and empirical demonstrations on CelebA, Human Parsing, and SceneParse150 showing competitive accuracy with substantially reduced GPU memory. The approach enables bias reduction by generating segment-informed samples and provides intrinsic explainability through hierarchical, segment-level decisions, suitable for developers, search engines, and AI systems seeking interpretable context.
Abstract
Image classification is an essential part of computer vision which assigns a given input image to a specific category based on the similarity evaluation within given criteria. While promising classifiers can be obtained through deep learning models, these approaches lack explainability, where the classification results are hard to interpret in a human-understandable way. In this paper, we present an explainable metric learning framework, which constructs hierarchical levels of semantic segments of an image for better interpretability. The key methodology involves a bottom-up learning strategy, starting by training the local metric learning model for the individual segments and then combining segments to compose comprehensive metrics in a tree. Specifically, our approach enables a more human-understandable similarity measurement between two images based on the semantic segments within it, which can be utilized to generate new samples to reduce bias in a training dataset. Extensive experimental evaluation demonstrates that the proposed approach can drastically improve model accuracy compared with state-of-the-art methods.
