DP-Net: Learning Discriminative Parts for image recognition
Ronan Sicre, Hanwei Zhang, Julien Dejasmin, Chiheb Daaloul, Stéphane Ayache, Thierry Artières
TL;DR
DP-Net tackles the challenge of interpretable, scalable image recognition by learning discriminative parts without fine-tuning a pretrained CNN. It couples a fixed backbone with a learnable part layer, producing a bag-of-parts representation via $S = U X$ and $b$-based classification, while enforcing constraints to encourage part diversity, decisive region-to-part assignment, and class-specific part usage. The approach yields interpretable heatmaps and CAM-like explanations at both image and category levels, with experiments showing strong performance gains over global representations on several datasets and clear visualizations of discriminative parts. This yields a scalable method for interpretable image recognition suitable for large-scale datasets like ImageNet, with practical implications for model transparency and debugging.
Abstract
This paper presents Discriminative Part Network (DP-Net), a deep architecture with strong interpretation capabilities, which exploits a pretrained Convolutional Neural Network (CNN) combined with a part-based recognition module. This system learns and detects parts in the images that are discriminative among categories, without the need for fine-tuning the CNN, making it more scalable than other part-based models. While part-based approaches naturally offer interpretable representations, we propose explanations at image and category levels and introduce specific constraints on the part learning process to make them more discrimative.
