DP-Net: Learning Discriminative Parts for image recognition

Ronan Sicre; Hanwei Zhang; Julien Dejasmin; Chiheb Daaloul; Stéphane Ayache; Thierry Artières

DP-Net: Learning Discriminative Parts for image recognition

Ronan Sicre, Hanwei Zhang, Julien Dejasmin, Chiheb Daaloul, Stéphane Ayache, Thierry Artières

TL;DR

DP-Net tackles the challenge of interpretable, scalable image recognition by learning discriminative parts without fine-tuning a pretrained CNN. It couples a fixed backbone with a learnable part layer, producing a bag-of-parts representation via $S = U X$ and $b$-based classification, while enforcing constraints to encourage part diversity, decisive region-to-part assignment, and class-specific part usage. The approach yields interpretable heatmaps and CAM-like explanations at both image and category levels, with experiments showing strong performance gains over global representations on several datasets and clear visualizations of discriminative parts. This yields a scalable method for interpretable image recognition suitable for large-scale datasets like ImageNet, with practical implications for model transparency and debugging.

Abstract

This paper presents Discriminative Part Network (DP-Net), a deep architecture with strong interpretation capabilities, which exploits a pretrained Convolutional Neural Network (CNN) combined with a part-based recognition module. This system learns and detects parts in the images that are discriminative among categories, without the need for fine-tuning the CNN, making it more scalable than other part-based models. While part-based approaches naturally offer interpretable representations, we propose explanations at image and category levels and introduce specific constraints on the part learning process to make them more discrimative.

DP-Net: Learning Discriminative Parts for image recognition

TL;DR

and

-based classification, while enforcing constraints to encourage part diversity, decisive region-to-part assignment, and class-specific part usage. The approach yields interpretable heatmaps and CAM-like explanations at both image and category levels, with experiments showing strong performance gains over global representations on several datasets and clear visualizations of discriminative parts. This yields a scalable method for interpretable image recognition suitable for large-scale datasets like ImageNet, with practical implications for model transparency and debugging.

Abstract

Paper Structure (8 sections, 2 equations, 3 figures, 2 tables)

This paper contains 8 sections, 2 equations, 3 figures, 2 tables.

Introduction
Previous works
Discriminative Parts NN (DP-Net)
DP-net architecture
Learning
Interpretability strategies
Experiments
Conclusion

Figures (3)

Figure 1: Figure of the proposed architecture and its learned parameters $U$ and $V$.
Figure 2: Three most important parts for the class Casino.
Figure 3: Heatmap illustrations using test images of classes Artstudio, Computer room, Casino and Kindergarden on top row and most discriminant region used to classify birds test images on second row.

DP-Net: Learning Discriminative Parts for image recognition

TL;DR

Abstract

DP-Net: Learning Discriminative Parts for image recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (3)