Table of Contents
Fetching ...

DP-Net: Learning Discriminative Parts for image recognition

Ronan Sicre, Hanwei Zhang, Julien Dejasmin, Chiheb Daaloul, Stéphane Ayache, Thierry Artières

TL;DR

DP-Net tackles the challenge of interpretable, scalable image recognition by learning discriminative parts without fine-tuning a pretrained CNN. It couples a fixed backbone with a learnable part layer, producing a bag-of-parts representation via $S = U X$ and $b$-based classification, while enforcing constraints to encourage part diversity, decisive region-to-part assignment, and class-specific part usage. The approach yields interpretable heatmaps and CAM-like explanations at both image and category levels, with experiments showing strong performance gains over global representations on several datasets and clear visualizations of discriminative parts. This yields a scalable method for interpretable image recognition suitable for large-scale datasets like ImageNet, with practical implications for model transparency and debugging.

Abstract

This paper presents Discriminative Part Network (DP-Net), a deep architecture with strong interpretation capabilities, which exploits a pretrained Convolutional Neural Network (CNN) combined with a part-based recognition module. This system learns and detects parts in the images that are discriminative among categories, without the need for fine-tuning the CNN, making it more scalable than other part-based models. While part-based approaches naturally offer interpretable representations, we propose explanations at image and category levels and introduce specific constraints on the part learning process to make them more discrimative.

DP-Net: Learning Discriminative Parts for image recognition

TL;DR

DP-Net tackles the challenge of interpretable, scalable image recognition by learning discriminative parts without fine-tuning a pretrained CNN. It couples a fixed backbone with a learnable part layer, producing a bag-of-parts representation via and -based classification, while enforcing constraints to encourage part diversity, decisive region-to-part assignment, and class-specific part usage. The approach yields interpretable heatmaps and CAM-like explanations at both image and category levels, with experiments showing strong performance gains over global representations on several datasets and clear visualizations of discriminative parts. This yields a scalable method for interpretable image recognition suitable for large-scale datasets like ImageNet, with practical implications for model transparency and debugging.

Abstract

This paper presents Discriminative Part Network (DP-Net), a deep architecture with strong interpretation capabilities, which exploits a pretrained Convolutional Neural Network (CNN) combined with a part-based recognition module. This system learns and detects parts in the images that are discriminative among categories, without the need for fine-tuning the CNN, making it more scalable than other part-based models. While part-based approaches naturally offer interpretable representations, we propose explanations at image and category levels and introduce specific constraints on the part learning process to make them more discrimative.
Paper Structure (8 sections, 2 equations, 3 figures, 2 tables)

This paper contains 8 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Figure of the proposed architecture and its learned parameters $U$ and $V$.
  • Figure 2: Three most important parts for the class Casino.
  • Figure 3: Heatmap illustrations using test images of classes Artstudio, Computer room, Casino and Kindergarden on top row and most discriminant region used to classify birds test images on second row.