Table of Contents
Fetching ...

Key Patch Proposer: Key Patches Contain Rich Information

Jing Xu, Beiwen Tian, Hao Zhao

TL;DR

This work tackles patch-level active learning for semantic segmentation by proposing Key Patch Proposer (KPP), a non-learning-based method that selects informative image patches using a greedy approximation to a submodular objective. The patch-set objective is $P^* = \arg\min_{P_s \subseteq P, |P_s| = r|P|} L(P_s)$ with $L$ as the MAE reconstruction error, solved greedily starting from the central patch. Empirically, KPP improves reconstruction quality and downstream ViT-B/16 classification accuracy across various patch ratios on ImageNette and NYU-Depth V2, suggesting strong potential for active learning and patch-based representation learning. The work provides code and lays groundwork for efficient, training-free patch selection in semantic segmentation tasks.

Abstract

In this paper, we introduce a novel algorithm named Key Patch Proposer (KPP) designed to select key patches in an image without additional training. Our experiments showcase KPP's robust capacity to capture semantic information by both reconstruction and classification tasks. The efficacy of KPP suggests its potential application in active learning for semantic segmentation. Our source code is publicly available at https://github.com/CA-TT-AC/key-patch-proposer.

Key Patch Proposer: Key Patches Contain Rich Information

TL;DR

This work tackles patch-level active learning for semantic segmentation by proposing Key Patch Proposer (KPP), a non-learning-based method that selects informative image patches using a greedy approximation to a submodular objective. The patch-set objective is with as the MAE reconstruction error, solved greedily starting from the central patch. Empirically, KPP improves reconstruction quality and downstream ViT-B/16 classification accuracy across various patch ratios on ImageNette and NYU-Depth V2, suggesting strong potential for active learning and patch-based representation learning. The work provides code and lays groundwork for efficient, training-free patch selection in semantic segmentation tasks.

Abstract

In this paper, we introduce a novel algorithm named Key Patch Proposer (KPP) designed to select key patches in an image without additional training. Our experiments showcase KPP's robust capacity to capture semantic information by both reconstruction and classification tasks. The efficacy of KPP suggests its potential application in active learning for semantic segmentation. Our source code is publicly available at https://github.com/CA-TT-AC/key-patch-proposer.
Paper Structure (9 sections, 3 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 9 sections, 3 equations, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: Left: Comparative analysis of reconstruction loss between KPP and random patch selection across various selection percentages. Right Section: The upper row demonstrates the random selection method, while the lower row depicts the KPP algorithm's approach. (a): ground truth; (b): selected patches; (c): reconstructed images utilizing the chosen patches.
  • Figure 2: Ablation study of reconstruction loss with/without initial patch.
  • Figure 3: (a): Original images; (b): 10% patches selected by KPP; (c): Images reconstructed by KPP patches. (d): 10% patches selected randomly; (e): Images reconstructed by random selected patches.