Key Patch Proposer: Key Patches Contain Rich Information
Jing Xu, Beiwen Tian, Hao Zhao
TL;DR
This work tackles patch-level active learning for semantic segmentation by proposing Key Patch Proposer (KPP), a non-learning-based method that selects informative image patches using a greedy approximation to a submodular objective. The patch-set objective is $P^* = \arg\min_{P_s \subseteq P, |P_s| = r|P|} L(P_s)$ with $L$ as the MAE reconstruction error, solved greedily starting from the central patch. Empirically, KPP improves reconstruction quality and downstream ViT-B/16 classification accuracy across various patch ratios on ImageNette and NYU-Depth V2, suggesting strong potential for active learning and patch-based representation learning. The work provides code and lays groundwork for efficient, training-free patch selection in semantic segmentation tasks.
Abstract
In this paper, we introduce a novel algorithm named Key Patch Proposer (KPP) designed to select key patches in an image without additional training. Our experiments showcase KPP's robust capacity to capture semantic information by both reconstruction and classification tasks. The efficacy of KPP suggests its potential application in active learning for semantic segmentation. Our source code is publicly available at https://github.com/CA-TT-AC/key-patch-proposer.
