Table of Contents
Fetching ...

Learning Clustering-based Prototypes for Compositional Zero-shot Learning

Hongyu Qu, Jianan Wei, Xiangbo Shu, Wenguan Wang

TL;DR

ClusPro is developed, a robust clustering-based prototype mining framework for CZSL that defines the conceptual boundaries of primitives through a set of diversified prototypes and outperforms various top-leading CZSL solutions under both closed-world and open-world settings.

Abstract

Learning primitive (i.e., attribute and object) concepts from seen compositions is the primary challenge of Compositional Zero-Shot Learning (CZSL). Existing CZSL solutions typically rely on oversimplified data assumptions, e.g., modeling each primitive with a single centroid primitive representation, ignoring the natural diversities of the attribute (resp. object) when coupled with different objects (resp. attribute). In this work, we develop ClusPro, a robust clustering-based prototype mining framework for CZSL that defines the conceptual boundaries of primitives through a set of diversified prototypes. Specifically, ClusPro conducts within-primitive clustering on the embedding space for automatically discovering and dynamically updating prototypes. These representative prototypes are subsequently used to repaint a well-structured and independent primitive embedding space, ensuring intra-primitive separation and inter-primitive decorrelation through prototype-based contrastive learning and decorrelation learning. Moreover, ClusPro efficiently performs prototype clustering in a non-parametric fashion without the introduction of additional learnable parameters or computational budget during testing. Experiments on three benchmarks demonstrate ClusPro outperforms various top-leading CZSL solutions under both closed-world and open-world settings.

Learning Clustering-based Prototypes for Compositional Zero-shot Learning

TL;DR

ClusPro is developed, a robust clustering-based prototype mining framework for CZSL that defines the conceptual boundaries of primitives through a set of diversified prototypes and outperforms various top-leading CZSL solutions under both closed-world and open-world settings.

Abstract

Learning primitive (i.e., attribute and object) concepts from seen compositions is the primary challenge of Compositional Zero-Shot Learning (CZSL). Existing CZSL solutions typically rely on oversimplified data assumptions, e.g., modeling each primitive with a single centroid primitive representation, ignoring the natural diversities of the attribute (resp. object) when coupled with different objects (resp. attribute). In this work, we develop ClusPro, a robust clustering-based prototype mining framework for CZSL that defines the conceptual boundaries of primitives through a set of diversified prototypes. Specifically, ClusPro conducts within-primitive clustering on the embedding space for automatically discovering and dynamically updating prototypes. These representative prototypes are subsequently used to repaint a well-structured and independent primitive embedding space, ensuring intra-primitive separation and inter-primitive decorrelation through prototype-based contrastive learning and decorrelation learning. Moreover, ClusPro efficiently performs prototype clustering in a non-parametric fashion without the introduction of additional learnable parameters or computational budget during testing. Experiments on three benchmarks demonstrate ClusPro outperforms various top-leading CZSL solutions under both closed-world and open-world settings.

Paper Structure

This paper contains 22 sections, 15 equations, 7 figures, 11 tables, 1 algorithm.

Figures (7)

  • Figure 1: (a) Previous CZSL methods model all samples of each primitive concept with only one centroid primitive presentation, neglecting feature divergence within each primitive when involved in different compositions. $\!$(b) Our method represents each primitive as a set of prototypes to capture primitive diversities.
  • Figure 2: The overview of ClusPro. (a)ClusPro is built upon a three-path paradigm to jointly recognize attribute, object, and attribute-object composition (§\ref{['sec:base']}). (b) To capture the diversity within each primitive, ClusPro describes each primitive with a set of prototypes, and conducts within-primitive clustering across training data for prototype assignment and updating (§\ref{['sec::pro']}). (c)ClusPro imposes two constraints based on these constructed prototypes to promote intra-primitive separation and inter-primitive decorrelation (§\ref{['sec::rep']}).
  • Figure 3: Case study on UT-Zappos yu2014fine and C-GQA naeem2021learning. $\!$We compare ClusPro with baseline without primitive-wise prototype clustering. $\!$Correct and incorrect predictions are marked in green and red, respectively.
  • Figure 4: Visualization of attribute and object features learned by baseline and ClusPro on UT-Zappos yu2014fine.
  • Figure 5: More case studies on MIT-States isola2015discovering. We compare ClusPro with baseline without primitive-wise prototype clustering. Correct and incorrect predictions are marked in green and red, respectively.
  • ...and 2 more figures