How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

Subhadeep Koley; Ayan Kumar Bhunia; Aneeshan Sain; Pinaki Nath Chowdhury; Tao Xiang; Yi-Zhe Song

How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

TL;DR

This paper proposes a novel abstraction-aware sketch-based image retrieval framework capable of handling sketch abstraction at varied levels, and proposes feature-level and retrieval granularity-level designs so that the system builds into its DNA the necessary means to interpret abstraction.

Abstract

In this paper, we propose a novel abstraction-aware sketch-based image retrieval framework capable of handling sketch abstraction at varied levels. Prior works had mainly focused on tackling sub-factors such as drawing style and order, we instead attempt to model abstraction as a whole, and propose feature-level and retrieval granularity-level designs so that the system builds into its DNA the necessary means to interpret abstraction. On learning abstraction-aware features, we for the first-time harness the rich semantic embedding of pre-trained StyleGAN model, together with a novel abstraction-level mapper that deciphers the level of abstraction and dynamically selects appropriate dimensions in the feature matrix correspondingly, to construct a feature matrix embedding that can be freely traversed to accommodate different levels of abstraction. For granularity-level abstraction understanding, we dictate that the retrieval model should not treat all abstraction-levels equally and introduce a differentiable surrogate Acc.@q loss to inject that understanding into the system. Different to the gold-standard triplet loss, our Acc.@q loss uniquely allows a sketch to narrow/broaden its focus in terms of how stringent the evaluation should be - the more abstract a sketch, the less stringent (higher q). Extensive experiments depict our method to outperform existing state-of-the-arts in standard SBIR tasks along with challenging scenarios like early retrieval, forensic sketch-photo matching, and style-invariant retrieval.

How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

TL;DR

Abstract

Paper Structure (13 sections, 6 equations, 16 figures, 7 tables, 1 algorithm)

This paper contains 13 sections, 6 equations, 16 figures, 7 tables, 1 algorithm.

Introduction
Related Works
Backgrounds
Pilot Study: Problems and Solutions
Proposed Methodology
Model Architecture
Loss Objectives
Experiments
Quantitative Analysis
Sketch Abstraction Analysis
Ablation
Extension to Forensic Sketch-Photo Matching
Conclusion and Future Works

Figures (16)

Figure 1: Pilot Study I: StyleGAN latent-disentanglement via optimising different groups of latent codes (coarse, medium, and fine).
Figure 2: Pilot Study II: Compare retrieval consistency by comparing entropy of separation in the embedding space, evaluated over successive stages of sketch completion. Inset images show how our method directs the query to a single gallery image (blue) while pushing others away as sketching progresses.
Figure 3: Our method learns a feature matrix representation in the joint embedding space, regularised by a pre-trained StyleGAN, trained with a weighted summation of reconstruction, abstraction identification, and Acc.@q losses. $\theta$:$\texttt{flip}(\texttt{cumsum}(\texttt{flip}(\cdot)))$, $\phi$:$\texttt{repeat}~\text{and}~\texttt{flattening}~\text{operation}$ (more in text).
Figure 4: Proposed (blue) method’s efficacy over Triplet-SN yu2016sketch (green) against different sketching styles of the same shoe (red bordered). Zoom in for the best view. (More in § Supplementary.)
Figure 5: Quantitative results on ShoeV2 for early retrieval setup, visualised via the percentage of sketch. A higher area under the curve indicates better early retrieval performance.
...and 11 more figures

How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

TL;DR

Abstract

How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

Authors

TL;DR

Abstract

Table of Contents

Figures (16)