Table of Contents
Fetching ...

Equipping Sketch Patches with Context-Aware Positional Encoding for Graphic Sketch Representation

Sicong Zang, Zhijun Fang

TL;DR

The paper tackles leveraging sketch drawing orders for graphic sketch representation and identifies instability when edges are constructed directly from drawing sequences. It proposes DC-gra2seq, which introduces context-aware positional encodings attached to sketch patches (nodes): a sinusoidal absolute PE encodes drawing time and a learnable relative PE captures contextual distances, both injected at the node level while edges remain based on semantic similarity. The model uses CNN-based patch embeddings, a GCN for message passing with node-level PEs, and an RNN decoder to reconstruct sketches, trained with a VI-like objective minus the KL term. Empirical results on QuickDraw show that DC-gra2seq improves controllable sketch synthesis and sketch healing, and its ablations demonstrate robustness to drawing-variation while underscoring the value of combining absolute and relative PEs at the node level. The work offers a practical method for robust, context-aware graphic sketch representation with potential for improved sketch editing and generation.

Abstract

When benefiting graphic sketch representation with sketch drawing orders, recent studies have linked sketch patches as graph edges by drawing orders in accordance to a temporal-based nearest neighboring strategy. However, such constructed graph edges may be unreliable, since the contextual relationships between patches may be inconsistent with the sequential positions in drawing orders, due to variants of sketch drawings. In this paper, we propose a variant-drawing-protected method by equipping sketch patches with context-aware positional encoding (PE) to make better use of drawing orders for sketch learning. We introduce a sinusoidal absolute PE to embed the sequential positions in drawing orders, and a learnable relative PE to encode the unseen contextual relationships between patches. Both types of PEs never attend the construction of graph edges, but are injected into graph nodes to cooperate with the visual patterns captured from patches. After linking nodes by semantic proximity, during message aggregation via graph convolutional networks, each node receives both semantic features from patches and contextual information from PEs from its neighbors, which equips local patch patterns with global contextual information, further obtaining drawing-order-enhanced sketch representations. Experimental results indicate that our method significantly improves sketch healing and controllable sketch synthesis. The source codes could be found at https://github.com/SCZang/DC-gra2seq.

Equipping Sketch Patches with Context-Aware Positional Encoding for Graphic Sketch Representation

TL;DR

The paper tackles leveraging sketch drawing orders for graphic sketch representation and identifies instability when edges are constructed directly from drawing sequences. It proposes DC-gra2seq, which introduces context-aware positional encodings attached to sketch patches (nodes): a sinusoidal absolute PE encodes drawing time and a learnable relative PE captures contextual distances, both injected at the node level while edges remain based on semantic similarity. The model uses CNN-based patch embeddings, a GCN for message passing with node-level PEs, and an RNN decoder to reconstruct sketches, trained with a VI-like objective minus the KL term. Empirical results on QuickDraw show that DC-gra2seq improves controllable sketch synthesis and sketch healing, and its ablations demonstrate robustness to drawing-variation while underscoring the value of combining absolute and relative PEs at the node level. The work offers a practical method for robust, context-aware graphic sketch representation with potential for improved sketch editing and generation.

Abstract

When benefiting graphic sketch representation with sketch drawing orders, recent studies have linked sketch patches as graph edges by drawing orders in accordance to a temporal-based nearest neighboring strategy. However, such constructed graph edges may be unreliable, since the contextual relationships between patches may be inconsistent with the sequential positions in drawing orders, due to variants of sketch drawings. In this paper, we propose a variant-drawing-protected method by equipping sketch patches with context-aware positional encoding (PE) to make better use of drawing orders for sketch learning. We introduce a sinusoidal absolute PE to embed the sequential positions in drawing orders, and a learnable relative PE to encode the unseen contextual relationships between patches. Both types of PEs never attend the construction of graph edges, but are injected into graph nodes to cooperate with the visual patterns captured from patches. After linking nodes by semantic proximity, during message aggregation via graph convolutional networks, each node receives both semantic features from patches and contextual information from PEs from its neighbors, which equips local patch patterns with global contextual information, further obtaining drawing-order-enhanced sketch representations. Experimental results indicate that our method significantly improves sketch healing and controllable sketch synthesis. The source codes could be found at https://github.com/SCZang/DC-gra2seq.
Paper Structure (17 sections, 7 equations, 6 figures, 13 tables)

This paper contains 17 sections, 7 equations, 6 figures, 13 tables.

Figures (6)

  • Figure 1: Two approaches to inject drawing orders into graphic sketch representation, when dealing with variants of sketch drawings. (a) Constructing graph edges by drawing orders as in su2020sketchhealerqi2022generative, by linking the neighboring sketch components following a drawing order. (b) The proposed approach by injecting drawing orders into graph edges, keeping drawing orders away from graph edge construction.
  • Figure 2: Learning graphic representation $\bm y_t$ of sketch $\bm S_t$ by DC-gra2seq. The cropped sketch patches $\{\bm p_{tm}\}$ along with the resized full sketch $\bm p_{t0}$ are embedded by a CNN encoder as patch embeddings $\{\bm v_{tm}\}$. The absolute PE $\bm P$ restoring sketch drawing order and the relative PE $\bm R$ encoding contextual relationships between patches are incorporated with $\{\bm v_{tm}\}$, weighted by masked coefficients $\bm A_t$ computed from $\{\bm v_{tm}\}$ only. A GCN layer collects all the information from $\bm v$, $\bm P$ and $\bm R$ to produce a final sketch code $\bm y_t$.
  • Figure 3: Qualitative comparisons on controllable sketch synthesis.
  • Figure 4: Generating sketches by DC-gra2seq with interpolated latent codes. (a) The latent space learned by DC-gra2seq on DS3. (b) In each row, we morph sketch patterns by interpolating their corresponding latent codes.
  • Figure 5: Qualitative comparisons on sketch healing.
  • ...and 1 more figures