Equipping Sketch Patches with Context-Aware Positional Encoding for Graphic Sketch Representation
Sicong Zang, Zhijun Fang
TL;DR
The paper tackles leveraging sketch drawing orders for graphic sketch representation and identifies instability when edges are constructed directly from drawing sequences. It proposes DC-gra2seq, which introduces context-aware positional encodings attached to sketch patches (nodes): a sinusoidal absolute PE encodes drawing time and a learnable relative PE captures contextual distances, both injected at the node level while edges remain based on semantic similarity. The model uses CNN-based patch embeddings, a GCN for message passing with node-level PEs, and an RNN decoder to reconstruct sketches, trained with a VI-like objective minus the KL term. Empirical results on QuickDraw show that DC-gra2seq improves controllable sketch synthesis and sketch healing, and its ablations demonstrate robustness to drawing-variation while underscoring the value of combining absolute and relative PEs at the node level. The work offers a practical method for robust, context-aware graphic sketch representation with potential for improved sketch editing and generation.
Abstract
When benefiting graphic sketch representation with sketch drawing orders, recent studies have linked sketch patches as graph edges by drawing orders in accordance to a temporal-based nearest neighboring strategy. However, such constructed graph edges may be unreliable, since the contextual relationships between patches may be inconsistent with the sequential positions in drawing orders, due to variants of sketch drawings. In this paper, we propose a variant-drawing-protected method by equipping sketch patches with context-aware positional encoding (PE) to make better use of drawing orders for sketch learning. We introduce a sinusoidal absolute PE to embed the sequential positions in drawing orders, and a learnable relative PE to encode the unseen contextual relationships between patches. Both types of PEs never attend the construction of graph edges, but are injected into graph nodes to cooperate with the visual patterns captured from patches. After linking nodes by semantic proximity, during message aggregation via graph convolutional networks, each node receives both semantic features from patches and contextual information from PEs from its neighbors, which equips local patch patterns with global contextual information, further obtaining drawing-order-enhanced sketch representations. Experimental results indicate that our method significantly improves sketch healing and controllable sketch synthesis. The source codes could be found at https://github.com/SCZang/DC-gra2seq.
