Table of Contents
Fetching ...

Generating Sketches in a Hierarchical Auto-Regressive Process for Flexible Sketch Drawing Manipulation at Stroke-Level

Sicong Zang, Shuhui Gao, Zhijun Fang

TL;DR

This work tackles controllable sketch generation at the stroke level, enabling edits during the drawing process. It introduces Sketch-HARP, a hierarchical auto-regressive framework that first predicts stroke embeddings, then anchors them on the canvas, and finally translates embeddings into drawing actions, all within an autoregressive loop. The model employs a stroke encoder, a position encoder, and a relationship encoder to produce a sketch code that guides a three-stage generator, trained with a multi-term loss to balance sequence fidelity, spatial placement, and visual quality. Experiments on QuickDraw DS1 and DS2 demonstrate flexible stroke-level manipulation, including replacement, erasion, and expansion, while maintaining competitive sketch reconstruction, underscoring the method's potential for interactive sketch editing.

Abstract

Generating sketches with specific patterns as expected, i.e., manipulating sketches in a controllable way, is a popular task. Recent studies control sketch features at stroke-level by editing values of stroke embeddings as conditions. However, in order to provide generator a global view about what a sketch is going to be drawn, all these edited conditions should be collected and fed into generator simultaneously before generation starts, i.e., no further manipulation is allowed during sketch generating process. In order to realize sketch drawing manipulation more flexibly, we propose a hierarchical auto-regressive sketch generating process. Instead of generating an entire sketch at once, each stroke in a sketch is generated in a three-staged hierarchy: 1) predicting a stroke embedding to represent which stroke is going to be drawn, and 2) anchoring the predicted stroke on the canvas, and 3) translating the embedding to a sequence of drawing actions to form the full sketch. Moreover, the stroke prediction, anchoring and translation are proceeded auto-regressively, i.e., both the recently generated strokes and their positions are considered to predict the current one, guiding model to produce an appropriate stroke at a suitable position to benefit the full sketch generation. It is flexible to manipulate stroke-level sketch drawing at any time during generation by adjusting the exposed editable stroke embeddings.

Generating Sketches in a Hierarchical Auto-Regressive Process for Flexible Sketch Drawing Manipulation at Stroke-Level

TL;DR

This work tackles controllable sketch generation at the stroke level, enabling edits during the drawing process. It introduces Sketch-HARP, a hierarchical auto-regressive framework that first predicts stroke embeddings, then anchors them on the canvas, and finally translates embeddings into drawing actions, all within an autoregressive loop. The model employs a stroke encoder, a position encoder, and a relationship encoder to produce a sketch code that guides a three-stage generator, trained with a multi-term loss to balance sequence fidelity, spatial placement, and visual quality. Experiments on QuickDraw DS1 and DS2 demonstrate flexible stroke-level manipulation, including replacement, erasion, and expansion, while maintaining competitive sketch reconstruction, underscoring the method's potential for interactive sketch editing.

Abstract

Generating sketches with specific patterns as expected, i.e., manipulating sketches in a controllable way, is a popular task. Recent studies control sketch features at stroke-level by editing values of stroke embeddings as conditions. However, in order to provide generator a global view about what a sketch is going to be drawn, all these edited conditions should be collected and fed into generator simultaneously before generation starts, i.e., no further manipulation is allowed during sketch generating process. In order to realize sketch drawing manipulation more flexibly, we propose a hierarchical auto-regressive sketch generating process. Instead of generating an entire sketch at once, each stroke in a sketch is generated in a three-staged hierarchy: 1) predicting a stroke embedding to represent which stroke is going to be drawn, and 2) anchoring the predicted stroke on the canvas, and 3) translating the embedding to a sequence of drawing actions to form the full sketch. Moreover, the stroke prediction, anchoring and translation are proceeded auto-regressively, i.e., both the recently generated strokes and their positions are considered to predict the current one, guiding model to produce an appropriate stroke at a suitable position to benefit the full sketch generation. It is flexible to manipulate stroke-level sketch drawing at any time during generation by adjusting the exposed editable stroke embeddings.

Paper Structure

This paper contains 16 sections, 12 equations, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Manipulating sketches at stroke-level by SketchEdit li2024sketchedit and the proposed Sketch-HARP. For SketchEdit, the editable stroke embeddings are simultaneously collected and fed into the generator before generation starts, i.e., no manipulation is allowed during sketch generating process. Sketch-HARP generates sketches in a hierarchical process, i.e., with a given sketch code, a group of stroke embeddings are firstly generated, which are secondly translated into drawing actions and positioned on the canvas. The exposed stroke embeddings are editable and could be flexibly adjusted during sketch drawing to enable stroke-level sketch drawing manipulation.
  • Figure 2: Generating sketches in a hierarchical auto-regressive process by Sketch-HARP. (a) The network structure. We learn relationships among strokes and incorporate them with features from strokes and starting positions to obtain a sketch code, which is fed into a hierarchical auto-regressive generator to produce sketches. (b) Generating sketches in a hierarchical auto-regressive process. Each stroke is generated by three stages: 1) predicting which stroke is going to be drawn, 2) determining where to locate the predicted stroke, and 3) translating stroke embeddings into drawing actions to finally form a full sketch.
  • Figure 3: Comparisons on stroke replacement. For each sketch in "target stroke" column, its stroke in red is replaced by the highlighted one in "source stroke" column, with the composed sketches listed in "composed sketch" column.
  • Figure 4: Comparisons on stroke erasion.
  • Figure 5: Exemplary stroke expansion results by Sketch-HARP. Sketch-HARP is required to continue drawing from an incomplete sketch, and the generated sketch should contain features from another source sketch.
  • ...and 5 more figures