Hyperstroke: A Novel High-quality Stroke Representation for Assistive Artistic Drawing
Haoyun Qin, Jian Lin, Hanyuan Liu, Xueting Liu, Chengze Li
TL;DR
Hyperstroke introduces a high-quality, stroke-centric representation for assistive drawing by modeling each stroke as a tokenizable, bounded, 4-channel alpha image ⟨I,B⟩. A grid-based tokenization (|B|→$\tilde{B}$, |I|→$\tilde{\mathcal{I}}$) learned via a revised VQGAN captures fine-grained appearance and opacity; an encoder–decoder transformer then predicts sequences of hyperstrokes conditioned on canvas context and CLIP guidance. The approach is trained on a mix of synthetic and real timelapse data to learn implicit stroke dynamics and achieves both faithful stroke reconstruction and plausible, category-conditioned sketch generation on the Quick, Draw! dataset. This work enables iterative co-creative drawing with stroke-aware guidance, promising practical improvements for artists and interactive drawing systems. $S=\langle I,B\rangle$ and the blending operation $A\circ\mathcal{S}$ form the core primitives enabling incremental composition and temporal modeling.$
Abstract
Assistive drawing aims to facilitate the creative process by providing intelligent guidance to artists. Existing solutions often fail to effectively model intricate stroke details or adequately address the temporal aspects of drawing. We introduce hyperstroke, a novel stroke representation designed to capture precise fine stroke details, including RGB appearance and alpha-channel opacity. Using a Vector Quantization approach, hyperstroke learns compact tokenized representations of strokes from real-life drawing videos of artistic drawing. With hyperstroke, we propose to model assistive drawing via a transformer-based architecture, to enable intuitive and user-friendly drawing applications, which are experimented in our exploratory evaluation.
