DemoCaricature: Democratising Caricature Generation with a Rough Sketch
Dar-Yen Chen, Ayan Kumar Bhunia, Subhadeep Koley, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song
TL;DR
The paper tackles the problem of generating personalized caricatures from a single reference photo and a rough sketch by leveraging a diffusion-based framework augmented with a sketch-conditioned adapter and single-image personalisation. It introduces Explicit Rank-1 Model Editing to selectively modify identity-related concepts in cross-attention, along with Random Mask Reconstruction to improve robustness to distorted shapes, and Concept Regularisation to mitigate overfitting. Evaluations on WebCaricature show strong identity, style, and shape fidelity, outperforming deformation-based caricature methods and existing SD-based personalisation baselines, with additional validation from a human user study. The approach enables non-experts to create high-quality caricatures with minimal input, highlighting a practical pathway for AI-assisted, artist-friendly visual expression without supplanting human creators.
Abstract
In this paper, we democratise caricature generation, empowering individuals to effortlessly craft personalised caricatures with just a photo and a conceptual sketch. Our objective is to strike a delicate balance between abstraction and identity, while preserving the creativity and subjectivity inherent in a sketch. To achieve this, we present Explicit Rank-1 Model Editing alongside single-image personalisation, selectively applying nuanced edits to cross-attention layers for a seamless merge of identity and style. Additionally, we propose Random Mask Reconstruction to enhance robustness, directing the model to focus on distinctive identity and style features. Crucially, our aim is not to replace artists but to eliminate accessibility barriers, allowing enthusiasts to engage in the artistry.
