Table of Contents
Fetching ...

Panel-by-Panel Souls: A Performative Workflow for Expressive Faces in AI-Assisted Manga Creation

Qing Zhang, Jing Huang, Yifei Huang, Jun Rekimoto

TL;DR

This work addresses the challenge of translating narrative intent into expressive manga faces by introducing a dual-hybrid, interactive workflow that combines landmark-based face preparation with a LivePortrait-driven expression-mapping stage, augmented by an artist-controlled manual framing and an interactive timeline. It presents a three-stage Panel-by-Panel Pipeline—face preparation, expression mapping, and composition/refinement—that enables rapid drafting while preserving narrative coherence and artistic control. A formative end-to-end case study demonstrates improved efficiency and expressiveness, though artifacts and temporal offsets persist, pointing to future work in 3D-aware modeling and formal user studies with professional artists. Overall, the method offers a practical model for human-AI co-creation in manga, empowering artists to infuse character soul into AI-assisted panels without relinquishing final aesthetic control.

Abstract

Current text-to-image models struggle to render the nuanced facial expressions required for compelling manga narratives, largely due to the ambiguity of language itself. To bridge this gap, we introduce an interactive system built on a novel, dual-hybrid pipeline. The first stage combines landmark-based auto-detection with a manual framing tool for robust, artist-centric face preparation. The second stage maps expressions using the LivePortrait engine, blending intuitive performative input from video for fine-grained control. Our case study analysis suggests that this integrated workflow can streamline the creative process and effectively translate narrative intent into visual expression. This work presents a practical model for human-AI co-creation, offering artists a more direct and intuitive means of ``infusing souls'' into their characters. Our primary contribution is not a new generative model, but a novel, interactive workflow that bridges the gap between artistic intent and AI execution.

Panel-by-Panel Souls: A Performative Workflow for Expressive Faces in AI-Assisted Manga Creation

TL;DR

This work addresses the challenge of translating narrative intent into expressive manga faces by introducing a dual-hybrid, interactive workflow that combines landmark-based face preparation with a LivePortrait-driven expression-mapping stage, augmented by an artist-controlled manual framing and an interactive timeline. It presents a three-stage Panel-by-Panel Pipeline—face preparation, expression mapping, and composition/refinement—that enables rapid drafting while preserving narrative coherence and artistic control. A formative end-to-end case study demonstrates improved efficiency and expressiveness, though artifacts and temporal offsets persist, pointing to future work in 3D-aware modeling and formal user studies with professional artists. Overall, the method offers a practical model for human-AI co-creation in manga, empowering artists to infuse character soul into AI-assisted panels without relinquishing final aesthetic control.

Abstract

Current text-to-image models struggle to render the nuanced facial expressions required for compelling manga narratives, largely due to the ambiguity of language itself. To bridge this gap, we introduce an interactive system built on a novel, dual-hybrid pipeline. The first stage combines landmark-based auto-detection with a manual framing tool for robust, artist-centric face preparation. The second stage maps expressions using the LivePortrait engine, blending intuitive performative input from video for fine-grained control. Our case study analysis suggests that this integrated workflow can streamline the creative process and effectively translate narrative intent into visual expression. This work presents a practical model for human-AI co-creation, offering artists a more direct and intuitive means of ``infusing souls'' into their characters. Our primary contribution is not a new generative model, but a novel, interactive workflow that bridges the gap between artistic intent and AI execution.

Paper Structure

This paper contains 9 sections, 1 figure.

Figures (1)

  • Figure 1: An end-to-end demonstration of our pipeline and an illustration of its current limitations. The process begins with (a) a Gen-AI generated manga draft with neutral character expressions. In (b) Stage 1: Face Preparation, our system successfully identifies and frames the primary faces, though it fails on a small, distant face (highlighted in red), illustrating a limitation of the auto-detector. During (c) Stage 2: Expression Mapping, new expressions are mapped onto the prepared faces from driving inputs (shown inset). In (d) Stage 3: Composition, the modified faces are re-integrated back into the draft, demonstrating a successful transfer of expression. Finally, (e) highlights typical limitations, which are largely inherited from the underlying LivePortrait model.