Proteus-ID: ID-Consistent and Motion-Coherent Video Customization

Guiyu Zhang; Chen Shi; Zijian Jiang; Xunzhi Xiang; Jingjing Qian; Shaoshuai Shi; Li Jiang

Proteus-ID: ID-Consistent and Motion-Coherent Video Customization

Guiyu Zhang, Chen Shi, Zijian Jiang, Xunzhi Xiang, Jingjing Qian, Shaoshuai Shi, Li Jiang

TL;DR

Proteus-ID, a novel diffusion-based framework for identity-consistent and motion-coherent video customization, and Adaptive Motion Learning (AML), a motion-aware optimization strategy that reweights training loss based on optical-flow-derived motion heatmaps, enhancing motion realism without requiring additional inputs.

Abstract

Video identity customization seeks to synthesize realistic, temporally coherent videos of a specific subject, given a single reference image and a text prompt. This task presents two core challenges: (1) maintaining identity consistency while aligning with the described appearance and actions, and (2) generating natural, fluid motion without unrealistic stiffness. To address these challenges, we introduce Proteus-ID, a novel diffusion-based framework for identity-consistent and motion-coherent video customization. First, we propose a Multimodal Identity Fusion (MIF) module that unifies visual and textual cues into a joint identity representation using a Q-Former, providing coherent guidance to the diffusion model and eliminating modality imbalance. Second, we present a Time-Aware Identity Injection (TAII) mechanism that dynamically modulates identity conditioning across denoising steps, improving fine-detail reconstruction. Third, we propose Adaptive Motion Learning (AML), a self-supervised strategy that reweights the training loss based on optical-flow-derived motion heatmaps, enhancing motion realism without requiring additional inputs. To support this task, we construct Proteus-Bench, a high-quality dataset comprising 200K curated clips for training and 150 individuals from diverse professions and ethnicities for evaluation. Extensive experiments demonstrate that Proteus-ID outperforms prior methods in identity preservation, text alignment, and motion quality, establishing a new benchmark for video identity customization. Codes and data are publicly available at https://grenoble-zhang.github.io/Proteus-ID/.

Proteus-ID: ID-Consistent and Motion-Coherent Video Customization

TL;DR

Abstract

Proteus-ID: ID-Consistent and Motion-Coherent Video Customization

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)