I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
Wanquan Feng, Jiawei Liu, Pengqi Tu, Tianhao Qi, Mingzhen Sun, Tianxiang Ma, Songtao Zhao, Siyu Zhou, Qian He
TL;DR
This paper tackles the challenge of precise camera control in image-to-video generation while enabling adjustable subject motion. It introduces I2VControl-Camera, which represents camera-induced motion with 3D point trajectories in the camera coordinate system and decouples motion strength as a higher-order trajectory component, integrated via an adapter into a diffusion-based video generator. A data pipeline derives control signals from RGB videos using depth estimation and pixel tracking, and an adaptive network module allows the method to plug into various base models. Across static and dynamic scenes, the approach demonstrates superior pixel-level controllability, robust motion-strength adjustment, and favorable quantitative metrics, highlighting its potential for professional-quality, user-guided video synthesis.
Abstract
Video generation technologies are developing rapidly and have broad potential applications. Among these technologies, camera control is crucial for generating professional-quality videos that accurately meet user expectations. However, existing camera control methods still suffer from several limitations, including control precision and the neglect of the control for subject motion dynamics. In this work, we propose I2VControl-Camera, a novel camera control method that significantly enhances controllability while providing adjustability over the strength of subject motion. To improve control precision, we employ point trajectory in the camera coordinate system instead of only extrinsic matrix information as our control signal. To accurately control and adjust the strength of subject motion, we explicitly model the higher-order components of the video trajectory expansion, not merely the linear terms, and design an operator that effectively represents the motion strength. We use an adapter architecture that is independent of the base model structure. Experiments on static and dynamic scenes show that our framework outperformances previous methods both quantitatively and qualitatively. The project page is: https://wanquanf.github.io/I2VControlCamera .
