Table of Contents
Fetching ...

I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength

Wanquan Feng, Jiawei Liu, Pengqi Tu, Tianhao Qi, Mingzhen Sun, Tianxiang Ma, Songtao Zhao, Siyu Zhou, Qian He

TL;DR

This paper tackles the challenge of precise camera control in image-to-video generation while enabling adjustable subject motion. It introduces I2VControl-Camera, which represents camera-induced motion with 3D point trajectories in the camera coordinate system and decouples motion strength as a higher-order trajectory component, integrated via an adapter into a diffusion-based video generator. A data pipeline derives control signals from RGB videos using depth estimation and pixel tracking, and an adaptive network module allows the method to plug into various base models. Across static and dynamic scenes, the approach demonstrates superior pixel-level controllability, robust motion-strength adjustment, and favorable quantitative metrics, highlighting its potential for professional-quality, user-guided video synthesis.

Abstract

Video generation technologies are developing rapidly and have broad potential applications. Among these technologies, camera control is crucial for generating professional-quality videos that accurately meet user expectations. However, existing camera control methods still suffer from several limitations, including control precision and the neglect of the control for subject motion dynamics. In this work, we propose I2VControl-Camera, a novel camera control method that significantly enhances controllability while providing adjustability over the strength of subject motion. To improve control precision, we employ point trajectory in the camera coordinate system instead of only extrinsic matrix information as our control signal. To accurately control and adjust the strength of subject motion, we explicitly model the higher-order components of the video trajectory expansion, not merely the linear terms, and design an operator that effectively represents the motion strength. We use an adapter architecture that is independent of the base model structure. Experiments on static and dynamic scenes show that our framework outperformances previous methods both quantitatively and qualitatively. The project page is: https://wanquanf.github.io/I2VControlCamera .

I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength

TL;DR

This paper tackles the challenge of precise camera control in image-to-video generation while enabling adjustable subject motion. It introduces I2VControl-Camera, which represents camera-induced motion with 3D point trajectories in the camera coordinate system and decouples motion strength as a higher-order trajectory component, integrated via an adapter into a diffusion-based video generator. A data pipeline derives control signals from RGB videos using depth estimation and pixel tracking, and an adaptive network module allows the method to plug into various base models. Across static and dynamic scenes, the approach demonstrates superior pixel-level controllability, robust motion-strength adjustment, and favorable quantitative metrics, highlighting its potential for professional-quality, user-guided video synthesis.

Abstract

Video generation technologies are developing rapidly and have broad potential applications. Among these technologies, camera control is crucial for generating professional-quality videos that accurately meet user expectations. However, existing camera control methods still suffer from several limitations, including control precision and the neglect of the control for subject motion dynamics. In this work, we propose I2VControl-Camera, a novel camera control method that significantly enhances controllability while providing adjustability over the strength of subject motion. To improve control precision, we employ point trajectory in the camera coordinate system instead of only extrinsic matrix information as our control signal. To accurately control and adjust the strength of subject motion, we explicitly model the higher-order components of the video trajectory expansion, not merely the linear terms, and design an operator that effectively represents the motion strength. We use an adapter architecture that is independent of the base model structure. Experiments on static and dynamic scenes show that our framework outperformances previous methods both quantitatively and qualitatively. The project page is: https://wanquanf.github.io/I2VControlCamera .

Paper Structure

This paper contains 19 sections, 12 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: We propose I2VControl-Camera, a novel camera control method for image-to-video generation, offering high control precision and adjustable motion strength.
  • Figure 2: We lift the input image from 2D to 3D as a RGBD point cloud. When the camera moves, the 3D points can be considered as moving in the camera coordinate system. Then we project them onto 2D according to current camera pose to obtain the 2D point trajectory.
  • Figure 3: Illustration of motion strength (speed value).
  • Figure 4: The adaptive network structure.
  • Figure 5: Visualization of our pixel-level controllability. The figure presents two samples: the top one demonstrates a pan-left camera movement, while the bottom one shows the camera sliding to the right. For each sample, we show a preview (directly render the RGBD point cloud on to 2D plane according to the extrinsic matrix) and our generated result. We can see that the generated result can almost follow the control signal at the pixel level (can be seen in the green boxes) even when there exists movable object (the cat in the red box).
  • ...and 2 more figures