FreeDrag: Feature Dragging for Reliable Point-based Image Editing
Pengyang Ling, Lin Chen, Pan Zhang, Huaian Chen, Yi Jin, Jinjin Zheng
TL;DR
FreeDrag tackles miss tracking and ambiguous tracking in point-based image editing by replacing exact point tracking with feature dragging guided by adaptive template features. It introduces a two-component mechanism: adaptive template updates $T_i^{k+1} = \lambda_i^k \cdot F_r(h_i^k) + (1 - \lambda_i^k) \cdot T_i^k$ and a line-search with backtracking that constrains movements along the line from the original handle to the target and optimizes over a controlled distance to minimize $\big| \big\| F_r(q_i) - T_i^{k+1} \big\|_1 - l \big|$. The approach is evaluated on StyleGAN2 and diffusion-based editors with the FreeDragBench dataset (2251 instructions) and CCSD as a symmetry-dragging metric, showing improved editing accuracy and speed over DragGAN and DragDiffusion. This work advances practical, robust, and efficient point-based editing, with strong implications for real-world content manipulation and benchmarking through the FreeDragBench suite.
Abstract
To serve the intricate and varied demands of image editing, precise and flexible manipulation in image content is indispensable. Recently, Drag-based editing methods have gained impressive performance. However, these methods predominantly center on point dragging, resulting in two noteworthy drawbacks, namely "miss tracking", where difficulties arise in accurately tracking the predetermined handle points, and "ambiguous tracking", where tracked points are potentially positioned in wrong regions that closely resemble the handle points. To address the above issues, we propose FreeDrag, a feature dragging methodology designed to free the burden on point tracking. The FreeDrag incorporates two key designs, i.e., template feature via adaptive updating and line search with backtracking, the former improves the stability against drastic content change by elaborately controls feature updating scale after each dragging, while the latter alleviates the misguidance from similar points by actively restricting the search area in a line. These two technologies together contribute to a more stable semantic dragging with higher efficiency. Comprehensive experimental results substantiate that our approach significantly outperforms pre-existing methodologies, offering reliable point-based editing even in various complex scenarios.
