AnchorFlow: Training-Free 3D Editing via Latent Anchor-Aligned Flows
Zhenglin Zhou, Fan Ma, Chengzhuo Gui, Xiaobo Xia, Hehe Fan, Yi Yang, Tat-Seng Chua
TL;DR
AnchorFlow tackles training-free, mask-free 3D editing by stabilizing latent references through a global latent anchor and an anchor-aligned update rule. The method yields strong semantic edits with preserved geometry, validated on a new Eval3DEdit benchmark across multiple editing types. It avoids mask supervision and enables scalable data curation for instruction-following 3D editing. Quantitative and qualitative results show competitive or superior performance compared to state-of-the-art inversion-free and LFM-based methods.
Abstract
Training-free 3D editing aims to modify 3D shapes based on human instructions without model finetuning. It plays a crucial role in 3D content creation. However, existing approaches often struggle to produce strong or geometrically stable edits, largely due to inconsistent latent anchors introduced by timestep-dependent noise during diffusion sampling. To address these limitations, we introduce AnchorFlow, which is built upon the principle of latent anchor consistency. Specifically, AnchorFlow establishes a global latent anchor shared between the source and target trajectories, and enforces coherence using a relaxed anchor-alignment loss together with an anchor-aligned update rule. This design ensures that transformations remain stable and semantically faithful throughout the editing process. By stabilizing the latent reference space, AnchorFlow enables more pronounced semantic modifications. Moreover, AnchorFlow is mask-free. Without mask supervision, it effectively preserves geometric fidelity. Experiments on the Eval3DEdit benchmark show that AnchorFlow consistently delivers semantically aligned and structurally robust edits across diverse editing types. Code is at https://github.com/ZhenglinZhou/AnchorFlow.
