InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images

Jiun Tian Hoe; Weipeng Hu; Wei Zhou; Chao Xie; Ziwei Wang; Chee Seng Chan; Xudong Jiang; Yap-Peng Tan

InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images

Jiun Tian Hoe, Weipeng Hu, Wei Zhou, Chao Xie, Ziwei Wang, Chee Seng Chan, Xudong Jiang, Yap-Peng Tan

TL;DR

This work introduces InteractEdit, a zero-shot framework for editing existing human–object interactions in images while preserving the identities of the subject and object. It achieves this by disassembling HOI into subject, object, and background cues, regularizing inversion with Low-Rank Adaptation (LoRA), and applying selective fine-tuning to retain pretrained interaction priors. The authors also propose IEBench, a comprehensive benchmark for evaluating HOI editing in terms of editability and identity preservation. Across extensive qualitative, quantitative, ablation, and user studies, InteractEdit demonstrates superior performance over state-of-the-art baselines, establishing a new baseline for HOI editing research and enabling practical applications in content creation and visualization.

Abstract

This paper presents InteractEdit, a novel framework for zero-shot Human-Object Interaction (HOI) editing, addressing the challenging task of transforming an existing interaction in an image into a new, desired interaction while preserving the identities of the subject and object. Unlike simpler image editing scenarios such as attribute manipulation, object replacement or style transfer, HOI editing involves complex spatial, contextual, and relational dependencies inherent in humans-objects interactions. Existing methods often overfit to the source image structure, limiting their ability to adapt to the substantial structural modifications demanded by new interactions. To address this, InteractEdit decomposes each scene into subject, object, and background components, then employs Low-Rank Adaptation (LoRA) and selective fine-tuning to preserve pretrained interaction priors while learning the visual identity of the source image. This regularization strategy effectively balances interaction edits with identity consistency. We further introduce IEBench, the most comprehensive benchmark for HOI editing, which evaluates both interaction editing and identity preservation. Our extensive experiments show that InteractEdit significantly outperforms existing methods, establishing a strong baseline for future HOI editing research and unlocking new possibilities for creative and practical applications. Code will be released upon publication.

InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images

TL;DR

Abstract

InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)