Incremental Human-Object Interaction Detection with Invariant Relation Representation Learning

Yana Wei; Zeen Chi; Chongyu Wang; Yu Wu; Shipeng Yan; Yongfei Liu; Xuming He

Incremental Human-Object Interaction Detection with Invariant Relation Representation Learning

Yana Wei, Zeen Chi, Chongyu Wang, Yu Wu, Shipeng Yan, Yongfei Liu, Xuming He

TL;DR

This work tackles open-world HOI detection under incremental learning by formulating Incremental Human-Object Interaction Detection (IHOID) and proposing an exemplar-free Incremental Relation Distillation (IRD) framework. IRD decouples object and relation learning and introduces two distillation strategies—Momentum Feature Distillation (MFD) and Concept Feature Distillation (CFD)—backed by a momentum teacher and a dynamic concept-feature dictionary to preserve invariant relation representations across phases. The approach addresses catastrophic forgetting, interaction drift, and zero-shot generalization, and demonstrates superior performance on HICO-DET and V-COCO compared with strong baselines, including zero-shot detectors and generalized incremental methods. The results show improved stability-plasticity balance, robustness to drift, and enhanced zero-shot HOI detection, indicating practical value for adaptive HOI systems in dynamic environments.

Abstract

In open-world environments, human-object interactions (HOIs) evolve continuously, challenging conventional closed-world HOI detection models. Inspired by humans' ability to progressively acquire knowledge, we explore incremental HOI detection (IHOID) to develop agents capable of discerning human-object relations in such dynamic environments. This setup confronts not only the common issue of catastrophic forgetting in incremental learning but also distinct challenges posed by interaction drift and detecting zero-shot HOI combinations with sequentially arriving data. Therefore, we propose a novel exemplar-free incremental relation distillation (IRD) framework. IRD decouples the learning of objects and relations, and introduces two unique distillation losses for learning invariant relation features across different HOI combinations that share the same relation. Extensive experiments on HICO-DET and V-COCO datasets demonstrate the superiority of our method over state-of-the-art baselines in mitigating forgetting, strengthening robustness against interaction drift, and generalization on zero-shot HOIs. Code is available at \href{https://github.com/weiyana/ContinualHOI}{this HTTP URL}

Incremental Human-Object Interaction Detection with Invariant Relation Representation Learning

TL;DR

Abstract

Incremental Human-Object Interaction Detection with Invariant Relation Representation Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)