Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection

Anay Majee; Amitesh Gangrade; Rishabh Iyer

Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection

Anay Majee, Amitesh Gangrade, Rishabh Iyer

TL;DR

This work tackles Open-World Object Detection (OWOD), where detectors must discover unknown objects and incrementally learn new classes without retraining on all data. It introduces CROWD, a data-discovery guided framework that interleaves CROWD-Discover (SCG-based unknown mining) and CROWD-Learn (a combinatorial, submodular objective-driven learning) to separate known and unknown representations while preserving prior knowledge. By employing Submodular Conditional Gain and related submodular information functions (e.g., Graph-Cut, Facility-Location, and Log-Determinant), CROWD achieves substantial gains in unknown recall and known-class accuracy on OWOD benchmarks (M-OWOD and S-OWOD) and improves generalization to Incremental Object Detection (IOD). The approach demonstrates the value of a set-based, combinatorial perspective for open-world learning, with practical impact on scalable, continual detection systems, and suggests directions for further refinement of submodular objectives and constraints.

Abstract

Open-World Object Detection (OWOD) enriches traditional object detectors by enabling continual discovery and integration of unknown objects via human guidance. However, existing OWOD approaches frequently suffer from semantic confusion between known and unknown classes, alongside catastrophic forgetting, leading to diminished unknown recall and degraded known-class accuracy. To overcome these challenges, we propose Combinatorial Open-World Detection (CROWD), a unified framework reformulating unknown object discovery and adaptation as an interwoven combinatorial (set-based) data-discovery (CROWD-Discover) and representation learning (CROWD-Learn) task. CROWD-Discover strategically mines unknown instances by maximizing Submodular Conditional Gain (SCG) functions, selecting representative examples distinctly dissimilar from known objects. Subsequently, CROWD-Learn employs novel combinatorial objectives that jointly disentangle known and unknown representations while maintaining discriminative coherence among known classes, thus mitigating confusion and forgetting. Extensive evaluations on OWOD benchmarks illustrate that CROWD achieves improvements of 2.83% and 2.05% in known-class accuracy on M-OWODB and S-OWODB, respectively, and nearly 2.4x unknown recall compared to leading baselines.

Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection

TL;DR

Abstract

Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (6)