Table of Contents
Fetching ...

Wasserstein Distance-based Expansion of Low-Density Latent Regions for Unknown Class Detection

Prakash Mallick, Feras Dayoub, Jamie Sherrah

TL;DR

This work tackles open-set object detection by pushing known-class features into tighter, high-density clusters while expanding low-density regions where unknown objects reside. It introduces two key innovations to the Open-Det framework: Class Wasserstein Anchor (CWA), a Wasserstein-distance–based loss aligning logits with class anchors, and spectral normalisation to enforce distance-aware representations. Coupled with a supervised contrastive loss (CFL) and an unknown-probability loss (UPL), the approach yields substantial reductions in open-set errors and improved novelty detection across VOC/COCO benchmarks, with notable gains under both ResNet-50 and Swin-T backbones. The results demonstrate the practical value of optimal-transport concepts and spectral regularisation for robust open-set detection, while also outlining limitations and directions for theoretical generalisation bounds and reproducibility.

Abstract

This paper addresses the significant challenge in open-set object detection (OSOD): the tendency of state-of-the-art detectors to erroneously classify unknown objects as known categories with high confidence. We present a novel approach that effectively identifies unknown objects by distinguishing between high and low-density regions in latent space. Our method builds upon the Open-Det (OD) framework, introducing two new elements to the loss function. These elements enhance the known embedding space's clustering and expand the unknown space's low-density regions. The first addition is the Class Wasserstein Anchor (CWA), a new function that refines the classification boundaries. The second is a spectral normalisation step, improving the robustness of the model. Together, these augmentations to the existing Contrastive Feature Learner (CFL) and Unknown Probability Learner (UPL) loss functions significantly improve OSOD performance. Our proposed OpenDet-CWA (OD-CWA) method demonstrates: a) a reduction in open-set errors by approximately 17%-22%, b) an enhancement in novelty detection capability by 1.5%-16%, and c) a decrease in the wilderness index by 2%-20% across various open-set scenarios. These results represent a substantial advancement in the field, showcasing the potential of our approach in managing the complexities of open-set object detection.

Wasserstein Distance-based Expansion of Low-Density Latent Regions for Unknown Class Detection

TL;DR

This work tackles open-set object detection by pushing known-class features into tighter, high-density clusters while expanding low-density regions where unknown objects reside. It introduces two key innovations to the Open-Det framework: Class Wasserstein Anchor (CWA), a Wasserstein-distance–based loss aligning logits with class anchors, and spectral normalisation to enforce distance-aware representations. Coupled with a supervised contrastive loss (CFL) and an unknown-probability loss (UPL), the approach yields substantial reductions in open-set errors and improved novelty detection across VOC/COCO benchmarks, with notable gains under both ResNet-50 and Swin-T backbones. The results demonstrate the practical value of optimal-transport concepts and spectral regularisation for robust open-set detection, while also outlining limitations and directions for theoretical generalisation bounds and reproducibility.

Abstract

This paper addresses the significant challenge in open-set object detection (OSOD): the tendency of state-of-the-art detectors to erroneously classify unknown objects as known categories with high confidence. We present a novel approach that effectively identifies unknown objects by distinguishing between high and low-density regions in latent space. Our method builds upon the Open-Det (OD) framework, introducing two new elements to the loss function. These elements enhance the known embedding space's clustering and expand the unknown space's low-density regions. The first addition is the Class Wasserstein Anchor (CWA), a new function that refines the classification boundaries. The second is a spectral normalisation step, improving the robustness of the model. Together, these augmentations to the existing Contrastive Feature Learner (CFL) and Unknown Probability Learner (UPL) loss functions significantly improve OSOD performance. Our proposed OpenDet-CWA (OD-CWA) method demonstrates: a) a reduction in open-set errors by approximately 17%-22%, b) an enhancement in novelty detection capability by 1.5%-16%, and c) a decrease in the wilderness index by 2%-20% across various open-set scenarios. These results represent a substantial advancement in the field, showcasing the potential of our approach in managing the complexities of open-set object detection.
Paper Structure (20 sections, 6 equations, 9 figures, 10 tables)

This paper contains 20 sections, 6 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Model trained on an existing method, i.e., Open-Det (OD) (both ResNet and Transformer-based) is proficient at identifying unknown entities to a certain extent but remains largely susceptible to misclassification of a diverse range of unfamiliar elements (black triangles, e.g. zebra, elephant) into known classes (coloured dots, e.g. dog, cow). (b) Our method identifies unknown objects by enhancing the compactness among proposal features, thereby assisting the uncertainty-based optimiser in extending low-density regions (dotted striped regions in between boundaries) beyond the baseline (OD).
  • Figure 2: U-MAP mcinnes2020umap visualisation of latent features. Three VOC classes (coloured dots; id - $0,1,2$) are known classes, and non-VOC classes (black cross; id - $80$) in COCO as unknown classes. OD-CWA exhibits better separation as compared to OD in-terms of both open-set and closed-set classes.
  • Figure 3: OD-CWA consists of a Contrastive Feature Learner (CFL), Spectral Normalisation (SN), Class Wasserstein Anchor Learner (CWA) and Unknown Probability Learner (UPL). The CFL han2022opendet components utilises proposal features encoded into low-dimensional embeddings using the Contrastive Head (CH) optimised using Instance Contrastive Loss ($\mathcal{L}_{IC}$). The weights of the linear output layer are passed through a spectral normalisation step that maintain distance awareness property. Then UPL component utilises the cosine distances between embeddings and spectral normalised weights to learn the probabilities for both known classes ($C_K$) and the unknown class ($C_U$). The class Wasserstein anchor($\mathcal{L}_{CWA}$) part aids both CFL $\&$ UPL to increase the compactness in the clusters by finding the optimal transport plan from the logit space to anchor space. There is a visual illustration exhibiting the working of different components. Coloured dots and triangles represent reduced dimension of proposal features of different known and unknown classes, respectively. Coloured square represents proposal embeddings, and coloured $+$ sketched squares inside CWA box represents scaled and transformed logits.
  • Figure 4: Comparing ORE, OSOD and OVD in terms of generalizability, adaptability over time and scope of recognition.
  • Figure 5: Qualitative comparisons between proposed OD (top) and OD-CWA (bottom). Both models are trained on VOC and the detection results are visualised using images from COCO. The purple colour represents unknown and white represents known. White annotations represent classes seen by the model and purple annotation correspond to unknown classes.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Definition 1: Shen et al. shen2018wasserstein
  • Remark 4.1