Table of Contents
Fetching ...

Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection

Atefeh Mahdavi, Marco Carvalho

TL;DR

This work tackles open set recognition by introducing a Superlative Loss that reconfigures the neural network's feature space to better separate known classes and detect unknowns without extra data or complex architectures. It operates on penultimate-layer activations, computing per-class mean vectors (MAVs) and projecting them with PCA to a 3D space, then optimizing a loss $L_s = BD - IS + IC$ that encourages greater inter-class separation and controlled intra-class spread. Inference uses per-class thresholds derived from distances to MAVs to decide between known and unknown classes, enabling robust open-set detection. Empirical results on MNIST, Fashion-MNIST, and CIFAR-10 show that $L_s$, especially when combined with cross-entropy or OpenMax, improves accuracy and F1 scores and enhances unknown class detection, suggesting practical benefits for decision-support systems in dynamic environments.

Abstract

Machine learning-based techniques open up many opportunities and improvements to derive deeper and more practical insights from data that can help businesses make informed decisions. However, the majority of these techniques focus on the conventional closed-set scenario, in which the label spaces for the training and test sets are identical. Open set recognition (OSR) aims to bring classification tasks in a situation that is more like reality, which focuses on classifying the known classes as well as handling unknown classes effectively. In such an open-set problem the gathered samples in the training set cannot encompass all the classes and the system needs to identify unknown samples at test time. On the other hand, building an accurate and comprehensive model in a real dynamic environment presents a number of obstacles, because it is prohibitively expensive to train for every possible example of unknown items, and the model may fail when tested in testbeds. This study provides an algorithm exploring a new representation of feature space to improve classification in OSR tasks. The efficacy and efficiency of business processes and decision-making can be improved by integrating OSR, which offers more precise and insightful predictions of outcomes. We demonstrate the performance of the proposed method on three established datasets. The results indicate that the proposed model outperforms the baseline methods in accuracy and F1-score.

Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection

TL;DR

This work tackles open set recognition by introducing a Superlative Loss that reconfigures the neural network's feature space to better separate known classes and detect unknowns without extra data or complex architectures. It operates on penultimate-layer activations, computing per-class mean vectors (MAVs) and projecting them with PCA to a 3D space, then optimizing a loss that encourages greater inter-class separation and controlled intra-class spread. Inference uses per-class thresholds derived from distances to MAVs to decide between known and unknown classes, enabling robust open-set detection. Empirical results on MNIST, Fashion-MNIST, and CIFAR-10 show that , especially when combined with cross-entropy or OpenMax, improves accuracy and F1 scores and enhances unknown class detection, suggesting practical benefits for decision-support systems in dynamic environments.

Abstract

Machine learning-based techniques open up many opportunities and improvements to derive deeper and more practical insights from data that can help businesses make informed decisions. However, the majority of these techniques focus on the conventional closed-set scenario, in which the label spaces for the training and test sets are identical. Open set recognition (OSR) aims to bring classification tasks in a situation that is more like reality, which focuses on classifying the known classes as well as handling unknown classes effectively. In such an open-set problem the gathered samples in the training set cannot encompass all the classes and the system needs to identify unknown samples at test time. On the other hand, building an accurate and comprehensive model in a real dynamic environment presents a number of obstacles, because it is prohibitively expensive to train for every possible example of unknown items, and the model may fail when tested in testbeds. This study provides an algorithm exploring a new representation of feature space to improve classification in OSR tasks. The efficacy and efficiency of business processes and decision-making can be improved by integrating OSR, which offers more precise and insightful predictions of outcomes. We demonstrate the performance of the proposed method on three established datasets. The results indicate that the proposed model outperforms the baseline methods in accuracy and F1-score.
Paper Structure (9 sections, 7 equations, 6 figures, 1 table)

This paper contains 9 sections, 7 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: A closed-set classifier distinguishes known classes like dogs, birds, and elephants, but it wrongly categorizes unknown samples like cats and airplanes as known ones during testing.
  • Figure 2: An overview of proposed method.
  • Figure 3: Visualization of the network architecture.
  • Figure 4: This figure compares precision, recall, and F1 scores for MNIST test samples from three sets ($\textbf{Set{1}}$, $\textbf{Set{2}}$, and $\textbf{Set{3}}$) across 12 runs, considering all methods. It also displays the accumulative F1 score over 36 runs, combining results from all sets.
  • Figure 5: Feature space visualization of MNIST dataset in the experiments of $CE$ vs $\mathcal{L}{s} + CE$. Labels 0,2,3,4,6,9 represent the known classes.
  • ...and 1 more figures