Familiarity-Based Open-Set Recognition Under Adversarial Attacks
Philip Enevoldsen, Christian Gundersen, Nico Lang, Serge Belongie, Christian Igel
TL;DR
This paper investigates the vulnerability of familiarity-based open-set recognition scores, particularly MLS and MSP, to gradient-based adversarial perturbations. It distinguishes False Familiarity and False Novelty attacks and analyzes their effectiveness under informed and uninformed settings on TinyImageNet, while introducing the Adversarial Reaction Score as a potential OSR metric. The findings show that MLS can be easily manipulated, with informed attacks capable of reversing OSR rankings and iterative methods delivering strongest disruption; ARS offers limited improvement over MLS. The work highlights the need for robust scoring rules and targeted defense strategies to ensure reliable OSR performance in adversarial contexts, with implications for real-world deployment where novelty detection is critical.
Abstract
Open-set recognition (OSR), the identification of novel categories, can be a critical component when deploying classification models in real-world applications. Recent work has shown that familiarity-based scoring rules such as the Maximum Softmax Probability (MSP) or the Maximum Logit Score (MLS) are strong baselines when the closed-set accuracy is high. However, one of the potential weaknesses of familiarity-based OSR are adversarial attacks. Here, we study gradient-based adversarial attacks on familiarity scores for both types of attacks, False Familiarity and False Novelty attacks, and evaluate their effectiveness in informed and uninformed settings on TinyImageNet. Furthermore, we explore how novel and familiar samples react to adversarial attacks and formulate the adversarial reaction score as an alternative OSR scoring rule, which shows a high correlation with the MLS familiarity score.
