UNICAD: A Unified Approach for Attack Detection, Noise Reduction and Novel Class Identification
Alvaro Lopez Pellicer, Kittipos Giatgong, Yi Li, Neeraj Suri, Plamen Angelov
TL;DR
UNICAD addresses the combined challenges of adversarial attacks and unseen classes in image classification by unifying attack detection, noise reduction, and novel class identification within a prototype-based, similarity-driven architecture. It introduces a layered design with a Feature Extraction Layer, Prototype Conditional Probability Layer, Global Decision Making Layer, Denoising Layer using a novel combined loss $\mathcal{L}_{\text{comb}}$, and an Attack Decision Making Layer to dynamically handle attacks and emerging classes. Key contributions include the prototype-based similarity framework, the embedded denoising autoencoder that preserves latent and pixel fidelity, and the ability to detect and incorporate new classes without full retraining, demonstrated on CIFAR-10 with backbones like VGG-16 and DINOv2. The results show strong robustness under FGSM, PGD, and C&W attacks, competitive unseen-class detection, and improved clean accuracy retention, signaling significant practical impact for secure and adaptable vision systems in dynamic environments where distribution shifts and adversarial threats occur. The work advances a practical path toward real-world deployment of unified defenses that balance accuracy, interpretability, and open-set adaptability.
Abstract
As the use of Deep Neural Networks (DNNs) becomes pervasive, their vulnerability to adversarial attacks and limitations in handling unseen classes poses significant challenges. The state-of-the-art offers discrete solutions aimed to tackle individual issues covering specific adversarial attack scenarios, classification or evolving learning. However, real-world systems need to be able to detect and recover from a wide range of adversarial attacks without sacrificing classification accuracy and to flexibly act in {\bf unseen} scenarios. In this paper, UNICAD, is proposed as a novel framework that integrates a variety of techniques to provide an adaptive solution. For the targeted image classification, UNICAD achieves accurate image classification, detects unseen classes, and recovers from adversarial attacks using Prototype and Similarity-based DNNs with denoising autoencoders. Our experiments performed on the CIFAR-10 dataset highlight UNICAD's effectiveness in adversarial mitigation and unseen class classification, outperforming traditional models.
