DeBUGCN -- Detecting Backdoors in CNNs Using Graph Convolutional Networks
Akash Vartak, Khondoker Murad Hossain, Tim Oates
TL;DR
This work tackles the problem of detecting backdoors in CNNs by representing trained networks as graphs built from static layer weights and applying a Graph Convolutional Network as a binary trojan detector. The DeBUGCN pipeline constructs graphs from the final FC layer (and optionally early conv layers) and uses node/edge features to train a model-agnostic classifier that distinguishes clean from trojaned networks. Empirical results across MNIST, CIFAR-10, and the TrojAI dataset show high detection accuracy and faster computation compared with baselines, with robust performance under weight permutations and trigger variations. The study also demonstrates that incorporating convolutional filter graphs in a multimodal GCN further improves performance and suggests a scalable, topology-aware approach for DNN security against backdoor attacks.
Abstract
Deep neural networks (DNNs) are becoming commonplace in critical applications, making their susceptibility to backdoor (trojan) attacks a significant problem. In this paper, we introduce a novel backdoor attack detection pipeline, detecting attacked models using graph convolution networks (DeBUGCN). To the best of our knowledge, ours is the first use of GCNs for trojan detection. We use the static weights of a DNN to create a graph structure of its layers. A GCN is then used as a binary classifier on these graphs, yielding a trojan or clean determination for the DNN. To demonstrate the efficacy of our pipeline, we train hundreds of clean and trojaned CNN models on the MNIST handwritten digits and CIFAR-10 image datasets, and show the DNN classification results using DeBUGCN. For a true In-the-Wild use case, our pipeline is evaluated on the TrojAI dataset which consists of various CNN architectures, thus showing the robustness and model-agnostic behaviour of DeBUGCN. Furthermore, on comparing our results on several datasets with state-of-the-art trojan detection algorithms, DeBUGCN is faster and more accurate.
