Patch Synthesis for Property Repair of Deep Neural Networks
Zhiming Chi, Jianan Ma, Pengfei Yang, Cheng-Chao Huang, Renjue Li, Xiaowei Huang, Lijun Zhang
TL;DR
PatchPro tackles local robustness repair for deep neural networks by introducing patch modules that are trained with a DeepPoly based loss to provably fix adversarial vulnerabilities within a perturbation neighborhood. An external indicator routes inputs to neighborhood specific patches, while a patch allocation strategy enables generalization to unseen data and maintains original network performance. The approach scales to large networks by performing repairs in a reduced feature space and by adding patches to the network output rather than altering the base model. Empirical results on MNIST, CIFAR-10, Tiny ImageNet, and ACAS Xu demonstrate provable repairs with high repair success, strong generalization, and competitive efficiency compared to state of the art.
Abstract
Deep neural networks (DNNs) are prone to various dependability issues, such as adversarial attacks, which hinder their adoption in safety-critical domains. Recently, NN repair techniques have been proposed to address these issues while preserving original performance by locating and modifying guilty neurons and their parameters. However, existing repair approaches are often limited to specific data sets and do not provide theoretical guarantees for the effectiveness of the repairs. To address these limitations, we introduce PatchPro, a novel patch-based approach for property-level repair of DNNs, focusing on local robustness. The key idea behind PatchPro is to construct patch modules that, when integrated with the original network, provide specialized repairs for all samples within the robustness neighborhood while maintaining the network's original performance. Our method incorporates formal verification and a heuristic mechanism for allocating patch modules, enabling it to defend against adversarial attacks and generalize to other inputs. PatchPro demonstrates superior efficiency, scalability, and repair success rates compared to existing DNN repair methods, i.e., realizing provable property-level repair for 100% cases across multiple high-dimensional datasets.
