Provable Repair of Deep Neural Network Defects by Preimage Synthesis and Property Refinement

Jianan Ma; Jingyi Wang; Qi Xuan; Zhen Wang

Provable Repair of Deep Neural Network Defects by Preimage Synthesis and Property Refinement

Jianan Ma, Jingyi Wang, Qi Xuan, Zhen Wang

TL;DR

ProRepair presents a provable neural network repair framework that unifies defenses against backdoors, corruption, adversarial, and safety-violation failures by synthesizing a small proxy box to characterize the feature-space preimage and applying adaptive property refinement. By repairing the early feature extractor rather than the classifier, ProRepair preserves the preimage structure and provides formal guarantees via linear-relaxation based bound propagation, with a bounded distance measure guiding edits. The approach combines point-wise proxy-box synthesis and region-wise counterexample guided refinement, achieving up to 5×–2000× speedups over prior provable methods and solving all 36 safety-property violations in ACAS Xu while scaling to high-dimensional spaces. Empirical results across four repair tasks and six benchmarks show strong efficacy, robustness to activation functions, and notable generalization improvements, illustrating its practical potential for post-deployment safety in real-world systems. The work also provides an open-source toolkit to facilitate broader adoption and future enhancements in provable NN repair.

Abstract

It is known that deep neural networks may exhibit dangerous behaviors under various security threats (e.g., backdoor attacks, adversarial attacks and safety property violation) and there exists an ongoing arms race between attackers and defenders. In this work, we propose a complementary perspective to utilize recent progress on "neural network repair" to mitigate these security threats and repair various kinds of neural network defects (arising from different security threats) within a unified framework, offering a potential silver bullet solution to real-world scenarios. To substantially push the boundary of existing repair techniques (suffering from limitations such as lack of guarantees, limited scalability, considerable overhead, etc) in addressing more practical contexts, we propose ProRepair, a novel provable neural network repair framework driven by formal preimage synthesis and property refinement. The key intuitions are: (i) synthesizing a precise proxy box to characterize the feature space preimage, which can derive a bounded distance term sufficient to guide the subsequent repair step towards the correct outputs, and (ii) performing property refinement to enable surgical corrections and scale to more complex tasks. We evaluate ProRepair across four security threats repair tasks on six benchmarks and the results demonstrate it outperforms existing methods in effectiveness, efficiency and scalability. For point-wise repair, ProRepair corrects models while preserving performance and achieving significantly improved generalization, with a speedup of 5x to 2000x over existing provable approaches. In region-wise repair, ProRepair successfully repairs all 36 safety property violation instances (compared to 8 by the best existing method), and can handle 18x higher dimensional spaces.

Provable Repair of Deep Neural Network Defects by Preimage Synthesis and Property Refinement

TL;DR

Abstract

Provable Repair of Deep Neural Network Defects by Preimage Synthesis and Property Refinement

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (4)