Table of Contents
Fetching ...

Exact Certification of (Graph) Neural Networks Against Label Poisoning

Mahalakshmi Sabanayagam, Lukas Gosch, Stephan Günnemann, Debarghya Ghoshdastidar

TL;DR

This work presents the first exact certificate to a poisoning attack ever derived for neural networks, which could be of independent interest and is applicable to sufficiently wide NNs in general through their NTK.

Abstract

Machine learning models are highly vulnerable to label flipping, i.e., the adversarial modification (poisoning) of training labels to compromise performance. Thus, deriving robustness certificates is important to guarantee that test predictions remain unaffected and to understand worst-case robustness behavior. However, for Graph Neural Networks (GNNs), the problem of certifying label flipping has so far been unsolved. We change this by introducing an exact certification method, deriving both sample-wise and collective certificates. Our method leverages the Neural Tangent Kernel (NTK) to capture the training dynamics of wide networks enabling us to reformulate the bilevel optimization problem representing label flipping into a Mixed-Integer Linear Program (MILP). We apply our method to certify a broad range of GNN architectures in node classification tasks. Thereby, concerning the worst-case robustness to label flipping: $(i)$ we establish hierarchies of GNNs on different benchmark graphs; $(ii)$ quantify the effect of architectural choices such as activations, depth and skip-connections; and surprisingly, $(iii)$ uncover a novel phenomenon of the robustness plateauing for intermediate perturbation budgets across all investigated datasets and architectures. While we focus on GNNs, our certificates are applicable to sufficiently wide NNs in general through their NTK. Thus, our work presents the first exact certificate to a poisoning attack ever derived for neural networks, which could be of independent interest. The code is available at https://github.com/saper0/qpcert.

Exact Certification of (Graph) Neural Networks Against Label Poisoning

TL;DR

This work presents the first exact certificate to a poisoning attack ever derived for neural networks, which could be of independent interest and is applicable to sufficiently wide NNs in general through their NTK.

Abstract

Machine learning models are highly vulnerable to label flipping, i.e., the adversarial modification (poisoning) of training labels to compromise performance. Thus, deriving robustness certificates is important to guarantee that test predictions remain unaffected and to understand worst-case robustness behavior. However, for Graph Neural Networks (GNNs), the problem of certifying label flipping has so far been unsolved. We change this by introducing an exact certification method, deriving both sample-wise and collective certificates. Our method leverages the Neural Tangent Kernel (NTK) to capture the training dynamics of wide networks enabling us to reformulate the bilevel optimization problem representing label flipping into a Mixed-Integer Linear Program (MILP). We apply our method to certify a broad range of GNN architectures in node classification tasks. Thereby, concerning the worst-case robustness to label flipping: we establish hierarchies of GNNs on different benchmark graphs; quantify the effect of architectural choices such as activations, depth and skip-connections; and surprisingly, uncover a novel phenomenon of the robustness plateauing for intermediate perturbation budgets across all investigated datasets and architectures. While we focus on GNNs, our certificates are applicable to sufficiently wide NNs in general through their NTK. Thus, our work presents the first exact certificate to a poisoning attack ever derived for neural networks, which could be of independent interest. The code is available at https://github.com/saper0/qpcert.

Paper Structure

This paper contains 36 sections, 6 theorems, 25 equations, 16 figures, 13 tables.

Key Result

Theorem 1

Given the adversary $\mathcal{A}$ and positive constants $M_{u_i}$ and $M_{v_i}$ set as in app:bigm for all $i \in [m]$, the prediction for node $t$ is certifiably robust if the optimal solution to the MILP $\mathop{\mathrm{P}}\limits({\mathbf{y}})$, given below, is greater than zero and non-robust

Figures (16)

  • Figure 1: (a) The Karate Club network is visualized with its labeled () and unlabeled () nodes. The adversarial label flip () calculated by our method outlined in (b) provably leads to most node predictions being flipped () for two GNNs (GCN & SGC). The certified accuracy refers to the percentage of correctly classified nodes that remain robust to the attack.
  • Figure 2: Certified accuracies as given by our sample-wise certificate, for multi-class Cora-ML and Citeseer see \ref{['app_exp:multi_class']}and other datasets in \ref{['app_exp:cba']}. A clear and consistent hierarchy emerges across perturbation budgets concerning the worst-case robustness of different GNNs.
  • Figure 3: Certified ratios of selected architectures as calculated with our sample-wise and collective certificate. We refer to \ref{['app_sec:coll_rob_fig']} for collective results on all GNNs. Collective certification provides significantly higher certified ratios, and uncovers a plateauing phenomenon for intermediate $\epsilon$.
  • Figure 4: Selected architectural findings based on our collective certificates. $(a)$ The effect of graph normalizations ${\mathbf{S}}_{\text{row}}$ and ${\mathbf{S}}_{\text{sym}}$ is data-dependent. $(b)$ For skip-connections, depth does not improve robustness, shown for GCN Skip-$\alpha$, see \ref{['app_sec:depth_result']} for other GNNs and datasets.
  • Figure 5: Graph structure findings based on our collective certificates. $(a)-(b)$ The higher amount of graph information improves certifiable robustness. $(c)-(d)$ Graph density and homophily positively affect the certifiable robustness, shown for GCN using CSBM, see \ref{['app_sec:graph_connection']} for more results.
  • ...and 11 more figures

Theorems & Definitions (6)

  • Theorem 1: Sample-wise MILP
  • Theorem 2: Collective MILP
  • Theorem 3: Multiclass MILP
  • Proposition 1
  • Proposition 1
  • Theorem 4: MILP Formulation