Table of Contents
Fetching ...

IF-CPS: Influence Functions for Cyber-Physical Systems -- A Unified Framework for Diagnosis, Curation, and Safety Attribution

Jiachen Li, Shihao Li, Soovadeep Bakshi, Jiamin Xu, Dongmei Chen

Abstract

Neural network controllers trained via behavior cloning are increasingly deployed in cyber-physical systems (CPS), yet practitioners lack tools to trace controller failures back to training data. Existing data attribution methods assume i.i.d.\ data and standard loss targets, ignoring CPS-specific properties: closed-loop dynamics, safety constraints, and temporal trajectory structure. We propose IF-CPS, a modular influence function framework with three CPS-adapted variants: safety influence (attributing constraint violations), trajectory influence (temporal discounting over trajectories), and propagated influence (tracing effects through plant dynamics). We evaluate IF-CPS on six benchmarks across diagnosis, curation, and safety attribution tasks. IF-CPS improves over standard influence functions in the majority of settings, achieving AUROC $1.00$ in Pendulum (5-10\% poisoning), $0.92$ vs.\ $0.50$ in HVAC (10\%), and the strongest constraint-boundary correlation (Spearman $ρ= 0.55$ in Pendulum).

IF-CPS: Influence Functions for Cyber-Physical Systems -- A Unified Framework for Diagnosis, Curation, and Safety Attribution

Abstract

Neural network controllers trained via behavior cloning are increasingly deployed in cyber-physical systems (CPS), yet practitioners lack tools to trace controller failures back to training data. Existing data attribution methods assume i.i.d.\ data and standard loss targets, ignoring CPS-specific properties: closed-loop dynamics, safety constraints, and temporal trajectory structure. We propose IF-CPS, a modular influence function framework with three CPS-adapted variants: safety influence (attributing constraint violations), trajectory influence (temporal discounting over trajectories), and propagated influence (tracing effects through plant dynamics). We evaluate IF-CPS on six benchmarks across diagnosis, curation, and safety attribution tasks. IF-CPS improves over standard influence functions in the majority of settings, achieving AUROC in Pendulum (5-10\% poisoning), vs.\ in HVAC (10\%), and the strongest constraint-boundary correlation (Spearman in Pendulum).
Paper Structure (22 sections, 20 equations, 4 figures, 5 tables)

This paper contains 22 sections, 20 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Overview of the IF-CPS framework.
  • Figure 2: IF-CPS data curation on CartPole, Quadrotor, and CSTR: expert (green), 20% poisoned BC (red), IF-CPS-curated BC (blue). Dashed lines show safety bounds.
  • Figure 3: Diagnosis AUROC vs. poisoning rate. Shaded regions: $\pm$1 std (3 seeds).
  • Figure 4: IF-CPS attribution scores on Pendulum, LunarLander, and HVAC. Color encodes normalized IF-CPS score (blue = low, red = high); black $\times$ marks poisoned demonstrations.