Multi-Agent Reinforcement Learning for Assessing False-Data Injection Attacks on Transportation Networks

Taha Eghtesad; Sirui Li; Yevgeniy Vorobeychik; Aron Laszka

Multi-Agent Reinforcement Learning for Assessing False-Data Injection Attacks on Transportation Networks

Taha Eghtesad, Sirui Li, Yevgeniy Vorobeychik, Aron Laszka

TL;DR

The paper addresses false-data injection attacks on transportation networks driven by navigation apps by modeling the attacker as an MDP that perturbs observed edge travel times under a budget. It introduces a Hierarchical Multi-Agent Reinforcement Learning (HMARL) framework with a high-level budget allocator and low-level cooperative agents to identify near-optimal attack strategies at scale, validated on the Sioux Falls, ND network. Empirical results show HMARL attenuates baselines and ablations, achieving 10–50% greater disruption to total travel time depending on budget, demonstrating scalability to graph-scale transport systems. The work highlights defense implications and points to graph-based state representations and ML-driven detection as promising directions for mitigating false-data injection attacks in navigation-enabled transportation networks.

Abstract

The increasing reliance of drivers on navigation applications has made transportation networks more susceptible to data-manipulation attacks by malicious actors. Adversaries may exploit vulnerabilities in the data collection or processing of navigation services to inject false information, and to thus interfere with the drivers' route selection. Such attacks can significantly increase traffic congestions, resulting in substantial waste of time and resources, and may even disrupt essential services that rely on road networks. To assess the threat posed by such attacks, we introduce a computational framework to find worst-case data-injection attacks against transportation networks. First, we devise an adversarial model with a threat actor who can manipulate drivers by increasing the travel times that they perceive on certain roads. Then, we employ hierarchical multi-agent reinforcement learning to find an approximate optimal adversarial strategy for data manipulation. We demonstrate the applicability of our approach through simulating attacks on the Sioux Falls, ND network topology.

Multi-Agent Reinforcement Learning for Assessing False-Data Injection Attacks on Transportation Networks

TL;DR

Abstract

Paper Structure (27 sections, 10 equations, 3 figures, 2 tables, 2 algorithms)

This paper contains 27 sections, 10 equations, 3 figures, 2 tables, 2 algorithms.

Introduction
Contributions
Organization
Related Work
Attacks on Navigation Applications
Hierarchical RL Approaches
System Model
Environment
State Transition
Attacker Model
Background
Deep Reinforcement Learning
Multi-Agent Deep Reinforcement Learning
Hierarchical Multi-Agent Reinforcement Learning
K-Means Node Clustering
...and 12 more sections

Figures (3)

Figure 1: Hierarchical Multi-Agent Deep Reinforcement Learning Architecture. $\mu_H$ and $Q_H$ are the high-level agent's actor and critic function approximators, respectively. $\mu_k$ and $Q_k$ are the actor and critic function approximators of low-level agent $k$, respectively. $\boldsymbol{a} = \langle \boldsymbol{a_1} \times \hat{b}_1, \boldsymbol{a_2} \times \hat{b}_2, \cdots \boldsymbol{a_k} \times \hat{b}_k\rangle$ is the perturbations of all edges of the transit graph $G$ where $a_k$ is the perturbations of edges in component $k$. $\boldsymbol{o_k}$ and $\hat{b}_k$ are the observation of the $k$-th agent from its component and the proportion of budget allocated to it, respectively. The Normalize layer can be constructed using the Softmax function or the 1-norm normalization of ReLU-activated actor outputs.
Figure 2: Decomposition of Sioux Falls, ND transportation network into four components, where one low-level agent is responsible for adding perturbation to edges in each component, and one high-level agent is responsible for allocating budget $B$ to each low-level agent. Edge width represents the density of vehicles moving over the edge without any attacker perturbation added.
Figure 3: Ablation study of HMARL on the Sioux Falls network. "No Attack" pertains to no attack on the network. "Greedy Heuristic" is a network greedy (see Section \ref{['sec:heuristics']}) attack. "DDPG" applies the general-purpose DDPG algorithm network-wide. In the remaining columns, the network is divided into four components. In "Decomposed Heuristic," the low-level actors are low-level greedy agents, with the high-level being a proportional allocation to the number of vehicles in each component. In "Ablation | Low Level," the high-level agent is the proportional allocation heuristic, while its low-level is the MADDPG approach. In "Ablation | High Level," the low-level is the greedy heuristic, while the high-level is a DDPG allocator RL agent. "HMARL" is our HMARL approach. Here, the low-level MADDPG and high-level DDPG components have been trained simultaneously.

Multi-Agent Reinforcement Learning for Assessing False-Data Injection Attacks on Transportation Networks

TL;DR

Abstract

Multi-Agent Reinforcement Learning for Assessing False-Data Injection Attacks on Transportation Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (3)