Table of Contents
Fetching ...

Knowledge Transfer for Collaborative Misbehavior Detection in Untrusted Vehicular Environments

Roshan Sedar, Charalampos Kalalas, Paolo Dini, Francisco Vazquez-Gallego, Jesus Alonso-Zarate, Luis Alonso

TL;DR

The paper tackles robust misbehavior detection in untrusted vehicular networks by introducing a DRL-based detector at RSUs augmented with selective transfer learning. It defines a semantic relatedness trust mechanism to identify trustworthy source RSUs and uses instance-level knowledge transfer to accelerate learning while avoiding negative transfer from adversaries. Across VeReMi-based experiments and three scenario types, the approach achieves faster convergence and high detection performance, including for unseen and partially observable attacks, outperforming tabula rasa baselines. The work demonstrates significant improvements in robustness, generalization, and data-efficiency, with practical implications for scalable edge security in V2X systems.

Abstract

Vehicular mobility underscores the need for collaborative misbehavior detection at the vehicular edge. However, locally trained misbehavior detection models are susceptible to adversarial attacks that aim to deliberately influence learning outcomes. In this paper, we introduce a deep reinforcement learning-based approach that employs transfer learning for collaborative misbehavior detection among roadside units (RSUs). In the presence of label-flipping and policy induction attacks, we perform selective knowledge transfer from trustworthy source RSUs to foster relevant expertise in misbehavior detection and avoid negative knowledge sharing from adversary-influenced RSUs. The performance of our proposed scheme is demonstrated with evaluations over a diverse set of misbehavior detection scenarios using an open-source dataset. Experimental results show that our approach significantly reduces the training time at the target RSU and achieves superior detection performance compared to the baseline scheme with tabula rasa learning. Enhanced robustness and generalizability can also be attained, by effectively detecting previously unseen and partially observable misbehavior attacks.

Knowledge Transfer for Collaborative Misbehavior Detection in Untrusted Vehicular Environments

TL;DR

The paper tackles robust misbehavior detection in untrusted vehicular networks by introducing a DRL-based detector at RSUs augmented with selective transfer learning. It defines a semantic relatedness trust mechanism to identify trustworthy source RSUs and uses instance-level knowledge transfer to accelerate learning while avoiding negative transfer from adversaries. Across VeReMi-based experiments and three scenario types, the approach achieves faster convergence and high detection performance, including for unseen and partially observable attacks, outperforming tabula rasa baselines. The work demonstrates significant improvements in robustness, generalization, and data-efficiency, with practical implications for scalable edge security in V2X systems.

Abstract

Vehicular mobility underscores the need for collaborative misbehavior detection at the vehicular edge. However, locally trained misbehavior detection models are susceptible to adversarial attacks that aim to deliberately influence learning outcomes. In this paper, we introduce a deep reinforcement learning-based approach that employs transfer learning for collaborative misbehavior detection among roadside units (RSUs). In the presence of label-flipping and policy induction attacks, we perform selective knowledge transfer from trustworthy source RSUs to foster relevant expertise in misbehavior detection and avoid negative knowledge sharing from adversary-influenced RSUs. The performance of our proposed scheme is demonstrated with evaluations over a diverse set of misbehavior detection scenarios using an open-source dataset. Experimental results show that our approach significantly reduces the training time at the target RSU and achieves superior detection performance compared to the baseline scheme with tabula rasa learning. Enhanced robustness and generalizability can also be attained, by effectively detecting previously unseen and partially observable misbehavior attacks.
Paper Structure (33 sections, 14 equations, 10 figures, 3 tables, 2 algorithms)

This paper contains 33 sections, 14 equations, 10 figures, 3 tables, 2 algorithms.

Figures (10)

  • Figure 1: Considered network model for collaborative misbehavior detection. Distributed knowledge transfer between source RSUs $\{s_{i}\}^m_{i=1}$ and the target RSU $t$ is depicted with green arrows for positive knowledge and with orange for negative knowledge. The presence of a malicious adversary implies that the training of the misbehavior detection system (DRL-MDS) at certain source RSUs (e.g., RSU $m$) is under adversarial influence.
  • Figure 2: High-level illustration of the DRL-MDS workflow in an RSU. In every agent-environment interaction, the tuple ($e_{t} = <s_{t}, a_{t}, r_{t}, s_{t+1}>$) is stored in the replay memory buffer as the agent’s instantaneous experience, and a random minibatch of past experiences is sampled to regularize the training process. The online and target networks comprise the same DNN structure, with the target network being updated periodically (every $T$ steps, copy weights of $\theta$ to $\theta^-$) using the weights from the online network to keep it in sync with the online network.
  • Figure 3: Visualization workflow for selecting trustworthy source RSUs. Source RSUs with policies $\{\pi_{s_{i}}\}^m_{i=1}$ are ranked ($\mathscr{f}_{\text{ranking}}$) based on normalized scale cumulative return values $\{G^\pi_{\text{scaled}, s}\}^m_{i=1} \in [0,1]$, which are extracted from the target RSU $t$'s environment. Instances $\{(s^{(t)}_{i},a^{(t)}_{i},r^{(t)}_{i},s^{(t+1)}_{i}) \coloneqq d_{s_{i}}\}^k_{i=1}$ are transferred to the target RSU $t$ from $k$ subset of trustworthy source RSUs with policies $\{\pi_{s_{i}}\}^{k}_{i=1}$, where $k \leq m$ and $D_{S}$ denotes samples of source RSUs. Green arrows denote positive transfer from trusted sources and the orange arrow represents a potential negative transfer from a malicious source.
  • Figure 4: Learning performance of source RSUs in SC1 for (Left) Random Position and (Right) Random Position Offset misbehavior attack types.
  • Figure 5: Learning performance of source RSUs in SC2 for (Left) a combination of DoS, DoS Random and DoS Random Sybil; (Right) a combination of DoS, DoS Disruptive and DoS Disruptive Sybil misbehavior attack types.
  • ...and 5 more figures

Theorems & Definitions (3)

  • Definition 1: Label-flipping attack
  • Definition 2: Policy induction attack
  • Definition 3: Transfer learning between RSUs