Table of Contents
Fetching ...

Exploring reinforcement learning for incident response in autonomous military vehicles

Henrik Madsen, Gudmund Grov, Federico Mancini, Magnus Baksaas, Åvald Åslaugson Sommervoll

TL;DR

This work explores reinforcement learning to train an agent that can autonomously respond to cyber attacks on unmanned vehicles in the context of a military operation, demonstrating that reinforcement learning is a viable approach to train an agent that can be used for autonomous cyber defence on a real unmanned ground vehicle, even when trained in a simple simulation environment.

Abstract

Unmanned vehicles able to conduct advanced operations without human intervention are being developed at a fast pace for many purposes. Not surprisingly, they are also expected to significantly change how military operations can be conducted. To leverage the potential of this new technology in a physically and logically contested environment, security risks are to be assessed and managed accordingly. Research on this topic points to autonomous cyber defence as one of the capabilities that may be needed to accelerate the adoption of these vehicles for military purposes. Here, we pursue this line of investigation by exploring reinforcement learning to train an agent that can autonomously respond to cyber attacks on unmanned vehicles in the context of a military operation. We first developed a simple simulation environment to quickly prototype and test some proof-of-concept agents for an initial evaluation. This agent was then applied to a more realistic simulation environment and finally deployed on an actual unmanned ground vehicle for even more realism. A key contribution of our work is demonstrating that reinforcement learning is a viable approach to train an agent that can be used for autonomous cyber defence on a real unmanned ground vehicle, even when trained in a simple simulation environment.

Exploring reinforcement learning for incident response in autonomous military vehicles

TL;DR

This work explores reinforcement learning to train an agent that can autonomously respond to cyber attacks on unmanned vehicles in the context of a military operation, demonstrating that reinforcement learning is a viable approach to train an agent that can be used for autonomous cyber defence on a real unmanned ground vehicle, even when trained in a simple simulation environment.

Abstract

Unmanned vehicles able to conduct advanced operations without human intervention are being developed at a fast pace for many purposes. Not surprisingly, they are also expected to significantly change how military operations can be conducted. To leverage the potential of this new technology in a physically and logically contested environment, security risks are to be assessed and managed accordingly. Research on this topic points to autonomous cyber defence as one of the capabilities that may be needed to accelerate the adoption of these vehicles for military purposes. Here, we pursue this line of investigation by exploring reinforcement learning to train an agent that can autonomously respond to cyber attacks on unmanned vehicles in the context of a military operation. We first developed a simple simulation environment to quickly prototype and test some proof-of-concept agents for an initial evaluation. This agent was then applied to a more realistic simulation environment and finally deployed on an actual unmanned ground vehicle for even more realism. A key contribution of our work is demonstrating that reinforcement learning is a viable approach to train an agent that can be used for autonomous cyber defence on a real unmanned ground vehicle, even when trained in a simple simulation environment.

Paper Structure

This paper contains 9 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The Milrem THeMIS UGV themis_ugv used for the experiments in this paper.
  • Figure 2: Simulation visualization. The left side shows our simulation using the Mapviz tool, while the right side shows our simulation with the Veranda tool.
  • Figure 3: Overall approach. The AI agents are trained using RL and deep RL in a simple simulation environment. The trained agents are then applied in a more complex integrated simulation environment before being applied in a real environment. The feedback loops indicate how the more complex environments support the training.
  • Figure 4: Experiment 1: Total reward for Q-learning for different strategies.
  • Figure 5: Velocity of UGV during evaluation on the Milrem THeMIS UGV. The graph shows how the UGV abruptly changed between accelerating and braking as a result of attacks and mitigating actions by the RL agent.