Table of Contents
Fetching ...

Reinforcement Learning for Angle-Only Intercept Guidance of Maneuvering Targets

Brian Gaudet, Roberto Furfaro, Richard Linares

TL;DR

A novel guidance law that uses observations consisting solely of seeker line of sight angle measurements and their rate of change is presented, and it is demonstrated that the policy performs as well as augmented zero-effort miss guidance with perfect target acceleration knowledge.

Abstract

We present a novel guidance law that uses observations consisting solely of seeker line of sight angle measurements and their rate of change. The policy is optimized using reinforcement meta-learning and demonstrated in a simulated terminal phase of a mid-course exo-atmospheric interception. Importantly, the guidance law does not require range estimation, making it particularly suitable for passive seekers. The optimized policy maps stabilized seeker line of sight angles and their rate of change directly to commanded thrust for the missile's divert thrusters. The use of reinforcement meta-learning allows the optimized policy to adapt to target acceleration, and we demonstrate that the policy performs as well as augmented zero-effort miss guidance with perfect target acceleration knowledge. The optimized policy is computationally efficient and requires minimal memory, and should be compatible with today's flight processors.

Reinforcement Learning for Angle-Only Intercept Guidance of Maneuvering Targets

TL;DR

A novel guidance law that uses observations consisting solely of seeker line of sight angle measurements and their rate of change is presented, and it is demonstrated that the policy performs as well as augmented zero-effort miss guidance with perfect target acceleration knowledge.

Abstract

We present a novel guidance law that uses observations consisting solely of seeker line of sight angle measurements and their rate of change. The policy is optimized using reinforcement meta-learning and demonstrated in a simulated terminal phase of a mid-course exo-atmospheric interception. Importantly, the guidance law does not require range estimation, making it particularly suitable for passive seekers. The optimized policy maps stabilized seeker line of sight angles and their rate of change directly to commanded thrust for the missile's divert thrusters. The use of reinforcement meta-learning allows the optimized policy to adapt to target acceleration, and we demonstrate that the policy performs as well as augmented zero-effort miss guidance with perfect target acceleration knowledge. The optimized policy is computationally efficient and requires minimal memory, and should be compatible with today's flight processors.

Paper Structure

This paper contains 13 sections, 17 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Engagement
  • Figure 2: Planar Heading Error
  • Figure 3: Sample Target Maneuver
  • Figure 4: Engagement
  • Figure 5: Learning Curves: Rewards
  • ...and 3 more figures