Table of Contents
Fetching ...

Seasonal Station-Keeping of Short Duration High Altitude Balloons using Deep Reinforcement Learning

Tristan K. Schuler, Chinthan Prasad, Georgiy Kiselev, Donald Sofge

TL;DR

The paper tackles station-keeping for short-duration high-altitude balloons (HABs) using Deep Q-Networks (DQN) within a custom 3D wind-field simulator. It advances realism by generating radiosonde-derived synthetic winds to complement ERA5 forecasts and evaluates season-dependent performance with a Forecast Score metric that quantifies wind diversity. Across months, trained DQN HABs achieve meaningful time-within-region performance (TWR50), with success influenced by wind diversity, forecast accuracy, and vertical wind structure. This work provides a practical framework for robust HAB navigation under forecast uncertainty and lays groundwork for real-world SHAB-V deployment and further methodological improvements.

Abstract

Station-Keeping short-duration high-altitude balloons (HABs) in a region of interest is a challenging path-planning problem due to partially observable, complex, and dynamic wind flows. Deep reinforcement learning is a popular strategy for solving the station-keeping problem. A custom simulation environment was developed to train and evaluate Deep Q-Learning (DQN) for short-duration HAB agents in the simulation. To train the agents on realistic winds, synthetic wind forecasts were generated from aggregated historical radiosonde data to apply horizontal kinematics to simulated agents. The synthetic forecasts were closely correlated with ECWMF ERA5 Reanalysis forecasts, providing a realistic simulated wind field and seasonal and altitudinal variances between the wind models. DQN HAB agents were then trained and evaluated across different seasonal months. To highlight differences and trends in months with vastly different wind fields, a Forecast Score algorithm was introduced to independently classify forecasts based on wind diversity, and trends between station-keeping success and the Forecast Score were evaluated across all seasons.

Seasonal Station-Keeping of Short Duration High Altitude Balloons using Deep Reinforcement Learning

TL;DR

The paper tackles station-keeping for short-duration high-altitude balloons (HABs) using Deep Q-Networks (DQN) within a custom 3D wind-field simulator. It advances realism by generating radiosonde-derived synthetic winds to complement ERA5 forecasts and evaluates season-dependent performance with a Forecast Score metric that quantifies wind diversity. Across months, trained DQN HABs achieve meaningful time-within-region performance (TWR50), with success influenced by wind diversity, forecast accuracy, and vertical wind structure. This work provides a practical framework for robust HAB navigation under forecast uncertainty and lays groundwork for real-world SHAB-V deployment and further methodological improvements.

Abstract

Station-Keeping short-duration high-altitude balloons (HABs) in a region of interest is a challenging path-planning problem due to partially observable, complex, and dynamic wind flows. Deep reinforcement learning is a popular strategy for solving the station-keeping problem. A custom simulation environment was developed to train and evaluate Deep Q-Learning (DQN) for short-duration HAB agents in the simulation. To train the agents on realistic winds, synthetic wind forecasts were generated from aggregated historical radiosonde data to apply horizontal kinematics to simulated agents. The synthetic forecasts were closely correlated with ECWMF ERA5 Reanalysis forecasts, providing a realistic simulated wind field and seasonal and altitudinal variances between the wind models. DQN HAB agents were then trained and evaluated across different seasonal months. To highlight differences and trends in months with vastly different wind fields, a Forecast Score algorithm was introduced to independently classify forecasts based on wind diversity, and trends between station-keeping success and the Forecast Score were evaluated across all seasons.

Paper Structure

This paper contains 16 sections, 7 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Altitude Profile and 3D trajectory of a trained DQN HAB agent in simulation with ERA5 forecasts as observation, and synthetic winds for movement.
  • Figure 2: Learning curve over 1.5 million timesteps for a station-keeping DQN HAB agent
  • Figure 3: Synthetic Wind Generation from Aggregated and Interpolated Radiosonde Data in the Southwestern United States on August 23, 2023 at 1200 UTC
  • Figure 4: ERA5 and Synthetic Model Variations on January 17 and July 17, 2023 at 000UTC at the 50 hPa Pressure Level
  • Figure 5: Filtered Forecast Score Distributions for SW USA
  • ...and 3 more figures