Table of Contents
Fetching ...

High-Altitude Balloon Station-Keeping with First Order Model Predictive Control

Myles Pasetsky, Jiawei Lin, Bradley Guo, Sarah Dean

TL;DR

High-altitude balloon station-keeping is challenged by nonlinear, underactuated dynamics and partially observable winds. The authors develop First-Order Model Predictive Control (FOMPC) by implementing differentiable balloon and wind models in JAX to enable gradient-based online planning over a receding horizon, without offline training. In Balloon Learning Environment benchmarks, FOMPC outperforms the state-of-the-art RL controller Perciatelli44 by up to $24\%$ in time-within-radius (TWR) and reduces safety violations, though with higher per-step computation. The work provides a solid, training-free model-based baseline for HAB control and offers insights into which planning and wind-correction factors most influence performance, laying groundwork for future hybrids that integrate MPC and RL.

Abstract

High-altitude balloons (HABs) are common in scientific research due to their wide range of applications and low cost. Because of their nonlinear, underactuated dynamics and the partial observability of wind fields, prior work has largely relied on model-free reinforcement learning (RL) methods to design near-optimal control schemes for station-keeping. These methods often compare only against hand-crafted heuristics, dismissing model-based approaches as impractical given the system complexity and uncertain wind forecasts. We revisit this assumption about the efficacy of model-based control for station-keeping by developing First-Order Model Predictive Control (FOMPC). By implementing the wind and balloon dynamics as differentiable functions in JAX, we enable gradient-based trajectory optimization for online planning. FOMPC outperforms a state-of-the-art RL policy, achieving a 24% improvement in time-within-radius (TWR) without requiring offline training, though at the cost of greater online computation per control step. Through systematic ablations of modeling assumptions and control factors, we show that online planning is effective across many configurations, including under simplified wind and dynamics models.

High-Altitude Balloon Station-Keeping with First Order Model Predictive Control

TL;DR

High-altitude balloon station-keeping is challenged by nonlinear, underactuated dynamics and partially observable winds. The authors develop First-Order Model Predictive Control (FOMPC) by implementing differentiable balloon and wind models in JAX to enable gradient-based online planning over a receding horizon, without offline training. In Balloon Learning Environment benchmarks, FOMPC outperforms the state-of-the-art RL controller Perciatelli44 by up to in time-within-radius (TWR) and reduces safety violations, though with higher per-step computation. The work provides a solid, training-free model-based baseline for HAB control and offers insights into which planning and wind-correction factors most influence performance, laying groundwork for future hybrids that integrate MPC and RL.

Abstract

High-altitude balloons (HABs) are common in scientific research due to their wide range of applications and low cost. Because of their nonlinear, underactuated dynamics and the partial observability of wind fields, prior work has largely relied on model-free reinforcement learning (RL) methods to design near-optimal control schemes for station-keeping. These methods often compare only against hand-crafted heuristics, dismissing model-based approaches as impractical given the system complexity and uncertain wind forecasts. We revisit this assumption about the efficacy of model-based control for station-keeping by developing First-Order Model Predictive Control (FOMPC). By implementing the wind and balloon dynamics as differentiable functions in JAX, we enable gradient-based trajectory optimization for online planning. FOMPC outperforms a state-of-the-art RL policy, achieving a 24% improvement in time-within-radius (TWR) without requiring offline training, though at the cost of greater online computation per control step. Through systematic ablations of modeling assumptions and control factors, we show that online planning is effective across many configurations, including under simplified wind and dynamics models.

Paper Structure

This paper contains 35 sections, 15 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Station-keeping via altitude control. Predicted winds generate a planned trajectory (red), which diverges from the actual trajectory (black); replanning at intermediate points corrects forecast errors.
  • Figure 2: Block diagram of FOMPC. Candidate plans are initialized using heuristics, then refined by first-order optimization with dynamics model $F_{\Delta t}^\phi$ and wind model $\hat{W}^\phi$. The lowest-cost plan provides the first $n$ actions, after which the process repeats with the updated state
  • Figure 3: Performance–efficiency trade-offs in FOMPC. Each plot reports mean time-within-radius relative to Perciatelli44 (error bars show 95% confidence intervals) as a function of online execution time per step. Markers denote parameter settings; bolded values indicate defaults (12h horizon, 72m replan interval, 100 initializations, highest dynamics fidelity, GP-Column wind model). Each plotted settings varies one parameter at a time from this default.