Table of Contents
Fetching ...

Transformers As Generalizable Optimal Controllers

Turki Bin Mohaya, Maitham F. AL-Sunni, John M. Dolan, Peter Seiler

Abstract

We study whether optimal state-feedback laws for a family of heterogeneous Multiple-Input, Multiple-Output (MIMO) Linear Time-Invariant (LTI) systems can be captured by a single learned controller. We train one transformer policy on LQR-generated trajectories from systems with different state and input dimensions, using a shared representation with standardization, padding, dimension encoding, and masked loss. The policy maps recent state history to control actions without requiring plant matrices at inference time. Across a broad set of systems, it achieves empirically small sub-optimality relative to Linear Quadratic Regulator (LQR), remains stabilizing under moderate parameter perturbations, and benefits from lightweight fine-tuning on unseen systems. These results support transformer policies as practical approximators of near-optimal feedback laws over structured linear-system families.

Transformers As Generalizable Optimal Controllers

Abstract

We study whether optimal state-feedback laws for a family of heterogeneous Multiple-Input, Multiple-Output (MIMO) Linear Time-Invariant (LTI) systems can be captured by a single learned controller. We train one transformer policy on LQR-generated trajectories from systems with different state and input dimensions, using a shared representation with standardization, padding, dimension encoding, and masked loss. The policy maps recent state history to control actions without requiring plant matrices at inference time. Across a broad set of systems, it achieves empirically small sub-optimality relative to Linear Quadratic Regulator (LQR), remains stabilizing under moderate parameter perturbations, and benefits from lightweight fine-tuning on unseen systems. These results support transformer policies as practical approximators of near-optimal feedback laws over structured linear-system families.
Paper Structure (14 sections, 22 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 14 sections, 22 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Relative sub-optimality distribution across seen and unseen systems (see \ref{['tab:lti_systems']} for systems' labels).
  • Figure 2: Performance on dual arm robot (system #15, seen).
  • Figure 3: Performance on vibrating beam (system #26, unseen).
  • Figure 4: Performance on asymmetric oscillator (system #18, unseen).
  • Figure 5: Ablation study on the effect of (left) the window length and (right) the number of sequences per system