End-to-End Learning Framework for Solving Non-Markovian Optimal Control

Xiaole Zhang; Peiyu Zhang; Xiongye Xiao; Shixuan Li; Vasileios Tzoumas; Vijay Gupta; Paul Bogdan

End-to-End Learning Framework for Solving Non-Markovian Optimal Control

Xiaole Zhang, Peiyu Zhang, Xiongye Xiao, Shixuan Li, Vasileios Tzoumas, Vijay Gupta, Paul Bogdan

TL;DR

This work addresses the challenge of controlling systems with memory effects by extending the Linear Quadratic Regulator to fractional-order linear time-invariant dynamics and proposing FOLOC, an end-to-end data-driven framework. FOLOC combines a system-identification module (RNN+MLP) with a neural-operator-based optimal-control module (Fourier Neural Operator) to jointly learn system parameters $(A,B,oldsymbol{eta})$ and the optimal control policy directly from trajectories, grounded by analytical LQR solutions for FOLTI. The authors derive a discrete-time fractional-order system solution, establish sample-complexity bounds, and demonstrate robust performance across synthetic and real-world tasks (cart-pole and quadrotor) under non-Gaussian noise and limited data. The approach achieves efficient inference and shows scalability to higher dimensions, suggesting practical applicability to complex non-Markovian control problems in engineering and robotics. The work advances both theory and practice by unifying fractional-order system identification with end-to-end control under realistic noise and data constraints.

Abstract

Integer-order calculus often falls short in capturing the long-range dependencies and memory effects found in many real-world processes. Fractional calculus addresses these gaps via fractional-order integrals and derivatives, but fractional-order dynamical systems pose substantial challenges in system identification and optimal control due to the lack of standard control methodologies. In this paper, we theoretically derive the optimal control via linear quadratic regulator (LQR) for fractional-order linear time-invariant (FOLTI) systems and develop an end-to-end deep learning framework based on this theoretical foundation. Our approach establishes a rigorous mathematical model, derives analytical solutions, and incorporates deep learning to achieve data-driven optimal control of FOLTI systems. Our key contributions include: (i) proposing an innovative system identification method control strategy for FOLTI systems, (ii) developing the first end-to-end data-driven learning framework, Fractional-Order Learning for Optimal Control (FOLOC), that learns control policies from observed trajectories, and (iii) deriving a theoretical analysis of sample complexity to quantify the number of samples required for accurate optimal control in complex real-world problems. Experimental results indicate that our method accurately approximates fractional-order system behaviors without relying on Gaussian noise assumptions, pointing to promising avenues for advanced optimal control.

End-to-End Learning Framework for Solving Non-Markovian Optimal Control

TL;DR

Abstract

End-to-End Learning Framework for Solving Non-Markovian Optimal Control

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (10)