Neural Fractional Differential Equations
C. Coelho, M. Fernanda P. Costa, L. L. Ferrás
TL;DR
This work extends Neural ODEs by introducing Neural Fractional Differential Equations that replace the local time derivative with a Caputo fractional derivative $\, _{0}^{C}D^{\alpha}_{t}$ of order $\alpha \in (0,1)$, where $\alpha$ is learned by a dedicated neural network. The method combines two neural networks, $\mathbf{f}_{\boldsymbol{\theta}}$ for the dynamics and $\alpha_{\boldsymbol{\phi}}$ for the derivative order, with a Predictor-Corrector solver to compute solutions and enable backpropagation. Across synthetic and real-world datasets, Neural FDEs show improved accuracy and notably faster convergence in memory-rich tasks compared to Neural ODEs, albeit with higher computational cost due to nonlocal history dependence. The study highlights the potential of nonlocal dynamics for time-series modelling and outlines avenues to improve efficiency and stability, such as graded meshes and alternative loss formulations for $\alpha$.
Abstract
Fractional Differential Equations (FDEs) are essential tools for modelling complex systems in science and engineering. They extend the traditional concepts of differentiation and integration to non-integer orders, enabling a more precise representation of processes characterised by non-local and memory-dependent behaviours. This property is useful in systems where variables do not respond to changes instantaneously, but instead exhibit a strong memory of past interactions. Having this in mind, and drawing inspiration from Neural Ordinary Differential Equations (Neural ODEs), we propose the Neural FDE, a novel deep neural network architecture that adjusts a FDE to the dynamics of data. This work provides a comprehensive overview of the numerical method employed in Neural FDEs and the Neural FDE architecture. The numerical outcomes suggest that, despite being more computationally demanding, the Neural FDE may outperform the Neural ODE in modelling systems with memory or dependencies on past states, and it can effectively be applied to learn more intricate dynamical systems.
