Table of Contents
Fetching ...

Integrating Lagrangian Neural Networks into the Dyna Framework for Reinforcement Learning

Shreya Das, Kundan Kumar, Muhammad Iqbal, Outi Savolainen, Dominik Baumann, Laura Ruotsalainen, Simo Särkkä

TL;DR

LNNs are employed, which enforce an underlying Lagrangian structure to train the model within a Dyna-based MBRL framework, and the state-estimation-based method converges faster than the stochastic gradient-based method during neural network training.

Abstract

Model-based reinforcement learning (MBRL) is sample-efficient but depends on the accuracy of the learned dynamics, which are often modeled using black-box methods that do not adhere to physical laws. Those methods tend to produce inaccurate predictions when presented with data that differ from the original training set. In this work, we employ Lagrangian neural networks (LNNs), which enforce an underlying Lagrangian structure to train the model within a Dyna-based MBRL framework. Furthermore, we train the LNN using stochastic gradient-based and state-estimation-based optimizers to learn the network's weights. The state-estimation-based method converges faster than the stochastic gradient-based method during neural network training. Simulation results are provided to illustrate the effectiveness of the proposed LNN-based Dyna framework for MBRL.

Integrating Lagrangian Neural Networks into the Dyna Framework for Reinforcement Learning

TL;DR

LNNs are employed, which enforce an underlying Lagrangian structure to train the model within a Dyna-based MBRL framework, and the state-estimation-based method converges faster than the stochastic gradient-based method during neural network training.

Abstract

Model-based reinforcement learning (MBRL) is sample-efficient but depends on the accuracy of the learned dynamics, which are often modeled using black-box methods that do not adhere to physical laws. Those methods tend to produce inaccurate predictions when presented with data that differ from the original training set. In this work, we employ Lagrangian neural networks (LNNs), which enforce an underlying Lagrangian structure to train the model within a Dyna-based MBRL framework. Furthermore, we train the LNN using stochastic gradient-based and state-estimation-based optimizers to learn the network's weights. The state-estimation-based method converges faster than the stochastic gradient-based method during neural network training. Simulation results are provided to illustrate the effectiveness of the proposed LNN-based Dyna framework for MBRL.
Paper Structure (9 sections, 19 equations, 3 figures, 3 algorithms)

This paper contains 9 sections, 19 equations, 3 figures, 3 algorithms.

Figures (3)

  • Figure 1: The basic idea behind the developed method of LNN-based MBRL in the Dyna framework. We learn the dynamics model using the LNN from real environment data and use an RK-2 integrator to generate model-based rollouts. The policy and value networks are trained on both real and model-generated data, improving sample efficiency while preserving physical consistency.
  • Figure 2: Model learning using LNN with $(q, \,\dot{q})$ as input and $L$ is the output of the network, which is further used to evaluate $\ddot{q}$ using the Euler-Lagrangian equation. The second-order Runge-Kutta (RK-2) method is then used to compute $(q_{t+1}, \, \dot{q}_{t+1})$. The reward is learning from the state-action pair.
  • Figure 3: Average return versus timestep plot for the proposed PIMBRL using LNN with Adam and EKF as optimizers, PIMBRL using DNN with constraints as in liu2021physics, and MFRL.