Table of Contents
Fetching ...

Low Dimensional State Representation Learning with Robotics Priors in Continuous Action Spaces

Nicolò Botteghi, Khaled Alaa, Mannes Poel, Beril Sirmacek, Christoph Brune, Abeje Mersha, Stefano Stramigioli

TL;DR

This paper proposes a framework combining the learning of a low-dimensional state representation, from high-dimensional observations coming from the robot’s raw sensory readings, with thelearning of the optimal policy, given the learned state representation.

Abstract

Autonomous robots require high degrees of cognitive and motoric intelligence to come into our everyday life. In non-structured environments and in the presence of uncertainties, such degrees of intelligence are not easy to obtain. Reinforcement learning algorithms have proven to be capable of solving complicated robotics tasks in an end-to-end fashion without any need for hand-crafted features or policies. Especially in the context of robotics, in which the cost of real-world data is usually extremely high, reinforcement learning solutions achieving high sample efficiency are needed. In this paper, we propose a framework combining the learning of a low-dimensional state representation, from high-dimensional observations coming from the robot's raw sensory readings, with the learning of the optimal policy, given the learned state representation. We evaluate our framework in the context of mobile robot navigation in the case of continuous state and action spaces. Moreover, we study the problem of transferring what learned in the simulated virtual environment to the real robot without further retraining using real-world data in the presence of visual and depth distractors, such as lighting changes and moving obstacles.

Low Dimensional State Representation Learning with Robotics Priors in Continuous Action Spaces

TL;DR

This paper proposes a framework combining the learning of a low-dimensional state representation, from high-dimensional observations coming from the robot’s raw sensory readings, with thelearning of the optimal policy, given the learned state representation.

Abstract

Autonomous robots require high degrees of cognitive and motoric intelligence to come into our everyday life. In non-structured environments and in the presence of uncertainties, such degrees of intelligence are not easy to obtain. Reinforcement learning algorithms have proven to be capable of solving complicated robotics tasks in an end-to-end fashion without any need for hand-crafted features or policies. Especially in the context of robotics, in which the cost of real-world data is usually extremely high, reinforcement learning solutions achieving high sample efficiency are needed. In this paper, we propose a framework combining the learning of a low-dimensional state representation, from high-dimensional observations coming from the robot's raw sensory readings, with the learning of the optimal policy, given the learned state representation. We evaluate our framework in the context of mobile robot navigation in the case of continuous state and action spaces. Moreover, we study the problem of transferring what learned in the simulated virtual environment to the real robot without further retraining using real-world data in the presence of visual and depth distractors, such as lighting changes and moving obstacles.

Paper Structure

This paper contains 18 sections, 11 equations, 5 figures.

Figures (5)

  • Figure 1: Proposed framework combining state representation learning and reinforcement learning.
  • Figure 2: Examples of simulation environments.
  • Figure 3: True state values (Figure \ref{['fig:0']}), and first two principal components (Figure \ref{['fig:1']}-\ref{['fig:5']}), obtained with PCA, of learned state representations for environment Env-1 and target location in the bottom left corner (see Figure \ref{['fig:env1']}).
  • Figure 4: Evolution of the success ratio over training in the different environments. The solid line represents the mean and the shaded area, the variance of the success ratio. For the sake of clarity, we omit the variance in Figure \ref{['fig:res_env1']}.
  • Figure 5: Trajectory tracking in the testing virtual environments.