Table of Contents
Fetching ...

Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems

Boyi Liu, Lujia Wang, Ming Liu

TL;DR

This work addresses enabling robots to accumulate and reuse experience across environments in cloud robotics. It introduces Lifelong Federated Reinforcement Learning (LFRL), combining a cloud-based knowledge fusion algorithm with two transfer-learning strategies to fuse private models into a powerful shared model and transfer it back to robots. Through simulations and real-world Turtlebot3 experiments, the approach reduces training time while improving navigation performance and generalization to new environments. A cloud navigation-learning website is released to demonstrate practical deployment and service provision for cloud robotic systems.

Abstract

This paper was motivated by the problem of how to make robots fuse and transfer their experience so that they can effectively use prior knowledge and quickly adapt to new environments. To address the problem, we present a learning architecture for navigation in cloud robotic systems: Lifelong Federated Reinforcement Learning (LFRL). In the work, We propose a knowledge fusion algorithm for upgrading a shared model deployed on the cloud. Then, effective transfer learning methods in LFRL are introduced. LFRL is consistent with human cognitive science and fits well in cloud robotic systems. Experiments show that LFRL greatly improves the efficiency of reinforcement learning for robot navigation. The cloud robotic system deployment also shows that LFRL is capable of fusing prior knowledge. In addition, we release a cloud robotic navigation-learning website based on LFRL.

Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems

TL;DR

This work addresses enabling robots to accumulate and reuse experience across environments in cloud robotics. It introduces Lifelong Federated Reinforcement Learning (LFRL), combining a cloud-based knowledge fusion algorithm with two transfer-learning strategies to fuse private models into a powerful shared model and transfer it back to robots. Through simulations and real-world Turtlebot3 experiments, the approach reduces training time while improving navigation performance and generalization to new environments. A cloud navigation-learning website is released to demonstrate practical deployment and service provision for cloud robotic systems.

Abstract

This paper was motivated by the problem of how to make robots fuse and transfer their experience so that they can effectively use prior knowledge and quickly adapt to new environments. To address the problem, we present a learning architecture for navigation in cloud robotic systems: Lifelong Federated Reinforcement Learning (LFRL). In the work, We propose a knowledge fusion algorithm for upgrading a shared model deployed on the cloud. Then, effective transfer learning methods in LFRL are introduced. LFRL is consistent with human cognitive science and fits well in cloud robotic systems. Experiments show that LFRL greatly improves the efficiency of reinforcement learning for robot navigation. The cloud robotic system deployment also shows that LFRL is capable of fusing prior knowledge. In addition, we release a cloud robotic navigation-learning website based on LFRL.

Paper Structure

This paper contains 19 sections, 7 equations, 9 figures, 1 table, 2 algorithms.

Figures (9)

  • Figure 1: The person on the right is considering where should the next step go. The chess he has played and the chess he has seen are the most two influential factors on making decisions. His memory fused into his policy model. So how can robots remember and make decisions like humans? Motivated by this human cognitive science, we propose the LFRL in cloud robot systems. LFRL makes the cloud remember what robots learned before like a human brain.
  • Figure 2: Proposed Architecture. In Robot$\rightarrow$Environment, the robot learns to avoid some new types of obstacles in the new environment through reinforcement learning and obtains the private Q-network model. Not only from one robot training in different environments, private models can also be resulted from multiple robots. It is a type of federated learning. After that, the private network will be uploaded to the cloud. The cloud server evolves the shared model by fusing private models to the shared model. In Cloud$\rightarrow$Robot, inspired by transfer learning, successor features are used to transfer the strategy to unknown environment. We input the output of the shared model as added features to the Q-network in reinforcement learning, or simply transfer all parameters to the Q-network. Iterating this step, models on the cloud become increasingly powerful.
  • Figure 3: LFRL compared with A3C or UNREAL
  • Figure 4: Knowledge Fusion Algorithm in LFRL: We generate a large amount of training data based on sensor data, target data, and human-defined features. Each training sample is added into the private network and the k-th generation sharing network, while different actors are scored for different actions. Then, we store the scores and calculate the confidence values of all actors in this training sample data. The "confidence value" is used as a weight, while the scores are weighted and summed to obtain the label of the current sample data. By analogy, all sample data labels are generated. Finally, a network is generated and fits the sample data as much as possible. The generated network is the (k+1)th generation. This step of fusion is finished.
  • Figure 5: A transfer learning method of LFRL
  • ...and 4 more figures