Table of Contents
Fetching ...

Federated reinforcement learning for robot motion planning with zero-shot generalization

Zhenyuan Yuan, Siyuan Xu, Minghui Zhu

TL;DR

A federated reinforcement learning framework that enables collaborative learning of multiple learners and a central server without sharing their raw data and leverages on the derived zero-shot generalization guarantees on arrival time and safety.

Abstract

This paper considers the problem of learning a control policy for robot motion planning with zero-shot generalization, i.e., no data collection and policy adaptation is needed when the learned policy is deployed in new environments. We develop a federated reinforcement learning framework that enables collaborative learning of multiple learners and a central server, i.e., the Cloud, without sharing their raw data. In each iteration, each learner uploads its local control policy and the corresponding estimated normalized arrival time to the Cloud, which then computes the global optimum among the learners and broadcasts the optimal policy to the learners. Each learner then selects between its local control policy and that from the Cloud for next iteration. The proposed framework leverages on the derived zero-shot generalization guarantees on arrival time and safety. Theoretical guarantees on almost-sure convergence, almost consensus, Pareto improvement and optimality gap are also provided. Monte Carlo simulation is conducted to evaluate the proposed framework.

Federated reinforcement learning for robot motion planning with zero-shot generalization

TL;DR

A federated reinforcement learning framework that enables collaborative learning of multiple learners and a central server without sharing their raw data and leverages on the derived zero-shot generalization guarantees on arrival time and safety.

Abstract

This paper considers the problem of learning a control policy for robot motion planning with zero-shot generalization, i.e., no data collection and policy adaptation is needed when the learned policy is deployed in new environments. We develop a federated reinforcement learning framework that enables collaborative learning of multiple learners and a central server, i.e., the Cloud, without sharing their raw data. In each iteration, each learner uploads its local control policy and the corresponding estimated normalized arrival time to the Cloud, which then computes the global optimum among the learners and broadcasts the optimal policy to the learners. Each learner then selects between its local control policy and that from the Cloud for next iteration. The proposed framework leverages on the derived zero-shot generalization guarantees on arrival time and safety. Theoretical guarantees on almost-sure convergence, almost consensus, Pareto improvement and optimality gap are also provided. Monte Carlo simulation is conducted to evaluate the proposed framework.
Paper Structure (25 sections, 11 theorems, 66 equations, 13 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 11 theorems, 66 equations, 13 figures, 3 tables, 1 algorithm.

Key Result

Theorem 3.1

Suppose Assumptions assmp: stochastic environment and assmp: stochastic initialization hold. The following properties are true for all $i\in\mathcal{V}$:

Figures (13)

  • Figure 1: Implementation FedGen for learner $i$ in iteration $k$
  • Figure 2: Parameter update logic at each iteration
  • Figure 3: A sample environment in PyBullet
  • Figure 4: Generalized performances to unseen environments
  • Figure 5: Comparison between initial policy, locally converged policy and globally converged policy
  • ...and 8 more figures

Theorems & Definitions (11)

  • Theorem 3.1
  • Lemma 3.5
  • Theorem 3.6
  • Theorem 3.7
  • Theorem 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Lemma 4.5
  • Lemma 4.6
  • ...and 1 more