Stimulate the Potential of Robots via Competition

Kangyao Huang; Di Guo; Xinyu Zhang; Xiangyang Ji; Huaping Liu

Stimulate the Potential of Robots via Competition

Kangyao Huang, Di Guo, Xinyu Zhang, Xiangyang Ji, Huaping Liu

TL;DR

A competitive learning framework which is able to help individual robot to acquire knowledge from the competition, fully stimulating its dynamics potential in the race by introducing the competition information among competitors as the additional auxiliary signal to learn advantaged actions.

Abstract

It is common for us to feel pressure in a competition environment, which arises from the desire to obtain success comparing with other individuals or opponents. Although we might get anxious under the pressure, it could also be a drive for us to stimulate our potentials to the best in order to keep up with others. Inspired by this, we propose a competitive learning framework which is able to help individual robot to acquire knowledge from the competition, fully stimulating its dynamics potential in the race. Specifically, the competition information among competitors is introduced as the additional auxiliary signal to learn advantaged actions. We further build a Multiagent-Race environment, and extensive experiments are conducted, demonstrating that robots trained in competitive environments outperform ones that are trained with SoTA algorithms in single robot environment.

Stimulate the Potential of Robots via Competition

TL;DR

Abstract

Paper Structure (18 sections, 8 equations, 6 figures, 1 table)

This paper contains 18 sections, 8 equations, 6 figures, 1 table.

Introduction
Related Work
Competition in Multi-agent Task
Learning from Competitive and Adversarial Data
Problem Formulation
Methodology
Framework
Policy Training $\&$ Experience Sharing
Robot Observation Construction
Evaluation
Emperical Results
Environment
Baselines
Experimental Details
Role of Competition
...and 3 more sections

Figures (6)

Figure 1: Episode reward comparison between competitive and non-competitive Walker2d environment. The performance of Walker2ds trained with competition can reach 120$\%$ of the baseline.
Figure 2: Framework for learning knowledge from comparative information. Where $a$ denotes actions, $s$ represents the proprioceptive state, and $o$ is the competitive observation.
Figure 3: The proposed Self-Interest Competition Environments Multiagent-Race. Three walker2d robots racing is illustrated as an example.
Figure 4: Performance of different settings on 3-agent environments. $S$A: Yellow. $3$A-Sh-Decent: Blue. $3$A-Sh-Decent-Comp: Red. $3$A-Sp-Decent-Comp: Green. $3$A-Sh-Cent-Comp: Brown.
Figure 5: Comparison between $3$A-Sh-Decent (Non: non-competitive information as inputs), $3$A-Sh-Decent-Noi (Noi: noise information as inputs), and $3$A-Sh-Decent-Comp (Comp: competitive information as inputs).
...and 1 more figures

Stimulate the Potential of Robots via Competition

TL;DR

Abstract

Stimulate the Potential of Robots via Competition

Authors

TL;DR

Abstract

Table of Contents

Figures (6)