Table of Contents
Fetching ...

A3C-S: Automated Agent Accelerator Co-Search towards Efficient Deep Reinforcement Learning

Yonggan Fu, Yongan Zhang, Chaojian Li, Zhongzhi Yu, Yingyan Celine Lin

TL;DR

This work proposes an Automated Agent Accelerator Co-Search (A3C-S) framework, which to their best knowledge is the first to automatically co-search the optimally matched DRL agents and accelerators that maximize both test scores and hardware efficiency.

Abstract

Driven by the explosive interest in applying deep reinforcement learning (DRL) agents to numerous real-time control and decision-making applications, there has been a growing demand to deploy DRL agents to empower daily-life intelligent devices, while the prohibitive complexity of DRL stands at odds with limited on-device resources. In this work, we propose an Automated Agent Accelerator Co-Search (A3C-S) framework, which to our best knowledge is the first to automatically co-search the optimally matched DRL agents and accelerators that maximize both test scores and hardware efficiency. Extensive experiments consistently validate the superiority of our A3C-S over state-of-the-art techniques.

A3C-S: Automated Agent Accelerator Co-Search towards Efficient Deep Reinforcement Learning

TL;DR

This work proposes an Automated Agent Accelerator Co-Search (A3C-S) framework, which to their best knowledge is the first to automatically co-search the optimally matched DRL agents and accelerators that maximize both test scores and hardware efficiency.

Abstract

Driven by the explosive interest in applying deep reinforcement learning (DRL) agents to numerous real-time control and decision-making applications, there has been a growing demand to deploy DRL agents to empower daily-life intelligent devices, while the prohibitive complexity of DRL stands at odds with limited on-device resources. In this work, we propose an Automated Agent Accelerator Co-Search (A3C-S) framework, which to our best knowledge is the first to automatically co-search the optimally matched DRL agents and accelerators that maximize both test scores and hardware efficiency. Extensive experiments consistently validate the superiority of our A3C-S over state-of-the-art techniques.

Paper Structure

This paper contains 13 sections, 12 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Test scores averaged over 30 episodes during the training of five models on four Atari games.
  • Figure 2: Test score evolution during the search processes of three different search schemes on four Atari games bellemare2013arcade, where Direct-NAS denotes directly applying NAS w/o distillation, and A3C-S:One-level and A3C-S:Bi-level search with the distillation loss using one- and bi-level optimization, respectively.
  • Figure 3: Benchmark the proposed A3C-S with (1) ResNet-14 on our DAS's searched accelerators and (2) A3C-S searched agents on A3C-S searched accelerators vs. SOTA accelerators DNNBuilder zhang2018dnnbuilder, in terms of test scores and FPS trade-off on four Atari games.