Table of Contents
Fetching ...

Deep Reinforcement Learning for System-on-Chip: Myths and Realities

Tegg Taekyong Sung, Bo Ryu

TL;DR

The novel neural scheduler technique, Eclectic Interaction Matching (EIM), overcomes the above challenges, thus significantly improving the existing neural schedulers, and rationalize the underlying reasons behind the performance gain by the EIM-based neural Scheduler.

Abstract

Neural schedulers based on deep reinforcement learning (DRL) have shown considerable potential for solving real-world resource allocation problems, as they have demonstrated significant performance gain in the domain of cluster computing. In this paper, we investigate the feasibility of neural schedulers for the domain of System-on-Chip (SoC) resource allocation through extensive experiments and comparison with non-neural, heuristic schedulers. The key finding is three-fold. First, neural schedulers designed for cluster computing domain do not work well for SoC due to i) heterogeneity of SoC computing resources and ii) variable action set caused by randomness in incoming jobs. Second, our novel neural scheduler technique, Eclectic Interaction Matching (EIM), overcomes the above challenges, thus significantly improving the existing neural schedulers. Specifically, we rationalize the underlying reasons behind the performance gain by the EIM-based neural scheduler. Third, we discover that the ratio of the average processing elements (PE) switching delay and the average PE computation time significantly impacts the performance of neural SoC schedulers even with EIM. Consequently, future neural SoC scheduler design must consider this metric as well as its implementation overhead for practical utility.

Deep Reinforcement Learning for System-on-Chip: Myths and Realities

TL;DR

The novel neural scheduler technique, Eclectic Interaction Matching (EIM), overcomes the above challenges, thus significantly improving the existing neural schedulers, and rationalize the underlying reasons behind the performance gain by the EIM-based neural Scheduler.

Abstract

Neural schedulers based on deep reinforcement learning (DRL) have shown considerable potential for solving real-world resource allocation problems, as they have demonstrated significant performance gain in the domain of cluster computing. In this paper, we investigate the feasibility of neural schedulers for the domain of System-on-Chip (SoC) resource allocation through extensive experiments and comparison with non-neural, heuristic schedulers. The key finding is three-fold. First, neural schedulers designed for cluster computing domain do not work well for SoC due to i) heterogeneity of SoC computing resources and ii) variable action set caused by randomness in incoming jobs. Second, our novel neural scheduler technique, Eclectic Interaction Matching (EIM), overcomes the above challenges, thus significantly improving the existing neural schedulers. Specifically, we rationalize the underlying reasons behind the performance gain by the EIM-based neural scheduler. Third, we discover that the ratio of the average processing elements (PE) switching delay and the average PE computation time significantly impacts the performance of neural SoC schedulers even with EIM. Consequently, future neural SoC scheduler design must consider this metric as well as its implementation overhead for practical utility.
Paper Structure (28 sections, 17 equations, 13 figures, 3 tables, 2 algorithms)

This paper contains 28 sections, 17 equations, 13 figures, 3 tables, 2 algorithms.

Figures (13)

  • Figure 1: An illustration of a set of synthetic job and resource profiles. The diagram on the left depicts a job DAG, where a note represents a task by its ID and the edge represents data transmission delay by its weights. The table on the right shows a set of heterogeneous PEs with different computation time for each task.
  • Figure 2: An overview of DS3 workflow. At initialization, a set of workloads and PEs are generated for given job and resource profiles. The job generator continuously generates multiple jobs using the set of workloads and distributes them to the task queues. The scheduler takes any tasks in the ready queue and maps each task to one of the PEs. If the PE is idle, it starts task execution. The task dependency graph prescribes which next task to move onto the ready queue after the completion of its predecessors.
  • Figure 3: The edge density and chain ratio of cluster and SoC workloads. The results of TPC-DS and TPC-H are reproduced by referring to tian2019characterizing.
  • Figure 4: An illustration of irregular interactions. Although Tasks 1 and 3 have been completed earlier, the next Tasks 4, 5, and 6 are scheduled after Task 2 has been completed. As a result, the reward gains for scheduling decisions for Tasks 1 and 3 are truncated due to the task dependencies.
  • Figure 5: The architecture of neural schedulers applied to DS3 simulator. Schedulers receive $N$ tasks in the ready queue and map each task to SoC computing resources. Due to the varying number of tasks, scheduling policies feed each task iteratively. SoCRATES applies Eclectic Interaction Matching to post-process the return (bottom-left). DeepSoCS returns sorted tasks and uses the EFT algorithm to map them to resources (bottom-right).
  • ...and 8 more figures