Table of Contents
Fetching ...

RL-TIME: Reinforcement Learning-based Task Replication in Multicore Embedded Systems

Roozbeh Siyadatzadeh, Mohsen Ansari, Muhammad Shafique, Alireza Ejlali

TL;DR

This work tackles reliability and real-time constraints in multicore embedded systems where static, design-time task replication incurs significant power, temperature, and inefficiency. It introduces RL-TIME, a reinforcement learning-based framework that dynamically chooses the number of replicas per real-time task and maps them to cores, while enforcing per-core Thermal Safe Power constraints and achieving target reliability under EDF-like scheduling. Key contributions include a Q-learning formulation with discretized states, a reliability and power model that accounts for aging, DVFS, and fault detection, offline pre-training to alleviate cold-start, and a comprehensive evaluation against state-of-the-art replication approaches showing substantial power savings and improved schedulability under thermal constraints. The results demonstrate practical impact for energy-efficient and fault-tolerant real-time multicore systems, enabling adaptive reliability that avoids overheating while meeting deadlines.

Abstract

Embedded systems power many modern applications and must often meet strict reliability, real-time, thermal, and power requirements. Task replication can improve reliability by duplicating a task's execution to handle transient and permanent faults, but blindly applying replication often leads to excessive overhead and higher temperatures. Existing design-time methods typically choose the number of replicas based on worst-case conditions, which can waste resources under normal operation. In this paper, we present RL-TIME, a reinforcement learning-based approach that dynamically decides the number of replicas according to actual system conditions. By considering both the reliability target and a core-level Thermal Safe Power (TSP) constraint at run-time, RL-TIME adapts the replication strategy to avoid unnecessary overhead and overheating. Experimental results show that, compared to state-of-the-art methods, RL-TIME reduces power consumption by 63%, increases schedulability by 53%, and respects TSP 72% more often.

RL-TIME: Reinforcement Learning-based Task Replication in Multicore Embedded Systems

TL;DR

This work tackles reliability and real-time constraints in multicore embedded systems where static, design-time task replication incurs significant power, temperature, and inefficiency. It introduces RL-TIME, a reinforcement learning-based framework that dynamically chooses the number of replicas per real-time task and maps them to cores, while enforcing per-core Thermal Safe Power constraints and achieving target reliability under EDF-like scheduling. Key contributions include a Q-learning formulation with discretized states, a reliability and power model that accounts for aging, DVFS, and fault detection, offline pre-training to alleviate cold-start, and a comprehensive evaluation against state-of-the-art replication approaches showing substantial power savings and improved schedulability under thermal constraints. The results demonstrate practical impact for energy-efficient and fault-tolerant real-time multicore systems, enabling adaptive reliability that avoids overheating while meeting deadlines.

Abstract

Embedded systems power many modern applications and must often meet strict reliability, real-time, thermal, and power requirements. Task replication can improve reliability by duplicating a task's execution to handle transient and permanent faults, but blindly applying replication often leads to excessive overhead and higher temperatures. Existing design-time methods typically choose the number of replicas based on worst-case conditions, which can waste resources under normal operation. In this paper, we present RL-TIME, a reinforcement learning-based approach that dynamically decides the number of replicas according to actual system conditions. By considering both the reliability target and a core-level Thermal Safe Power (TSP) constraint at run-time, RL-TIME adapts the replication strategy to avoid unnecessary overhead and overheating. Experimental results show that, compared to state-of-the-art methods, RL-TIME reduces power consumption by 63%, increases schedulability by 53%, and respects TSP 72% more often.

Paper Structure

This paper contains 19 sections, 18 equations, 11 figures, 2 tables, 1 algorithm.

Figures (11)

  • Figure 1: Motivational example: Impact of transistor aging on threshold voltage over time at various temperatures (b), and the required number of replicas to maintain a target reliability of 0.9999999 for "dedup" from the PARSEC benchmark van1990algorithms (a).
  • Figure 2: A summary of our novel contribution.
  • Figure 3: Design flow of the proposed RL-TIME
  • Figure 4: The experimental setup and integrated tool flow.
  • Figure 5: Number of replicas for each task at different temperatures over time slots (each slot is one million CPU cycles).
  • ...and 6 more figures