RL-TIME: Reinforcement Learning-based Task Replication in Multicore Embedded Systems

Roozbeh Siyadatzadeh; Mohsen Ansari; Muhammad Shafique; Alireza Ejlali

RL-TIME: Reinforcement Learning-based Task Replication in Multicore Embedded Systems

Roozbeh Siyadatzadeh, Mohsen Ansari, Muhammad Shafique, Alireza Ejlali

TL;DR

This work tackles reliability and real-time constraints in multicore embedded systems where static, design-time task replication incurs significant power, temperature, and inefficiency. It introduces RL-TIME, a reinforcement learning-based framework that dynamically chooses the number of replicas per real-time task and maps them to cores, while enforcing per-core Thermal Safe Power constraints and achieving target reliability under EDF-like scheduling. Key contributions include a Q-learning formulation with discretized states, a reliability and power model that accounts for aging, DVFS, and fault detection, offline pre-training to alleviate cold-start, and a comprehensive evaluation against state-of-the-art replication approaches showing substantial power savings and improved schedulability under thermal constraints. The results demonstrate practical impact for energy-efficient and fault-tolerant real-time multicore systems, enabling adaptive reliability that avoids overheating while meeting deadlines.

Abstract

Embedded systems power many modern applications and must often meet strict reliability, real-time, thermal, and power requirements. Task replication can improve reliability by duplicating a task's execution to handle transient and permanent faults, but blindly applying replication often leads to excessive overhead and higher temperatures. Existing design-time methods typically choose the number of replicas based on worst-case conditions, which can waste resources under normal operation. In this paper, we present RL-TIME, a reinforcement learning-based approach that dynamically decides the number of replicas according to actual system conditions. By considering both the reliability target and a core-level Thermal Safe Power (TSP) constraint at run-time, RL-TIME adapts the replication strategy to avoid unnecessary overhead and overheating. Experimental results show that, compared to state-of-the-art methods, RL-TIME reduces power consumption by 63%, increases schedulability by 53%, and respects TSP 72% more often.

RL-TIME: Reinforcement Learning-based Task Replication in Multicore Embedded Systems

TL;DR

Abstract

RL-TIME: Reinforcement Learning-based Task Replication in Multicore Embedded Systems

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)