Development of an Energy-Efficient and Real-Time Data Movement Strategy for Next-Generation Heterogeneous Mixed-Criticality Systems
Thomas Benz
TL;DR
The work tackles memory-system bottlenecks in increasingly heterogeneous mixed-criticality systems by introducing iDMA, a modular, protocol-agnostic DMA engine, and AXI-REALM, a lightweight interconnect extension that enforces real-time guarantees and isolation. The approach spans multiple system levels, from microarchitectural DMA back-ends and tensor-aware front-ends to Linux-driven, descriptor-based interfaces and virtual-memory-friendly mids. Across silicon demonstrations (Linux SoCs, Occamy chiplets, and Carfield platforms), the framework delivers near-ideal bus utilization, significant reductions in memory interference, and robust real-time behavior with modest area overhead. Together, iDMA and AXI-REALM enable scalable, energy-efficient data movement and predictable interconnect behavior essential for next-generation zonal, domain-aware cyber-physical systems in automotive, robotics, and aerospace.
Abstract
Industrial domains such as automotive, robotics, and aerospace are rapidly evolving to satisfy the increasing demand for machine-learning-driven Autonomy, Connectivity, Electrification, and Shared mobility (ACES). This paradigm shift inherently and significantly increases the requirement for onboard computing performance and high-performance communication infrastructure. At the same time, Moore's Law and Dennard Scaling are grinding to a halt, in turn, driving computing systems to larger scales and higher levels of heterogeneity and specialization, through application-specific hardware accelerators, instead of relying on technological scaling only. Approaching ACES requires this substantial amount of compute at an increasingly high energy-efficiency, since most use cases are fundamentally resource-bound. This increase in compute performance and heterogeneity goes hand in hand with a growing demand for high memory bandwidth and capacity as the driving applications grow in complexity, operating on huge and progressively irregular data sets and further requiring a steady influx of sensor data, increasing pressure both on on-chip and off-chip interconnect systems. Further, ACES combines real-time time-critical with general compute tasks on the same physical platform, sharing communication, storage, and micro-architectural resources. These heterogeneous mixed-criticality systems (MCSs) place additional pressure on the interconnect, demanding minimal contention between the different criticality levels to sustain a high degree of predictability. Fulfilling the performance and energy-efficiency requirements across a wide range of industrial applications requires a carefully co-designed process of the memory system with the use cases as well as the compute units and accelerators.
