Efficient deadlock avoidance for 2D mesh NoCs that use OQ or VOQ routers
Philippos Papaphilippou, Thiem Van Chu
TL;DR
This work addresses deadlocks in 2D mesh NoCs employing VOQ or OQ routers by introducing a bimodal routing scheme that uses a local freedom condition, $F$, to decide when a restricted turn may be taken. The base algorithm can be any routing method, while a fallback turn-model-based route guarantees deadlock freedom when necessary; an adaptation $F'$ simplifies implementation. Two example algorithms, XY/Adaptive and XY/O1-Turn, demonstrate improved performance over traditional turn-restricted methods across synthetic and real traces, with near-fully adaptive behavior and minimal hardware overhead. The approach leverages the inherent turn information in VOQ/OQ topologies, enabling more flexible routing without additional queues or global knowledge, which is especially suitable for FPGA NoCs. Overall, the method provides a practical path to higher NoC throughput and lower latency while preserving deadlock freedom and hardware efficiency.
Abstract
Network-on-chips (NoCs) are currently a widely used approach for achieving scalability of multi-cores to many-cores, as well as for interconnecting other vital system-on-chip (SoC) components. Each entity in 2D mesh-based NoCs has a router responsible for forwarding packets between the dimensions as well as the entity itself, and it is essentially a 5-port switch. With respect to the routing algorithm, there are important trade-offs between routing performance and the efficiency of overcoming potential deadlocks. Common deadlock avoidance techniques including the turn model usually involve restrictions of certain paths a packet can take at the cost of a higher probability for network congestion. In contrast, deadlock resolution techniques, as well as some avoidance schemes, provide more path flexibility at the expense of hardware complexity, such as by incorporating (or assuming) dedicated buffers. This paper provides a deadlock avoidance algorithm for NoC routers based on output-queues (OQs) or virtual-output queues (VOQs), with a focus on their use on field-programmable gate-arrays (FPGAs). The proposed approach features fewer path restrictions than common techniques, and can be based on existing routing algorithms as a baseline, deadlock-free or not. This requires no modification to the queueing topology, and the required logic is minimal. Our algorithm approaches the performance of fully-adaptive algorithms, while maintaining deadlock freedom.
