Exploring the Design Space for Message-Driven Systems for Dynamic Graph Processing using CCA
Bibrak Qamar Chandio, Maciej Brodowicz, Thomas Sterling
TL;DR
This work argues that irregular, dynamic graph workloads outstrip conventional von Neumann architectures and proposes the Continuum Computer Architecture (CCA), a memory-centric, message-driven model with a global address space to unlock fine-grain parallelism. It outlines a hardware design space built from tessellated Compute Cells (CCs) forming a memory–compute–communication continuum and introduces batched dynamic graph processing via actions and Recursively Parallel Vertex Objects (RPVOs). The paper combines theoretical/systems synthesis with hardware-space exploration, detailing shape, memory, and communication trade-offs, and demonstrates batched dynamic BFS on a CCASimulator, highlighting how dynamic data movement and asynchronous execution can improve performance on dynamic graphs. It also maps concrete future directions—reducing NoC diameter, adaptive routing, and wafer-scale deployments—that could realize scalable non-von Neumann accelerators for graph-centric AI workloads.
Abstract
Computer systems that have been successfully deployed for dense regular workloads fall short of achieving scalability and efficiency when applied to irregular and dynamic graph applications. Conventional computing systems rely heavily on static, regular, numeric intensive computations while High Performance Computing systems executing parallel graph applications exhibit little locality, spatial or temporal, and are fine-grained and memory intensive. With the strong interest in AI which depend on these very different use cases combined with the end of Moore's Law at nanoscale, dramatic alternatives in architecture and underlying execution models are required. This paper identifies an innovative non-von Neumann architecture, Continuum Computer Architecture (CCA), that redefines the nature of computing structures to yield powerful innovations in computational methods to deliver a new generation of highly parallel hardware architecture. CCA reflects a genus of highly parallel architectures that while varying in specific quantities (e.g., memory blocks), share a multiple of attributes not found in typical von Neumann machines. Among these are memory-centric components, message-driven asynchronous flow control, and lightweight out-of-order execution across a global name space. Together these innovative non-von Neumann architectural properties guided by a new original execution model will deliver the new future path for extending beyond the von Neumann model. This paper documents a series of interrelated experiments that together establish future directions for next generation non-von Neumann architectures, especially for graph processing.
