Table of Contents
Fetching ...

Structures and Techniques for Streaming Dynamic Graph Processing on Decentralized Message-Driven Systems

Bibrak Qamar Chandio, Maciej Brodowicz, Thomas Sterling

TL;DR

The paper tackles the challenge of dynamic, fine-grained graph processing on decentralized, memory-driven hardware. It presents a diffusive, asynchronous programming model built on AM-CCA with Rhizomes, leveraging actions, LCOs, and a Recursively Parallel Vertex Object (RPVO) to stream edges and locally update vertex state without recomputing from scratch. Key contributions include the runtime supporting dynamic task creation, a scalable vertex-centric data structure, and language constructs for action-based, data-local computation, demonstrated via streaming BFS on simulated hardware. The work shows a path toward scalable, real-time dynamic graph analytics on decentralized architectures, with potential extensions to other graph algorithms like Triangle Counting and Jaccard Coefficient.

Abstract

The paper presents structures and techniques aimed towards co-designing scalable asynchronous and decentralized dynamic graph processing for fine-grain memory-driven architectures. It uses asynchronous active messages, in the form of actions that send ``work to data'', with a programming and execution model that allows spawning tasks from within the data-parallelism combined with a data-structure that parallelizes vertex object across many scratchpad memory-coupled cores and yet provides a single programming abstraction to the data object. The graph is constructed by streaming new edges using novel message delivery mechanisms and language constructs that work together to pass data and control using abstraction of actions, continuations and local control objects (LCOs) such as futures. It results in very fine-grain updates to a hierarchical dynamic vertex data structure, which subsequently triggers a user application action to update the results of any previous computation without recomputing from scratch. In our experiments we use BFS to demonstrate our concept design, and document challenges and opportunities.

Structures and Techniques for Streaming Dynamic Graph Processing on Decentralized Message-Driven Systems

TL;DR

The paper tackles the challenge of dynamic, fine-grained graph processing on decentralized, memory-driven hardware. It presents a diffusive, asynchronous programming model built on AM-CCA with Rhizomes, leveraging actions, LCOs, and a Recursively Parallel Vertex Object (RPVO) to stream edges and locally update vertex state without recomputing from scratch. Key contributions include the runtime supporting dynamic task creation, a scalable vertex-centric data structure, and language constructs for action-based, data-local computation, demonstrated via streaming BFS on simulated hardware. The work shows a path toward scalable, real-time dynamic graph analytics on decentralized architectures, with potential extensions to other graph algorithms like Triangle Counting and Jaccard Coefficient.

Abstract

The paper presents structures and techniques aimed towards co-designing scalable asynchronous and decentralized dynamic graph processing for fine-grain memory-driven architectures. It uses asynchronous active messages, in the form of actions that send ``work to data'', with a programming and execution model that allows spawning tasks from within the data-parallelism combined with a data-structure that parallelizes vertex object across many scratchpad memory-coupled cores and yet provides a single programming abstraction to the data object. The graph is constructed by streaming new edges using novel message delivery mechanisms and language constructs that work together to pass data and control using abstraction of actions, continuations and local control objects (LCOs) such as futures. It results in very fine-grain updates to a hierarchical dynamic vertex data structure, which subsequently triggers a user application action to update the results of any previous computation without recomputing from scratch. In our experiments we use BFS to demonstrate our concept design, and document challenges and opportunities.
Paper Structure (7 sections, 9 figures, 2 tables)

This paper contains 7 sections, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Vertex structures: a) The logical vertex, b) same vertex stored in a hierarchical data structure called Recursively Parallel Vertex Object (RPVO).
  • Figure 2: A $5\times6$ AM-CCA chip shown as an exemplar. Compute Cells containing local memory along with computing logic are tessellated in a mesh network.
  • Figure 3: Asynchronous control transfer. Runtime sends a system action allocate, configured with a return trigger action, to a remote compute cell. the remote compute cell allocates memory. memory address is sent back in the form of the trigger action that is targeted originating vertex at the source CC. the future LCO is set, the runtime resumes the prior action state.
  • Figure 4: ghost : (Future Pointer), a future of pointer type, as an exemplar shows the internal state of the future object as it is being set. null state. the first insert-edge-action (see Listing \ref{['lst:cca-insert-edge']}) puts it in pending as it is being waiting to be set. some actions that have dependency on this future arrive and their related tasks are enqueued in the form of a closure task. a continuation from a remote compute cell, returned in the form of an action, sets the future with the address of the newly allocated remote memory space. depended tasks are scheduled, and the future queue is emptied.
  • Figure 5: Vertex object allocation policy: (a) Localize ghost vertices in Compute Cells nearby, and (b) No regard to locality of ghost vertices.
  • ...and 4 more figures