Distributed and Autonomic Minimum Spanning Trees
Luiz A. Rodrigues, Elias P. Duarte, Luciana Arantes
TL;DR
The paper presents an autonomic spanning-tree construction on the VCube virtual topology to connect all processes with bounded degree and depth, enabling scalable, fault-tolerant broadcasting. Two tree-based broadcast algorithms (best-effort and reliable) are designed to operate on this structure, with mechanisms to autonomically rebuild the tree after crashes. Simulation results show improved scalability over traditional all-to-all approaches, especially under faults, though trade-offs exist in latency and message overhead depending on the scenario. The work highlights the practicality of hierarchical, crash-aware trees for efficient distributed communication in large-scale systems.
Abstract
The most common strategy for enabling a process in a distributed system to broadcast a message is one-to-all communication. However, this approach is not scalable, as it places a heavy load on the sender. This work presents an autonomic algorithm that enables the $n$ processes in a distributed system to build and maintain a spanning tree connecting themselves. In this context, processes are the vertices of the spanning tree. By definition, a spanning tree connects all processes without forming cycles. The proposed algorithm ensures that every vertex in the spanning tree has both an in-degree and the tree depth of at most $log_2 n$. When all processes are correct, the degree of each process is exactly $log_2 n$. A spanning tree is dynamically created from any source process and is transparently reconstructed as processes fail or recover. Up to $n-1$ processes can fail, and the correct processes remain connected through a scalable, functioning spanning tree. To build and maintain the tree, processes use the VCube virtual topology, which also serves as a failure detector. Two broadcast algorithms based on the autonomic spanning tree algorithm are presented: one for best-effort broadcast and one for reliable broadcast. Simulation results are provided, including comparisons with other alternatives.
