Table of Contents
Fetching ...

Snow: Self-organizing Broadcast Protocol for Cloud

Chengkai Tong

TL;DR

Snow introduces a self-organizing, application-level broadcast protocol for cloud data centers that avoids IP multicast and fixed-root trees. It builds a dynamic, multi-way balanced topology from a ring-ordered membership view, enabling $O(\log n)$ dissemination hops. It integrates reliable messages for critical deliveries and membership maintenance, plus an optional node-coloring optimization to boost convergence and mitigate stragglers. Experimental results show higher reliability and lower overhead under churn compared with Gossip and Plumtree, with an open-source implementation provided for further research.

Abstract

In large-scale distributed applications, efficient and reliable broadcast protocols are essential for node communication. Tree-based broadcast lacks flexibility and may suffer performance degradation or even broadcast failure when cluster membership changes. Gossip-based broadcast incurs high bandwidth overhead and only provides probabilistic delivery guarantees. In tree-based broadcasting, when an internal node leaves, its child nodes need to reconnect to a new parent. This process may introduce instability, leading to potential message duplication and increased transmission latency. However, in cloud environments, node departures and arrivals are common, causing frequent performance degradation in tree-based broadcasting. This paper introduces Snow, a self-organizing broadcast protocol designed for cloud environments. Instead, it dynamically sends or forwards messages based on each node's membership view, ultimately forming a broadcast structure resembling a multi-way balanced tree(the height difference of leaf nodes is at most 1). Our experimental results showed that Snow maintains message delivery reliability and latency guarantees under node churn while maintaining low overhead without sending unnecessary redundant messages.

Snow: Self-organizing Broadcast Protocol for Cloud

TL;DR

Snow introduces a self-organizing, application-level broadcast protocol for cloud data centers that avoids IP multicast and fixed-root trees. It builds a dynamic, multi-way balanced topology from a ring-ordered membership view, enabling dissemination hops. It integrates reliable messages for critical deliveries and membership maintenance, plus an optional node-coloring optimization to boost convergence and mitigate stragglers. Experimental results show higher reliability and lower overhead under churn compared with Gossip and Plumtree, with an open-source implementation provided for further research.

Abstract

In large-scale distributed applications, efficient and reliable broadcast protocols are essential for node communication. Tree-based broadcast lacks flexibility and may suffer performance degradation or even broadcast failure when cluster membership changes. Gossip-based broadcast incurs high bandwidth overhead and only provides probabilistic delivery guarantees. In tree-based broadcasting, when an internal node leaves, its child nodes need to reconnect to a new parent. This process may introduce instability, leading to potential message duplication and increased transmission latency. However, in cloud environments, node departures and arrivals are common, causing frequent performance degradation in tree-based broadcasting. This paper introduces Snow, a self-organizing broadcast protocol designed for cloud environments. Instead, it dynamically sends or forwards messages based on each node's membership view, ultimately forming a broadcast structure resembling a multi-way balanced tree(the height difference of leaf nodes is at most 1). Our experimental results showed that Snow maintains message delivery reliability and latency guarantees under node churn while maintaining low overhead without sending unnecessary redundant messages.

Paper Structure

This paper contains 23 sections, 10 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: For A, the left region is a blue line, and the right region is a red line. B shows how each node selects other nodes when $n=10,k=2$.
  • Figure 2: When $k$ is 2, the running process of 10 nodes.
  • Figure 3: Reliable Message, The black line is a normal message broadcast, and the red line is the acknowledgment of the message.
  • Figure 4: The solid line is the Primary Tree, and the dashed line is the Secondary Tree. The internal nodes of both trees are each other's leaf nodes. The region responsible for each node in the Primary Tree is located above the node, while the region responsible for nodes in the Secondary Tree is located below the node.
  • Figure 5: The x-axis denotes the sequence number of messages, where larger values correspond to messages sent at a later time.
  • ...and 2 more figures