Table of Contents
Fetching ...

Argo: An efficient verification framework for distributed in-network computing

Mingyuan Song, Huan Shen, Jinghui Jiang, Qiang Su, Qingyu Song, Lu Tang, Wanjian Feng, Fei Yuan, Qiao Xiang, Jiwu Shu

TL;DR

Argo tackles the challenge of verifying distributed in-network programs by modeling each switch pipeline as a reactive actor and all inter-device interactions as CSP-based message passing. It introduces a unified verification intent language to express topology, local contracts, and global invariants, and a P4-aware semantic pruning technique to curb state-space explosion. The framework is implemented as a prototype and evaluated on five real-world distributed systems and four single-device programs, uncovering 10 bugs and delivering substantial improvements in verification time and memory usage compared with prior single-device verifiers. This work enables scalable, reliable verification for distributed P4 deployments in data centers, improving robustness of in-network computing workloads.

Abstract

Distributed in-network programs are increasingly deployed in data centers for their performance benefits, but shifting application logic to switches also enlarges the failure domain. Ensuring their correctness before deployment is thus critical for reliability. While prior verification frameworks can efficiently detect bugs for programs running on a single switch, they overlook the common interactive behaviors in distributed settings, thereby missing related bugs that can cause state inconsistencies and system failures. This paper presents Procurator, a verification framework that efficiently captures interactive behaviors in distributed in-network programs. Procurator introduces a formal model combining the actor paradigm with Communicating Sequential Processes (CSP), translating pipeline execution into reactive, event-driven actors and unifying their interactions as message passing. To support flexible specification of distributed properties, it provides a unified intent language. Additionally, it incorporates a semantic-aware state pruner to reduce verification complexity, thus ensuring system scalability. Evaluation results show that Procurator efficiently uncovers 10 distinct bugs caused by interactive behaviors across five real-world in-network systems. It also reduces verification time by up to 913.2x and memory consumption by up to 1.9x compared to the state-of-the-art verifier.

Argo: An efficient verification framework for distributed in-network computing

TL;DR

Argo tackles the challenge of verifying distributed in-network programs by modeling each switch pipeline as a reactive actor and all inter-device interactions as CSP-based message passing. It introduces a unified verification intent language to express topology, local contracts, and global invariants, and a P4-aware semantic pruning technique to curb state-space explosion. The framework is implemented as a prototype and evaluated on five real-world distributed systems and four single-device programs, uncovering 10 bugs and delivering substantial improvements in verification time and memory usage compared with prior single-device verifiers. This work enables scalable, reliable verification for distributed P4 deployments in data centers, improving robustness of in-network computing workloads.

Abstract

Distributed in-network programs are increasingly deployed in data centers for their performance benefits, but shifting application logic to switches also enlarges the failure domain. Ensuring their correctness before deployment is thus critical for reliability. While prior verification frameworks can efficiently detect bugs for programs running on a single switch, they overlook the common interactive behaviors in distributed settings, thereby missing related bugs that can cause state inconsistencies and system failures. This paper presents Procurator, a verification framework that efficiently captures interactive behaviors in distributed in-network programs. Procurator introduces a formal model combining the actor paradigm with Communicating Sequential Processes (CSP), translating pipeline execution into reactive, event-driven actors and unifying their interactions as message passing. To support flexible specification of distributed properties, it provides a unified intent language. Additionally, it incorporates a semantic-aware state pruner to reduce verification complexity, thus ensuring system scalability. Evaluation results show that Procurator efficiently uncovers 10 distinct bugs caused by interactive behaviors across five real-world in-network systems. It also reduces verification time by up to 913.2x and memory consumption by up to 1.9x compared to the state-of-the-art verifier.

Paper Structure

This paper contains 33 sections, 15 equations, 9 figures, 3 tables, 2 algorithms.

Figures (9)

  • Figure 1: An example of Replica Total Order violation.
  • Figure 2: Code snippet for the key functionality of NetChain netchain.
  • Figure 3: An example of a distributed P4 system bug.
  • Figure 4: The workflow of Argo. Argo accepts the specification and in-network programs as input. props denotes the verification intent-related properties, and dataplane prog. are dataplane programs. IR and OptIR are the original and optimized intermediate representations, respectively.
  • Figure 5: An example of a P4 backward dependency.
  • ...and 4 more figures