Functional Meaning for Parallel Streaming
Nick Rioux, Steve Zdancewic
TL;DR
This work presents $λ_\vee$, a core untyped, call-by-value language for deterministic parallelism built on streaming, growable values organized by a semantic streaming order. It blends functional programming with Datalog-style monotone reasoning via a parallel join operator and threshold queries, yielding a deterministic by-construction model for distributed computation. The authors provide both an approximate operational semantics and a filter-model denotational semantics, and prove adequacy via a novel logical relation that ties the two views to domain-theoretic models. They discuss practical implementation considerations, including pipeline parallelism, memoization, and potential extensions like frozen and versioned values to broaden expressivity while preserving monotonicity. Overall, the paper integrates domain theory with distributed computation foundations to offer a principled, deterministic framework for parallel functional programming with rich streaming data types.
Abstract
Nondeterminism introduced by race conditions and message reorderings makes parallel and distributed programming hard. Nevertheless, promising approaches such as LVars and CRDTs address this problem by introducing a partial order structure on shared state that describes how the state evolves over time. Monotone programs that respect the order are deterministic. Datalog-inspired languages incorporate this idea of monotonicity in a first-class way but they are not general-purpose. We would like parallel and distributed languages to be as natural to use as any functional language, without sacrificing expressivity, and with a formal basis of study as appealing as the lambda calculus. This paper presents $λ_\vee$, a core language for deterministic parallelism that embodies the ideas above. In $λ_\vee$, values may increase over time according to a streaming order and all computations are monotone with respect to that order. The streaming order coincides with the approximation order found in Scott semantics and so unifies the foundations of functional programming with the foundations of deterministic distributed computation. The resulting lambda calculus has a computationally adequate model rooted in domain theory. It integrates the compositionality and power of abstraction characteristic of functional programming with the declarative nature of Datalog. This version of the paper includes extended exposition and appendices with proofs.
