Object Proxy Patterns for Accelerating Distributed Applications
J. Gregory Pauloski, Valerie Hayot-Sasson, Logan Ward, Alexander Brace, André Bauer, Kyle Chard, Ian Foster
TL;DR
The paper addresses data-flow bottlenecks in large-scale distributed applications by introducing three high-level proxy-based patterns—distributed futures (ProxyFutures), object streaming (ProxyStream), and ownership—built on an extended ProxyStore framework. The approach decouples data movement from control flow, enables cross-engine deployment, and provides automated lifecycle management for distributed objects. Key contributions include reference implementations, evaluation on synthetic benchmarks, and demonstrations on three scientific applications: 1000 Genomes, DeepDriveMD, and MOF Generation, with substantial gains in makespan, latency, throughput, and memory efficiency. The work advances portable, scalable, and efficient data-sharing patterns across heterogeneous HPC and cloud environments, with practical implications for accelerating data-intensive workflows.
Abstract
Workflow and serverless frameworks have empowered new approaches to distributed application design by abstracting compute resources. However, their typically limited or one-size-fits-all support for advanced data flow patterns leaves optimization to the application programmer -- optimization that becomes more difficult as data become larger. The transparent object proxy, which provides wide-area references that can resolve to data regardless of location, has been demonstrated as an effective low-level building block in such situations. Here we propose three high-level proxy-based programming patterns -- distributed futures, streaming, and ownership -- that make the power of the proxy pattern usable for more complex and dynamic distributed program structures. We motivate these patterns via careful review of application requirements and describe implementations of each pattern. We evaluate our implementations through a suite of benchmarks and by applying them in three substantial scientific applications, in which we demonstrate substantial improvements in runtime, throughput, and memory usage.
