Table of Contents
Fetching ...

LOCO: Rethinking Objects for Network Memory

George Hodgkins, Mark Madler, Joseph Izraelevitz

TL;DR

LOCO presents a channel-based programming model for networks of RDMA-enabled machines, reframing weak memory fabrics as a large, shared memory with weak coherence. It introduces channel objects that store state across multiple nodes, enabling composable primitives (e.g., locks, barriers, maps) built from base components like shared_region, owned_var, atomic_var, and SST. A memory-consistency framework using fences and ack_key tracking supports ordering across channels, culminating in a provably linearizable key-value store. Empirical results show LOCO can achieve RDMA-like performance with a simpler, more modular programming model, offering practical benefits for irregular workloads such as transactional processing and distributed data stores.

Abstract

In this work, we explore an object-based programming model for filling the space between shared memory and distributed systems programming. We argue that the natural representation for resources distributed across a memory network (e.g. RDMA or CXL) is the traditional shared memory object. This concurrent object (which we call a "channel" object) exports traditional methods, but, especially in an incoherent or uncacheable memory network, stores its state in a distributed fashion across all participating nodes. In a sense, the channel object's state is stored "across the network". Based on this philosophy, we introduce the Library of Channel Objects (LOCO), a library for building multi-node objects on RDMA. Channel objects are composable and designed for both the strong locality effects and the weak consistency of RDMA. Unlike prior work, channel objects do not hide memory complexity, instead relying on the programmer to use NUMA-like techniques to explicitly manage each object. As a consequence, our channel objects have performance similar to custom RDMA systems (e.g. distributed maps), but with a far simpler programming model. Our distributed map channel has better read and comparable write performance to a state-of-the-art custom RDMA solution, using well-encapsulated and reusable primitives.

LOCO: Rethinking Objects for Network Memory

TL;DR

LOCO presents a channel-based programming model for networks of RDMA-enabled machines, reframing weak memory fabrics as a large, shared memory with weak coherence. It introduces channel objects that store state across multiple nodes, enabling composable primitives (e.g., locks, barriers, maps) built from base components like shared_region, owned_var, atomic_var, and SST. A memory-consistency framework using fences and ack_key tracking supports ordering across channels, culminating in a provably linearizable key-value store. Empirical results show LOCO can achieve RDMA-like performance with a simpler, more modular programming model, offering practical benefits for irregular workloads such as transactional processing and distributed data stores.

Abstract

In this work, we explore an object-based programming model for filling the space between shared memory and distributed systems programming. We argue that the natural representation for resources distributed across a memory network (e.g. RDMA or CXL) is the traditional shared memory object. This concurrent object (which we call a "channel" object) exports traditional methods, but, especially in an incoherent or uncacheable memory network, stores its state in a distributed fashion across all participating nodes. In a sense, the channel object's state is stored "across the network". Based on this philosophy, we introduce the Library of Channel Objects (LOCO), a library for building multi-node objects on RDMA. Channel objects are composable and designed for both the strong locality effects and the weak consistency of RDMA. Unlike prior work, channel objects do not hide memory complexity, instead relying on the programmer to use NUMA-like techniques to explicitly manage each object. As a consequence, our channel objects have performance similar to custom RDMA systems (e.g. distributed maps), but with a far simpler programming model. Our distributed map channel has better read and comparable write performance to a state-of-the-art custom RDMA solution, using well-encapsulated and reusable primitives.

Paper Structure

This paper contains 29 sections, 3 theorems, 7 figures.

Key Result

lemma 1

All s, s, and s for a given key form a total modification order which respects the real-time ordering of the operations.

Figures (7)

  • Figure 1: LOCO barrier code
  • Figure 2: An SST with three participants. Arrows represent , pointing from the writer to readers.
  • Figure 3: Read and write operations in the .
  • Figure 4: Throughput of single-lock and transactions in OpenMPI and LOCO.
  • Figure 5: Throughput comparison of RDMA key-value stores.
  • ...and 2 more figures

Theorems & Definitions (3)

  • lemma 1
  • lemma 2
  • theorem 1