A Chunked-Object Pattern for Multi-Region Large Payload Storage in Managed NoSQL Databases
Manideep Reddy Chinthareddy
TL;DR
The paper addresses the challenge of storing large payloads in managed NoSQL databases that impose strict per-item size limits, which creates cross-region replication hazards when paired with object storage. It introduces the chunked-object pattern, storing large entities as an ordered set of chunks with a metadata barrier to ensure atomic visibility within a single replication stream, and provides a DynamoDB-based reference implementation. Empirical evaluation on production telemetry shows the approach eliminates the dangling-pointer hazard and significantly improves p99 cross-region latency compared with the traditional S3-pointer pattern. The work generalizes to other KV stores and highlights the practical benefits for active-active, multi-region deployments by keeping data and metadata within one replication domain.
Abstract
Many managed key-value and NoSQL databases - such as Amazon DynamoDB, Azure Cosmos DB, and Google Cloud Firestore - enforce strict maximum item sizes (e.g., 400 KB in DynamoDB). This constraint imposes significant architectural challenges for applications requiring low-latency, multi-region access to objects that exceed these limits. The standard industry recommendation is to offload payloads to object storage (e.g., Amazon S3) while retaining a pointer in the database. While cost-efficient, this "pointer pattern" introduces network overhead and exposes applications to non-deterministic replication lag between the database and the object store, creating race conditions in active-active architectures. This paper presents a "chunked-object" pattern that persists large logical entities as sets of ordered chunks within the database itself. We precisely define the pattern and provide a reference implementation using Amazon DynamoDB Global Tables. The design generalizes to any key-value store with per-item size limits and multi-region replication. We evaluate the approach using telemetry from a production system processing over 200,000 transactions per hour. Results demonstrate that the chunked-object pattern eliminates cross-system replication lag hazards and reduces p99 cross-region time-to-consistency for 1 MB payloads by keeping data and metadata within a single consistency domain.
