Improved Interactive Protocol for Synchronizing From Deletions
Haolun, Ni, Lev Tauz, Ryan Gabrys, Lara Dolecek
TL;DR
This work advances data synchronization under deletions by embedding multi-deletion correction codes into a baseline interactive protocol and introducing adaptive segmenting. The proposed three-module scheme—Matching, Deletion Recovery, and Error Correction—leverages segment-length tuning and generalized upper bounds to lower communication cost while maintaining polynomial-time feasibility. The authors provide a rigorous upper bound on transmitted bits and validate improvements experimentally, showing notable reductions in redundancy compared to the baseline. The approach enables low-redundancy synchronization in environments with deletions, with potential extensions to broader edit models and more efficient multi-deletion codes.
Abstract
Data synchronization is a fundamental problem with applications in diverse fields such as cloud storage, genomics, and distributed systems. This paper addresses the challenge of synchronizing two files, one of which is a subsequence of the other and related through a constant rate of deletions, using an improved communication protocol. Building upon prior work, we integrate advanced multi-deletion correction codes into an existing baseline protocol, which previously relied on single-deletion correction. Our proposed protocol reduces communication cost by leveraging more general partitioning techniques as well as multi-deletion error correction. We derive a generalized upper bound on the expected number of transmitted bits, applicable to a broad class of deletion correction codes. Experimental results demonstrate that our approach outperforms the baseline in communication cost. These findings establish the efficacy of the improved protocol in achieving low-redundancy synchronization in scenarios where deletion errors occur.
