Half-Marker Codes for Deletion Channels with Applications in DNA Storage
Javad Haghighat, Tolga M. Duman
TL;DR
This paper introduces half-marker codes, a variant of marker codes for DNA storage synchronization over IDS channels, by reserving one bit per 4-ary symbol for synchronization. The authors formalize half-marker construction, adapt the FB decoding to the altered statistics, and quantify mutual information gains via $I(\boldsymbol{u};\boldsymbol{L})$, aiding the design of concatenated LDPC codes. Numerical results show that half-marker codes can achieve higher overall achievable rates and lower end-to-end BER/SER than standard markers across multiple parameter settings, with HMC1 and HMC2 often outperforming their standard counterparts. The work highlights a practical pathway to improved DNA storage reliability and throughput, while also suggesting directions for optimizing LDPC–half-marker schemes in future studies.
Abstract
DNA storage systems face significant challenges, including insertion, deletion, and substitution (IDS) errors. Therefore, designing effective synchronization codes, i.e., codes capable of correcting IDS errors, is essential for DNA storage systems. Marker codes are a favorable choice for this purpose. In this paper, we extend the notion of marker codes by making the following key observation. Since each DNA base is equivalent to a 2-bit storage unit, one bit can be reserved for synchronization, while the other is dedicated to data transmission. Using this observation, we propose a new class of marker codes, which we refer to as half-marker codes. We demonstrate that this extension has the potential to significantly increase the mutual information between the input symbols and the soft outputs of an IDS channel modeling a DNA storage system. Specifically, through examples, we show that when concatenated with an outer error-correcting code, half-marker codes outperform standard marker codes and significantly reduce the end-to-end bit error rate of the system.
