Table of Contents
Fetching ...

Marker+Codeword+Marker: A Coding Structure for Segmented Single-Insdel/-Edit Channels

Zhen Li, Xuan He, Xiaohu Tang

TL;DR

The study tackles segmented channels where each length-$n$ segment may incur at most one error. It introduces a marker+codeword+marker encoding paradigm that leverages Varshamov-Tenengolts codes to correct segmented single-insdel errors, achieving a per-segment redundancy of $\log_2(n-6)+7$ with linear-time encoding/decoding, and extends the framework to segmented single-edit errors using larger markers and $VT_a(2k;k)$ codewords to obtain a per-segment redundancy of $\log_2(n-9)+10$ with linear-time performance. The constructions rely on carefully chosen marker patterns and VT-codewords to enable efficient localization and correction of errors within each segment while preserving segment boundaries. This work provides the first binary segmented single-edit ECCs with linear-time encoders/decoders and the lowest known redundancy for segmented single-insdel ECCs, offering practical pathways for reliable data transmission/storage in segmented channels. The methods have potential implications for DNA storage and other applications where partitioned data streams experience localized synchronization errors.

Abstract

An insdel refers to a deletion or an insertion, and an edit refers to an insdel or a substitution. In this paper, we consider the segmented single-insdel (resp. single-edit) channel, where the channel's input bit stream is partitioned into segments of length $n$ and each segment can suffer from at most a single insdel (resp. edit) error. The value of $n$ is known to the receiver but the boundaries of segments are not. We propose to encode each segment following a marker+codeword+marker structure, where the two markers are carefully selected and the codewords are chosen from Varshamov-Tenegolts (VT) codes. In this way, we are able to construct a new class of binary codes that can correct segmented single-insdel errors. Our codes have the lowest redundancy of $\log_2(n-6)+7$ bits and are the first one that has linear-time encoder/decoder in the literature. Moreover, by enhancing the VT codes and one of the markers, we are able to construct the first class of binary codes that can correct segmented single-edit errors. This class of codes has redundancy $\log_2(n-9)+10$ bits and has linear-time encoder/decoder.

Marker+Codeword+Marker: A Coding Structure for Segmented Single-Insdel/-Edit Channels

TL;DR

The study tackles segmented channels where each length- segment may incur at most one error. It introduces a marker+codeword+marker encoding paradigm that leverages Varshamov-Tenengolts codes to correct segmented single-insdel errors, achieving a per-segment redundancy of with linear-time encoding/decoding, and extends the framework to segmented single-edit errors using larger markers and codewords to obtain a per-segment redundancy of with linear-time performance. The constructions rely on carefully chosen marker patterns and VT-codewords to enable efficient localization and correction of errors within each segment while preserving segment boundaries. This work provides the first binary segmented single-edit ECCs with linear-time encoders/decoders and the lowest known redundancy for segmented single-insdel ECCs, offering practical pathways for reliable data transmission/storage in segmented channels. The methods have potential implications for DNA storage and other applications where partitioned data streams experience localized synchronization errors.

Abstract

An insdel refers to a deletion or an insertion, and an edit refers to an insdel or a substitution. In this paper, we consider the segmented single-insdel (resp. single-edit) channel, where the channel's input bit stream is partitioned into segments of length and each segment can suffer from at most a single insdel (resp. edit) error. The value of is known to the receiver but the boundaries of segments are not. We propose to encode each segment following a marker+codeword+marker structure, where the two markers are carefully selected and the codewords are chosen from Varshamov-Tenegolts (VT) codes. In this way, we are able to construct a new class of binary codes that can correct segmented single-insdel errors. Our codes have the lowest redundancy of bits and are the first one that has linear-time encoder/decoder in the literature. Moreover, by enhancing the VT codes and one of the markers, we are able to construct the first class of binary codes that can correct segmented single-edit errors. This class of codes has redundancy bits and has linear-time encoder/decoder.
Paper Structure (7 sections, 8 theorems, 7 equations, 3 tables, 2 algorithms)

This paper contains 7 sections, 8 theorems, 7 equations, 3 tables, 2 algorithms.

Key Result

Theorem 1

With notations as before. Define the code under the following two rules: Then $\mathcal{C}$ is capable of correcting segmented single-insdel errors with redundancy $\log_2(n-6)+7$ bits and with linear-time encoding/decoding algorithm.

Theorems & Definitions (9)

  • Definition 1
  • Theorem 1
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Theorem 2
  • Lemma 4
  • Lemma 5
  • Lemma 6