Table of Contents
Fetching ...

NEURODNAAI: Neural pipeline approaches for the advancing dna-based information storage as a sustainable digital medium using deep learning framework

Rakesh Thakur, Lavanya Singh, Yashika, Manomay Bundawala, Aruna Kumar

TL;DR

NeuroDNAAI tackles the challenge of reliable DNA-based information storage by presenting a Transformer-based encoder–decoder that operates within a biologically informed noise framework, including PCR-aware amplification and sequencing considerations. The approach unifies coding theory with realistic molecular constraints, enabling end-to-end simulation and robust reconstruction for binary data encoded in DNA. Key findings include significant improvements in bit-level fidelity (low BER) and perceptual image quality (high PSNR/SSIM) alongside a strong downstream task performance, demonstrating practical potential for archival DNA storage. The work also provides an open-source simulator and reproducibility framework, highlighting pathways toward scalable, constraint-aware DNA storage systems and future integration with traditional ECC and wet-lab validation.

Abstract

DNA is a promising medium for digital information storage for its exceptional density and durability. While prior studies advanced coding theory, workflow design, and simulation tools, challenges such as synthesis costs, sequencing errors, and biological constraints (GC-content imbalance, homopolymers) limit practical deployment. To address this, our framework draws from quantum parallelism concepts to enhance encoding diversity and resilience, integrating biologically informed constraints with deep learning to enhance error mitigation in DNA storage. NeuroDNAAI encodes binary data streams into symbolic DNA sequences, transmits them through a noisy channel with substitutions, insertions, and deletions, and reconstructs them with high fidelity. Our results show that traditional prompting or rule-based schemes fail to adapt effectively to realistic noise, whereas NeuroDNAAI achieves superior accuracy. Experiments on benchmark datasets demonstrate low bit error rates for both text and images. By unifying theory, workflow, and simulation into one pipeline, NeuroDNAAI enables scalable, biologically valid archival DNA storage

NEURODNAAI: Neural pipeline approaches for the advancing dna-based information storage as a sustainable digital medium using deep learning framework

TL;DR

NeuroDNAAI tackles the challenge of reliable DNA-based information storage by presenting a Transformer-based encoder–decoder that operates within a biologically informed noise framework, including PCR-aware amplification and sequencing considerations. The approach unifies coding theory with realistic molecular constraints, enabling end-to-end simulation and robust reconstruction for binary data encoded in DNA. Key findings include significant improvements in bit-level fidelity (low BER) and perceptual image quality (high PSNR/SSIM) alongside a strong downstream task performance, demonstrating practical potential for archival DNA storage. The work also provides an open-source simulator and reproducibility framework, highlighting pathways toward scalable, constraint-aware DNA storage systems and future integration with traditional ECC and wet-lab validation.

Abstract

DNA is a promising medium for digital information storage for its exceptional density and durability. While prior studies advanced coding theory, workflow design, and simulation tools, challenges such as synthesis costs, sequencing errors, and biological constraints (GC-content imbalance, homopolymers) limit practical deployment. To address this, our framework draws from quantum parallelism concepts to enhance encoding diversity and resilience, integrating biologically informed constraints with deep learning to enhance error mitigation in DNA storage. NeuroDNAAI encodes binary data streams into symbolic DNA sequences, transmits them through a noisy channel with substitutions, insertions, and deletions, and reconstructs them with high fidelity. Our results show that traditional prompting or rule-based schemes fail to adapt effectively to realistic noise, whereas NeuroDNAAI achieves superior accuracy. Experiments on benchmark datasets demonstrate low bit error rates for both text and images. By unifying theory, workflow, and simulation into one pipeline, NeuroDNAAI enables scalable, biologically valid archival DNA storage

Paper Structure

This paper contains 25 sections, 3 figures.

Figures (3)

  • Figure 1: Architecture
  • Figure :
  • Figure :