On Counting Subsequences and Higher-Order Fibonacci Numbers

Hsin-Po Wang; Chi-Wei Chin

On Counting Subsequences and Higher-Order Fibonacci Numbers

Hsin-Po Wang, Chi-Wei Chin

TL;DR

The number of unordered sets of n strands of DNA that have a common supersequence whose length is at most $t$ is studied to maximize the amount of information that can be synthesized into DNA within a finite amount of time.

Abstract

In array-based DNA synthesis, multiple strands of DNA are synthesized in parallel to reduce the time cost from the sum of their lengths to the length their shortest common supersequences. To maximize the amount of information that can be synthesized into DNA within a finite amount of time, we study the number of unordered sets of $n$ strands of DNA that have a common supersequence whose length is at most $t$. Our analysis stems from the following connection: The number of subsequences of A C G T A C G T A C G T ... is the partial sum (prefix sum) of the fourth-order Fibonacci numbers.

On Counting Subsequences and Higher-Order Fibonacci Numbers

TL;DR

The number of unordered sets of n strands of DNA that have a common supersequence whose length is at most

is studied to maximize the amount of information that can be synthesized into DNA within a finite amount of time.

Abstract

strands of DNA that have a common supersequence whose length is at most

. Our analysis stems from the following connection: The number of subsequences of A C G T A C G T A C G T ... is the partial sum (prefix sum) of the fourth-order Fibonacci numbers.

Paper Structure (18 sections, 13 theorems, 19 equations, 3 figures)

This paper contains 18 sections, 13 theorems, 19 equations, 3 figures.

Introduction
Problem statements
Answering Question \ref{['que:2']}
The q = 2 case
The general q case
Growth of Fibonacci
Answering Question \ref{['que:3']}
Upper bound
Lower bound for q = 3
Lower bound for general q
Answering Question \ref{['que:4']}
The bounds
The best p
Answering Question \ref{['que:5']}
The intuition
...and 3 more sections

Key Result

Proposition 8

The number of the subsequences of $\overline{a_1 \dotsm a_q}^t$ that are not subsequences of $\overline{a_1 \dotsm a_q}^{t-1}$ is $F_q(t)$. The number of the subsequences of $\overline{a_1 \dotsm a_q}^t$ is the partial sum $F_q(0) + \dotsb + F_q(t)$.

Figures (3)

Figure 1: Array-based DNA synthesis: At time $s = 1$, we ask $x^1, x^2, x^3, x^4$ whether they want $M_1 = \mathsf A$; only $x^2$ says yes. At time $s = 2$, we ask whether they want $M_2 = \mathsf C$, and $x^3, x^4$ say yes. The same process repeats until we ask whether they want $M_{12} = \mathsf T$, and $x^1$ says yes.
Figure 2: Details of array-based DNA synthesis. Each (a)--(d) cycle consumes one letter from the master lineup. Cf. HVS23.
Figure 3: Left: CaH69 counts the subsequences of $\overline{\mathsf A\mathsf C}^t$ by length. Right: we count by $\tau$ (i.e., synthesis time).

Theorems & Definitions (14)

Definition 7: Higher-order Fibonacci numbers
Proposition 8
Proposition 9
Lemma 10
Lemma 11
Proposition 12
Proposition 13
Proposition 14
Proposition 15
Proposition 16
...and 4 more

On Counting Subsequences and Higher-Order Fibonacci Numbers

TL;DR

Abstract

On Counting Subsequences and Higher-Order Fibonacci Numbers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (14)