Optimal Source Coding of Markov Chains for Real-Time Remote Estimation
Ismail Cosandal, Sennur Ulukus
TL;DR
This paper tackles real-time remote monitoring of Markov sources by assuming that each bit transmission lasts one time slot and the source evolves on the same time scale. It models the problem as an MDP with an augmented state that includes the current symbol and its transmission duration, and solves for an optimal policy via policy iteration using complete codes that satisfy Kraft's equality. Two Huffman-based benchmarks (Myopic and Steady-State) are analyzed for comparison, and the framework is extended to CTMCs by treating per-bit delays with the matrix exponential $\mathrm{Exp}(\bm{Q}d)$. Empirical results on randomly generated transition matrices show that the proposed optimal policy consistently reduces the average transmission duration relative to the benchmarks, with the gain depending on process parameters such as $N$ and $\beta$. The work provides a principled method for designing real-time remote estimators where coding latency and source dynamics interact in a time-scale-accurate manner.
Abstract
We revisit the source coding problem for a Markov chain under the assumption that the transmission times and how fast the Markov chain transitions its state happen at the same time-scale. Specifically, we assume that the transmission of each bit takes a single time slot, and the Markov chain updates its state in the same time slot. Thus, the length of the codeword assigned to a symbol determines the number of non-transmitted symbols, as well as, the probability of the realization of the next symbol to be transmitted. We aim to minimize the average transmission duration over an infinite horizon by proposing an optimal source coding policy based on the last transmitted symbol and its transmission duration. To find the optimal policy, we formulate the problem with a Markov decision process (MDP) by augmenting the symbols alongside the transmission duration of the symbols. Finally, we analyze two Huffman-based benchmark policies and compare their performances with the proposed optimal policy. We observe that, in randomly generated processes, our proposed optimal policy decreases the average transmission duration compared to benchmark policies. The performance gain varies based on the parameters of the Markov process.
