Table of Contents
Fetching ...

ULTRA-MC: A Unified Approach to Learning Mixtures of Markov Chains via Hitting Times

Fabian Spaeh, Konstantinos Sotiropoulos, Charalampos E. Tsourakakis

TL;DR

ULTRA-MC addresses the problem of learning mixtures of discrete-time and continuous-time Markov chains from (potentially noisy) hitting-time observations by unifying the two settings through hitting times. It optimizes the Laplacian pseudoinverse $L^+$ to match the hitting-time matrix $H$ and extends to mixtures with an EM-style loop, leveraging efficient gradient computations with complexity $O(n^comega)$. The approach demonstrates scalability to around $n\approx 1000$ nodes and outperforms competitive baselines in both DTMC and CTMC scenarios, including applications to NBA passing data. The work has practical impact for modeling complex, asymmetric state dynamics in domains such as healthcare, web analytics, and sports analytics by providing a robust, unified, and scalable learning framework from trajectory data.

Abstract

This study introduces a novel approach for learning mixtures of Markov chains, a critical process applicable to various fields, including healthcare and the analysis of web users. Existing research has identified a clear divide in methodologies for learning mixtures of discrete and continuous-time Markov chains, while the latter presents additional complexities for recovery accuracy and efficiency. We introduce a unifying strategy for learning mixtures of discrete and continuous-time Markov chains, focusing on hitting times, which are well defined for both types. Specifically, we design a reconstruction algorithm that outputs a mixture which accurately reflects the estimated hitting times and demonstrates resilience to noise. We introduce an efficient gradient-descent approach, specifically tailored to manage the computational complexity and non-symmetric characteristics inherent in the calculation of hitting time derivatives. Our approach is also of significant interest when applied to a single Markov chain, thus extending the methodologies previously established by Hoskins et al. and Wittmann et al. We complement our theoretical work with experiments conducted on synthetic and real-world datasets, providing a comprehensive evaluation of our methodology.

ULTRA-MC: A Unified Approach to Learning Mixtures of Markov Chains via Hitting Times

TL;DR

ULTRA-MC addresses the problem of learning mixtures of discrete-time and continuous-time Markov chains from (potentially noisy) hitting-time observations by unifying the two settings through hitting times. It optimizes the Laplacian pseudoinverse to match the hitting-time matrix and extends to mixtures with an EM-style loop, leveraging efficient gradient computations with complexity . The approach demonstrates scalability to around nodes and outperforms competitive baselines in both DTMC and CTMC scenarios, including applications to NBA passing data. The work has practical impact for modeling complex, asymmetric state dynamics in domains such as healthcare, web analytics, and sports analytics by providing a robust, unified, and scalable learning framework from trajectory data.

Abstract

This study introduces a novel approach for learning mixtures of Markov chains, a critical process applicable to various fields, including healthcare and the analysis of web users. Existing research has identified a clear divide in methodologies for learning mixtures of discrete and continuous-time Markov chains, while the latter presents additional complexities for recovery accuracy and efficiency. We introduce a unifying strategy for learning mixtures of discrete and continuous-time Markov chains, focusing on hitting times, which are well defined for both types. Specifically, we design a reconstruction algorithm that outputs a mixture which accurately reflects the estimated hitting times and demonstrates resilience to noise. We introduce an efficient gradient-descent approach, specifically tailored to manage the computational complexity and non-symmetric characteristics inherent in the calculation of hitting time derivatives. Our approach is also of significant interest when applied to a single Markov chain, thus extending the methodologies previously established by Hoskins et al. and Wittmann et al. We complement our theoretical work with experiments conducted on synthetic and real-world datasets, providing a comprehensive evaluation of our methodology.
Paper Structure (34 sections, 9 theorems, 55 equations, 10 figures, 3 tables, 2 algorithms)

This paper contains 34 sections, 9 theorems, 55 equations, 10 figures, 3 tables, 2 algorithms.

Key Result

Lemma 4.1

We have $H = A - B$ for $a=\mathbf 1^{\top}L^{+}\in\mathbb{R}^{1\times n}$ and

Figures (10)

  • Figure 1: Three offensive strategies of the Denver Nuggets during the 2022 season, learned from a mixture of $C=6$ continuous-time Markov chains. We provide the remaining strategies in Figure \ref{['fig:nuggets']} and a detailed explanation in Section \ref{['sec:exp']}.
  • Figure 2: We measure the error $\| H - \hat{H} \|_F$ in the estimation of hitting times from trails for $n=16$ nodes on the complete graph ($K_n$), star graph ($S_n$), lollipop graph (LOL), and grid graph (GRID). We show the mean over $5$ runs. The cover time for the graphs are $16$, $46$, $612$, and $\approx 59.4$ respectively and we vary the trail length to a multiple of the cover time for each graph.
  • Figure 3: Learning Markov chains from noisy hitting times, for different graph types and noise levels. We plot the difference between the recovery error achieved by wittmann2009reconstruction and our method. The difference is consistently positive, indicating an improvement for ULTRA-MC. On the left, the improvement for the lollipop graph LOL$_n$ exceeds $50$, so we omit it from the plot.
  • Figure 4: Learning a discrete-time (left) and continuous time (right) random mixture of $C=2$ chains from trails. We report average and standard deviation. Missing points indicate a timeout at 2 hours. For readability, we report the standard deviation separately in Table \ref{['tab:std-ht-mix']} of Appendix \ref{['sec:apx-exp']}
  • Figure 5: Three out of $C=6$ strategies of the Denver Nuggets. The remaining ones are in Figure \ref{['fig:nba']}. We mark the six positions in a basketball game: Point Guard (PG), Shooting Guard (SG), Power Forward (PF), Center (C), and Small Forward (SF). Each position is annoted with the average ball holding time. Arrow thickness and opacity reflect the probability of a pass, and we omit passes that occur with probability less than 0.2 for clarity. The player most likely to start a passing game is highlighted in blue. Attempted shots are indicated in red (miss) and green (score).
  • ...and 5 more figures

Theorems & Definitions (18)

  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Lemma A.1
  • proof
  • Lemma A.2
  • proof
  • proof
  • proof
  • ...and 8 more