Table of Contents
Fetching ...

Hamiltonian Monte Carlo methods for spectroscopy data analysis

Daniel McBride, Ioannis Sgouralis

Abstract

We present a scalable Bayesian framework for the analysis of confocal fluorescence spectroscopy data, addressing key limitations in traditional fluorescence correlation spectroscopy methods. Our framework captures molecular motion, microscope optics, and photon detection with high fidelity, enabling statistical inference of molecule trajectories from raw photon count data, introducing a superresolution parameter which further enhances trajectory estimation beyond the native time resolution of data acquisition. To handle the high dimensionality of the arising posterior distribution, we develop a family of Hamiltonian Monte Carlo (HMC) algorithms that leverages the unique characteristics inherent to spectroscopy data analysis. Here, due to the highly-coupled correlation structure of the target posterior distribution, HMC requires the numerical solution of a stiff ordinary differential equation containing a two-scale discrete Laplacian. By considering the spectral properties of this operator, we produce a CFL-type integrator stability condition for the standard Störmer-Verlet integrator used in HMC. To circumvent this instability we introduce a semi-implicit (IMEX) method which treats the stiff and non-stiff parts differently, while leveraging the sparse structure of the discrete Laplacian for computational efficiency. Detailed numerical experiments demonstrate that this method improves upon fully explicit approaches, allowing larger HMC step sizes and maintaining second-order accuracy in position and energy. Our framework provides a foundation for extensions to more complex models such as surface constrained molecular motion or motion with multiple diffusion modes.

Hamiltonian Monte Carlo methods for spectroscopy data analysis

Abstract

We present a scalable Bayesian framework for the analysis of confocal fluorescence spectroscopy data, addressing key limitations in traditional fluorescence correlation spectroscopy methods. Our framework captures molecular motion, microscope optics, and photon detection with high fidelity, enabling statistical inference of molecule trajectories from raw photon count data, introducing a superresolution parameter which further enhances trajectory estimation beyond the native time resolution of data acquisition. To handle the high dimensionality of the arising posterior distribution, we develop a family of Hamiltonian Monte Carlo (HMC) algorithms that leverages the unique characteristics inherent to spectroscopy data analysis. Here, due to the highly-coupled correlation structure of the target posterior distribution, HMC requires the numerical solution of a stiff ordinary differential equation containing a two-scale discrete Laplacian. By considering the spectral properties of this operator, we produce a CFL-type integrator stability condition for the standard Störmer-Verlet integrator used in HMC. To circumvent this instability we introduce a semi-implicit (IMEX) method which treats the stiff and non-stiff parts differently, while leveraging the sparse structure of the discrete Laplacian for computational efficiency. Detailed numerical experiments demonstrate that this method improves upon fully explicit approaches, allowing larger HMC step sizes and maintaining second-order accuracy in position and energy. Our framework provides a foundation for extensions to more complex models such as surface constrained molecular motion or motion with multiple diffusion modes.

Paper Structure

This paper contains 22 sections, 50 equations, 7 figures, 4 algorithms.

Figures (7)

  • Figure 1: The measurement scale mesh is indexed by $n=1:N$, with highlighted exposure windows and dead time periods in between. We show one such window blown-up with the superresolution subpanel mesh, indexed by $k=1:K$. The measurement $w_n$ is the photon count for the $n^{\text{th}}$ duty cycle. Overall the experiment starts at $t_0$ and ends at $t_N$.
  • Figure 2: The top panel shows the measured photon counts. The middle and bottom panels show $|q^{(0:J)}|$ for the SVEX scheme and the split IMEX scheme, respectively, along with the ground truth in bold.
  • Figure 3: This figure shows $b_{\phi_h}(q_0)$ against $h$ on a vertical semilog scale for both the full posterior system integrated with $\phi_h = \phi_h^{\text{SVEX}}$ (plotted in red with circular ticks), as well as the prior subsystem integrated with $\phi_h = \chi_h^{\text{SV}}$ (plotted in blue with cross marks). The same value for $q_0$, was used for all trials with $q_0$ sampled from the prior. The measurements $w$ used in the posterior system were generated via ancestral sampling. All other parameters were held fixed including $N=20, K=20$ and $L=20$. These plots are representative of the general behavior, with repeated simulations for randomly generated $q_0$ being indistinguishable at this scale.
  • Figure 4: Panels A and B display the phase space trajectory for a single coordinate of $(q_{\text{prior}}(\eta),p_{\text{prior}}(\eta))$, the numerical solution to the prior subsystem integrated for $L=100$ steps with Störmer-Verlet from the same initial conditions. The omitted coordinates have qualitatively similar behavior. Panels C and D show the total energy $H_{\text{prior}}$ of the solutions sampled in panels A and B, respectively, as a function of HMC-time $\eta$.
  • Figure 5: Here we compute $AR(h)$ for a range of values of $h$ for both the SVEX numerical scheme and the IMEX splitting scheme with the implicit midpoint method integrating the prior subsystem. As expected, we see a faster decay in the MCMC stability for the explicit scheme. For example, with a step size of $h = 0.06$ the explicit scheme has an $AR$ of around $0.38$ while the IMEX scheme has an $AR$ of about $0.63$.
  • ...and 2 more figures