Table of Contents
Fetching ...

Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation

Ö. Deniz Akyildiz, Francesca Romana Crucinio, Mark Girolami, Tim Johnston, Sotirios Sabanis

TL;DR

This work introduces Interacting Particle Langevin Algorithm (IPLA), a diffusion-based approach for maximum marginal likelihood estimation in latent-variable models. By formulating a continuous-time IPS on $(\theta, X_1,...,X_N)$ with invariant measure $\pi_*^N(\theta,x_1,...,x_N) \propto e^{-\,\sum_{i=1}^N U(\theta,x_i)}$ and $U(\theta,x)=-\log p_\theta(x,y)$, the authors establish that the $\theta$-marginal concentrates around the MMLE $\bar{\theta}^*$ as $N\to\infty$, enabling nonasymptotic convergence guarantees. The paper provides a rigorous analysis of the continuous-time IPS, its Euler–Maruyama discretisation, and extensions to stochastic gradients, yielding explicit bounds that decompose into concentration, ergodicity, and discretisation components. The IPLA algorithm is demonstrated on synthetic and real logistic-regression tasks, showing competitive performance with existing particle-based methods while delivering explicit nonasymptotic error control. Overall, this diffusion-based MMLE framework offers a principled, scalable alternative to EM with transparent convergence guarantees and potential extensions to nonconvex settings.

Abstract

We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation (MMLE) procedure to estimate the parameters of a latent variable model. We achieve this by formulating a continuous-time interacting particle system which can be seen as a Langevin diffusion over an extended state space of parameters and latent variables. In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure where number of particles acts as the inverse temperature parameter in classical settings for global optimisation. Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error in a manner that is uniform in time and does not increase with the number of particles. The discretisation results in an algorithm, termed Interacting Particle Langevin Algorithm (IPLA) which can be used for MMLE. We further prove nonasymptotic bounds for the optimisation error of our estimator in terms of key parameters of the problem, and also extend this result to the case of stochastic gradients covering practical scenarios. We provide numerical experiments to illustrate the empirical behaviour of our algorithm in the context of logistic regression with verifiable assumptions. Our setting provides a straightforward way to implement a diffusion-based optimisation routine compared to more classical approaches such as the Expectation Maximisation (EM) algorithm, and allows for especially explicit nonasymptotic bounds.

Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation

TL;DR

This work introduces Interacting Particle Langevin Algorithm (IPLA), a diffusion-based approach for maximum marginal likelihood estimation in latent-variable models. By formulating a continuous-time IPS on with invariant measure and , the authors establish that the -marginal concentrates around the MMLE as , enabling nonasymptotic convergence guarantees. The paper provides a rigorous analysis of the continuous-time IPS, its Euler–Maruyama discretisation, and extensions to stochastic gradients, yielding explicit bounds that decompose into concentration, ergodicity, and discretisation components. The IPLA algorithm is demonstrated on synthetic and real logistic-regression tasks, showing competitive performance with existing particle-based methods while delivering explicit nonasymptotic error control. Overall, this diffusion-based MMLE framework offers a principled, scalable alternative to EM with transparent convergence guarantees and potential extensions to nonconvex settings.

Abstract

We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation (MMLE) procedure to estimate the parameters of a latent variable model. We achieve this by formulating a continuous-time interacting particle system which can be seen as a Langevin diffusion over an extended state space of parameters and latent variables. In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure where number of particles acts as the inverse temperature parameter in classical settings for global optimisation. Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error in a manner that is uniform in time and does not increase with the number of particles. The discretisation results in an algorithm, termed Interacting Particle Langevin Algorithm (IPLA) which can be used for MMLE. We further prove nonasymptotic bounds for the optimisation error of our estimator in terms of key parameters of the problem, and also extend this result to the case of stochastic gradients covering practical scenarios. We provide numerical experiments to illustrate the empirical behaviour of our algorithm in the context of logistic regression with verifiable assumptions. Our setting provides a straightforward way to implement a diffusion-based optimisation routine compared to more classical approaches such as the Expectation Maximisation (EM) algorithm, and allows for especially explicit nonasymptotic bounds.
Paper Structure (32 sections, 168 equations, 3 figures, 1 algorithm)

This paper contains 32 sections, 168 equations, 3 figures, 1 algorithm.

Figures (3)

  • Figure 1: The performance of IPLA and PGD on the synthetic logistic regression problem. Each column corresponds to one of the components of the true parameter $\bar{\theta}^\star = [2,3,4]$. The top row corresponds to $N=10$, the middle row to $N=100$ and the bottom row to $N=500$. As expected, the PGD and IPLA perform similarly, especially as $N$ increases.
  • Figure 2: The performance of IPLA, PGD, MFVI and SOUL on the logistic regression experiment on real data. The left plot corresponds to $N=100$, the middle plot to $N=500$ and the right plot to $N=1000$. As expected, the PGD and IPLA perform similarly, especially as $N$ increases. We also note that SOUL is significantly slower compared to PGD and IPLA.
  • Figure 3: The convergence rate of the variance of the parameter estimates produced by PGD and IPLA over $100$ Monte Carlo runs for each $N \in \{10, 100, 1000, 10000\}$. We can see that the $\mathcal{O}(1/N)$ convergence rate holds for the second moments as suggested by our results for IPLA.

Theorems & Definitions (19)

  • Example 1
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • ...and 9 more