On the Time Derivative of the KL Divergence for a Generalized Langevin Annealing Scheme
Andreas Habring
TL;DR
This work provides a rigorous derivation of the time derivative of the KL divergence between the law $q_t$ of a Langevin diffusion and a time-dependent target density $p_t$. Under regularity and dissipativity assumptions, the authors prove that $\frac{d}{dt} \mathrm{KL}(q_t|\pi_t)$ exists a.e. and equals the sum of a Fisher-information-like term and a term involving $\partial_t \log \pi_t$, namely $\frac{d}{dt} \mathrm{KL}(q_t|\pi_t) = -\int q_t\,|\nabla \log \frac{q_t}{p_t}|^2\,dx - \int q_t\,\partial_t \log \pi_t\,dx$. The proof constructs mollified densities, analyzes the convergence of the mollified components, and passes to the limits to justify the formal calculation, establishing the absolute continuity of $t \mapsto \mathrm{KL}(q_t|\pi_t)$ and yielding a robust identity for generalized Langevin annealing schemes.
Abstract
Consider the Langevin diffusion process $\mathrm{d} X_t = \nabla \log p_t(X_t) + \sqrt{2}\mathrm{d} W_t$ guided by the time-dependent probability density $p_t(x)$. Let $q_t$ be the density of $X_t$. Recently, in order to analyze convergence in the Kullback-Leibler divergence, the time derivative of $t\mapsto \mathrm{KL}(q_t|p_t)$ has been used in several works without investigating in detail when such a derivative exists. In this short manuscript we provide a rigorous derivation of the quantity $\frac{\mathrm{d}}{\mathrm{d} t}\mathrm{KL}(q_t|p_t)$.
