Table of Contents
Fetching ...

Efficient sampling of fast and slow cosmological parameters

Antony Lewis

TL;DR

The paper tackles the computational challenge of Bayesian cosmological parameter inference when many nuisance parameters are fast to evaluate compared with slow cosmological parameters. It introduces a Cholesky-based decorrelation to reorder parameters so that fast and slow components remain separable, enabling efficient full-space sampling; it also explores oversampling of fast parameters and Neal's dragging scheme to handle non-ideal dependencies. Through Planck-like experiments, the study demonstrates significant speedups and robustness, especially when covariance is unknown and adaptively learned. The work culminates in practical guidance for implementing fast-slow samplers in CosmoMC and highlights broader applicability to problems with pronounced speed hierarchies in likelihood evaluations.

Abstract

Physical parameters are often constrained from the data likelihoods using sampling methods. Changing some parameters can be much more computationally expensive (`slow') than changing other parameters (`fast parameters'). I describe a method for decorrelating fast and slow parameters so that parameter sampling in the full space becomes almost as efficient as sampling in the slow subspace when the covariance is well known and the distributions are simple. This gives a large reduction in computational cost when there are many fast parameters. The method can also be combined with a fast 'dragging' method proposed by Neal (2005) that can be more robust and efficient when parameters cannot be fully decorrelated a priori or have more complicated dependencies. I illustrate these methods for the case of cosmological parameter estimation using data likelihoods from the Planck satellite observations with dozens of fast nuisance parameters, and demonstrate a speed up by a factor of five or more. In more complicated cases, especially where the fast subspace is very fast but complex or highly correlated, the fast-slow sampling methods can in principle give arbitrarily large performance gains. The new samplers are implemented in the latest version of the publicly available CosmoMC code.

Efficient sampling of fast and slow cosmological parameters

TL;DR

The paper tackles the computational challenge of Bayesian cosmological parameter inference when many nuisance parameters are fast to evaluate compared with slow cosmological parameters. It introduces a Cholesky-based decorrelation to reorder parameters so that fast and slow components remain separable, enabling efficient full-space sampling; it also explores oversampling of fast parameters and Neal's dragging scheme to handle non-ideal dependencies. Through Planck-like experiments, the study demonstrates significant speedups and robustness, especially when covariance is unknown and adaptively learned. The work culminates in practical guidance for implementing fast-slow samplers in CosmoMC and highlights broader applicability to problems with pronounced speed hierarchies in likelihood evaluations.

Abstract

Physical parameters are often constrained from the data likelihoods using sampling methods. Changing some parameters can be much more computationally expensive (`slow') than changing other parameters (`fast parameters'). I describe a method for decorrelating fast and slow parameters so that parameter sampling in the full space becomes almost as efficient as sampling in the slow subspace when the covariance is well known and the distributions are simple. This gives a large reduction in computational cost when there are many fast parameters. The method can also be combined with a fast 'dragging' method proposed by Neal (2005) that can be more robust and efficient when parameters cannot be fully decorrelated a priori or have more complicated dependencies. I illustrate these methods for the case of cosmological parameter estimation using data likelihoods from the Planck satellite observations with dozens of fast nuisance parameters, and demonstrate a speed up by a factor of five or more. In more complicated cases, especially where the fast subspace is very fast but complex or highly correlated, the fast-slow sampling methods can in principle give arbitrarily large performance gains. The new samplers are implemented in the latest version of the publicly available CosmoMC code.

Paper Structure

This paper contains 14 sections, 8 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Possible proposal directions (arrows) for sampling a correlated 2D distribution: Left: a choice of orthogonal eigenvector directions explores efficiently but requires changing both fast and slow parameters in both proposal directions; Centre: a choice that allows fast moves in the $x$-direction but is non-orthogonal; Right: By performing a linear shearing (parameter redefinition of the slow direction) the proposal distribution can be orthogonal and changes in the fast $x$ direction can remain fast.
  • Figure 2: Illustration of sampling from a correlated 2D distribution (contours, where the correlation is unknown a priori and hence not taken out by a variable decorrelation transformation). Here $x$ is a fast direction, $y$ is slow. Consider proposing a move $y\rightarrow y'$ as shown by the arrow in the top plot. Normally this would be immediately rejected with very high probability. Lower plot: the dragging method takes samples in the fast $x$ direction from a series of interpolating distributions [magenta lines] that interpolate between $P(x|y)$ [blue] and $P(x|y')$ [red]. This allows the fast samples to gradually explore the degeneracy direction, for example ending up at the red point in the upper plot. If there are enough interpolating steps the total move is accepted with probability similar to sampling from the marginalized distribution $P(x)$ [solid black line in the lower plot], which in general is not possible directly.
  • Figure 3: Indicative correlation length in units of slow parameter steps for Planck parameter estimation runs in the baseline six-parameter model with known covariance. Left shows Planck+WP ($N_{\rm fast}=13$), right shows Planck+WP+highL ($N_{\rm fast}=31$). Solid lines show the correlation length for one of the slow parameters, dashed lines show one of the fast parameters. Using the dragging method with $f_{\rm drag} \ge 2$ or Metropolis with $f_{\rm fast}\agt 4$ reduces the fast-parameter correlation length to be similar or smaller than the slow correlation length, improving convergence compared to the simplest fast-slow scheme (Metropolis with $f_{\rm fast}=1$). Actual total efficiency depends on the relative speed of the fast and slow steps.
  • Figure 4: Possible radial proposal distributions. An $n$-D Gaussian proposal distribution corresponds to choosing a random direction in parameter space and proposing a move by distance $r$ in that direction with probability $P_n(r)$; larger $n$ become more sharply peaked near one, and in particular very rarely propose much smaller or larger moves. The thick black line $P_{nf}(r)$ shows a mixture with fraction $f=\frac{2}{3}$ of $P_2(r)$ and fraction $1-f$ of an exponential distribution. This has much broader tails and does not go to zero at $r\sim 0$, and is the proposal distribution used by CosmoMC. It is much more robust to covariance matrix misestimation than a high $n$-D Gaussian proposal distribution (though slightly less optimal in the ideal case).