Table of Contents
Fetching ...

On the impact of observation error correlations in data assimilation, with application to along-track altimeter data

Olivier Goux, Anthony Weaver, Selime Gürol, Oliver Guillet, Andrea Piacentini

TL;DR

The paper investigates how observation-error correlations affect data assimilation, using a theoretical Fourier framework and a realistic ocean 3D-Var setup with along-track altimeter data. It demonstrates that non-diagonal, diffusion-operator–based representations of the observation-error covariance $f R$ can yield substantial improvements, particularly for small-scale information and velocity fields, compared with variance inflation of a diagonal $f R$. When observation errors have short correlation lengths, inflation can mitigate large-scale overfitting but at the expense of small-scale detail; for long correlation lengths, explicit non-diagonal $f R$ is essential to avoid over-suppressing meaningful gradients. The study provides practical methods to construct, normalise, and stabilise diffusion-based $f R$ in operational DA systems and highlights the impact on convergence, conditioning, and analysis accuracy, especially for SSH and geostrophic velocities.

Abstract

Data assimilation involves estimating the state of a system by combining observations from various sources with a background estimate of the state. The weights given to the observations and background state depend on their specified error covariance matrices. Observation errors are often assumed to be uncorrelated even though this assumption is inaccurate for many modern data-sets such as those from satellite observing systems. As methods allowing for a more realistic representation of observation-error correlations are emerging, our aim in this article is to provide insight on their expected impact in data assimilation. First, we use a simple idealised system to analyse the effect of observation-error correlations on the spectral characteristics of the solution. Next, we assess the relevance of these results in a more realistic setting in which simulated alongtrack (nadir) altimeter observations with correlated errors are assimilated in a global ocean model using a three-dimensional variational assimilation (3D-Var) method. Correlated observation errors are modelled in the 3D-Var system using a diffusion operator. When the correlation length scale of observation error is small compared to that of background error, inflating the observation-error variances can mitigate most of the negative effects from neglecting the observation-error correlations. Accounting for observation-error correlations in this situation still outperforms variance inflation since it allows small-scale information in the observations to be more effectively extracted and does not affect the convergence of the minimization. Conversely, when the correlation length scale of observation error is large compared to that of background error, the effect of observation-error correlations cannot be properly approximated with variance inflation. However, the correlation model needs to be constructed carefully to ensure the minimization problem is adequately conditioned so that a robust solution can be obtained. Practical ways to achieve this are discussed.

On the impact of observation error correlations in data assimilation, with application to along-track altimeter data

TL;DR

The paper investigates how observation-error correlations affect data assimilation, using a theoretical Fourier framework and a realistic ocean 3D-Var setup with along-track altimeter data. It demonstrates that non-diagonal, diffusion-operator–based representations of the observation-error covariance can yield substantial improvements, particularly for small-scale information and velocity fields, compared with variance inflation of a diagonal . When observation errors have short correlation lengths, inflation can mitigate large-scale overfitting but at the expense of small-scale detail; for long correlation lengths, explicit non-diagonal is essential to avoid over-suppressing meaningful gradients. The study provides practical methods to construct, normalise, and stabilise diffusion-based in operational DA systems and highlights the impact on convergence, conditioning, and analysis accuracy, especially for SSH and geostrophic velocities.

Abstract

Data assimilation involves estimating the state of a system by combining observations from various sources with a background estimate of the state. The weights given to the observations and background state depend on their specified error covariance matrices. Observation errors are often assumed to be uncorrelated even though this assumption is inaccurate for many modern data-sets such as those from satellite observing systems. As methods allowing for a more realistic representation of observation-error correlations are emerging, our aim in this article is to provide insight on their expected impact in data assimilation. First, we use a simple idealised system to analyse the effect of observation-error correlations on the spectral characteristics of the solution. Next, we assess the relevance of these results in a more realistic setting in which simulated alongtrack (nadir) altimeter observations with correlated errors are assimilated in a global ocean model using a three-dimensional variational assimilation (3D-Var) method. Correlated observation errors are modelled in the 3D-Var system using a diffusion operator. When the correlation length scale of observation error is small compared to that of background error, inflating the observation-error variances can mitigate most of the negative effects from neglecting the observation-error correlations. Accounting for observation-error correlations in this situation still outperforms variance inflation since it allows small-scale information in the observations to be more effectively extracted and does not affect the convergence of the minimization. Conversely, when the correlation length scale of observation error is large compared to that of background error, the effect of observation-error correlations cannot be properly approximated with variance inflation. However, the correlation model needs to be constructed carefully to ensure the minimization problem is adequately conditioned so that a robust solution can be obtained. Practical ways to achieve this are discussed.

Paper Structure

This paper contains 23 sections, 66 equations, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: (a) Scale-dependent variance of the background error ($\sigma_{\rm b}^2\lambda_{\rm b}^{(k)}$), observation error ($\sigma_{\rm o}^2\lambda_{\rm o}^{(k)}$), and analysis error ($\sigma_{\rm a}^2\lambda_{\rm a}^{(k)}$; Equation \ref{['eq_analysis_error_spectral']}), as a function of spatial scale. (b) Scale-dependent sensitivity ($\lambda_{\rm s}^{(k)}$; Equation \ref{['eq:lambda_sk']}) as a function of spatial scale. In this idealised 1D problem, the domain size is 40000 km; the background-error correlation function is approximately Gaussian with a length scale of 200 km; the observation errors are uncorrelated; and the background- and observation-error variances are both equal to one (see text for further description). The spatial scale is computed as $ph/k$; i.e., the inverse of the non-dimensional wavenumber $k/p$ from Equation \ref{['eq_def_Fm']} multiplied by the distance between observations ($h = 25$ km). The scale on the vertical axis has been cut-off at $10^{-3}$ (the minimum background-error variance reaches $10^{-13}$ for the smallest resolved spatial scale).
  • Figure 2: Ratio of the practical scale-dependent analysis-error variance to the optimal scale-dependent analysis-error variance ($\eta_{\rm \widetilde{a}, a}^{(k)}$; see Equation \ref{['eq:etaaa']}) where the former is associated with a misspecification of the observation-error variance at a given spatial scale ($\eta_{\rm \widetilde{o}, o}^{(k)}\neq 1$). The black cross marks the point where $\eta_{\rm o, b}^{(k)} = 10^{-4}$ and $\eta_{\rm \widetilde{o}, o}^{(k)}=10^2$ (referenced later in the main text).
  • Figure 3: AR correlation functions (panels (a) and (c)) and their respective Fourier transforms (panels (b) and (d)). In panels (a) and( b), $\rho$ is fixed to a value of 450 km and curves are displayed for different values of values of $m$. In panels (c) and (d), $m$ is fixed to a value of 2 and curves are displayed for different values of $\rho$.
  • Figure 4: Panels (a) and (c): scale-dependent variance of the background error ($\sigma_{\rm b}^2\lambda_{\rm b}^{(k)}$), observation error ($\sigma_{\rm o}^2\lambda_{\rm o}^{(k)}$) and its misspecified counterpart ($\widetilde{\sigma}_{\rm o}^2\widetilde{\lambda}_{\rm o}^{(k)}$), and analysis error ($\sigma_{\rm a}^2\lambda_{\rm a}^{(k)}$; Equation \ref{['eq_analysis_error_spectral']}) and its sub-optimal counterpart ($\widetilde{\sigma}_{\rm a}^2\widetilde{\lambda}_{\rm a}^{(k)}$; Equation \ref{['eq_analysis_error_spectral_subopt']}) as a function of spatial scale. Panels (b) and (d): scale-dependent sensitivity ($\lambda_{\rm s}^{(k)}$; Equation \ref{['eq:lambda_sk']}) and its sub-optimal counterpart ($\widetilde{\lambda}_{\rm s}^{(k)}$; Equation \ref{['eq_sensitivity_spectral_subopt']}) as a function of spatial scale. The experiment is as in Figure \ref{['fig_scale_decomp_1']}. The background-error correlation function is approximately Gaussian with a length scale $\rho_{\rm b}=210$ km; the 'true' observation-error correlation function is a SOAR function with a length scale $\rho_{\rm o}=150$ km; the misspecified observation-error covariance function has no correlation, i.e., $\widetilde{\mathbf{R}}=\sigma_{\rm o}^2\mathbf{I}$ in panels (a) and (b), while $\widetilde{\mathbf{R}}=\alpha^2\sigma_{\rm o}^2\mathbf{I}$ in panels (c) and (d) where $\alpha=2.5$ is an optimally derived inflation factor that minimizes the analysis-error variance.
  • Figure 5: RMS of the analysis error at each iteration of the inner loop, for each variable (SSH: sea surface height, T: temperature, S: salinity, U: zonal current velocity, V: meridional current velocity). The values have been normalised by the RMS of the background error at each iteration. The background-error correlation function is approximately Gaussian ($m_{\rm b}=10$) where the length scale $\rho_{\rm b}$ varies between 210 km and 420 km. The true observation-error correlation function is a SOAR function ($m_{\rm o}=2$) with a length scale $\rho_{\rm o} =150$ km. Solid lines correspond to the mean of 100 samples of the RMS of the analysis error derived from different realisations of the random innovation vector (Equation \ref{['eq_innov_error']}). Filled-in areas indicate the spread between the 25% and 75% quantiles of the samples. (a) $\widetilde{\mathbf{R}}$ is specified as a diagonal matrix; (b) $\widetilde{\mathbf{R}}$ is specified as a diagonal matrix with optimal inflation; and (c) $\widetilde{\mathbf{R}} = \mathbf{R}$, the true observation-error covariance matrix. For panel (b), the optimal inflation factor is $\alpha = 2.5$.
  • ...and 6 more figures