Probabilistic machine learning of relaxation time distributions in spectral induced polarization
Charles L. Bérubé, Sébastien Gagnon, Lahiru M. A. Nagasingha, Jean-Luc Gagnon, E. Rachel Kenko, Reza Ghanati, Frédérique Baron
Abstract
Debye decomposition methods are widely used to interpret spectral induced polarization (SIP) data and to recover the relaxation time distribution (RTD) of geomaterials. However, SIP interpretation remains challenging for heterogeneous data sets because conventional decomposition methods treat each spectrum independently and provide limited uncertainty quantification. A probabilistic machine learning method is introduced to infer continuous RTD directly from complex resistivity spectra, using a combined laboratory and field data set comprising 140 SIP measurements of granular mixtures, rock cores, field surveys, and cementitious materials. The approach relies on a conditional variational autoencoder (CVAE) that performs decomposition at the data set level and learns a shared inverse mapping from complex resistivity spectra to probabilistic RTD expressed as Gaussian mixtures. The CVAE reproduces measured spectra with global errors below 0.53% and 0.45% over the full frequency range for the real and imaginary components, respectively. Dominant relaxation modes are recovered consistently, and both the total chargeability and the mean relaxation time show strong correlations with polarizable grain content and grain size, respectively, with coefficients of determination up to 0.95. Jacobian-based sensitivity analysis shows that the placement, width, and relative weighting of relaxation modes contribute to approximately 89% of the decomposition process. In contrast, the total chargeability contributes to 10% and the resistivity scaling parameter less than 1%. Latent variables learned by the CVAE organize SIP data into a structured space where sample populations naturally cluster without supervision. Compared to the chargeability and relaxation time domain, a two-dimensional projection of the latent variables improves the Davies--Bouldin clustering index by nearly a factor of three.
