Table of Contents
Fetching ...

Well log data generation and imputation using sequence-based generative adversarial networks

Abdulrahman Al-Fakih, A. Koeshidayatullah, Tapan Mukerji, Sadam Al-Azani, SanLinn I. Kaka

TL;DR

This work tackles the challenge of gaps and uncertainties in well log data by proposing a dual-GAN framework that combines Time Series GAN ($TSGAN$) for synthetic data generation and Sequence GAN ($SeqGAN$) for imputation of missing values. The approach is evaluated on North Sea LAS datasets, with comparisons to BRITS and NAOMI showing strong performance in both data synthesis ($R^2$ ≈ $0.92$) and sequential imputation. The study provides rigorous statistical and visual validation, including KS tests, Pearson correlations, KL divergences, and PCA/t-SNE visualizations, to demonstrate fidelity between real and synthetic data. Overall, the dual-framework enhances data completeness and reliability in geosciences, offering a practical pathway to improved reservoir characterization in data-sparse subsurface environments.

Abstract

Well log analysis is crucial for hydrocarbon exploration, providing detailed insights into subsurface geological formations. However, gaps and inaccuracies in well log data, often due to equipment limitations, operational challenges, and harsh subsurface conditions, can introduce significant uncertainties in reservoir evaluation. Addressing these challenges requires effective methods for both synthetic data generation and precise imputation of missing data, ensuring data completeness and reliability. This study introduces a novel framework utilizing sequence-based generative adversarial networks (GANs) specifically designed for well log data generation and imputation. The framework integrates two distinct sequence-based GAN models: Time Series GAN (TSGAN) for generating synthetic well log data and Sequence GAN (SeqGAN) for imputing missing data. Both models were tested on a dataset from the North Sea, Netherlands region, focusing on different sections of 5, 10, and 50 data points. Experimental results demonstrate that this approach achieves superior accuracy in filling data gaps compared to other deep learning models for spatial series analysis. The method yielded R^2 values of 0.921, 0.899, and 0.594, with corresponding mean absolute percentage error (MAPE) values of 8.320, 0.005, and 151.154, and mean absolute error (MAE) values of 0.012, 0.005, and 0.032, respectively. These results set a new benchmark for data integrity and utility in geosciences, particularly in well log data analysis.

Well log data generation and imputation using sequence-based generative adversarial networks

TL;DR

This work tackles the challenge of gaps and uncertainties in well log data by proposing a dual-GAN framework that combines Time Series GAN () for synthetic data generation and Sequence GAN () for imputation of missing values. The approach is evaluated on North Sea LAS datasets, with comparisons to BRITS and NAOMI showing strong performance in both data synthesis () and sequential imputation. The study provides rigorous statistical and visual validation, including KS tests, Pearson correlations, KL divergences, and PCA/t-SNE visualizations, to demonstrate fidelity between real and synthetic data. Overall, the dual-framework enhances data completeness and reliability in geosciences, offering a practical pathway to improved reservoir characterization in data-sparse subsurface environments.

Abstract

Well log analysis is crucial for hydrocarbon exploration, providing detailed insights into subsurface geological formations. However, gaps and inaccuracies in well log data, often due to equipment limitations, operational challenges, and harsh subsurface conditions, can introduce significant uncertainties in reservoir evaluation. Addressing these challenges requires effective methods for both synthetic data generation and precise imputation of missing data, ensuring data completeness and reliability. This study introduces a novel framework utilizing sequence-based generative adversarial networks (GANs) specifically designed for well log data generation and imputation. The framework integrates two distinct sequence-based GAN models: Time Series GAN (TSGAN) for generating synthetic well log data and Sequence GAN (SeqGAN) for imputing missing data. Both models were tested on a dataset from the North Sea, Netherlands region, focusing on different sections of 5, 10, and 50 data points. Experimental results demonstrate that this approach achieves superior accuracy in filling data gaps compared to other deep learning models for spatial series analysis. The method yielded R^2 values of 0.921, 0.899, and 0.594, with corresponding mean absolute percentage error (MAPE) values of 8.320, 0.005, and 151.154, and mean absolute error (MAE) values of 0.012, 0.005, and 0.032, respectively. These results set a new benchmark for data integrity and utility in geosciences, particularly in well log data analysis.

Paper Structure

This paper contains 35 sections, 3 equations, 24 figures, 5 tables.

Figures (24)

  • Figure 1: High-level architecture of the proposed framework.
  • Figure 2: TSGAN workflow for synthetic data generation.
  • Figure 3: TSGAN architecture.
  • Figure 4: Workflow for log imputation.
  • Figure 5: SeqGAN Architecture
  • ...and 19 more figures