Effective Data Selection for Seismic Interpretation through Disagreement

Ryan Benkert; Mohit Prabhushankar; Ghassan AlRegib

Effective Data Selection for Seismic Interpretation through Disagreement

Ryan Benkert, Mohit Prabhushankar, Ghassan AlRegib

TL;DR

The paper tackles the data-annotation bottleneck in seismic interpretation by proposing ATLAS, a plug-in, representation-shift–based spatially aware active-learning framework. It formalizes interpretation disagreement through representation shifts $\Delta h(\mathbf{x})$ between two model instantiations and uses this to mask and focus training-data selection via $\phi(\mathbf{x}, h_w) = \mathbf{x} * \Delta h(\mathbf{x})$. ATLAS integrates with common acquisition strategies and demonstrates improved generalization on two seismic surveys, yielding notable gains in mean IoU across both simple and complex structural regions. The approach enhances sampling efficiency by targeting geophysically interesting regions, offering a practical workflow for expensive seismic labeling with potential applicability to broader geoscience interpretation tasks.

Abstract

This paper presents a discussion on data selection for deep learning in the field of seismic interpretation. In order to achieve a robust generalization to the target volume, it is crucial to identify the specific samples are the most informative to the training process. The selection of the training set from a target volume is a critical factor in determining the effectiveness of the deep learning algorithm for interpreting seismic volumes. This paper proposes the inclusion of interpretation disagreement as a valuable and intuitive factor in the process of selecting training sets. The development of a novel data selection framework is inspired by established practices in seismic interpretation. The framework we have developed utilizes representation shifts to effectively model interpretation disagreement within neural networks. Additionally, it incorporates the disagreement measure to enhance attention towards geologically interesting regions throughout the data selection workflow. By combining this approach with active learning, a well-known machine learning paradigm for data selection, we arrive at a comprehensive and innovative framework for training set selection in seismic interpretation. In addition, we offer a specific implementation of our proposed framework, which we have named ATLAS. This implementation serves as a means for data selection. In this study, we present the results of our comprehensive experiments, which clearly indicate that ATLAS consistently surpasses traditional active learning frameworks in the field of seismic interpretation. Our findings reveal that ATLAS achieves improvements of up to 12% in mean intersection-over-union.

Effective Data Selection for Seismic Interpretation through Disagreement

TL;DR

between two model instantiations and uses this to mask and focus training-data selection via

. ATLAS integrates with common acquisition strategies and demonstrates improved generalization on two seismic surveys, yielding notable gains in mean IoU across both simple and complex structural regions. The approach enhances sampling efficiency by targeting geophysically interesting regions, offering a practical workflow for expensive seismic labeling with potential applicability to broader geoscience interpretation tasks.

Abstract

Paper Structure (23 sections, 8 equations, 10 figures, 2 tables)

This paper contains 23 sections, 8 equations, 10 figures, 2 tables.

Introduction
Related Work
Sampling Efficiency in Seismic Interpretation
Active Learning
Deep Feature Extraction for Seismic Interpretation
Background
Notation and Problem Setup
Active Learning
Methodology
Representation Shifts
Spatially Aware Active Learning
Active Transfer Learning for Attention Sensitivity
Experiments
Experimental Setup
Numerical Experiments
...and 8 more sections

Figures (10)

Figure 1: High-level overview of our measure of information content in seismic interpretation. a) information content for manual interpretation. Two different interpreters annotate the same seismic data section. Information content is measured by the disagreement between both interpretations $\Delta h$. b) our measure of information content with deep neural networks. Information content is measured by the disagreement or shift between representations of the same seismic section $\Delta h$.
Figure 2: Toy example of an a acquisition batch selection with an accurate representation (left) and a representation suffering under feature collapse. The collapsed representation results in a significant randomization within the selection algorithm.
Figure 3: Toy example of a representation shift under feature collapse applied to a binary classification problem. Left: the model representation of $h_{w_1}(\mathbf{x})$. Right: the model representation of $h_{w_1}(\mathbf{x})$.
Figure 4: a) high-level workflow of our plug-in active learning framework, ATLAS. For each unlabeled section, we extract the predictions from both the current round as well as the previous round and derive the prediction difference. Subsequently, we filter regions where the predictions conflict, and process the disagreeing regions by the active learning algorithm exclusively. b) spatially-aware active learning workflow. The active learning framework is modified by introducing a disagreement filter that restricts the input of the active learning algorithm to regions of geological interest.
Figure 5: Training and test split of both seismic datasets. a) The F3 dataset located in the Netherlands. We use the split of alaudah2019machine with one contiguous training, as well as two test volumes. b) The Parihaka volume from New Zealand. We split the volume into one contiguous training and test volume.
...and 5 more figures

Effective Data Selection for Seismic Interpretation through Disagreement

TL;DR

Abstract

Effective Data Selection for Seismic Interpretation through Disagreement

Authors

TL;DR

Abstract

Table of Contents

Figures (10)