Table of Contents
Fetching ...

Operator Learning for Consolidation: An Architectural Comparison for DeepONet Variants

Yongjin Choi, Chenying Liu, Jorge Macedo

Abstract

Deep Operator Networks (DeepONets) have emerged as a powerful surrogate modeling framework for learning solution operators in PDE-governed systems. While their use is expanding across engineering disciplines, applications in geotechnical engineering remain limited. This study systematically evaluates several DeepONet architectures for the consolidation problem. We initially consider three architectures: a standard DeepONet with the coefficient of consolidation embedded in the branch net (Models 1 and 2), and a physics-inspired architecture with the coefficient embedded in the trunk net (Model 3). Results show that Model 3 outperforms the standard configurations (Models 1 and 2) but still has limitations when the target solution (excess pore pressures) exhibits significant variation. To overcome this limitation, we propose a Trunknet Fourier feature-enhanced DeepONet (Model 4) that addresses the identified limitations by capturing rapidly varying functions. We further extend Model 4 to 3D scenarios. Although the computational speedup can be modest in the 1D case (1.5-100x compared with traditional solvers), the speedup becomes more pronounced in 3D, reaching approximately 1,000x. Leveraging this efficiency, we offer a conceptual demonstration of DeepONet's potential to accelerate uncertainty quantification in a 3D consolidation problem. Overall, the study highlights the potential of DeepONets to enable efficient, generalizable surrogate modeling in geotechnical applications, advancing the integration of scientific machine learning in geotechnics, which is at an early stage.

Operator Learning for Consolidation: An Architectural Comparison for DeepONet Variants

Abstract

Deep Operator Networks (DeepONets) have emerged as a powerful surrogate modeling framework for learning solution operators in PDE-governed systems. While their use is expanding across engineering disciplines, applications in geotechnical engineering remain limited. This study systematically evaluates several DeepONet architectures for the consolidation problem. We initially consider three architectures: a standard DeepONet with the coefficient of consolidation embedded in the branch net (Models 1 and 2), and a physics-inspired architecture with the coefficient embedded in the trunk net (Model 3). Results show that Model 3 outperforms the standard configurations (Models 1 and 2) but still has limitations when the target solution (excess pore pressures) exhibits significant variation. To overcome this limitation, we propose a Trunknet Fourier feature-enhanced DeepONet (Model 4) that addresses the identified limitations by capturing rapidly varying functions. We further extend Model 4 to 3D scenarios. Although the computational speedup can be modest in the 1D case (1.5-100x compared with traditional solvers), the speedup becomes more pronounced in 3D, reaching approximately 1,000x. Leveraging this efficiency, we offer a conceptual demonstration of DeepONet's potential to accelerate uncertainty quantification in a 3D consolidation problem. Overall, the study highlights the potential of DeepONets to enable efficient, generalizable surrogate modeling in geotechnical applications, advancing the integration of scientific machine learning in geotechnics, which is at an early stage.

Paper Structure

This paper contains 14 sections, 19 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: A standard DeepONet architecture.
  • Figure 2: (a) Training and Validation Data Sampling Method. Input functions (initial excess pore water pressure (PWP)) are sampled at $m=100$ fixed equally spaced locations, denoted by $\{x_\ell \}_{\ell=1}^{100}$ (see left side figure). For each given input function $u^{(i)}$, the corresponding solution is sampled at $P=100$ randomly chosen output evaluation points, denoted by $\{y_j^{(i)}\}_{j=1}^{100}$ (the red markers on the right side figure). This results in 100 training examples of the form $(u^{(i)}, y_j^{(i)}, \mathcal{G}(u^{(i)})(y_j^{(i)}))$ for each input function. For the training data, we generate $N=40,000$ input functions, forming a total of $40,000 \times 100$ training examples. For validation, we generate $N=5,000$ input functions, forming $5,000 \times 100$ validation examples. (b) Test Method. Input functions unseen during training are sampled at $m=100$ fixed equally spaced locations, $\{x_\ell \}_{\ell=1}^{100}$ (see left side figure). For each given input function $u^{(i)}$, we evaluate the learned solution operator $\mathcal{G}_{\Theta}(u^{(i)})(y)$ at a dense $100 \times 100$ uniformly spaced grid of $y$ points across the output domain (see right side figure).
  • Figure 3: DeepONet with direct concatenation of $C_v$ to the branch net.
  • Figure 4: DeepONet with auxiliary networks for embedding $u(z,0)$ and $C_v$ and merging their representations.
  • Figure 5: DeepONet with trunk net modulation by $C_v$.
  • ...and 13 more figures