Table of Contents
Fetching ...

Deep Learning for Melt Pool Depth Contour Prediction From Surface Thermal Images via Vision Transformers

Francis Ogoke, Peter Myung-Won Pak, Alexander Myers, Guadalupe Quirarte, Jack Beuth, Jonathan Malen, Amir Barati Farimani

TL;DR

The paper tackles predicting the subsurface melt pool contour in Laser Powder Bed Fusion directly from in-situ surface two-color thermal images. It introduces a hybrid CNN-Transformer pipeline that uses a ResNet backbone to encode spatial features and a temporal Transformer to capture long-range sequence dynamics, outputting a 64×64 truncated signed distance function contour of the melt pool. Evaluations on experimental data show contour IoU around 0.76–0.77 and depth/area correlations up to $R^2 \approx 0.88$, with ratiometric temperature inputs yielding improved IoU over monochrome images. Transfer learning from FLOW-3D and Eagar-Tsai simulations substantially reduces the required labeled data while maintaining predictive accuracy, enabling potential in-situ defect detection and process control in L-PBF.

Abstract

Insufficient overlap between the melt pools produced during Laser Powder Bed Fusion (L-PBF) can lead to lack-of-fusion defects and deteriorated mechanical and fatigue performance. In-situ monitoring of the melt pool subsurface morphology requires specialized equipment that may not be readily accessible or scalable. Therefore, we introduce a machine learning framework to correlate in-situ two-color thermal images observed via high-speed color imaging to the two-dimensional profile of the melt pool cross-section. Specifically, we employ a hybrid CNN-Transformer architecture to establish a correlation between single bead off-axis thermal image sequences and melt pool cross-section contours measured via optical microscopy. In this architecture, a ResNet model embeds the spatial information contained within the thermal images to a latent vector, while a Transformer model correlates the sequence of embedded vectors to extract temporal information. Our framework is able to model the curvature of the subsurface melt pool structure, with improved performance in high energy density regimes compared to analytical melt pool models. The performance of this model is evaluated through dimensional and geometric comparisons to the corresponding experimental melt pool observations.

Deep Learning for Melt Pool Depth Contour Prediction From Surface Thermal Images via Vision Transformers

TL;DR

The paper tackles predicting the subsurface melt pool contour in Laser Powder Bed Fusion directly from in-situ surface two-color thermal images. It introduces a hybrid CNN-Transformer pipeline that uses a ResNet backbone to encode spatial features and a temporal Transformer to capture long-range sequence dynamics, outputting a 64×64 truncated signed distance function contour of the melt pool. Evaluations on experimental data show contour IoU around 0.76–0.77 and depth/area correlations up to , with ratiometric temperature inputs yielding improved IoU over monochrome images. Transfer learning from FLOW-3D and Eagar-Tsai simulations substantially reduces the required labeled data while maintaining predictive accuracy, enabling potential in-situ defect detection and process control in L-PBF.

Abstract

Insufficient overlap between the melt pools produced during Laser Powder Bed Fusion (L-PBF) can lead to lack-of-fusion defects and deteriorated mechanical and fatigue performance. In-situ monitoring of the melt pool subsurface morphology requires specialized equipment that may not be readily accessible or scalable. Therefore, we introduce a machine learning framework to correlate in-situ two-color thermal images observed via high-speed color imaging to the two-dimensional profile of the melt pool cross-section. Specifically, we employ a hybrid CNN-Transformer architecture to establish a correlation between single bead off-axis thermal image sequences and melt pool cross-section contours measured via optical microscopy. In this architecture, a ResNet model embeds the spatial information contained within the thermal images to a latent vector, while a Transformer model correlates the sequence of embedded vectors to extract temporal information. Our framework is able to model the curvature of the subsurface melt pool structure, with improved performance in high energy density regimes compared to analytical melt pool models. The performance of this model is evaluated through dimensional and geometric comparisons to the corresponding experimental melt pool observations.
Paper Structure (5 sections, 5 equations, 12 figures, 5 tables)

This paper contains 5 sections, 5 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: a) A schematic of the architecture used for mapping the image sequence to the below surface melt pool morphology. In this schematic, $n$ observed melt pool thermal signatures are linked to the below surface melt pool shape. b) The architecture of the transformer encoder module. The core of the transformer encoder block is the multi-head attention step, which computes query ($Q$), key ($K$), and value ($V$) matrices that encode the relationships and dependencies between the frames of the video sequence.
  • Figure 2: a) The melt pool surface temperature images are sampled at intervals of 44 µs, corresponding to a frame rate of 22, 500 frames per second. The temperature distribution changes from frame to frame. At high exposure times, more of the melt pool structure is visible, but the temperature core of the melt pool near the laser is saturated. At lower exposure times, less of the melt pool is directly observable, but the center of the melt pool is no longer saturated. b) Optical micrographs of the cross-sectioned tracks at varying energy densities. c)The melt pool depth contour profiles for varying energy densities. There are slight variations due to the variability present between successive melt tracks.
  • Figure 3: a) Sample thermal images of the melt pool over time for P = 200 W, V = 0.9 m/s, P = 350 W, V = 1.2 m/s, and P = 250 W, V = 0.3 m/s used as input for the depth contour prediction model. A moving average window is applied temporally for a period of 220 $\mu s$ ($n$ = 5), and a sequence length of 50 frames is used for the prediction. A 20 $\mu s$ exposure time is used as input to the depth contour prediction model. b) A comparison of the depth contours predicted by the temporal transformer model to the ground truth depth contours.
  • Figure 4: A comparison of the U-Net and Temporal Transformer prediction performance as a function of sequence length. a) The mean absolute error of the resampled predicted contour points, compared to the resampled ground truth contour points as a function of the sequence length. (b) The Hausdorff distance between the predicted contour points, and the ground truth contour points.
  • Figure 5: A comparison of the extracted melt pool dimensions for contour predictions on the test partition of the experimental dataset, for three different model architectures. a) The area correlation between the ground truth and predicted melt pool contours, for the U-Net, Vision Transformer, and Temporal Transformer models. The dotted lines indicate a deviation of $\pm 6500 \; \mu m^{2}$ from the ideal prediction. b) The depth correlation between the ground truth and predicted melt pool contours, for the U-Net, Vision Transformer, and Temporal Transformer models. The dotted lines indicate a deviation of $\pm 45 \; \mu m$ from the ideal prediction.
  • ...and 7 more figures