Table of Contents
Fetching ...

Reinforcement Learning on Reconfigurable Hardware: Overcoming Material Variability in Laser Material Processing

Giulio Masinelli, Chang Rajani, Patrik Hoffmann, Kilian Wasmer, David Atienza

TL;DR

Problem: variability in laser welding caused by material properties and surface conditions degrades weld quality. Approach: a real-time reinforcement learning controller implemented on an FPGA drives laser power using two optical sensor inputs, with server-side Soft Actor-Critic training and a simple reward based on the optical reflection signal $r(s_{t+1}) = \frac{OR(s_{t+1})}{10}$. Contributions: real-time closed-loop control on FPGA enabling microsecond-scale reactions, autonomous adaptation to surface variation without prior tuning, validation on 316L stainless steel samples across brushed, sandblasted, and mixed surfaces, and post-fabrication imaging showing larger melt pools and reduced porosity compared with constant power strategies. Significance: demonstrates rapid, automated optimization for high-speed manufacturing and suggests routes to extend to other laser processes using richer sensing and domain randomization.

Abstract

Ensuring consistent processing quality is challenging in laser processes due to varying material properties and surface conditions. Although some approaches have shown promise in solving this problem via automation, they often rely on predetermined targets or are limited to simulated environments. To address these shortcomings, we propose a novel real-time reinforcement learning approach for laser process control, implemented on a Field Programmable Gate Array to achieve real-time execution. Our experimental results from laser welding tests on stainless steel samples with a range of surface roughnesses validated the method's ability to adapt autonomously, without relying on reward engineering or prior setup information. Specifically, the algorithm learned the correct power profile for each unique surface characteristic, demonstrating significant improvements over hand-engineered optimal constant power strategies -- up to 23% better performance on rougher surfaces and 7% on mixed surfaces. This approach represents a significant advancement in automating and optimizing laser processes, with potential applications across multiple industries.

Reinforcement Learning on Reconfigurable Hardware: Overcoming Material Variability in Laser Material Processing

TL;DR

Problem: variability in laser welding caused by material properties and surface conditions degrades weld quality. Approach: a real-time reinforcement learning controller implemented on an FPGA drives laser power using two optical sensor inputs, with server-side Soft Actor-Critic training and a simple reward based on the optical reflection signal . Contributions: real-time closed-loop control on FPGA enabling microsecond-scale reactions, autonomous adaptation to surface variation without prior tuning, validation on 316L stainless steel samples across brushed, sandblasted, and mixed surfaces, and post-fabrication imaging showing larger melt pools and reduced porosity compared with constant power strategies. Significance: demonstrates rapid, automated optimization for high-speed manufacturing and suggests routes to extend to other laser processes using richer sensing and domain randomization.

Abstract

Ensuring consistent processing quality is challenging in laser processes due to varying material properties and surface conditions. Although some approaches have shown promise in solving this problem via automation, they often rely on predetermined targets or are limited to simulated environments. To address these shortcomings, we propose a novel real-time reinforcement learning approach for laser process control, implemented on a Field Programmable Gate Array to achieve real-time execution. Our experimental results from laser welding tests on stainless steel samples with a range of surface roughnesses validated the method's ability to adapt autonomously, without relying on reward engineering or prior setup information. Specifically, the algorithm learned the correct power profile for each unique surface characteristic, demonstrating significant improvements over hand-engineered optimal constant power strategies -- up to 23% better performance on rougher surfaces and 7% on mixed surfaces. This approach represents a significant advancement in automating and optimizing laser processes, with potential applications across multiple industries.

Paper Structure

This paper contains 11 sections, 2 equations, 4 figures.

Figures (4)

  • Figure 1: Illustration of the proposed method. Top: The FPGA receives optical signals from the process zone and uses its onboard policy network to determine the laser power in real-time. Between processing runs, the collected data is sent to a server where RL is used to train the policy. Bottom: The policy initially starts with random actions and learns to optimize the process outcome, achieving the best possible results while avoiding defects such as keyhole formation.
  • Figure 2: Comparison of our RL algorithm performance on different sample types during training and testing episodes. Three rows of plots are shown: Brushed, Sandblasted, and Mixed samples. Each row contains two graphs: Train Episode Returns (left) and Test Episode Returns (right). The blue lines represent the episode returns, while the red dashed lines indicate the Optimal Constant Power.
  • Figure 3: Comparison of OR and laser power actions during a test episode. OR (blue) and laser power actions (red). A power spike at the beginning initiates the melting process, while the increase in the middle of the episode corresponds to adjustments made for the sandblasted region. The microscope image at the bottom illustrates the corresponding processed line.
  • Figure 4: Representative melt-pool cross-sections taken from the mixed sample. Left column: Optimal constant power strategy. Right column: Learned policy. Top row: Brushed surface. Bottom row: Sandblasted surface.