Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection

Sara Roos-Hoefgeest; Mario Roos-Hoefgeest; Ignacio Alvarez; Rafael C. González

Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection

Sara Roos-Hoefgeest, Mario Roos-Hoefgeest, Ignacio Alvarez, Rafael C. González

TL;DR

The paper tackles the challenge of achieving complete and precise surface scans with profilometric sensors by optimizing robot-sensor trajectories through reinforcement learning. It introduces a PPO-based framework trained in a CAD-driven simulated environment (simu_roos) with a carefully designed state space, action space, and a composite reward that accounts for distance, orientation, and profile spacing. Key contributions include the explicit modeling of state/action/reward for profilometric inspection, online policy optimization for end-effector pose adjustments, and validation in both simulation and a real UR3e-based setup, including a real-world trajectory transfer via RoboDK. Results show that RL-optimized passes improve measurement quality and coverage compared with traditional straight or boustrophedon scans, demonstrating practical sim-to-real viability and generalizability to different parts and robotic configurations.

Abstract

High-precision surface defect detection in manufacturing is essential for ensuring quality control. Laser triangulation profilometric sensors are key to this process, providing detailed and accurate surface measurements over a line. To achieve a complete and precise surface scan, accurate relative motion between the sensor and the workpiece is required. It is crucial to control the sensor pose to maintain optimal distance and relative orientation to the surface. It is also important to ensure uniform profile distribution throughout the scanning process. This paper presents a novel Reinforcement Learning (RL) based approach to optimize robot inspection trajectories for profilometric sensors. Building upon the Boustrophedon scanning method, our technique dynamically adjusts the sensor position and tilt to maintain optimal orientation and distance from the surface, while also ensuring a consistent profile distance for uniform and high-quality scanning. Utilizing a simulated environment based on the CAD model of the part, we replicate real-world scanning conditions, including sensor noise and surface irregularities. This simulation-based approach enables offline trajectory planning based on CAD models. Key contributions include the modeling of the state space, action space, and reward function, specifically designed for inspection applications using profilometric sensors. We use Proximal Policy Optimization (PPO) algorithm to efficiently train the RL agent, demonstrating its capability to optimize inspection trajectories with profilometric sensors. To validate our approach, we conducted several experiments where a model trained on a specific training piece was tested on various parts in simulation. Also, we conducted a real-world experiment by executing the optimized trajectory, generated offline from a CAD model, to inspect a part using a UR3e robotic arm model.

Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection

TL;DR

Abstract

Paper Structure (21 sections, 16 equations, 21 figures, 2 tables)

This paper contains 21 sections, 16 equations, 21 figures, 2 tables.

Introduction
Materials and Methods
Reinforcement Learning (RL): Basic Concepts
Scanning Characteristics for Surface Inspection with Profilometric Sensors
Simulated Environment
State Space
Action Space
Dynamic Action Limitation
Reward Function
Distance Reward $(R_D)$
Orientation Reward $(R_\alpha)$
Movement Reward $(R_{\Delta s})$
RL Algorithm: PPO
Results
Training Process
...and 6 more sections

Figures (21)

Figure 1: Agent-Environment Interaction Cycle
Figure 2: Representation of the profilometric sensor and its main parameters. (a) 3D representation of the sensor with its coordinate system. (b) Frontal view of the sensor. The optimal working distance $W_d$ and depth of field $Z_r$ are depicted. (c) Representation of the direction angle. This is the angle between the direction of the sensor's laser beam, represented by $\overrightarrow{l}$, and the normal vector of the workpiece surface, $\overrightarrow{n}$. (d) Distance between two consecutive profiles $\Delta s$.
Figure 3: Top view of the scanning trajectory, where multiple parallel passes are made, each separated by a distance $d$. An overlapping area is defined between each pass. The gray square represents the sensor. The red lines show the trajectories where the sensor captures data. The black lines indicate the intermediate movements to position the sensor for the next pass.
Figure 4: View of the simulated environment with the profilometric sensor and a part to be inspected.
Figure 5: Profile obtained during the simulation of a scan of the car door handle section of the CAD model shown in Figure \ref{['fig:simuCarDoor']}.
...and 16 more figures

Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection

TL;DR

Abstract

Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection

Authors

TL;DR

Abstract

Table of Contents

Figures (21)