Accurate and Noise-Tolerant Extraction of Routine Logs in Robotic Process Automation (Extended Version)
Massimiliano de Leoni, Faizan Ahmed Khan, Simone Agostinelli
TL;DR
This work tackles the challenge of deriving accurate routine logs from UI logs to enable automatic discovery of routine-type models for Robotic Process Automation. It introduces a noise-tolerant, clustering-based pipeline that segments UI logs at predefined completion actions, encodes each routine execution as action-count vectors, and clusters these vectors with $K$-Means, DBSCAN, or HDBSCAN to produce distinct routine logs. Evaluated on nine synthetic UI logs with varying noise and on a real-world UI log, the approach shows higher $JC$ and alignment-based $Fitness$ than state-of-the-art baselines and produces fewer empty logs, demonstrating robustness to human variability. The method advances practical routine-type model discovery in RPM by providing a scalable, flexible pipeline that improves the quality of discovered routine logs and, consequently, the accuracy of downstream process models.
Abstract
Robotic Process Mining focuses on the identification of the routine types performed by human resources through a User Interface. The ultimate goal is to discover routine-type models to enable robotic process automation. The discovery of routine-type models requires the provision of a routine log. Unfortunately, the vast majority of existing works do not directly focus on enabling the model discovery, limiting themselves to extracting the set of actions that are part of the routines. They were also not evaluated in scenarios characterized by inconsistent routine execution, hereafter referred to as noise, which reflects natural variability and occasional errors in human performance. This paper presents a clustering-based technique that aims to extract routine logs. Experiments were conducted on nine UI logs from the literature with different levels of injected noise. Our technique was compared with existing techniques, most of which are not meant to discover routine logs but were adapted for the purpose. The results were evaluated through standard state-of-the-art metrics, showing that we can extract more accurate routine logs than what the state of the art could, especially in the presence of noise.
