SimProcess: High Fidelity Simulation of Noisy ICS Physical Processes
Denis Donadel, Gabriele Crestanello, Giulio Morandini, Daniele Antonioli, Mauro Conti, Massimo Merro
TL;DR
SimProcess addresses the challenge of realistically simulating ICS physical processes for honeypots by focusing on reproducing the noise characteristics of real systems. It introduces a seven-stage, noise-centric pipeline that relies only on real-system time-series to estimate noise and compare against simulations via machine learning, enabling fidelity ranking without differential equations. Validated on the EPIC power-grid testbed, the approach achieves recall up to 1.0 and identifies Gaussian, Gaussian Mixture, and autoencoder-based noises as the most faithful for matching real dynamics, offering actionable guidance for improving honeypot realism. The framework is general, publicly available, and capable of handling dynamic changes, making it a practical tool for defenders studying attacker fingerprinting and advancing ICS security research.
Abstract
Industrial Control Systems (ICS) manage critical infrastructures like power grids and water treatment plants. Cyberattacks on ICSs can disrupt operations, causing severe economic, environmental, and safety issues. For example, undetected pollution in a water plant can put the lives of thousands at stake. ICS researchers have increasingly turned to honeypots -- decoy systems designed to attract attackers, study their behaviors, and eventually improve defensive mechanisms. However, existing ICS honeypots struggle to replicate the ICS physical process, making them susceptible to detection. Accurately simulating the noise in ICS physical processes is challenging because different factors produce it, including sensor imperfections and external interferences. In this paper, we propose SimProcess, a novel framework to rank the fidelity of ICS simulations by evaluating how closely they resemble real-world and noisy physical processes. It measures the simulation distance from a target system by estimating the noise distribution with machine learning models like Random Forest. Unlike existing solutions that require detailed mathematical models or are limited to simple systems, SimProcess operates with only a timeseries of measurements from the real system, making it applicable to a broader range of complex dynamic systems. We demonstrate the framework's effectiveness through a case study using real-world power grid data from the EPIC testbed. We compare the performance of various simulation methods, including static and generative noise techniques. Our model correctly classifies real samples with a recall of up to 1.0. It also identifies Gaussian and Gaussian Mixture as the best distribution to simulate our power systems, together with a generative solution provided by an autoencoder, thereby helping developers to improve honeypot fidelity. Additionally, we make our code publicly available.
