A new machine learning framework for occupational accidents forecasting with safety inspections integration
Aho Yapi, Pierre Latouche, Arnaud Guillin, Yan Bailly
TL;DR
This paper tackles short-term forecasting of occupational accidents by modeling daily accident occurrences as a binary time series and conditioning on safety-inspection covariates and calendar features. It introduces a model-agnostic framework that supports multiple horizon strategies (DirRec, MIMO, Seq2Seq) to produce daily probabilities which are aggregated into weekly risk assessments, enabling proactive planning and targeted interventions. The study demonstrates that learned models, particularly LSTM-based MIMO configurations, outperform operational baselines and simple heuristics across ITW, ExW, and ITW-d1 groups, with calendar information further improving weekly risk detection. The approach offers a practical, auditable, and adaptable tool for safety management, capable of integration into dashboards and decision processes to prioritize inspections and interventions in high-risk periods, with room for calibration and expansion to richer textual signals and cross-site dependencies.
Abstract
We propose a generic framework for short-term occupational accident forecasting that leverages safety inspections and models accident occurrences as binary time series. The approach generates daily predictions, which are then aggregated into weekly safety assessments to better inform decision making. To ensure the reliability and operational applicability of the forecasts, we apply a sliding-window cross-validation procedure specifically designed for time series data, combined with an evaluation based on aggregated period-level metrics. Several machine learning algorithms, including logistic regression, tree-based models, and neural networks, are trained and systematically compared within this framework. Unlike the other approaches, the long short-term memory (LSTM) network outperforms the other approaches and detects the upcoming high-risk periods with a balanced accuracy of 0.86, confirming the robustness of our methodology and demonstrating that a binary time series model can anticipate these critical periods based on safety inspections. The proposed methodology converts routine safety inspection data into clear weekly risk scores, detecting the periods when accidents are most likely. Decision-makers can integrate these scores into their planning tools to classify inspection priorities, schedule targeted interventions, and funnel resources to the sites or shifts classified as highest risk, stepping in before incidents occur and getting the greatest return on safety investments.
