Table of Contents
Fetching ...

Statistical analysis of geoinformation data for increasing railway safety

Katarzyna Gawlak, Jarosław Konieczny, Krzysztof Domino, Jarosław Adam Miszczak

TL;DR

The paper tackles reducing wildlife–train collisions by introducing a data-driven Bayesian geospatial framework that does not rely on detailed animal-activity models. It estimates spatial and temporal risk profiles from historical accident data in southern Poland and computes a warning probability $p_{pt}(\tau,t,\Delta t,l,x,\Delta x)$ to mark line segments when $p_{pt}$ exceeds a threshold $p_{\text{threshold}}$, enabling driver alerts. The approach is demonstrated on 2020–2022 data with 2023 validation, revealing hot-spots (notably end-of-line 139) and a dusk peak in accidents, while showing benefits over temporal-only methods. The method is simple, adaptable to regions with limited IT resources, and can inform a mix of countermeasures (train-based vs infrastructure-based), with potential for generalization to other railway networks possessing geolocated incident data.

Abstract

The impact of rail transport on the environment is one of the crucial factors for the sustainable development of this form of mass transport. We present a data-driven analysis of wild animal railway accidents in the region of southern Poland, a step to create the train driver warning system. We built our method by harnessing the Bayesian approach to the statistical analysis of information about the geolocation of the accidents. The implementation of the proposed model does not require advanced knowledge of data mining and can be applied even in less developed railway systems with small IT support. Furthermore, we have discovered unusual patterns of accidents while considering the number of trains and their speed and time at particular geographical locations of the railway network. We test the developed approach using data from southern Poland, compromising wildlife habitats and one of the most urbanised regions in Central Europe, based on this we conclude that our model is best suited to railway lines that pass through varying types of landscape.

Statistical analysis of geoinformation data for increasing railway safety

TL;DR

The paper tackles reducing wildlife–train collisions by introducing a data-driven Bayesian geospatial framework that does not rely on detailed animal-activity models. It estimates spatial and temporal risk profiles from historical accident data in southern Poland and computes a warning probability to mark line segments when exceeds a threshold , enabling driver alerts. The approach is demonstrated on 2020–2022 data with 2023 validation, revealing hot-spots (notably end-of-line 139) and a dusk peak in accidents, while showing benefits over temporal-only methods. The method is simple, adaptable to regions with limited IT resources, and can inform a mix of countermeasures (train-based vs infrastructure-based), with potential for generalization to other railway networks possessing geolocated incident data.

Abstract

The impact of rail transport on the environment is one of the crucial factors for the sustainable development of this form of mass transport. We present a data-driven analysis of wild animal railway accidents in the region of southern Poland, a step to create the train driver warning system. We built our method by harnessing the Bayesian approach to the statistical analysis of information about the geolocation of the accidents. The implementation of the proposed model does not require advanced knowledge of data mining and can be applied even in less developed railway systems with small IT support. Furthermore, we have discovered unusual patterns of accidents while considering the number of trains and their speed and time at particular geographical locations of the railway network. We test the developed approach using data from southern Poland, compromising wildlife habitats and one of the most urbanised regions in Central Europe, based on this we conclude that our model is best suited to railway lines that pass through varying types of landscape.
Paper Structure (14 sections, 11 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 14 sections, 11 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Location of areas according to the number of all accidents during the period of three years 2020-2022. The raw data about accidents is processed to assign the number of accidents to the locations. Markers are calculated by counting the number of points in the hexagonal grid with 2.5 km vertical and horizontal spacing (created using QGIS Create grid processing tool). The most important lines are marked in colours.
  • Figure 2: Profiles for animal-train collisions for various species. Spices of animals that took part in accidents, and seasons profiles of accidents with roe deer.
  • Figure 3: Yearly (\ref{['fig:time_profile:monthly']}) and daily (\ref{['fig:time_profile:XI_II']}-\ref{['fig:time_profile:V_VIII']}) profiles of accidents involving wild animals. Data collected in the period from 2020 to 2022.
  • Figure 4: Monthly profiles of accidents involving wild animals over particular years. We observe similar profile for all years. To demonstrate the stability of model, one can compare 2020 \ref{['fig:time_profile:monthly2020']} and 2021 \ref{['fig:time_profile:monthly2021']} as in the first half of 2020 train traffic was reduced due to COVID-19 lockdown.
  • Figure 5: Probability of accidents with animals for various lines, and spatial location, data collected in 2020 to 2022.
  • ...and 6 more figures