Table of Contents
Fetching ...

Usage-Specific Survival Modeling Based on Operational Data and Neural Networks

Olov Holmer, Mattias Krysander, Erik Frisk

TL;DR

The results show that if the data is homogeneously sampled the methodology works as intended and produces accurate survival models and the results show that randomly resampling the dataset on each epoch is an effective way to reduce the size of the training data.

Abstract

Accurate predictions of when a component will fail are crucial when planning maintenance, and by modeling the distribution of these failure times, survival models have shown to be particularly useful in this context. The presented methodology is based on conventional neural network-based survival models that are trained using data that is continuously gathered and stored at specific times, called snapshots. An important property of this type of training data is that it can contain more than one snapshot from a specific individual which results in that standard maximum likelihood training can not be directly applied since the data is not independent. However, the papers show that if the data is in a specific format where all snapshot times are the same for all individuals, called homogeneously sampled, maximum likelihood training can be applied and produce desirable results. In many cases, the data is not homogeneously sampled and in this case, it is proposed to resample the data to make it homogeneously sampled. How densely the dataset is sampled turns out to be an important parameter; it should be chosen large enough to produce good results, but this also increases the size of the dataset which makes training slow. To reduce the number of samples needed during training, the paper also proposes a technique to, instead of resampling the dataset once before the training starts, randomly resample the dataset at the start of each epoch during the training. The proposed methodology is evaluated on both a simulated dataset and an experimental dataset of starter battery failures. The results show that if the data is homogeneously sampled the methodology works as intended and produces accurate survival models. The results also show that randomly resampling the dataset on each epoch is an effective way to reduce the size of the training data.

Usage-Specific Survival Modeling Based on Operational Data and Neural Networks

TL;DR

The results show that if the data is homogeneously sampled the methodology works as intended and produces accurate survival models and the results show that randomly resampling the dataset on each epoch is an effective way to reduce the size of the training data.

Abstract

Accurate predictions of when a component will fail are crucial when planning maintenance, and by modeling the distribution of these failure times, survival models have shown to be particularly useful in this context. The presented methodology is based on conventional neural network-based survival models that are trained using data that is continuously gathered and stored at specific times, called snapshots. An important property of this type of training data is that it can contain more than one snapshot from a specific individual which results in that standard maximum likelihood training can not be directly applied since the data is not independent. However, the papers show that if the data is in a specific format where all snapshot times are the same for all individuals, called homogeneously sampled, maximum likelihood training can be applied and produce desirable results. In many cases, the data is not homogeneously sampled and in this case, it is proposed to resample the data to make it homogeneously sampled. How densely the dataset is sampled turns out to be an important parameter; it should be chosen large enough to produce good results, but this also increases the size of the dataset which makes training slow. To reduce the number of samples needed during training, the paper also proposes a technique to, instead of resampling the dataset once before the training starts, randomly resample the dataset at the start of each epoch during the training. The proposed methodology is evaluated on both a simulated dataset and an experimental dataset of starter battery failures. The results show that if the data is homogeneously sampled the methodology works as intended and produces accurate survival models. The results also show that randomly resampling the dataset on each epoch is an effective way to reduce the size of the training data.
Paper Structure (28 sections, 31 equations, 3 figures, 1 table)

This paper contains 28 sections, 31 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: An illustrative example showing the accumulative usage $x(t)$ for three individual. Included are also their respective failure times marked by crosses, and times from which snapshots are available marked with circles.
  • Figure 2: Results from models trained on datasets of three different sizes, for different numbers of samples in the resampling, and for both fixed sampling and epochwise random resampling. The results are shown both as a function of the sampling time (distance between two samples in the sampling grid), and total size of the training data; since for epochwise resampling, the grid varies the mean is used in both cases. The test loss is the mean loss for the 40 models evaluated on the test set as described in Section \ref{['sec:sim_training']}
  • Figure 3: Results from models trained using homogeneously sampled datasets as well as the mixed dataset containing two different sampling densities.

Theorems & Definitions (1)

  • Definition 1: homogeneously sampled