Harnessing Event Sensory Data for Error Pattern Prediction in Vehicles: A Language Model Approach
Hugo Math, Rainer Lienhart, Robin Schön
TL;DR
The paper tackles predicting when and what error patterns will occur in vehicles from irregular, high-cardinality DTC event streams. It proposes CarFormer, an encoder that fuses four embeddings (including time and mileage) with Rotary Position Embeddings, and EPredictor, a decoder that performs autoregressive multi-label EP prediction and time-to-EP regression, trained via a self-supervised multi-task objective. Key findings show that the model can achieve around 80% F1 for EP prediction with sequences averaging $L \approx 160$ DTCs using only half the codes and an average time-to-EP error of $58.4 \pm 13.2$ hours, with robust performance within a Confident Predictive Maintenance Window (CPMW). This framework enables proactive predictive maintenance and enhanced vehicle safety, offering a practical path toward deployment in in-vehicle systems and maintenance planning.
Abstract
In this paper, we draw an analogy between processing natural languages and processing multivariate event streams from vehicles in order to predict $\textit{when}$ and $\textit{what}$ error pattern is most likely to occur in the future for a given car. Our approach leverages the temporal dynamics and contextual relationships of our event data from a fleet of cars. Event data is composed of discrete values of error codes as well as continuous values such as time and mileage. Modelled by two causal Transformers, we can anticipate vehicle failures and malfunctions before they happen. Thus, we introduce $\textit{CarFormer}$, a Transformer model trained via a new self-supervised learning strategy, and $\textit{EPredictor}$, an autoregressive Transformer decoder model capable of predicting $\textit{when}$ and $\textit{what}$ error pattern will most likely occur after some error code apparition. Despite the challenges of high cardinality of event types, their unbalanced frequency of appearance and limited labelled data, our experimental results demonstrate the excellent predictive ability of our novel model. Specifically, with sequences of $160$ error codes on average, our model is able with only half of the error codes to achieve $80\%$ F1 score for predicting $\textit{what}$ error pattern will occur and achieves an average absolute error of $58.4 \pm 13.2$h $\textit{when}$ forecasting the time of occurrence, thus enabling confident predictive maintenance and enhancing vehicle safety.
