Hinge-FM2I: An Approach using Image Inpainting for Interpolating Missing Data in Univariate Time Series
Noufel Saad, Maaroufi Nadir, Najib Mehdi, Bakhouya Mohamed
TL;DR
This work tackles missing data in online univariate time series forecasting by introducing Hinge-FM2I, a hinge-based extension of FM2I that interpolates large gaps. The method drops a neighboring point, applies FM2I’s image-inpainting-based interpolation, and uses a hinge-inspired selection to identify the best imputed sequence. Empirical evaluation on 1356 M3-series and two wastewater datasets shows that Hinge-FM2I outperforms a broad set of baselines (including mean, LOCF, KNN, ARIMA, and DTWBI) across sMAPE, RMSE, and MAE, with robust performance across gap sizes and data characteristics. The study highlights practical gains for imputing missing values in univariate time series and outlines avenues for future enhancements, such as bidirectional hinge strategies and parallelized computation.
Abstract
Accurate time series forecasts are crucial for various applications, such as traffic management, electricity consumption, and healthcare. However, limitations in models and data quality can significantly impact forecasts accuracy. One common issue with data quality is the absence of data points, referred to as missing data. It is often caused by sensor malfunctions, equipment failures, or human errors. This paper proposes Hinge-FM2I, a novel method for handling missing data values in univariate time series data. Hinge-FM2I builds upon the strengths of the Forecasting Method by Image Inpainting (FM2I). FM2I has proven effective, but selecting the most accurate forecasts remain a challenge. To overcome this issue, we proposed a selection algorithm. Inspired by door hinges, Hinge-FM2I drops a data point either before or after the gap (left/right-hinge), then use FM2I for imputation, and then select the imputed gap based on the lowest error of the dropped data point. Hinge-FM2I was evaluated on a comprehensive sample composed of 1356 time series, extracted from the M3 competition benchmark dataset, with missing value rates ranging from 3.57\% to 28.57\%. Experimental results demonstrate that Hinge-FM2I significantly outperforms established methods such as, linear/spline interpolation, K-Nearest Neighbors (K-NN), and ARIMA. Notably, Hinge-FM2I achieves an average Symmetric Mean Absolute Percentage Error (sMAPE) score of 5.6\% for small gaps, and up to 10\% for larger ones. These findings highlight the effectiveness of Hinge-FM2I as a promising new method for addressing missing values in univariate time series data.
