Table of Contents
Fetching ...

Privacy Risks in Time Series Forecasting: User- and Record-Level Membership Inference

Nicolas Johansson, Tobias Olsson, Daniel Nilsson, Johan Östman, Fazeleh Hoseini

TL;DR

The paper investigates privacy risks in time series forecasting by adapting state-of-the-art membership inference attacks to forecasting tasks and introducing a novel end-to-end Deep Time Series (DTS) attack. It presents two attack paradigms—Multi-Signal LiRA (statistics-based) and DTS (classifier-based)—and evaluates them on TUH-EEG and ELD datasets with LSTM and N-HiTS models, under record- and user-level threat models. Key findings show pronounced vulnerability to MIAs, with online and user-level attacks often achieving high detection rates, particularly as forecast horizon increases and training populations shrink. The work establishes robust baselines for privacy risk in time series forecasting and highlights the greater privacy risk at the user level, emphasizing the need for defense mechanisms and broader, more diverse evaluations. Overall, the research demonstrates that forecasting models memorize and leak information about training data, with practical implications for privacy auditing and model deployment in sensitive domains.

Abstract

Membership inference attacks (MIAs) aim to determine whether specific data were used to train a model. While extensively studied on classification models, their impact on time series forecasting remains largely unexplored. We address this gap by introducing two new attacks: (i) an adaptation of multivariate LiRA, a state-of-the-art MIA originally developed for classification models, to the time-series forecasting setting, and (ii) a novel end-to-end learning approach called Deep Time Series (DTS) attack. We benchmark these methods against adapted versions of other leading attacks from the classification setting. We evaluate all attacks in realistic settings on the TUH-EEG and ELD datasets, targeting two strong forecasting architectures, LSTM and the state-of-the-art N-HiTS, under both record- and user-level threat models. Our results show that forecasting models are vulnerable, with user-level attacks often achieving perfect detection. The proposed methods achieve the strongest performance in several settings, establishing new baselines for privacy risk assessment in time series forecasting. Furthermore, vulnerability increases with longer prediction horizons and smaller training populations, echoing trends observed in large language models.

Privacy Risks in Time Series Forecasting: User- and Record-Level Membership Inference

TL;DR

The paper investigates privacy risks in time series forecasting by adapting state-of-the-art membership inference attacks to forecasting tasks and introducing a novel end-to-end Deep Time Series (DTS) attack. It presents two attack paradigms—Multi-Signal LiRA (statistics-based) and DTS (classifier-based)—and evaluates them on TUH-EEG and ELD datasets with LSTM and N-HiTS models, under record- and user-level threat models. Key findings show pronounced vulnerability to MIAs, with online and user-level attacks often achieving high detection rates, particularly as forecast horizon increases and training populations shrink. The work establishes robust baselines for privacy risk in time series forecasting and highlights the greater privacy risk at the user level, emphasizing the need for defense mechanisms and broader, more diverse evaluations. Overall, the research demonstrates that forecasting models memorize and leak information about training data, with practical implications for privacy auditing and model deployment in sensitive domains.

Abstract

Membership inference attacks (MIAs) aim to determine whether specific data were used to train a model. While extensively studied on classification models, their impact on time series forecasting remains largely unexplored. We address this gap by introducing two new attacks: (i) an adaptation of multivariate LiRA, a state-of-the-art MIA originally developed for classification models, to the time-series forecasting setting, and (ii) a novel end-to-end learning approach called Deep Time Series (DTS) attack. We benchmark these methods against adapted versions of other leading attacks from the classification setting. We evaluate all attacks in realistic settings on the TUH-EEG and ELD datasets, targeting two strong forecasting architectures, LSTM and the state-of-the-art N-HiTS, under both record- and user-level threat models. Our results show that forecasting models are vulnerable, with user-level attacks often achieving perfect detection. The proposed methods achieve the strongest performance in several settings, establishing new baselines for privacy risk assessment in time series forecasting. Furthermore, vulnerability increases with longer prediction horizons and smaller training populations, echoing trends observed in large language models.

Paper Structure

This paper contains 46 sections, 15 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Effect of forecasting horizon on record-level (left and mid) and user-level (right) TPRs for three attack configurations against the N-HiTS model on the ELD dataset.
  • Figure 2: Effect of varying number of individuals on record-level (left and mid) and user-level TPRs (right) for LiRA-Online, Multi-Signal LiRA-Online, and DTS with InceptionTime; plotted with the confidence interval over five runs.
  • Figure 3: Effect of observed dataset size on user-level MIA for LiRA-Online, Multi-Signal LiRA-Online, and DTS-Online.
  • Figure 4: ROC curves with logarithmic scales (TPR on the y-axis and FPR on the x-axis) for five membership inference attacks: Ensemble, RMIA, LiRA, Multi-Signal LiRA, and DTS, arranged in rows. Columns correspond to dataset–architecture pairs: EEG–LSTM, EEG–N-HiTS, ELD–LSTM, and ELD–N-HiTS. All results are for the online setting, with the Ensemble attack evaluated in audit mode. Within each subplot, curves correspond to individual attack signals for LiRA, RMIA, and Ensemble, all combined signals for Multi-Signal LiRA, and different classifier architectures for DTS. Each curve represents the mean over five independent runs.
  • Figure 5: ROC curves with logarithmic scales (TPR on the y-axis and FPR on the x-axis) for four memberships inference attacks: RMIA, LiRA, Multi-Signal LiRA, and DTS, arranged in columns. Rows correspond to dataset–architecture pairs: EEG–LSTM, EEG–N-HiTS, ELD–LSTM, and ELD–N-HiTS. All results are for the offline setting. Within each subplot, curves correspond to individual attack signals for LiRA and RMIA, all combined signals for Multi-Signal LiRA, and different classifier architectures for DTS. Each curve represents the mean over five independent runs.