Understanding the Limits of Deep Tabular Methods with Temporal Shift

Hao-Run Cai; Han-Jia Ye

Understanding the Limits of Deep Tabular Methods with Temporal Shift

Hao-Run Cai, Han-Jia Ye

TL;DR

The paper investigates why deep tabular methods falter under temporal distribution shifts and identifies training lag and validation bias in temporal splits as key culprits. It analyzes how temporal patterns are lost in deep representations and proposes a plug-and-play Fourier-series-based temporal embedding to recover periodic and trend information. A refined temporal splitting protocol is introduced to minimize lag and bias, yielding performance on par with random splits but with far greater stability. Together, these contributions offer a practical framework to enhance temporal generalization in deep tabular learning, demonstrated on the TabReD benchmark across diverse methods, including retrieval-based ones. The approach emphasizes explicit temporal information incorporation as essential for robust deployment in temporally evolving environments.

Abstract

Deep tabular models have demonstrated remarkable success on i.i.d. data, excelling in a variety of structured data tasks. However, their performance often deteriorates under temporal distribution shifts, where trends and periodic patterns are present in the evolving data distribution over time. In this paper, we explore the underlying reasons for this failure in capturing temporal dependencies. We begin by investigating the training protocol, revealing a key issue in how model selection performs. While existing approaches use temporal ordering for splitting validation set, we show that even a random split can significantly improve model performance. By minimizing the time lag between training data and test time, while reducing the bias in validation, our proposed training protocol significantly improves generalization across various methods. Furthermore, we analyze how temporal data affects deep tabular representations, uncovering that these models often fail to capture crucial periodic and trend information. To address this gap, we introduce a plug-and-play temporal embedding method based on Fourier series expansion to learn and incorporate temporal patterns, offering an adaptive approach to handle temporal shifts. Our experiments demonstrate that this temporal embedding, combined with the improved training protocol, provides a more effective and robust framework for learning from temporal tabular data.

Understanding the Limits of Deep Tabular Methods with Temporal Shift

TL;DR

Abstract

Understanding the Limits of Deep Tabular Methods with Temporal Shift

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)