Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

Christina X Ji; Ahmed M Alaa; David Sontag

Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

Christina X Ji, Ahmed M Alaa, David Sontag

TL;DR

This paper introduces Seq-to-Final, a modular benchmark for learning under temporal distribution shift, focusing on optimizing performance at a final time point using sequences of historical data. It defines a sequence-to-final problem, builds six shift-building blocks (input, output, and intermediate) to create synthetic sequences on CIFAR-10/100 and validates findings on a real-world Portraits dataset, quantifying shifts with Wasserstein-2 distances. The study compares 16 methods across three classes: ignoring history, pre-training with historical data plus final adaptation, and leveraging sequential history to target the final distribution, revealing that ignoring sequential structure can be as effective as leveraging it, with LP-FT excelling for label flips. A key contribution is the open, modular benchmark that enables reproducible, scalable evaluation and visualization of how historical data influences final-time learning, informing future algorithm design for temporal distribution shifts.

Abstract

Distribution shift over time occurs in many settings. Leveraging historical data is necessary to learn a model for the last time point when limited data is available in the final period, yet few methods have been developed specifically for this purpose. In this work, we construct a benchmark with different sequences of synthetic shifts to evaluate the effectiveness of 3 classes of methods that 1) learn from all data without adapting to the final period, 2) learn from historical data with no regard to the sequential nature and then adapt to the final period, and 3) leverage the sequential nature of historical data when tailoring a model to the final period. We call this benchmark Seq-to-Final to highlight the focus on using a sequence of time periods to learn a model for the final time point. Our synthetic benchmark allows users to construct sequences with different types of shift and compare different methods. We focus on image classification tasks using CIFAR-10 and CIFAR-100 as the base images for the synthetic sequences. We also evaluate the same methods on the Portraits dataset to explore the relevance to real-world shifts over time. Finally, we create a visualization to contrast the initializations and updates from different methods at the final time step. Our results suggest that, for the sequences in our benchmark, methods that disregard the sequential structure and adapt to the final time point tend to perform well. The approaches we evaluate that leverage the sequential nature do not offer any improvement. We hope that this benchmark will inspire the development of new algorithms that are better at leveraging sequential historical data or a deeper understanding of why methods that disregard the sequential nature are able to perform well.

Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

TL;DR

Abstract

Paper Structure (29 sections, 12 equations, 17 figures, 15 tables)

This paper contains 29 sections, 12 equations, 17 figures, 15 tables.

Introduction
Benchmark Construction
Problem Definition
Building Blocks: Types of Shift
Sequences of Synthetic Shifts
Sequence of Real-World Shifts
Methods to Learn from Data with Sequential Distribution Shift
Oracle and Baseline
Learning from All Distributions
Pre-training with Historical Data and Adapting to the Final Distribution
Leveraging Sequential Historical Data and Targeting the Final Distribution
Results
Visual Exploration of Models
Discussion
Examples of Benchmark Sequences
...and 14 more sections

Figures (17)

Figure 1: Linear interpolation paths from initialization to final model at last step. Top right: Rotation, corruption, and label flip sequence. Other plots: Corruption, label flip, and rotation sequence. Top: 4-block convolutional networks. Bottom left: 4-block dense network. Bottom right: 4-block residual network. Architectures are described in Appendix \ref{['app:image_class_models']}.
Figure 2: Illustration of corruption, rotation, and recoloring applied repeatedly to two images. Leftmost images are the original images from CIFAR-10. Transformations are applied repeatedly going to the right.
Figure 3: Order of sub-populations in first 10 label classes in CIFAR-100. The first two sub-populations are present at step 0. The other three sub-populations are introduced in the order shown at steps 1, 2, and 3.
Figure 4: Order of sub-populations in second 10 label classes in CIFAR-100. The first two sub-populations are present at step 0. The other three sub-populations are introduced in the order shown at steps 1, 2, and 3.
Figure 5: Images in CIFAR-10 shift sequence: corruption, label flip, and rotation. At each step, a new shift is added on top of the shifts in the previous rows.
...and 12 more figures

Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

TL;DR

Abstract

Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

Authors

TL;DR

Abstract

Table of Contents

Figures (17)