Table of Contents
Fetching ...

Instruction-based Time Series Editing

Jiaxing Qiu, Dongliang Guo, Brynne Sullivan, Teague R. Henry, Thomas Hartvigsen

TL;DR

The paper tackles the rigidity of prior time series editors by introducing instruction-based editing, allowing natural-language guidance to steer edits while preserving unrelated characteristics. It proposes InstructTime, a multimodal editor that encodes time series and instructions into a shared hyperspherical space and uses interpolated decoding to control editing strength across multiple resolutions. The work demonstrates state-of-the-art editing quality, supports smooth interpolated editing, and offers few-shot tuning to adapt to unseen conditions, highlighting strong generalizability. These advances enable nuanced, hypothesis-driven edits in real-world contexts where textual notes accompany time series data, with potential applications in healthcare and beyond.

Abstract

In time series editing, we aim to modify some properties of a given time series without altering others. For example, when analyzing a hospital patient's blood pressure, we may add a sudden early drop and observe how it impacts their future while preserving other conditions. Existing diffusion-based editors rely on rigid, predefined attribute vectors as conditions and produce all-or-nothing edits through sampling. This attribute- and sampling-based approach limits flexibility in condition format and lacks customizable control over editing strength. To overcome these limitations, we introduce Instruction-based Time Series Editing, where users specify intended edits using natural language. This allows users to express a wider range of edits in a more accessible format. We then introduce InstructTime, the first instruction-based time series editor. InstructTime takes in time series and instructions, embeds them into a shared multi-modal representation space, then decodes their embeddings to generate edited time series. By learning a structured multi-modal representation space, we can easily interpolate between embeddings to achieve varying degrees of edit. To handle local and global edits together, we propose multi-resolution encoders. In our experiments, we use synthetic and real datasets and find that InstructTime is a state-of-the-art time series editor: InstructTime achieves high-quality edits with controllable strength, can generalize to unseen instructions, and can be easily adapted to unseen conditions through few-shot learning.

Instruction-based Time Series Editing

TL;DR

The paper tackles the rigidity of prior time series editors by introducing instruction-based editing, allowing natural-language guidance to steer edits while preserving unrelated characteristics. It proposes InstructTime, a multimodal editor that encodes time series and instructions into a shared hyperspherical space and uses interpolated decoding to control editing strength across multiple resolutions. The work demonstrates state-of-the-art editing quality, supports smooth interpolated editing, and offers few-shot tuning to adapt to unseen conditions, highlighting strong generalizability. These advances enable nuanced, hypothesis-driven edits in real-world contexts where textual notes accompany time series data, with potential applications in healthcare and beyond.

Abstract

In time series editing, we aim to modify some properties of a given time series without altering others. For example, when analyzing a hospital patient's blood pressure, we may add a sudden early drop and observe how it impacts their future while preserving other conditions. Existing diffusion-based editors rely on rigid, predefined attribute vectors as conditions and produce all-or-nothing edits through sampling. This attribute- and sampling-based approach limits flexibility in condition format and lacks customizable control over editing strength. To overcome these limitations, we introduce Instruction-based Time Series Editing, where users specify intended edits using natural language. This allows users to express a wider range of edits in a more accessible format. We then introduce InstructTime, the first instruction-based time series editor. InstructTime takes in time series and instructions, embeds them into a shared multi-modal representation space, then decodes their embeddings to generate edited time series. By learning a structured multi-modal representation space, we can easily interpolate between embeddings to achieve varying degrees of edit. To handle local and global edits together, we propose multi-resolution encoders. In our experiments, we use synthetic and real datasets and find that InstructTime is a state-of-the-art time series editor: InstructTime achieves high-quality edits with controllable strength, can generalize to unseen instructions, and can be easily adapted to unseen conditions through few-shot learning.

Paper Structure

This paper contains 25 sections, 11 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Instruction-based time series editing modifies a given time series using natural language instructions (e.g., adding abnormal events to a normal heart rate). Our approach enables controllable editing strength and generalizes to unseen instructions. In contrast, existing attribute- and sampling-based methods edit time series with uncontrolled strength based on predefined attribute vectors.
  • Figure 2: InstructTime architecture, training, and interpolated editing. InstructTime includes a multi-resolution time series encoder, an instruction encoder, and a time series conditional decoder. During training (left), time series–description pairs are encoded into latent representations $z_x$ and $z_y$ in a shared hyperspherical space, and the decoder reconstructs the input time series from both. During editing (right), we encode a time series and an instruction onto the shared embedding space, then decode interpolated embeddings between them to generate edits at varying editing strength.
  • Figure 3: Interpolated editing. As the editing strength $w$ increases, InstructTime gradually edits the flat trend into an upward trend while preserving either the local pattern of upward shifts in the mean (upper) or the global pattern of seasonality (lower). Diffusion-based TEdit generates upward trends while preserving other attributes, but with varying slopes and scattered distances from the input.
  • Figure 4: Progressive change in editability and preservability by InstructTime as editing strength $w$ increases from 0 to 1 (step 0.1). TEdit and Time Weaver edit time series without controllable strength.
  • Figure 5: Few-shot tuning InstructTime on unseen conditions approaches the editability by model trained on all levels.
  • ...and 2 more figures