Table of Contents
Fetching ...

TADACap: Time-series Adaptive Domain-Aware Captioning

Elizabeth Fons, Rachneet Kaur, Zhen Zeng, Soham Palande, Tucker Balch, Svitlana Vyetrenko, Manuela Veloso

TL;DR

TADACap addresses the problem of domain-aware captioning for time-series images without retraining. It combines a domain-agnostic captioner with a diverse retrieval strategy (TADACap-diverse) that samples from a target-domain database using a Determinantal Point Process and CLIP embeddings, then leverages in-context prompting of an LLM to produce domain-specific captions. The key contributions are a retrieval-based framework that adapts to new domains with minimal annotation, a diverse-sample retrieval method with $k=4$, and four new domain-aware time-series caption datasets; experiments show competitive semantic accuracy compared to retrained baselines and significant annotation-effort reductions. This approach enables practical, scalable domain adaptation for time-series captioning in finance, healthcare, and other domains where access to raw time-series data is limited or where rapid domain-specific captioning is valuable.

Abstract

While image captioning has gained significant attention, the potential of captioning time-series images, prevalent in areas like finance and healthcare, remains largely untapped. Existing time-series captioning methods typically offer generic, domain-agnostic descriptions of time-series shapes and struggle to adapt to new domains without substantial retraining. To address these limitations, we introduce TADACap, a retrieval-based framework to generate domain-aware captions for time-series images, capable of adapting to new domains without retraining. Building on TADACap, we propose a novel retrieval strategy that retrieves diverse image-caption pairs from a target domain database, namely TADACap-diverse. We benchmarked TADACap-diverse against state-of-the-art methods and ablation variants. TADACap-diverse demonstrates comparable semantic accuracy while requiring significantly less annotation effort.

TADACap: Time-series Adaptive Domain-Aware Captioning

TL;DR

TADACap addresses the problem of domain-aware captioning for time-series images without retraining. It combines a domain-agnostic captioner with a diverse retrieval strategy (TADACap-diverse) that samples from a target-domain database using a Determinantal Point Process and CLIP embeddings, then leverages in-context prompting of an LLM to produce domain-specific captions. The key contributions are a retrieval-based framework that adapts to new domains with minimal annotation, a diverse-sample retrieval method with , and four new domain-aware time-series caption datasets; experiments show competitive semantic accuracy compared to retrained baselines and significant annotation-effort reductions. This approach enables practical, scalable domain adaptation for time-series captioning in finance, healthcare, and other domains where access to raw time-series data is limited or where rapid domain-specific captioning is valuable.

Abstract

While image captioning has gained significant attention, the potential of captioning time-series images, prevalent in areas like finance and healthcare, remains largely untapped. Existing time-series captioning methods typically offer generic, domain-agnostic descriptions of time-series shapes and struggle to adapt to new domains without substantial retraining. To address these limitations, we introduce TADACap, a retrieval-based framework to generate domain-aware captions for time-series images, capable of adapting to new domains without retraining. Building on TADACap, we propose a novel retrieval strategy that retrieves diverse image-caption pairs from a target domain database, namely TADACap-diverse. We benchmarked TADACap-diverse against state-of-the-art methods and ablation variants. TADACap-diverse demonstrates comparable semantic accuracy while requiring significantly less annotation effort.

Paper Structure

This paper contains 20 sections, 5 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Motivation of domain-aware time-series captioning. The caption of a given time-series in one domain can be drastically different from the caption of the same (or similarly shaped) time-series in another domain.
  • Figure 2: Overview of proposed TADACap framework for domain-adaptive time-series captioning. TADACap generates an in-domain caption of a query image based on a retrieved set of $k$ domain-agnostic and in-domain caption pairs, which are used as examples in the prompt to the GPT decoder. When adopting the proposed strategy of retrieving diverse samples from the target domain database, we name our full proposed approach as TADACap-diverse.
  • Figure 3: Qualitative results on the RealCovid, RealKnee, SynthPhysics and SynthStock time-series dataset for domain-aware captioning.
  • Figure 4: t-SNE visualization of image embeddings for each dataset. Colors represent the different datasets.
  • Figure 5: Example of selected diverse $k=4$ samples in our test database SynthStock. (a) TSNE visualization of data distribution of CLIP image embeddings of images in the database, orange marks the selected samples; (b) selected diverse images.
  • ...and 3 more figures