Table of Contents
Fetching ...

Large Language Models are Few-shot Multivariate Time Series Classifiers

Yakun Chen, Zihao Li, Chao Yang, Xianzhi Wang, Guandong Xu

TL;DR

This work tackles few-shot Multivariate Time Series Classification ($MTSC$) by leveraging pre-trained Large Language Models (LLMs). The authors propose LLMFew, a framework that transforms time-series patches into LLM-compatible embeddings via a Patch-wise Temporal Convolution Encoder (PTCEnc) and then fine-tunes the LLM with Low-Rank Adaptations (LoRA), followed by an MLP classifier. Across ten real-world MTSC datasets, LLMFew achieves substantial gains in 1-shot and demonstrates strong performance in $K$-shot settings, with notable improvements on Handwriting and EthanolConcentration over baselines; ablations confirm the importance of both PTCEnc and LoRA fine-tuning. The work suggests that pre-trained knowledge in LLMs can be effectively transferred to time-series domains, enabling robust, data-efficient classification applicable to industrial settings with limited labeled data.

Abstract

Large Language Models (LLMs) have been extensively applied in time series analysis. Yet, their utility in the few-shot classification (i.e., a crucial training scenario due to the limited training data available in industrial applications) concerning multivariate time series data remains underexplored. We aim to leverage the extensive pre-trained knowledge in LLMs to overcome the data scarcity problem within multivariate time series. Specifically, we propose LLMFew, an LLM-enhanced framework to investigate the feasibility and capacity of LLMs for few-shot multivariate time series classification. This model introduces a Patch-wise Temporal Convolution Encoder (PTCEnc) to align time series data with the textual embedding input of LLMs. We further fine-tune the pre-trained LLM decoder with Low-rank Adaptations (LoRA) to enhance its feature representation learning ability in time series data. Experimental results show that our model outperformed state-of-the-art baselines by a large margin, achieving 125.2% and 50.2% improvement in classification accuracy on Handwriting and EthanolConcentration datasets, respectively. Moreover, our experimental results demonstrate that LLM-based methods perform well across a variety of datasets in few-shot MTSC, delivering reliable results compared to traditional models. This success paves the way for their deployment in industrial environments where data are limited.

Large Language Models are Few-shot Multivariate Time Series Classifiers

TL;DR

This work tackles few-shot Multivariate Time Series Classification () by leveraging pre-trained Large Language Models (LLMs). The authors propose LLMFew, a framework that transforms time-series patches into LLM-compatible embeddings via a Patch-wise Temporal Convolution Encoder (PTCEnc) and then fine-tunes the LLM with Low-Rank Adaptations (LoRA), followed by an MLP classifier. Across ten real-world MTSC datasets, LLMFew achieves substantial gains in 1-shot and demonstrates strong performance in -shot settings, with notable improvements on Handwriting and EthanolConcentration over baselines; ablations confirm the importance of both PTCEnc and LoRA fine-tuning. The work suggests that pre-trained knowledge in LLMs can be effectively transferred to time-series domains, enabling robust, data-efficient classification applicable to industrial settings with limited labeled data.

Abstract

Large Language Models (LLMs) have been extensively applied in time series analysis. Yet, their utility in the few-shot classification (i.e., a crucial training scenario due to the limited training data available in industrial applications) concerning multivariate time series data remains underexplored. We aim to leverage the extensive pre-trained knowledge in LLMs to overcome the data scarcity problem within multivariate time series. Specifically, we propose LLMFew, an LLM-enhanced framework to investigate the feasibility and capacity of LLMs for few-shot multivariate time series classification. This model introduces a Patch-wise Temporal Convolution Encoder (PTCEnc) to align time series data with the textual embedding input of LLMs. We further fine-tune the pre-trained LLM decoder with Low-rank Adaptations (LoRA) to enhance its feature representation learning ability in time series data. Experimental results show that our model outperformed state-of-the-art baselines by a large margin, achieving 125.2% and 50.2% improvement in classification accuracy on Handwriting and EthanolConcentration datasets, respectively. Moreover, our experimental results demonstrate that LLM-based methods perform well across a variety of datasets in few-shot MTSC, delivering reliable results compared to traditional models. This success paves the way for their deployment in industrial environments where data are limited.

Paper Structure

This paper contains 28 sections, 4 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Illustration of different training paradigms for MTSC. (a) classical end-to-end deep learning, not designed for few-shot scenarios, requires a large amount of labeled training samples; (b) transfer learning uses a pre-training stage to learn knowledge from source time series data from different domains and then fine-tunes the learned model for the target domain with limited training samples; (c) our proposed LLM-enhanced framework for few-shot scenarios without leveraging source time series data.
  • Figure 2: Framework of LLMFew. We first preprocess MTS into a group of patches. The Patch-wise Temporal Convolution Encoder (PTCEnc) later converts the multiple patches to the time series embeddings, ready for LLMs. To activate LLMs' ability to learn temporal features, we use LoRA to fine-tune Q, K, V parameters of attention layers. Last, we use a classification head to generate the possible class.
  • Figure 3: Average classification accuracy (%) under K-shot setting, with $k=1,3,5$. All results are the average over 5 runs.
  • Figure 4: Average classification accuracy (%) with full training samples. The results for LLMFew, Crossformer and PatchTST are the average over 5 runs. The others are reported from OneFitsAll.
  • Figure 5: Inference time and GPU memory usage comparison for LLM-based methods. The results are for the PEMS-SF dataset. OneFitsAll is one of our baselines. The others are LLMFew variants, same as them in Table \ref{['tab:llms']} The results are average over 5 runs.
  • ...and 1 more figures