SPEAR: Soft Prompt Enhanced Anomaly Recognition for Time Series Data
Hanzhe Wei, Jiajun Wu, Jialin Yang, Henry Leung, Steve Drew
TL;DR
SPEAR addresses time series anomaly detection by leveraging soft prompt tuning with quantization to adapt small, pre-trained LLMs without fine-tuning the entire model. The pipeline quantizes continuous series into discrete tokens, embeds them, and concatenates learnable soft prompts before feeding the sequence into a frozen LLM, with a lightweight classifier head trained via cross-entropy. Contextual anomalies are enriched through automated labeling with monotonic trends, spikes, shifts, and volatility changes, and class imbalance is mitigated using T-SMOTE. Empirical results on MIMIC-IV, NAB, and NASA show SPEAR (especially SPEAR-BERT) outperforms zero-shot baselines in balanced metrics like AUROC and AUPR, while maintaining memory efficiency suitable for deployment. The work demonstrates that small LLMs with soft prompts can achieve competitive anomaly-detection performance, offering a privacy-friendly and cost-effective alternative to fine-tuning large models.
Abstract
Time series anomaly detection plays a crucial role in a wide range of fields, such as healthcare and internet traffic monitoring. The emergence of large language models (LLMs) offers new opportunities for detecting anomalies in the ubiquitous time series data. Traditional approaches struggle with variable-length time series sequences and context-based anomalies. We propose Soft Prompt Enhanced Anomaly Recognition (SPEAR), a novel approach to leverage LLMs for anomaly detection with soft prompts and quantization. Our methodology involves quantizing and transforming the time series data into input embeddings and combining them with learnable soft prompt embeddings. These combined embeddings are then fed into a frozen LLM. The soft prompts are updated iteratively based on a cross-entropy loss, allowing the model to adapt to time series anomaly detection. The use of soft prompts helps adapt LLMs effectively to time series tasks, while quantization ensures optimal handling of sequences, as LLMs are designed to handle discrete sequences. Our experimental results demonstrate that soft prompts effectively increase LLMs' performance in downstream tasks regarding time series anomaly detection.
