Table of Contents
Fetching ...

Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings

Sagar Srinivas Sakhinana, Geethan Sannidhi, Chidaksh Ravuru, Venkataramana Runkana

TL;DR

This work tackles enterprise-scale multivariate spatio-temporal forecasting (MTSF) under resource constraints and data privacy concerns by uniting time-series representation learning with instruction-tuned open-source language models. The core contribution is MultiTs Net, a dynamic, multi-modal framework that combines a time-first, space-aware representation with a dynamic prompt pool and on-device fine-tuning of small LMs via Mixture of Parameter-Efficient Experts (MoPEs) and LoRA, enabling on-premises deployment. Key innovations include a dynamic prompting mechanism, Grouped-Query Attention for intra- and inter-series dependencies, Graph Chebyshev convolution to leverage prior domain graphs, and uncertainty estimation through Gaussian NLL, all validated on PeMS/METR-LA datasets with strong empirical gains and robust ablation results. The framework demonstrates accuracy, uncertainty quantification, and privacy-preserving deployment on consumer hardware, offering a scalable pathway for real-world enterprise forecasting without relying on external APIs or excessive computation. The combination of explicit domain graphs, data-driven relational learning, and cross-modal text representations provides new opportunities for interpretable, robust forecasting in non-stationary environments.

Abstract

Spatio-temporal forecasting is crucial in transportation, logistics, and supply chain management. However, current methods struggle with large, complex datasets. We propose a dynamic, multi-modal approach that integrates the strengths of traditional forecasting methods and instruction tuning of small language models for time series trend analysis. This approach utilizes a mixture of experts (MoE) architecture with parameter-efficient fine-tuning (PEFT) methods, tailored for consumer hardware to scale up AI solutions in low resource settings while balancing performance and latency tradeoffs. Additionally, our approach leverages related past experiences for similar input time series to efficiently handle both intra-series and inter-series dependencies of non-stationary data with a time-then-space modeling approach, using grouped-query attention, while mitigating the limitations of traditional forecasting techniques in handling distributional shifts. Our approach models predictive uncertainty to improve decision-making. Our framework enables on-premises customization with reduced computational and memory demands, while maintaining inference speed and data privacy/security. Extensive experiments on various real-world datasets demonstrate that our framework provides robust and accurate forecasts, significantly outperforming existing methods.

Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings

TL;DR

This work tackles enterprise-scale multivariate spatio-temporal forecasting (MTSF) under resource constraints and data privacy concerns by uniting time-series representation learning with instruction-tuned open-source language models. The core contribution is MultiTs Net, a dynamic, multi-modal framework that combines a time-first, space-aware representation with a dynamic prompt pool and on-device fine-tuning of small LMs via Mixture of Parameter-Efficient Experts (MoPEs) and LoRA, enabling on-premises deployment. Key innovations include a dynamic prompting mechanism, Grouped-Query Attention for intra- and inter-series dependencies, Graph Chebyshev convolution to leverage prior domain graphs, and uncertainty estimation through Gaussian NLL, all validated on PeMS/METR-LA datasets with strong empirical gains and robust ablation results. The framework demonstrates accuracy, uncertainty quantification, and privacy-preserving deployment on consumer hardware, offering a scalable pathway for real-world enterprise forecasting without relying on external APIs or excessive computation. The combination of explicit domain graphs, data-driven relational learning, and cross-modal text representations provides new opportunities for interpretable, robust forecasting in non-stationary environments.

Abstract

Spatio-temporal forecasting is crucial in transportation, logistics, and supply chain management. However, current methods struggle with large, complex datasets. We propose a dynamic, multi-modal approach that integrates the strengths of traditional forecasting methods and instruction tuning of small language models for time series trend analysis. This approach utilizes a mixture of experts (MoE) architecture with parameter-efficient fine-tuning (PEFT) methods, tailored for consumer hardware to scale up AI solutions in low resource settings while balancing performance and latency tradeoffs. Additionally, our approach leverages related past experiences for similar input time series to efficiently handle both intra-series and inter-series dependencies of non-stationary data with a time-then-space modeling approach, using grouped-query attention, while mitigating the limitations of traditional forecasting techniques in handling distributional shifts. Our approach models predictive uncertainty to improve decision-making. Our framework enables on-premises customization with reduced computational and memory demands, while maintaining inference speed and data privacy/security. Extensive experiments on various real-world datasets demonstrate that our framework provides robust and accurate forecasts, significantly outperforming existing methods.
Paper Structure (19 sections, 13 equations, 3 figures, 6 tables)

This paper contains 19 sections, 13 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: The figure above illustrates our proposed comprehensive framework that synergizes three core strategies for advanced time series analysis. First, it employs a dynamic prompting mechanism that leverages historical learned patterns to adapt to emerging trends and capture dependencies both within and across time series to compute context-aware time series embeddings. Second, it fine-tunes a smaller language model using instruction-following data generated by larger models for time series trend analysis, yielding text-level embeddings that encapsulate these patterns. Lastly, it integrates these diverse, complementary cross-modal embeddings, offering accurate forecasts and improved generalization and scalability for practical forecasting applications. The model architecture is explained in great detail in Section\ref{['method']}. Refer to the technical appendix for more information.
  • Figure 2: The table displays the pointwise prediction error across multiple forecast horizons on benchmark datasets.
  • Figure 3: The figure shows the uncertainty estimations for the w/Unc-MultiTs Net framework forecasts on a sample of sensors(nodes) on the benchmark datasets.