Table of Contents
Fetching ...

TS-HINT: Enhancing Semiconductor Time Series Regression Using Attention Hints From Large Language Model Reasoning

Jonathan Adam Rico, Nagarajan Raghavan, Senthilnath Jayavelu

TL;DR

The paper tackles predicting wafer material removal rate (MRR) during CMP without direct in-process measurements by introducing TS-Hint, a Time Series Foundation Model enhanced with LLM-based chain-of-thought reasoning. By generating attention hints from pretrained attention and saliency maps and integrating them into a multivariate time-series regression framework, the approach preserves temporal dynamics and improves data efficiency, including few-shot learning. Experimental results on the PHM 2016 CMP dataset show TS-Hint achieves strong full-data performance (3.92 RMSE) and notable gains in limited-data settings, highlighting the value of attention-driven guidance. The work advances the use of LLM reasoning in regression tasks on time-series data and offers a pathway for more data-efficient semiconductor process modeling, albeit with higher computational costs and room for improvement in in-depth time-series reasoning.

Abstract

Existing data-driven methods rely on the extraction of static features from time series to approximate the material removal rate (MRR) of semiconductor manufacturing processes such as chemical mechanical polishing (CMP). However, this leads to a loss of temporal dynamics. Moreover, these methods require a large amount of data for effective training. In this paper, we propose TS-Hint, a Time Series Foundation Model (TSFM) framework, integrated with chain-of-thought reasoning which provides attention hints during training based on attention mechanism data and saliency data. Experimental results demonstrate the effectiveness of our model in limited data settings via few-shot learning and can learn directly from multivariate time series features.

TS-HINT: Enhancing Semiconductor Time Series Regression Using Attention Hints From Large Language Model Reasoning

TL;DR

The paper tackles predicting wafer material removal rate (MRR) during CMP without direct in-process measurements by introducing TS-Hint, a Time Series Foundation Model enhanced with LLM-based chain-of-thought reasoning. By generating attention hints from pretrained attention and saliency maps and integrating them into a multivariate time-series regression framework, the approach preserves temporal dynamics and improves data efficiency, including few-shot learning. Experimental results on the PHM 2016 CMP dataset show TS-Hint achieves strong full-data performance (3.92 RMSE) and notable gains in limited-data settings, highlighting the value of attention-driven guidance. The work advances the use of LLM reasoning in regression tasks on time-series data and offers a pathway for more data-efficient semiconductor process modeling, albeit with higher computational costs and room for improvement in in-depth time-series reasoning.

Abstract

Existing data-driven methods rely on the extraction of static features from time series to approximate the material removal rate (MRR) of semiconductor manufacturing processes such as chemical mechanical polishing (CMP). However, this leads to a loss of temporal dynamics. Moreover, these methods require a large amount of data for effective training. In this paper, we propose TS-Hint, a Time Series Foundation Model (TSFM) framework, integrated with chain-of-thought reasoning which provides attention hints during training based on attention mechanism data and saliency data. Experimental results demonstrate the effectiveness of our model in limited data settings via few-shot learning and can learn directly from multivariate time series features.

Paper Structure

This paper contains 10 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of TS-Hint architecture shows (a) pretraining on 15% training data, (b) few-shot fine-tuning one sample at a time enhanced by LLM chain-of-thought reasoning, and (c) inference on the test set. Dashed arrow lines indicate the attention data and saliency data retrieved from the pretrained model once.
  • Figure 2: Chemical mechanical polishing setup.
  • Figure 3: Scatter plot of true vs predicted average MRR of the TS-Hint for regression using full train data.
  • Figure 4: Before, after, and absolute difference for 1-shot fine-tuning (top) attention maps and attention hint and (bottom) saliency maps.