Table of Contents
Fetching ...

A Text-Based Knowledge-Embedded Soft Sensing Modeling Approach for General Industrial Process Tasks Based on Large Language Model

Shuo Tong, Han Liu, Runyuan Guo, Xueqiong Tian, Wenqing Wang, Ding Liu, Youmin Zhang

TL;DR

This work tackles the limitations of traditional data-driven soft sensors by introducing LLM-TKESS, a universal framework that fuses time-series auxiliary variables with a pre-trained LLM via an AVS Encoder. A two-stage adaptation ( autoregressive PEFT to form a soft sensing foundation model $SSFM$, followed by adapter-based downstream task customization) enables rapid, low-cost deployment across diverse soft-sensing tasks. Two text-based sensors, LLM-PSS and LLM-PDSS, embed natural language knowledge to enrich input representations and improve interpretability, while anomaly detection and missing value imputation are integrated through SSFM-TSA. Experiments on rotor deformation in an air preheater show state-of-the-art performance and strong few-shot capabilities, highlighting practical impact for robust, multimodal soft sensing with reduced data requirements.

Abstract

Data-driven soft sensors (DDSS) have become mainstream methods for predicting key performance indicators in process industries. However, DDSS development requires complex and costly customized designs tailored to various tasks during the modeling process. Moreover, DDSS are constrained to a single structured data modality, limiting their ability to incorporate additional contextual knowledge. Furthermore, DDSSs' limited representation learning leads to weak predictive performance with scarce data. To address these challenges, we propose a general framework named LLM-TKESS (large language model for text-based knowledge-embedded soft sensing), harnessing the powerful general problem-solving capabilities, cross-modal knowledge transfer abilities, and few-shot capabilities of LLM for enhanced soft sensing modeling. Specifically, an auxiliary variable series encoder (AVS Encoder) is proposed to unleash LLM's potential for capturing temporal relationships within series and spatial semantic relationships among auxiliary variables. Then, we propose a two-stage fine-tuning alignment strategy: in the first stage, employing parameter-efficient fine-tuning through autoregressive training adjusts LLM to rapidly accommodate process variable data, resulting in a soft sensing foundation model (SSFM). Subsequently, by training adapters, we adapt the SSFM to various downstream tasks without modifying its architecture. Then, we propose two text-based knowledge-embedded soft sensors, integrating new natural language modalities to overcome the limitations of pure structured data models. Furthermore, benefiting from LLM's pre-existing world knowledge, our model demonstrates outstanding predictive capabilities in small sample conditions. Using the thermal deformation of air preheater rotor as a case study, we validate through extensive experiments that LLM-TKESS exhibits outstanding performance.

A Text-Based Knowledge-Embedded Soft Sensing Modeling Approach for General Industrial Process Tasks Based on Large Language Model

TL;DR

This work tackles the limitations of traditional data-driven soft sensors by introducing LLM-TKESS, a universal framework that fuses time-series auxiliary variables with a pre-trained LLM via an AVS Encoder. A two-stage adaptation ( autoregressive PEFT to form a soft sensing foundation model , followed by adapter-based downstream task customization) enables rapid, low-cost deployment across diverse soft-sensing tasks. Two text-based sensors, LLM-PSS and LLM-PDSS, embed natural language knowledge to enrich input representations and improve interpretability, while anomaly detection and missing value imputation are integrated through SSFM-TSA. Experiments on rotor deformation in an air preheater show state-of-the-art performance and strong few-shot capabilities, highlighting practical impact for robust, multimodal soft sensing with reduced data requirements.

Abstract

Data-driven soft sensors (DDSS) have become mainstream methods for predicting key performance indicators in process industries. However, DDSS development requires complex and costly customized designs tailored to various tasks during the modeling process. Moreover, DDSS are constrained to a single structured data modality, limiting their ability to incorporate additional contextual knowledge. Furthermore, DDSSs' limited representation learning leads to weak predictive performance with scarce data. To address these challenges, we propose a general framework named LLM-TKESS (large language model for text-based knowledge-embedded soft sensing), harnessing the powerful general problem-solving capabilities, cross-modal knowledge transfer abilities, and few-shot capabilities of LLM for enhanced soft sensing modeling. Specifically, an auxiliary variable series encoder (AVS Encoder) is proposed to unleash LLM's potential for capturing temporal relationships within series and spatial semantic relationships among auxiliary variables. Then, we propose a two-stage fine-tuning alignment strategy: in the first stage, employing parameter-efficient fine-tuning through autoregressive training adjusts LLM to rapidly accommodate process variable data, resulting in a soft sensing foundation model (SSFM). Subsequently, by training adapters, we adapt the SSFM to various downstream tasks without modifying its architecture. Then, we propose two text-based knowledge-embedded soft sensors, integrating new natural language modalities to overcome the limitations of pure structured data models. Furthermore, benefiting from LLM's pre-existing world knowledge, our model demonstrates outstanding predictive capabilities in small sample conditions. Using the thermal deformation of air preheater rotor as a case study, we validate through extensive experiments that LLM-TKESS exhibits outstanding performance.
Paper Structure (27 sections, 21 equations, 16 figures, 7 tables)

This paper contains 27 sections, 21 equations, 16 figures, 7 tables.

Figures (16)

  • Figure 1: Conceptual framework of soft sensing foundation model (SSFM) based on LLM.
  • Figure 2: Different types of anomalies and missing data.
  • Figure 3: The proposed the overall framework architectures of LLM-TKESS. (a) Structure of SSFM autoregressive fine-tuning alignment phase. (b) Structure of SSFM-TSA downstream task adaptation phase.
  • Figure 4: Structure of the proposed AVS Encoder.
  • Figure 5: Structure of LoRA and task-specific adapter (TSA). (a) Structure of LoRA. (b) Structural of TSA.
  • ...and 11 more figures