Table of Contents
Fetching ...

Field Matters: A lightweight LLM-enhanced Method for CTR Prediction

Yu Cui, Feng Liu, Jiawei Chen, Xingyu Lou, Changwang Zhang, Jun Wang, Yuegang Sun, Xiaohu Yang, Can Wang

TL;DR

This work tackles CTR prediction at industrial scale by suppressing the heavy costs of instance- or user/item-level LLM processing. It introduces LLaCTR, a field-level enhancement framework built on SSFT for distilling field semantics, FRE for aligning field and feature embeddings, and FIE for injecting field-aware cues into feature interactions. Across four real-world datasets and six backbones, LLaCTR delivers about 2.24% relative AUC gains while achieving 10–100x reductions in training time compared to prior LLM-enhanced methods, with ablations confirming the necessity of all three components. The approach provides a practical, scalable path to integrate semantic knowledge from LLMs into production-grade CTR systems.

Abstract

Click-through rate (CTR) prediction is a fundamental task in modern recommender systems. In recent years, the integration of large language models (LLMs) has been shown to effectively enhance the performance of traditional CTR methods. However, existing LLM-enhanced methods often require extensive processing of detailed textual descriptions for large-scale instances or user/item entities, leading to substantial computational overhead. To address this challenge, this work introduces LLaCTR, a novel and lightweight LLM-enhanced CTR method that employs a field-level enhancement paradigm. Specifically, LLaCTR first utilizes LLMs to distill crucial and lightweight semantic knowledge from small-scale feature fields through self-supervised field-feature fine-tuning. Subsequently, it leverages this field-level semantic knowledge to enhance both feature representation and feature interactions. In our experiments, we integrate LLaCTR with six representative CTR models across four datasets, demonstrating its superior performance in terms of both effectiveness and efficiency compared to existing LLM-enhanced methods. Our code is available at https://anonymous.4open.science/r/LLaCTR-EC46.

Field Matters: A lightweight LLM-enhanced Method for CTR Prediction

TL;DR

This work tackles CTR prediction at industrial scale by suppressing the heavy costs of instance- or user/item-level LLM processing. It introduces LLaCTR, a field-level enhancement framework built on SSFT for distilling field semantics, FRE for aligning field and feature embeddings, and FIE for injecting field-aware cues into feature interactions. Across four real-world datasets and six backbones, LLaCTR delivers about 2.24% relative AUC gains while achieving 10–100x reductions in training time compared to prior LLM-enhanced methods, with ablations confirming the necessity of all three components. The approach provides a practical, scalable path to integrate semantic knowledge from LLMs into production-grade CTR systems.

Abstract

Click-through rate (CTR) prediction is a fundamental task in modern recommender systems. In recent years, the integration of large language models (LLMs) has been shown to effectively enhance the performance of traditional CTR methods. However, existing LLM-enhanced methods often require extensive processing of detailed textual descriptions for large-scale instances or user/item entities, leading to substantial computational overhead. To address this challenge, this work introduces LLaCTR, a novel and lightweight LLM-enhanced CTR method that employs a field-level enhancement paradigm. Specifically, LLaCTR first utilizes LLMs to distill crucial and lightweight semantic knowledge from small-scale feature fields through self-supervised field-feature fine-tuning. Subsequently, it leverages this field-level semantic knowledge to enhance both feature representation and feature interactions. In our experiments, we integrate LLaCTR with six representative CTR models across four datasets, demonstrating its superior performance in terms of both effectiveness and efficiency compared to existing LLM-enhanced methods. Our code is available at https://anonymous.4open.science/r/LLaCTR-EC46.

Paper Structure

This paper contains 27 sections, 14 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Current LLM-enhanced CTR paradigm versus our Field-level enhancement paradigm.
  • Figure 2: The overall framework of proposed LLaCTR. It contains 1) SSFT: leveraging self-supervised field-feature fine-tuning to improve LLMs’ ability to capture field semantics; 2) FRE: leveraging field semantic embeddings to guide the learning of the feature representation; 3) FIE: leveraging field semantic embeddings to enhance the models of feature interactions.
  • Figure 3: The template of self-supervised tuning.
  • Figure 4: AUC and training time of compared methods. The “sam-num” represents the sampling feature number for each field of LLaCTR, and "w/o FT" represents fine-tuning has been removed.
  • Figure B.1: Empirical efficiency study on representative conventional CTR models and LLM-enhanced CTR models. The multiple increase in time cost is reported based on WuKong (the SOTA traditional CTR model).
  • ...and 2 more figures