Table of Contents
Fetching ...

Implicit Style Conditioning: A Structured Style-Rewrite Framework for Low-Resource Character Modeling

Chanhui Zhu

TL;DR

This work proposes a Structured Style-Rewrite Framework that explicitly disentangles style into three interpretable dimensions: lexical signatures, syntactic patterns, and pragmatic style, and introduces an implicit style conditioning strategy via Chain-of-Thought (CoT) distillation.

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities in role-playing (RP); however, small Language Models (SLMs) with highly stylized personas remains a challenge due to data scarcity and the complexity of style disentanglement. Standard Supervised Fine-Tuning (SFT) often captures surface-level semantics while failing to reproduce the intricate syntactic and pragmatic nuances of a character, leading to "Out-Of-Character" (OOC) generation. To address this, we propose a Structured Style-Rewrite Framework that explicitly disentangles style into three interpretable dimensions: lexical signatures (via PMI), syntactic patterns (grounded in PCFG rules), and pragmatic style. Furthermore, we introduce an implicit style conditioning strategy via Chain-of-Thought (CoT) distillation. By leveraging explicit reasoning traces during training as a strong inductive bias, our approach aligns the model's latent representations with structured style features, enabling high-fidelity stylized generation without requiring explicit reasoning tokens during inference. Extensive experiments on a specific high-stylization domain (anime characters) demonstrate that our method enables a Qwen-1.7B model to outperform significantly larger baselines (e.g., 4B Vanilla SFT) in style consistency and semantic fidelity. Our approach offers a data-efficient paradigm for democratizing inference and deployment on consumer hardware.

Implicit Style Conditioning: A Structured Style-Rewrite Framework for Low-Resource Character Modeling

TL;DR

This work proposes a Structured Style-Rewrite Framework that explicitly disentangles style into three interpretable dimensions: lexical signatures, syntactic patterns, and pragmatic style, and introduces an implicit style conditioning strategy via Chain-of-Thought (CoT) distillation.

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities in role-playing (RP); however, small Language Models (SLMs) with highly stylized personas remains a challenge due to data scarcity and the complexity of style disentanglement. Standard Supervised Fine-Tuning (SFT) often captures surface-level semantics while failing to reproduce the intricate syntactic and pragmatic nuances of a character, leading to "Out-Of-Character" (OOC) generation. To address this, we propose a Structured Style-Rewrite Framework that explicitly disentangles style into three interpretable dimensions: lexical signatures (via PMI), syntactic patterns (grounded in PCFG rules), and pragmatic style. Furthermore, we introduce an implicit style conditioning strategy via Chain-of-Thought (CoT) distillation. By leveraging explicit reasoning traces during training as a strong inductive bias, our approach aligns the model's latent representations with structured style features, enabling high-fidelity stylized generation without requiring explicit reasoning tokens during inference. Extensive experiments on a specific high-stylization domain (anime characters) demonstrate that our method enables a Qwen-1.7B model to outperform significantly larger baselines (e.g., 4B Vanilla SFT) in style consistency and semantic fidelity. Our approach offers a data-efficient paradigm for democratizing inference and deployment on consumer hardware.
Paper Structure (120 sections, 9 equations, 4 figures, 25 tables)

This paper contains 120 sections, 9 equations, 4 figures, 25 tables.

Figures (4)

  • Figure 1: Overview of the proposed structured style modeling and style-conditioned generation framework.
  • Figure 2: Joint distribution of semantic and style scores across models.
  • Figure 3: N-shot stability curves of structured style vectors on Muice. The dashed line marks the automatic convergence point at $N=50$.
  • Figure 4: Pareto envelope on the Semantic--Style plane. The outer envelope is formed by non-dominated model points; the dashed line at Semantic$=0.75$ and the shaded right-hand side indicate the high-fidelity (usable) regime.