Structural Entropy Guided Probabilistic Coding
Xiang Huang, Hao Peng, Li Sun, Hui Lin, Chunyang Liu, Jiang Cao, Philip S. Yu
TL;DR
SEPC tackles uncertainty-aware representation learning by injecting latent-variable structural information into probabilistic coding through a structural-entropy regularization loss. It introduces an encoder-only probabilistic model paired with an encoding tree, and extends structural entropy to regression tasks via a probabilistic, soft-label encoding. The approach achieves state-of-the-art performance on 12 natural language understanding tasks, demonstrating strong generalization and robustness to label noise. The combination of $L_{PC}$, $L_{SE}$, and a soft-label encoding tree enables effective discrimination in the latent space while maintaining regression-friendly flexibility. The work provides practical contributions and an open-source implementation $($SEPC at https://github.com/SELGroup/SEPC$)$ with implications for uncertainty-aware NLP representations.
Abstract
Probabilistic embeddings have several advantages over deterministic embeddings as they map each data point to a distribution, which better describes the uncertainty and complexity of data. Many works focus on adjusting the distribution constraint under the Information Bottleneck (IB) principle to enhance representation learning. However, these proposed regularization terms only consider the constraint of each latent variable, omitting the structural information between latent variables. In this paper, we propose a novel structural entropy-guided probabilistic coding model, named SEPC. Specifically, we incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss. Besides, as traditional structural information theory is not well-suited for regression tasks, we propose a probabilistic encoding tree, transferring regression tasks to classification tasks while diminishing the influence of the transformation. Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC compared to other state-of-the-art models in terms of effectiveness, generalization capability, and robustness to label noise. The codes and datasets are available at https://github.com/SELGroup/SEPC.
