Table of Contents
Fetching ...

SEE: Sememe Entanglement Encoding for Transformer-bases Models Compression

Jing Zhang, Shuzhen Sun, Peng Zhang, Guangxing Cao, Hui Gao, Xindian Ma, Nan Xu, Yuexian Hou

TL;DR

Transformer models incur high storage and compute costs, especially in embedding layers. The authors propose Sememe Entanglement Encoding (SEE), which compresses embeddings by representing morphemes and sememes as low-dimensional vectors and reconstructs high-dimensional embeddings via generalized quantum entanglement, guided by HowNet knowledge. They introduce a two-stage finetuning with embedding- and hidden-state MSE, followed by distillation and cross-entropy losses, and demonstrate 10x–80x embedding compression with minimal BLEU degradation on translation benchmarks and viability on Phi3-1B-scale models. This work enables efficient deployment of large transformers in resource-constrained environments by integrating linguistic knowledge into model compression.

Abstract

Transformer-based large language models exhibit groundbreaking capabilities, but their storage and computational costs are prohibitively high, limiting their application in resource-constrained scenarios. An effective approach is to eliminate redundant model parameters and computational costs while incorporating efficient expert-derived knowledge structures to achieve a balance between compression and performance. Therefore, we propose the \textit{Sememe Entanglement Encoding (SEE)} algorithm. Guided by expert prior knowledge, the model is compressed through the low-rank approximation idea. In Entanglement Embedding, basic semantic units such as sememes are represented as low-dimensional vectors, and then reconstructed into high-dimensional word embeddings through the combination of generalized quantum entanglement. We adapt the Sememe Entanglement Encoding algorithm to transformer-based models of different magnitudes. Experimental results indicate that our approach achieves stable performance while compressing model parameters and computational costs.

SEE: Sememe Entanglement Encoding for Transformer-bases Models Compression

TL;DR

Transformer models incur high storage and compute costs, especially in embedding layers. The authors propose Sememe Entanglement Encoding (SEE), which compresses embeddings by representing morphemes and sememes as low-dimensional vectors and reconstructs high-dimensional embeddings via generalized quantum entanglement, guided by HowNet knowledge. They introduce a two-stage finetuning with embedding- and hidden-state MSE, followed by distillation and cross-entropy losses, and demonstrate 10x–80x embedding compression with minimal BLEU degradation on translation benchmarks and viability on Phi3-1B-scale models. This work enables efficient deployment of large transformers in resource-constrained environments by integrating linguistic knowledge into model compression.

Abstract

Transformer-based large language models exhibit groundbreaking capabilities, but their storage and computational costs are prohibitively high, limiting their application in resource-constrained scenarios. An effective approach is to eliminate redundant model parameters and computational costs while incorporating efficient expert-derived knowledge structures to achieve a balance between compression and performance. Therefore, we propose the \textit{Sememe Entanglement Encoding (SEE)} algorithm. Guided by expert prior knowledge, the model is compressed through the low-rank approximation idea. In Entanglement Embedding, basic semantic units such as sememes are represented as low-dimensional vectors, and then reconstructed into high-dimensional word embeddings through the combination of generalized quantum entanglement. We adapt the Sememe Entanglement Encoding algorithm to transformer-based models of different magnitudes. Experimental results indicate that our approach achieves stable performance while compressing model parameters and computational costs.

Paper Structure

This paper contains 17 sections, 9 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: The diagram of Word2Ket and MorphTE
  • Figure 2: The Overall Method
  • Figure 3: Example of Morpheme Decomposition
  • Figure 4: Example of Sememe Decomposition
  • Figure 5: Hyperparameter Analysis