Table of Contents
Fetching ...

Convolutional Lie Operator for Sentence Classification

Daniela N. Rim, Heeyoul Choi

TL;DR

The paper introduces Lie Convolutions for sentence classification, proposing CLie and DPCLie to capture non-Euclidean transformations in language. By grounding convolutions in Lie group theory and applying a Lie-algebra-inspired kernel, the authors show improved accuracy on several benchmark datasets compared to traditional ConvNet baselines. They also analyze symmetry properties of learned representations, finding DPCLie yields smoother and more symmetry-aligned embeddings. Overall, the work suggests non-Euclidean, symmetry-aware representations can enhance robustness and expressiveness in NLP tasks and motivates further exploration beyond Euclidean frameworks.

Abstract

Traditional Convolutional Neural Networks have been successful in capturing local, position-invariant features in text, but their capacity to model complex transformation within language can be further explored. In this work, we explore a novel approach by integrating Lie Convolutions into Convolutional-based sentence classifiers, inspired by the ability of Lie group operations to capture complex, non-Euclidean symmetries. Our proposed models SCLie and DPCLie empirically outperform traditional Convolutional-based sentence classifiers, suggesting that Lie-based models relatively improve the accuracy by capturing transformations not commonly associated with language. Our findings motivate more exploration of new paradigms in language modeling.

Convolutional Lie Operator for Sentence Classification

TL;DR

The paper introduces Lie Convolutions for sentence classification, proposing CLie and DPCLie to capture non-Euclidean transformations in language. By grounding convolutions in Lie group theory and applying a Lie-algebra-inspired kernel, the authors show improved accuracy on several benchmark datasets compared to traditional ConvNet baselines. They also analyze symmetry properties of learned representations, finding DPCLie yields smoother and more symmetry-aligned embeddings. Overall, the work suggests non-Euclidean, symmetry-aware representations can enhance robustness and expressiveness in NLP tasks and motivates further exploration beyond Euclidean frameworks.

Abstract

Traditional Convolutional Neural Networks have been successful in capturing local, position-invariant features in text, but their capacity to model complex transformation within language can be further explored. In this work, we explore a novel approach by integrating Lie Convolutions into Convolutional-based sentence classifiers, inspired by the ability of Lie group operations to capture complex, non-Euclidean symmetries. Our proposed models SCLie and DPCLie empirically outperform traditional Convolutional-based sentence classifiers, suggesting that Lie-based models relatively improve the accuracy by capturing transformations not commonly associated with language. Our findings motivate more exploration of new paradigms in language modeling.

Paper Structure

This paper contains 13 sections, 5 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: We show the conventional convolutional-based sentence classifications (a) SCNN and (c) DPCNN next to our proposed models incorporating a Convolutional Lie layer. (b)SCLie, a one-layer Lie convolutional layer sentence classifier, convolves the input embeddings with multiple filter widths and feature maps to obtain sentence representations and finally makes a classification after a pooling and a softmax layer. (d)DPCLie, a deeper version of (b) which adds a block of Convolutional Layers that downsample (DS) the representations, to add depth to the architecture without significantly increasing the computational burden.
  • Figure 2: Proposed Lie convolution layer for sentence classification. We show a simplified example for one sentence in the batch, which is forwarded through the embedding layer to obtain a grid-type tensor. The embedding layer contains a data transformation analogous to a Lie group. The grid is convolved with $k_\theta$ dynamic filters/kernels of different sizes to capture different features. These representations are max-pooled to obtain a vector representation of the whole text, finally obtaining a classification.
  • Figure 3: Visualizations of 2-D t-SNE applied to the sentence representations of the trained DPCNN (a) and DPCLie (b) on the SST dataset. The data points are colored by the binary classification labels corresponding to each sentence.