Table of Contents
Fetching ...

Sentence Representations via Gaussian Embedding

Shohei Yoda, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda

TL;DR

GaussCSE is proposed, a Gaussian-distribution-based contrastive learning framework for sentence embedding that can handle asymmetric inter-sentential relations, as well as a similarity measure for identifying entailment relations.

Abstract

Recent progress in sentence embedding, which represents the meaning of a sentence as a point in a vector space, has achieved high performance on tasks such as a semantic textual similarity (STS) task. However, sentence representations as a point in a vector space can express only a part of the diverse information that sentences have, such as asymmetrical relationships between sentences. This paper proposes GaussCSE, a Gaussian distribution-based contrastive learning framework for sentence embedding that can handle asymmetric relationships between sentences, along with a similarity measure for identifying inclusion relations. Our experiments show that GaussCSE achieves the same performance as previous methods in natural language inference tasks, and is able to estimate the direction of entailment relations, which is difficult with point representations.

Sentence Representations via Gaussian Embedding

TL;DR

GaussCSE is proposed, a Gaussian-distribution-based contrastive learning framework for sentence embedding that can handle asymmetric inter-sentential relations, as well as a similarity measure for identifying entailment relations.

Abstract

Recent progress in sentence embedding, which represents the meaning of a sentence as a point in a vector space, has achieved high performance on tasks such as a semantic textual similarity (STS) task. However, sentence representations as a point in a vector space can express only a part of the diverse information that sentences have, such as asymmetrical relationships between sentences. This paper proposes GaussCSE, a Gaussian distribution-based contrastive learning framework for sentence embedding that can handle asymmetric relationships between sentences, along with a similarity measure for identifying inclusion relations. Our experiments show that GaussCSE achieves the same performance as previous methods in natural language inference tasks, and is able to estimate the direction of entailment relations, which is difficult with point representations.
Paper Structure (20 sections, 4 equations, 2 figures, 5 tables)

This paper contains 20 sections, 4 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Sentence representations in embedding spaces of a previous method (left) and GaussCSE (right).
  • Figure 2: Histograms representing the distributions of the logarithmic values of the length ratios of the premise sentences and their corresponding hypothesis sentences in the SNLI, MNLI, and SICK datasets. The horizontal axis represents the logarithm of the length ratio, and the vertical axis represents the number of sentence pairs.