Textual Entailment Recognition with Semantic Features from Empirical Text Representation

Md Shajalal; Md Atabuzzaman; Maksuda Bilkis Baby; Md Rezaul Karim; Alexander Boden

Textual Entailment Recognition with Semantic Features from Empirical Text Representation

Md Shajalal, Md Atabuzzaman, Maksuda Bilkis Baby, Md Rezaul Karim, Alexander Boden

TL;DR

This work addresses textual entailment recognition by introducing a threshold-based empirical semantic representation that filters word embedding components to form high-dimensional semantic vectors $v_T$ and $v_H$, enabling an element-wise distance $EMDV$ for text-hypothesis pairs. It combines $EMDV$ with its scalar ${Avg(EMDV)}$ and handcrafted lexical/semantic features (JAC, BoW, STS) to train multiple ML classifiers and an ensemble via majority voting. On the SICK-RTE dataset, the approach achieves competitive accuracy, with notable gains when using the full feature set and thresholded representations, outperforming several baselines while remaining competitive with strong deep-learning baselines. The results demonstrate the value of integrating threshold-based semantic representations with traditional features, and the work points to future integration with deep learning to further leverage the $EMDV$ framework.

Abstract

Textual entailment recognition is one of the basic natural language understanding(NLU) tasks. Understanding the meaning of sentences is a prerequisite before applying any natural language processing(NLP) techniques to automatically recognize the textual entailment. A text entails a hypothesis if and only if the true value of the hypothesis follows the text. Classical approaches generally utilize the feature value of each word from word embedding to represent the sentences. In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis, thereby introducing a new semantic feature focusing on empirical threshold-based semantic text representation. We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair. We carried out several experiments on a benchmark entailment classification(SICK-RTE) dataset. We train several machine learning(ML) algorithms applying both semantic and lexical features to classify the text-hypothesis pair as entailment, neutral, or contradiction. Our empirical sentence representation technique enriches the semantic information of the texts and hypotheses found to be more efficient than the classical ones. In the end, our approach significantly outperforms known methods in understanding the meaning of the sentences for the textual entailment classification task.

Textual Entailment Recognition with Semantic Features from Empirical Text Representation

TL;DR

This work addresses textual entailment recognition by introducing a threshold-based empirical semantic representation that filters word embedding components to form high-dimensional semantic vectors

and

, enabling an element-wise distance

for text-hypothesis pairs. It combines

with its scalar

and handcrafted lexical/semantic features (JAC, BoW, STS) to train multiple ML classifiers and an ensemble via majority voting. On the SICK-RTE dataset, the approach achieves competitive accuracy, with notable gains when using the full feature set and thresholded representations, outperforming several baselines while remaining competitive with strong deep-learning baselines. The results demonstrate the value of integrating threshold-based semantic representations with traditional features, and the work points to future integration with deep learning to further leverage the

framework.

Abstract

Paper Structure (16 sections, 2 equations, 2 figures, 9 tables)

This paper contains 16 sections, 2 equations, 2 figures, 9 tables.

Introduction
Related Work
Proposed Approach
Empirical text representation
Feature extraction of text-hypothesis Pair
Element-wise Manhattan distance vector (EMDV)
Average of EMDV
Jaccard similarity score (JAC)
Bag-of-Words based similarity (BoW)
BERT-based semantic similarity score (STS)
Experiments Results
Dataset
Experimental settings
Performance analysis of entailment recognition
Comparative analysis
...and 1 more sections

Figures (2)

Figure 1: Overview diagram for recognising textual entailment
Figure 2: Performance comparison of our proposed method in terms of Accuracy on the SICK-RTE dataset.

Textual Entailment Recognition with Semantic Features from Empirical Text Representation

TL;DR

Abstract

Textual Entailment Recognition with Semantic Features from Empirical Text Representation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)