Table of Contents
Fetching ...

Inclusion of Role into Named Entity Recognition and Ranking

Neelesh Kumar Shukla, Sanasam Ranbir Singh

TL;DR

This work tackles Entity Role Detection, where entities are assigned domain-specific roles within a context, by formulating it as both a Named Entity Recognition task and an Entity Ranking task. It introduces corpus-only, domain-agnostic methods to learn representations for entities and roles, leveraging Word2Vec-based embeddings, sentence- and document-level contexts, and relation phrases. The approach includes three core components: NER with sequence tagging (HMM, CRF, BLSTM), and ranking-based retrieval using vector- and phrase-based representations (including E-V-C-Nd, E-W-Nd, Doc2Vec, TV, and TV-SW). Experimental results on an in-house news dataset show that BLSTM yields the best NER performance among the tested sequence models, while centroid-based entity representations and relation phrases provide the strongest signals for ranking roles. The work demonstrates the potential for role-aware retrieval and context-sensitive event understanding, with future avenues including learning-to-rank and richer semantic modeling across mentions and roles, possibly leveraging external knowledge sources when available.

Abstract

Most of the Natural Language Processing systems are involved in entity-based processing for several tasks like Information Extraction, Question-Answering, Text-Summarization and so on. A new challenge comes when entities play roles according to their act or attributes in certain context. Entity Role Detection is the task of assigning such roles to the entities. Usually real-world entities are of types: person, location and organization etc. Roles could be considered as domain-dependent subtypes of these types. In the cases, where retrieving a subset of entities based on their roles is needed, poses the problem of defining the role and entities having those roles. This paper presents the study of study of solving Entity Role Detection problem by modeling it as Named Entity Recognition (NER) and Entity Retrieval/Ranking task. In NER, these roles could be considered as mutually exclusive classes and standard NER methods like sequence tagging could be used. For Entity Retrieval, Roles could be formulated as Query and entities as Collection on which the query needs to be executed. The aspect of Entity Retrieval task, which is different than document retrieval task is that the entities and roles against which they need to be retrieved are indirectly described. We have formulated automated ways of learning representative words and phrases and building representations of roles and entities using them. We have also explored different contexts like sentence and document. Since the roles depend upon context, so it is not always possible to have large domain-specific dataset or knowledge bases for learning purposes, so we have tried to exploit the information from small dataset in domain-agnostic way.

Inclusion of Role into Named Entity Recognition and Ranking

TL;DR

This work tackles Entity Role Detection, where entities are assigned domain-specific roles within a context, by formulating it as both a Named Entity Recognition task and an Entity Ranking task. It introduces corpus-only, domain-agnostic methods to learn representations for entities and roles, leveraging Word2Vec-based embeddings, sentence- and document-level contexts, and relation phrases. The approach includes three core components: NER with sequence tagging (HMM, CRF, BLSTM), and ranking-based retrieval using vector- and phrase-based representations (including E-V-C-Nd, E-W-Nd, Doc2Vec, TV, and TV-SW). Experimental results on an in-house news dataset show that BLSTM yields the best NER performance among the tested sequence models, while centroid-based entity representations and relation phrases provide the strongest signals for ranking roles. The work demonstrates the potential for role-aware retrieval and context-sensitive event understanding, with future avenues including learning-to-rank and richer semantic modeling across mentions and roles, possibly leveraging external knowledge sources when available.

Abstract

Most of the Natural Language Processing systems are involved in entity-based processing for several tasks like Information Extraction, Question-Answering, Text-Summarization and so on. A new challenge comes when entities play roles according to their act or attributes in certain context. Entity Role Detection is the task of assigning such roles to the entities. Usually real-world entities are of types: person, location and organization etc. Roles could be considered as domain-dependent subtypes of these types. In the cases, where retrieving a subset of entities based on their roles is needed, poses the problem of defining the role and entities having those roles. This paper presents the study of study of solving Entity Role Detection problem by modeling it as Named Entity Recognition (NER) and Entity Retrieval/Ranking task. In NER, these roles could be considered as mutually exclusive classes and standard NER methods like sequence tagging could be used. For Entity Retrieval, Roles could be formulated as Query and entities as Collection on which the query needs to be executed. The aspect of Entity Retrieval task, which is different than document retrieval task is that the entities and roles against which they need to be retrieved are indirectly described. We have formulated automated ways of learning representative words and phrases and building representations of roles and entities using them. We have also explored different contexts like sentence and document. Since the roles depend upon context, so it is not always possible to have large domain-specific dataset or knowledge bases for learning purposes, so we have tried to exploit the information from small dataset in domain-agnostic way.

Paper Structure

This paper contains 24 sections, 1 equation, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Variation of Mean Average Precision w.r.t the Ranking Position K
  • Figure 2: Performance of Word vs Phrases Based Representation using Sentence-level Context
  • Figure 3: Frequency Distribution of Entities having Multiple Mentions
  • Figure 4: Statistics for Majority Assumption: Percentage of Mentions having Majority Role
  • Figure 5: Statistics for Positional Assumption
  • ...and 2 more figures