Table of Contents
Fetching ...

An Automatic Text Classification Method Based on Hierarchical Taxonomies, Neural Networks and Document Embedding: The NETHIC Tool

Luigi Lomasto, Rosario Di Florio, Andrea Ciapetti, Giuseppe Miscione, Giulia Ruggiero, Daniele Toti

TL;DR

NETHIC succeeds in bringing about a mechanism for text classification that proves to be significantly effective as well as efficient, and is further refined and extended by adding a document embedding mechanism, which has shown improvements in terms of performance on the individual networks and on the whole hierarchical model.

Abstract

This work describes an automatic text classification method implemented in a software tool called NETHIC, which takes advantage of the inner capabilities of highly-scalable neural networks combined with the expressiveness of hierarchical taxonomies. As such, NETHIC succeeds in bringing about a mechanism for text classification that proves to be significantly effective as well as efficient. The tool had undergone an experimentation process against both a generic and a domain-specific corpus, outputting promising results. On the basis of this experimentation, NETHIC has been now further refined and extended by adding a document embedding mechanism, which has shown improvements in terms of performance on the individual networks and on the whole hierarchical model.

An Automatic Text Classification Method Based on Hierarchical Taxonomies, Neural Networks and Document Embedding: The NETHIC Tool

TL;DR

NETHIC succeeds in bringing about a mechanism for text classification that proves to be significantly effective as well as efficient, and is further refined and extended by adding a document embedding mechanism, which has shown improvements in terms of performance on the individual networks and on the whole hierarchical model.

Abstract

This work describes an automatic text classification method implemented in a software tool called NETHIC, which takes advantage of the inner capabilities of highly-scalable neural networks combined with the expressiveness of hierarchical taxonomies. As such, NETHIC succeeds in bringing about a mechanism for text classification that proves to be significantly effective as well as efficient. The tool had undergone an experimentation process against both a generic and a domain-specific corpus, outputting promising results. On the basis of this experimentation, NETHIC has been now further refined and extended by adding a document embedding mechanism, which has shown improvements in terms of performance on the individual networks and on the whole hierarchical model.
Paper Structure (23 sections, 1 equation, 9 figures, 1 algorithm)

This paper contains 23 sections, 1 equation, 9 figures, 1 algorithm.

Figures (9)

  • Figure 1: General-purpose hierarchical taxonomy (upper part).
  • Figure 2: General-purpose hierarchical taxonomy (lower part).
  • Figure 3: NETHIC's Architecture
  • Figure 4: Single ANN's scores with BOW Dataset
  • Figure 5: Single ANN's scores with Doc2Vec Dataset
  • ...and 4 more figures