Table of Contents
Fetching ...

Informed deep hierarchical classification: a non-standard analysis inspired approach

Lorenzo Fiaschi, Marco Cococcioni

TL;DR

The paper addresses hierarchical classification by reframing HC as a lexicographic multi-objective problem and then embedding this structure within an NSA-inspired deep network (LH-DNN). By introducing projection-based non-interference via a non-standard loss and leveraging the transfer principle, LH-DNNs enable principled, lexicographic learning that prioritizes coarse-level accuracy while refining finer levels. Empirical results on CIFAR10, CIFAR100, and Fashion-MNIST show LH-DNNs achieve comparable or superior accuracy and substantially higher hierarchy coherency, with far fewer parameters and shorter training times than the B-CNN baseline. The approach promises practical benefits for real-world HC tasks, offering faster convergence, better adherence to hierarchy constraints, and efficient resource usage.

Abstract

This work proposes a novel approach to the deep hierarchical classification task, i.e., the problem of classifying data according to multiple labels organized in a rigid parent-child structure. It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer. The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields: lexicographic multi-objective optimization, non-standard analysis, and deep learning. To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks, on the CIFAR10, CIFAR100 (where it has been originally and recently proposed before being adopted and tuned for multiple real-world applications) and Fashion-MNIST benchmarks. Evidence states that an LH-DNN can achieve comparable if not superior performance, especially in the learning of the hierarchical relations, in the face of a drastic reduction of the learning parameters, training epochs, and computational time, without the need for ad-hoc loss functions weighting values.

Informed deep hierarchical classification: a non-standard analysis inspired approach

TL;DR

The paper addresses hierarchical classification by reframing HC as a lexicographic multi-objective problem and then embedding this structure within an NSA-inspired deep network (LH-DNN). By introducing projection-based non-interference via a non-standard loss and leveraging the transfer principle, LH-DNNs enable principled, lexicographic learning that prioritizes coarse-level accuracy while refining finer levels. Empirical results on CIFAR10, CIFAR100, and Fashion-MNIST show LH-DNNs achieve comparable or superior accuracy and substantially higher hierarchy coherency, with far fewer parameters and shorter training times than the B-CNN baseline. The approach promises practical benefits for real-world HC tasks, offering faster convergence, better adherence to hierarchy constraints, and efficient resource usage.

Abstract

This work proposes a novel approach to the deep hierarchical classification task, i.e., the problem of classifying data according to multiple labels organized in a rigid parent-child structure. It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer. The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields: lexicographic multi-objective optimization, non-standard analysis, and deep learning. To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks, on the CIFAR10, CIFAR100 (where it has been originally and recently proposed before being adopted and tuned for multiple real-world applications) and Fashion-MNIST benchmarks. Evidence states that an LH-DNN can achieve comparable if not superior performance, especially in the learning of the hierarchical relations, in the face of a drastic reduction of the learning parameters, training epochs, and computational time, without the need for ad-hoc loss functions weighting values.
Paper Structure (18 sections, 14 theorems, 68 equations, 18 figures, 4 tables)

This paper contains 18 sections, 14 theorems, 68 equations, 18 figures, 4 tables.

Key Result

Theorem 1

$\mathbb{E}\supset\mathbb{R}$ is a field, $\alpha\in\mathbb{E}$.

Figures (18)

  • Figure 1: Example of a B-DNN for a three-level HC problem.
  • Figure 2: Representation of an LH-DNN where the per-level non-standard operations are explicitly represented. Notice that each per-level label error prediction is non-standard.
  • Figure 3: Representation of an LH-DNN with only one hidden layer and two objectives, i.e., two lexicographic labels for each data point. The symbol $\rho$ indicates a generic non-linearity. Such a network can be used to solve the problem in \ref{['eq:2obj_shallow']}.
  • Figure 4: Representation of an LH-DNN for two-objective lexicographic learning with only one shared layer.
  • Figure 5: Representation of an LH-DNN for multi-objective lexicographic learning with only one shared layer.
  • ...and 13 more figures

Theorems & Definitions (26)

  • Theorem
  • Definition : Infinite number
  • Definition : Finite number
  • Definition : Infinitesimal number
  • Theorem 1
  • Theorem
  • Definition : Monosemium
  • Corollary
  • Theorem 2: Transfer principle
  • Definition : Lexicographic optimization problem
  • ...and 16 more