Informed deep hierarchical classification: a non-standard analysis inspired approach

Lorenzo Fiaschi; Marco Cococcioni

Informed deep hierarchical classification: a non-standard analysis inspired approach

Lorenzo Fiaschi, Marco Cococcioni

TL;DR

The paper addresses hierarchical classification by reframing HC as a lexicographic multi-objective problem and then embedding this structure within an NSA-inspired deep network (LH-DNN). By introducing projection-based non-interference via a non-standard loss and leveraging the transfer principle, LH-DNNs enable principled, lexicographic learning that prioritizes coarse-level accuracy while refining finer levels. Empirical results on CIFAR10, CIFAR100, and Fashion-MNIST show LH-DNNs achieve comparable or superior accuracy and substantially higher hierarchy coherency, with far fewer parameters and shorter training times than the B-CNN baseline. The approach promises practical benefits for real-world HC tasks, offering faster convergence, better adherence to hierarchy constraints, and efficient resource usage.

Abstract

This work proposes a novel approach to the deep hierarchical classification task, i.e., the problem of classifying data according to multiple labels organized in a rigid parent-child structure. It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer. The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields: lexicographic multi-objective optimization, non-standard analysis, and deep learning. To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks, on the CIFAR10, CIFAR100 (where it has been originally and recently proposed before being adopted and tuned for multiple real-world applications) and Fashion-MNIST benchmarks. Evidence states that an LH-DNN can achieve comparable if not superior performance, especially in the learning of the hierarchical relations, in the face of a drastic reduction of the learning parameters, training epochs, and computational time, without the need for ad-hoc loss functions weighting values.

Informed deep hierarchical classification: a non-standard analysis inspired approach

TL;DR

Abstract

Paper Structure (18 sections, 14 theorems, 68 equations, 18 figures, 4 tables)

This paper contains 18 sections, 14 theorems, 68 equations, 18 figures, 4 tables.

Introduction
Non-standard analysis and the Alpha theory
Hierarchical Classification and Branch Neural Networks
Branching Networks
Lexicographic Hybrid DNNs: hierarchical classification from a non-standard perspective
Hierarchical classification as a lexicographic problem
From lexicographic to non-standard deep learning
Implementation
Interpretation and comparison with B-DNNs
Experiments
CIFAR10 and CIFAR100
Fashion-MNIST
Results on CIFAR10
Results on CIFAR100
Results on Fashion-MNIST
...and 3 more sections

Key Result

Theorem 1

$\mathbb{E}\supset\mathbb{R}$ is a field, $\alpha\in\mathbb{E}$.

Figures (18)

Figure 1: Example of a B-DNN for a three-level HC problem.
Figure 2: Representation of an LH-DNN where the per-level non-standard operations are explicitly represented. Notice that each per-level label error prediction is non-standard.
Figure 3: Representation of an LH-DNN with only one hidden layer and two objectives, i.e., two lexicographic labels for each data point. The symbol $\rho$ indicates a generic non-linearity. Such a network can be used to solve the problem in \ref{['eq:2obj_shallow']}.
Figure 4: Representation of an LH-DNN for two-objective lexicographic learning with only one shared layer.
Figure 5: Representation of an LH-DNN for multi-objective lexicographic learning with only one shared layer.
...and 13 more figures

Theorems & Definitions (26)

Theorem
Definition : Infinite number
Definition : Finite number
Definition : Infinitesimal number
Theorem 1
Theorem
Definition : Monosemium
Corollary
Theorem 2: Transfer principle
Definition : Lexicographic optimization problem
...and 16 more

Informed deep hierarchical classification: a non-standard analysis inspired approach

TL;DR

Abstract

Informed deep hierarchical classification: a non-standard analysis inspired approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (18)

Theorems & Definitions (26)