Table of Contents
Fetching ...

Classifier Chain Networks for Multi-Label Classification

Daniel J. W. Touw, Michel van de Velden

TL;DR

This study introduces a generalization of the classifier chain: the classifier chain network, which enables joint estimation of model parameters, and allows to account for the influence of earlier label predictions on subsequent classifiers in the chain.

Abstract

The classifier chain is a widely used method for analyzing multi-labeled data sets. In this study, we introduce a generalization of the classifier chain: the classifier chain network. The classifier chain network enables joint estimation of model parameters, and allows to account for the influence of earlier label predictions on subsequent classifiers in the chain. Through simulations, we evaluate the classifier chain network's performance against multiple benchmark methods, demonstrating competitive results even in scenarios that deviate from its modeling assumptions. Furthermore, we propose a new measure for detecting conditional dependencies between labels and illustrate the classifier chain network's effectiveness using an empirical data set.

Classifier Chain Networks for Multi-Label Classification

TL;DR

This study introduces a generalization of the classifier chain: the classifier chain network, which enables joint estimation of model parameters, and allows to account for the influence of earlier label predictions on subsequent classifiers in the chain.

Abstract

The classifier chain is a widely used method for analyzing multi-labeled data sets. In this study, we introduce a generalization of the classifier chain: the classifier chain network. The classifier chain network enables joint estimation of model parameters, and allows to account for the influence of earlier label predictions on subsequent classifiers in the chain. Through simulations, we evaluate the classifier chain network's performance against multiple benchmark methods, demonstrating competitive results even in scenarios that deviate from its modeling assumptions. Furthermore, we propose a new measure for detecting conditional dependencies between labels and illustrate the classifier chain network's effectiveness using an empirical data set.

Paper Structure

This paper contains 23 sections, 25 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: A visual representation of a classifier chain network with three labels. The blue node provides the input vector $\mathbf{x}_i$ that undergoes an affine transformation on each of its outgoing edges. At the purple and red nodes, the values of the incoming edges are summed and used in an activation function to produce a prediction $p_{i\ell}$ for the corresponding labels. The purple nodes fulfil a dual role, as they compute part of the output of the network and provide (a transformation of) this output to subsequent nodes.
  • Figure 2: The label probabilities versus the value of $x_1$ under the data generating process containing strong label interdependencies (left), defined by the parameters in \ref{['eq:dgp_strong']}; and weak interdependencies (right), defined by the parameters in \ref{['eq:dgp_weak']}.
  • Figure 3: Boxplots of the difference in the performance between the classifier chain network and the benchmark methods discussed in Section \ref{['subsec:sim_methods']}. The differences are computed such that a positive value indicates improved performance of the classifier chain network over the corresponding method. The two simulation designs are the strong (top) and weak (bottom) label interdependencies.
  • Figure 4: The results for the simulation designs with reversed label order, sequential label realization, and increased label count. See Figure \ref{['fig:sim_res_a']} for explanatory notes.
  • Figure 5: The results for the negative log-likelihood for all simulation designs. See Figure \ref{['fig:sim_res_a']} for explanatory notes.
  • ...and 2 more figures