Automated Machine Learning for Multi-Label Classification

Marcel Wever

Automated Machine Learning for Multi-Label Classification

Marcel Wever

TL;DR

This work addresses the challenge of automating multi-label classification (MLC) within AutoML, focusing on the enormous, hierarchical search space that emerges when configuring MLC methods. It introduces ML-Plan, a hierarchical task network (HTN) planning framework for AutoML, first for short pipelines and then extended to unlimited-length pipelines and to MLC, enabling scalable, structured exploration of complex configurations. The thesis further extends the approach with methods like LiBRe for label-wise base-learner selection, ensembles of nested dichotomies, and a runtime predictor to improve efficiency, complemented by empirical evaluations that compare against state-of-the-art AutoML approaches. The Open-Ended discussion includes practical implications for On-The-Fly Computing and on-demand ML services, highlighting both methodological advances and remaining questions on search-space management, pruning, and runtime-aware optimization in real-world deployments.

Abstract

Automated machine learning (AutoML) aims to select and configure machine learning algorithms and combine them into machine learning pipelines tailored to a dataset at hand. For supervised learning tasks, most notably binary and multinomial classification, aka single-label classification (SLC), such AutoML approaches have shown promising results. However, the task of multi-label classification (MLC), where data points are associated with a set of class labels instead of a single class label, has received much less attention so far. In the context of multi-label classification, the data-specific selection and configuration of multi-label classifiers are challenging even for experts in the field, as it is a high-dimensional optimization problem with multi-level hierarchical dependencies. While for SLC, the space of machine learning pipelines is already huge, the size of the MLC search space outnumbers the one of SLC by several orders. In the first part of this thesis, we devise a novel AutoML approach for single-label classification tasks optimizing pipelines of machine learning algorithms, consisting of two algorithms at most. This approach is then extended first to optimize pipelines of unlimited length and eventually configure the complex hierarchical structures of multi-label classification methods. Furthermore, we investigate how well AutoML approaches that form the state of the art for single-label classification tasks scale with the increased problem complexity of AutoML for multi-label classification. In the second part, we explore how methods for SLC and MLC could be configured more flexibly to achieve better generalization performance and how to increase the efficiency of execution-based AutoML systems.

Automated Machine Learning for Multi-Label Classification

TL;DR

Abstract

Paper Structure (29 sections, 21 equations, 14 figures)

This paper contains 29 sections, 21 equations, 14 figures.

Introduction
Thesis Structure
Running Example
Preliminaries
Introduction to Automated Machine Learning
Machine Learning Pipelines
General Structure of AutoML Systems
Reduction to Hyper-Parameter Optimization
Grammar-Based Search
Meta-Learning
Neural Architecture Search
Introduction to Multi-Label Classification
Problem Definition
Single-Label Classification
Loss Functions
...and 14 more sections

Figures (14)

Figure 1: Each of the landscape pictures is associated with class labels BEACH, FOREST, MOUNTAIN, and SEA. While the first four pictures can be related to one label exclusively, more than one class label is relevant for the last four pictures. The corresponding sets of labels are detailed in the captions.
Figure 2: Visualization of different machine learning pipeline topologies. On the left-hand side, a sequential pipeline is shown. The center of the figure presents a tree-shaped pipeline topology, and on the right-hand side, the pipeline structure represents a directed acyclic graph.
Figure 3: Generic illustration of the AutoML framework. Receiving a task as an input containing a training data set $\mathcal{D}$ and a target loss function $\mathcal{L}$, the AutoML system aims to identify a machine learning pipeline that generalizes well beyond the provided training data. To this end, AutoML systems usually comprise three major components: a search space representation, an optimization algorithm operating on this search space representation, and a candidate evaluation module to assess the solution quality of candidates. Typically, the candidate evaluation uses the provided dataset and the target loss to estimate a candidate's generalization performance.
Figure 4: Illustration of an AutoML system employing Bayesian optimization. Solution candidates are represented in terms of a hyper-parameter vector. The surrogate model $\widehat{f}$ models the actual evaluation function $f$ to be optimized. Furthermore, $\widehat{f}$ is used by the acquisition function to decide, which hyper-parameter configuration $\lambda$ to sample next. Prior to evaluation, the chosen $\lambda$ is translated into a machine learning pipeline. Then, $f(\lambda)$ augments the set of observations of $f$, which in turn updates the surrogate model $\widehat{f}$.
Figure 5: Illustration of an AutoML system, employing a successive halving algorithm for optimization with a budgeted candidate evaluation function $f_b(\cdot)$. After picking an initial set of configurations and initial budgets, the algorithm iteratively evaluates configurations for the current budget $b$, discards the worse half of configurations, and re-evaluates the remaining for an increased budget. This process is run multiple times, varying the initial budget and the size of the initial set of configurations.
...and 9 more figures

Automated Machine Learning for Multi-Label Classification

TL;DR

Abstract

Automated Machine Learning for Multi-Label Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (14)