Parallel Algorithm for Optimal Threshold Labeling of Ordinal Regression Methods

Ryoya Yamasaki; Toshiyuki Tanaka

Parallel Algorithm for Optimal Threshold Labeling of Ordinal Regression Methods

Ryoya Yamasaki, Toshiyuki Tanaka

TL;DR

This study proposes a parallelizable algorithm to find the optimal threshold labeling, which was developed in previous research, and derive sufficient conditions for that algorithm to successfully output the optimal threshold labeling.

Abstract

Ordinal regression (OR) is classification of ordinal data in which the underlying categorical target variable has a natural ordinal relation for the underlying explanatory variable. For $K$-class OR tasks, threshold methods learn a one-dimensional transformation (1DT) of the explanatory variable so that 1DT values for observations of the explanatory variable preserve the order of label values $1,\ldots,K$ for corresponding observations of the target variable well, and then assign a label prediction to the learned 1DT through threshold labeling, namely, according to the rank of an interval to which the 1DT belongs among intervals on the real line separated by $(K-1)$ threshold parameters. In this study, we propose a parallelizable algorithm to find the optimal threshold labeling, which was developed in previous research, and derive sufficient conditions for that algorithm to successfully output the optimal threshold labeling. In a numerical experiment we performed, the computation time taken for the whole learning process of a threshold method with the optimal threshold labeling could be reduced to approximately 60\,\% by using the proposed algorithm with parallel processing compared to using an existing algorithm based on dynamic programming.

Parallel Algorithm for Optimal Threshold Labeling of Ordinal Regression Methods

TL;DR

Abstract

Ordinal regression (OR) is classification of ordinal data in which the underlying categorical target variable has a natural ordinal relation for the underlying explanatory variable. For

-class OR tasks, threshold methods learn a one-dimensional transformation (1DT) of the explanatory variable so that 1DT values for observations of the explanatory variable preserve the order of label values

for corresponding observations of the target variable well, and then assign a label prediction to the learned 1DT through threshold labeling, namely, according to the rank of an interval to which the 1DT belongs among intervals on the real line separated by

threshold parameters. In this study, we propose a parallelizable algorithm to find the optimal threshold labeling, which was developed in previous research, and derive sufficient conditions for that algorithm to successfully output the optimal threshold labeling. In a numerical experiment we performed, the computation time taken for the whole learning process of a threshold method with the optimal threshold labeling could be reduced to approximately 60\,\% by using the proposed algorithm with parallel processing compared to using an existing algorithm based on dynamic programming.

Paper Structure (10 sections, 2 theorems, 9 equations, 1 figure, 1 table, 4 algorithms)

This paper contains 10 sections, 2 theorems, 9 equations, 1 figure, 1 table, 4 algorithms.

Introduction
Preparation
Formulation of Ordinal Regression Task
Formulation of Threshold Method
Optimal Threshold Labeling
Existing Dynamic-Programming-based (DP) Algorithm
Proposed Independent Optimization (IO) Algorithm
Experiment
Conclusion
Further Parallelization of IO Algorithm

Key Result

Theorem 1

For any data $\{({\bm{x}}_i,y_i)\}_{i\in[n]}$, learned 1DT $\hat{a}$, and task loss $\ell$, the threshold parameter vector ${\bm{t}}=\hat{{\bm{t}}}$ obtained by Algorithms alg:Preparation and alg:IO-algorithm minimizes the empirical task risk $\frac{1}{n}\sum_{i=1}^n \ell(h_{\rm thr}(\hat{a}({\bm{x}

Figures (1)

Figure 1: This figure assists in understanding Equation \ref{['eq:IOTEQ']}. In calculation of $R_k(t_k)$, the learned 1DT value $\hat{a}({\bm{x}}_i)$ located in the range marked with $l$ is labeled as $l$, and the corresponding task loss is $\ell(l,y_i)$.

Theorems & Definitions (4)

Theorem 1
proof : Proof of Theorem \ref{['thm:Optimal']}
Theorem 2
proof : Proof of Theorem \ref{['thm:Convex']}

Parallel Algorithm for Optimal Threshold Labeling of Ordinal Regression Methods

TL;DR

Abstract

Parallel Algorithm for Optimal Threshold Labeling of Ordinal Regression Methods

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (4)