Table of Contents
Fetching ...

Towards Better IncomLDL: We Are Unaware of Hidden Labels in Advance

Jiecheng Jiang, Jiawei Tang, Jiahao Jiang, Hui Liu, Junhui Hou, Yuheng Jia

TL;DR

This work addresses learning complete label distributions when some labels are hidden during annotation by introducing HidLDL, a realistic extension of IncomLDL. It combines a proportional constraint on observed labels with a graph-based dependency model and a global low-rank prior, optimized via ADMM and equipped with a recovery bound. The method demonstrates strong recovery and predictive performance across 12 real-world datasets, significantly outperforming state-of-the-art LDL and IncomLDL baselines, and shows robustness to varying missing rates. This approach provides a principled framework for handling hidden labels in label distribution learning with practical implications for applications requiring reliable uncertainty-aware labeling.

Abstract

Label distribution learning (LDL) is a novel paradigm that describe the samples by label distribution of a sample. However, acquiring LDL dataset is costly and time-consuming, which leads to the birth of incomplete label distribution learning (IncomLDL). All the previous IncomLDL methods set the description degrees of "missing" labels in an instance to 0, but remains those of other labels unchanged. This setting is unrealistic because when certain labels are missing, the degrees of the remaining labels will increase accordingly. We fix this unrealistic setting in IncomLDL and raise a new problem: LDL with hidden labels (HidLDL), which aims to recover a complete label distribution from a real-world incomplete label distribution where certain labels in an instance are omitted during annotation. To solve this challenging problem, we discover the significance of proportional information of the observed labels and capture it by an innovative constraint to utilize it during the optimization process. We simultaneously use local feature similarity and the global low-rank structure to reveal the mysterious veil of hidden labels. Moreover, we theoretically give the recovery bound of our method, proving the feasibility of our method in learning from hidden labels. Extensive recovery and predictive experiments on various datasets prove the superiority of our method to state-of-the-art LDL and IncomLDL methods.

Towards Better IncomLDL: We Are Unaware of Hidden Labels in Advance

TL;DR

This work addresses learning complete label distributions when some labels are hidden during annotation by introducing HidLDL, a realistic extension of IncomLDL. It combines a proportional constraint on observed labels with a graph-based dependency model and a global low-rank prior, optimized via ADMM and equipped with a recovery bound. The method demonstrates strong recovery and predictive performance across 12 real-world datasets, significantly outperforming state-of-the-art LDL and IncomLDL baselines, and shows robustness to varying missing rates. This approach provides a principled framework for handling hidden labels in label distribution learning with practical implications for applications requiring reliable uncertainty-aware labeling.

Abstract

Label distribution learning (LDL) is a novel paradigm that describe the samples by label distribution of a sample. However, acquiring LDL dataset is costly and time-consuming, which leads to the birth of incomplete label distribution learning (IncomLDL). All the previous IncomLDL methods set the description degrees of "missing" labels in an instance to 0, but remains those of other labels unchanged. This setting is unrealistic because when certain labels are missing, the degrees of the remaining labels will increase accordingly. We fix this unrealistic setting in IncomLDL and raise a new problem: LDL with hidden labels (HidLDL), which aims to recover a complete label distribution from a real-world incomplete label distribution where certain labels in an instance are omitted during annotation. To solve this challenging problem, we discover the significance of proportional information of the observed labels and capture it by an innovative constraint to utilize it during the optimization process. We simultaneously use local feature similarity and the global low-rank structure to reveal the mysterious veil of hidden labels. Moreover, we theoretically give the recovery bound of our method, proving the feasibility of our method in learning from hidden labels. Extensive recovery and predictive experiments on various datasets prove the superiority of our method to state-of-the-art LDL and IncomLDL methods.

Paper Structure

This paper contains 31 sections, 37 equations, 5 figures, 9 tables, 1 algorithm.

Figures (5)

  • Figure 1: (a) A scene image containing 6 elements. (b) The label distribution (LD) of the image. (c) The observed LD in IncomLDL, with gray indicating missing (unobserved) labels. The sum of the description degrees of the observed labels is not 1. (d) The observed LD in HidLDL. For the unobserved labels in (c), the corresponding description degrees are 0, and the sum of the description degrees of the observed labels is 1. HidLDL is more intuitive and realistic, as the observed labels should occupy all description degree.
  • Figure 2: The visualization of two typical recovery results on the Movie (left) and spo (right) dataset.
  • Figure 3: Comparison of our method and IncomLDL-admm on Cosine (the higher the better) and the horizontal axis represents the hyper-parameter $\alpha$ with the missing rate $\omega$=50%.
  • Figure A1: Recovery performance comparison under different missing rates.
  • Figure A2: Predictive performance comparison under different missing rates.