Table of Contents
Fetching ...

Sparse joint shift in multinomial classification

Dirk Tasche

TL;DR

Sparse joint shift (SJS) reframes dataset shift by allowing both labels and a subset of features to change while keeping the remaining feature-conditionals invariant. The paper develops density-based characterizations of SJS, proves identifiability under a rank condition, and clarifies its relationship with covariate shift via conditional distribution invariance (CDI). It then proposes two KL-based and discrete estimation strategies to recover target-shift weights from labeled source and unlabeled target data, discusses potential inconsistencies, and suggests improvements, including classifier-augmented approaches to handle high dimensionality. Collectively, these results advance principled domain adaptation under SJS and provide practical guidance for estimating the shift and correcting posteriors when target labels are unavailable.

Abstract

Sparse joint shift (SJS) was recently proposed as a tractable model for general dataset shift which may cause changes to the marginal distributions of features and labels as well as the posterior probabilities and the class-conditional feature distributions. Fitting SJS for a target dataset without label observations may produce valid predictions of labels and estimates of class prior probabilities. We present new results on the transmission of SJS from sets of features to larger sets of features, a conditional correction formula for the class posterior probabilities under the target distribution, identifiability of SJS, and the relationship between SJS and covariate shift. In addition, we point out inconsistencies in the algorithms which were proposed for estimating the characteristics of SJS, as they could hamper the search for optimal solutions, and suggest potential improvements.

Sparse joint shift in multinomial classification

TL;DR

Sparse joint shift (SJS) reframes dataset shift by allowing both labels and a subset of features to change while keeping the remaining feature-conditionals invariant. The paper develops density-based characterizations of SJS, proves identifiability under a rank condition, and clarifies its relationship with covariate shift via conditional distribution invariance (CDI). It then proposes two KL-based and discrete estimation strategies to recover target-shift weights from labeled source and unlabeled target data, discusses potential inconsistencies, and suggests improvements, including classifier-augmented approaches to handle high dimensionality. Collectively, these results advance principled domain adaptation under SJS and provide practical guidance for estimating the shift and correcting posteriors when target labels are unavailable.

Abstract

Sparse joint shift (SJS) was recently proposed as a tractable model for general dataset shift which may cause changes to the marginal distributions of features and labels as well as the posterior probabilities and the class-conditional feature distributions. Fitting SJS for a target dataset without label observations may produce valid predictions of labels and estimates of class prior probabilities. We present new results on the transmission of SJS from sets of features to larger sets of features, a conditional correction formula for the class posterior probabilities under the target distribution, identifiability of SJS, and the relationship between SJS and covariate shift. In addition, we point out inconsistencies in the algorithms which were proposed for estimating the characteristics of SJS, as they could hamper the search for optimal solutions, and suggest potential improvements.
Paper Structure (13 sections, 12 theorems, 80 equations)

This paper contains 13 sections, 12 theorems, 80 equations.

Key Result

Proposition 1.8

Under Assumption as:cont, suppose that $\mathcal{F}$ is a sub-$\sigma$-algebra of $\mathcal{H}$. Then the following three statements are equivalent:

Theorems & Definitions (38)

  • Definition 1.2
  • Definition 1.3
  • Definition 1.4: Sparse Joint Shift
  • Remark 1.5
  • Remark 1.6
  • Proposition 1.8
  • Theorem 2.1: Equivalent conditions for SJS
  • Lemma 2.2
  • proof : Proof of Lemma \ref{['le:measurable']}
  • proof : Proof of Theorem \ref{['th:eqSJS']}
  • ...and 28 more