Table of Contents
Fetching ...

IW-GAE: Importance Weighted Group Accuracy Estimation for Improved Calibration and Model Selection in Unsupervised Domain Adaptation

Taejong Joo, Diego Klabjan

TL;DR

The paper tackles calibration and model selection under distribution shifts in unsupervised domain adaptation by proposing IW-GAE, an importance-weighted group accuracy estimator. It defines two estimators for group accuracy (Monte Carlo and importance-weighted) and optimizes the IW to align them, with theoretical bounds connecting source-estimation error to target accuracy. By constructing predictive groups based on confidence and employing CI-guided IW estimation, IW-GAE achieves substantial improvements in calibration and model selection across multiple UDA benchmarks. The approach highlights the value of group-wise accuracy and distribution-aware weighting for reliable deployment in shifted environments.

Abstract

Distribution shifts pose significant challenges for model calibration and model selection tasks in the unsupervised domain adaptation problem -- a scenario where the goal is to perform well in a distribution shifted domain without labels. In this work, we tackle difficulties coming from distribution shifts by developing a novel importance weighted group accuracy estimator. Specifically, we present a new perspective of addressing the model calibration and model selection tasks by estimating the group accuracy. Then, we formulate an optimization problem for finding an importance weight that leads to an accurate group accuracy estimation with theoretical analyses. Our extensive experiments show that our approach improves state-of-the-art performances by 22% in the model calibration task and 14% in the model selection task.

IW-GAE: Importance Weighted Group Accuracy Estimation for Improved Calibration and Model Selection in Unsupervised Domain Adaptation

TL;DR

The paper tackles calibration and model selection under distribution shifts in unsupervised domain adaptation by proposing IW-GAE, an importance-weighted group accuracy estimator. It defines two estimators for group accuracy (Monte Carlo and importance-weighted) and optimizes the IW to align them, with theoretical bounds connecting source-estimation error to target accuracy. By constructing predictive groups based on confidence and employing CI-guided IW estimation, IW-GAE achieves substantial improvements in calibration and model selection across multiple UDA benchmarks. The approach highlights the value of group-wise accuracy and distribution-aware weighting for reliable deployment in shifted environments.

Abstract

Distribution shifts pose significant challenges for model calibration and model selection tasks in the unsupervised domain adaptation problem -- a scenario where the goal is to perform well in a distribution shifted domain without labels. In this work, we tackle difficulties coming from distribution shifts by developing a novel importance weighted group accuracy estimator. Specifically, we present a new perspective of addressing the model calibration and model selection tasks by estimating the group accuracy. Then, we formulate an optimization problem for finding an importance weight that leads to an accurate group accuracy estimation with theoretical analyses. Our extensive experiments show that our approach improves state-of-the-art performances by 22% in the model calibration task and 14% in the model selection task.
Paper Structure (42 sections, 4 theorems, 23 equations, 9 figures, 10 tables, 1 algorithm)

This paper contains 42 sections, 4 theorems, 23 equations, 9 figures, 10 tables, 1 algorithm.

Key Result

Proposition 2.1

Let $\hat{\beta}^{(id)}$ and $\hat{\beta}^{(gr)}$ be MLEs of individual and group accuracies. Then, $\hat{\beta}^{(gr)}$ has a lower expected mean-squared error than $\hat{\beta}^{(id)}$ if where $\bar{\sigma}^2 = \tfrac{1}{N_n} \sum_{i \in [N_n]} \sigma^2_{x_i}$ with $\sigma^2_{x_i} = \beta(x_i) (1-\beta(x_i))$.

Figures (9)

  • Figure 1: Figure \ref{['fig:figure1b']} illustrates ideal and failure cases of IW-GAE with nine data points (red diamonds) from three groups (gray boxes). Group 1 is desirable for model calibration where the group accuracy estimation (a blue rectangle) well represents the individual expected accuracies of samples in the group. Conversely, group accuracy estimation could inaccurately represent the individual accuracies in the group due to a high variance of accuracies within the group (group 2) and a high bias of the estimator (group 3). For model selection, we aim to match the mean group accuracy estimation (the blue dotted line as an average of blue rectangles) to the mean expected accuracy (the red dotted line as an average of red diamonds), which can be induced by accurate group accuracy estimations for each group. Figure \ref{['fig:figure1a']} explains the idea of encouraging two estimators close to each other. The shaded area for the IW-based estimator is possible group accuracy estimations from different IWs. IW-GAE finds the IW minimizing the opt error for reducing the group accuracy estimation error.
  • Figure 2: Illustration of correlations between the optimization error and the source and target group accuracy estimation errors. Each point corresponds to a different IW estimator and the values are measured on the OfficeHome dataset (720 IW estimators in total). See Appendix \ref{['appx:terms_analy']} for more detailed discussions and analyses.
  • Figure 3: True group accuracy and estimated group accuracy of IW-GAE and IW-Mid under MDD. The shaded areas represent possible group accuracy estimation with binned IWs in the CI. See Figure \ref{['fig:rc_curve']} for visualization of all domain pairs.
  • Figure 4: Sensitivity analysis with respect to $M$ (a) and $B$ (b) on four domain pairs (Ar-Pr, Pr-Cl, Rw-Cl, Rw-Pr) in OfficeHome. The shaded areas represent areas between the minimum and the maximum changes in ECE (lower is better).
  • Figure A1: The shaded area includes values of maximum and minimum expected accuracies within the group, which satisfy the sufficient condition for (\ref{['eq:stat_favorable']}) when $N_n = 100$.
  • ...and 4 more figures

Theorems & Definitions (8)

  • Proposition 2.1
  • Proposition 3.1
  • proof
  • Proposition 1.1
  • proof
  • proof
  • Proposition 1.2: Bias-variance decomposition
  • proof