Table of Contents
Fetching ...

Globally-Optimal Greedy Experiment Selection for Active Sequential Estimation

Xiaoou Li, Hongru Zhao

TL;DR

This study proposes adopting a class of greedy experiment selection methods and provides statistical analysis for the maximum likelihood estimator following these selection rules, and proves that these methods produce consistent and asymptotically normal estimators.

Abstract

Motivated by modern applications such as computerized adaptive testing, sequential rank aggregation, and heterogeneous data source selection, we study the problem of active sequential estimation, which involves adaptively selecting experiments for sequentially collected data. The goal is to design experiment selection rules for more accurate model estimation. Greedy information-based experiment selection methods, optimizing the information gain for one-step ahead, have been employed in practice thanks to their computational convenience, flexibility to context or task changes, and broad applicability. However, statistical analysis is restricted to one-dimensional cases due to the problem's combinatorial nature and the seemingly limited capacity of greedy algorithms, leaving the multidimensional problem open. In this study, we close the gap for multidimensional problems. In particular, we propose adopting a class of greedy experiment selection methods and provide statistical analysis for the maximum likelihood estimator following these selection rules. This class encompasses both existing methods and introduces new methods with improved numerical efficiency. We prove that these methods produce consistent and asymptotically normal estimators. Additionally, within a decision theory framework, we establish that the proposed methods achieve asymptotic optimality when the risk measure aligns with the selection rule. We also conduct extensive numerical studies on both simulated and real data to illustrate the efficacy of the proposed methods. From a technical perspective, we devise new analytical tools to address theoretical challenges. These analytical tools are of independent theoretical interest and may be reused in related problems involving stochastic approximation and sequential designs.

Globally-Optimal Greedy Experiment Selection for Active Sequential Estimation

TL;DR

This study proposes adopting a class of greedy experiment selection methods and provides statistical analysis for the maximum likelihood estimator following these selection rules, and proves that these methods produce consistent and asymptotically normal estimators.

Abstract

Motivated by modern applications such as computerized adaptive testing, sequential rank aggregation, and heterogeneous data source selection, we study the problem of active sequential estimation, which involves adaptively selecting experiments for sequentially collected data. The goal is to design experiment selection rules for more accurate model estimation. Greedy information-based experiment selection methods, optimizing the information gain for one-step ahead, have been employed in practice thanks to their computational convenience, flexibility to context or task changes, and broad applicability. However, statistical analysis is restricted to one-dimensional cases due to the problem's combinatorial nature and the seemingly limited capacity of greedy algorithms, leaving the multidimensional problem open. In this study, we close the gap for multidimensional problems. In particular, we propose adopting a class of greedy experiment selection methods and provide statistical analysis for the maximum likelihood estimator following these selection rules. This class encompasses both existing methods and introduces new methods with improved numerical efficiency. We prove that these methods produce consistent and asymptotically normal estimators. Additionally, within a decision theory framework, we establish that the proposed methods achieve asymptotic optimality when the risk measure aligns with the selection rule. We also conduct extensive numerical studies on both simulated and real data to illustrate the efficacy of the proposed methods. From a technical perspective, we devise new analytical tools to address theoretical challenges. These analytical tools are of independent theoretical interest and may be reused in related problems involving stochastic approximation and sequential designs.
Paper Structure (63 sections, 51 theorems, 584 equations, 8 figures, 2 tables, 6 algorithms)

This paper contains 63 sections, 51 theorems, 584 equations, 8 figures, 2 tables, 6 algorithms.

Key Result

Lemma 3.1

Assume the computational complexity of evaluating $\mathbb{G}_{ \bm{\theta} }(\bm{\Sigma})$ and $\nabla \mathbb{G}_{ \bm{\theta} }(\bm{\Sigma})$ is no more than $O(p^3)$. Given the MLE $\widehat{\bm{\theta}}_n^{\text{ML}}$ and $\mathcal{I}(\widehat{\bm{\theta}}_n^{\text{ML}};{\bm{a}}_n)$, we have

Figures (8)

  • Figure 1: Optimal proportion $\bm{\pi}^*$ as a function of $\theta_2$, where the true parameter satisfies $\boldsymbol{\theta^*} =(1,\theta_2)^T$.
  • Figure 2: MSE of the MLE as sample size $n$ varies.
  • Figure 3: Empirical proportion $\overline{\pi}_n(a)$ and the optimal proportion $\pi(a)$ for $a=1,2,3$.
  • Figure 4: Comparison of different selection methods through Kendall's $\tau$ coefficient. The averaged Kendall's $\tau$ correlation between $\bm{\theta}^*$ and $\widehat{\bm{\theta}}_{n}^{\text{ML}}$ versus the number of comparisons is plotted, along with the first and third quartiles, following different active experiment selection methods.
  • Figure 5: Histograms for $\{Z^j_1\}_{j=1}^N$ and $\{Z^j_2\}_{j=1}^N$ following GI0 and GI1, and the density curve for the standard normal distribution. The upper left and bottom left panels show the histogram of $\{Z^j_1\}_{j=1}^N$ following GI0 and GI1, respectively. The upper right and bottom right panels show the histogram of $\{Z^j_2\}_{j=1}^N$ following GI0 and GI1, respectively.
  • ...and 3 more figures

Theorems & Definitions (106)

  • Example 1
  • Lemma 3.1
  • Theorem 4.1: Strong consistency
  • Theorem 4.2: Asymptotic normality following general experiment selection rules
  • Theorem 4.3: Limiting experiment selection frequency following GI0 or GI1
  • Theorem 4.4: Asymptotic normality following GI0 or GI1
  • Theorem 4.5: Asymptotic covariance matrix of the MLE
  • Definition 4.6: $\mathbb{G}_{\bm{\theta}^*}$- optimality
  • Theorem 4.7: $\mathbb{G}_{\bm{\theta}^*}$- optimal selection
  • Theorem 4.8: Minimum risk for unbiased estimators
  • ...and 96 more