Estimating odds and log odds with guaranteed accuracy

Luis Mendo

Estimating odds and log odds with guaranteed accuracy

Luis Mendo

TL;DR

The paper tackles the problem of estimating the Bernoulli parameter's odds $\gamma = \frac{p}{1-p}$ and log odds $\phi = \log\gamma$ with guarantees that the target accuracy holds uniformly over $p\in(0,1)$. It introduces two unbiased sequential estimators based on inverse binomial sampling: for $\gamma$, independent streams estimate $p$ and $1-p$ to form $\hat{\gamma}$; for $\phi$, independent streams estimate $\log p$ and $\log(1-p)$ to form $\hat{\phi}$. The authors derive explicit variance bounds and average sample sizes, and show Wolfowitz-type efficiency near 1 as the controlling parameter $r$ grows, illustrating a favorable trade-off between accuracy and sampling cost. The approach yields guaranteed relative MSE control for odds and absolute MSE control for log odds, with clear guidance on choosing $r$ to meet desired RMSE targets. This work provides robust, unbiased, sequential estimators for risk-related measures that are broadly applicable in clinical, epidemiological, economic, and machine-learning contexts involving Bernoulli data.

Abstract

Two sequential estimators are proposed for the odds p/(1-p) and log odds log(p/(1-p)) respectively, using independent Bernoulli random variables with parameter p as inputs. The estimators are unbiased, and guarantee that the variance of the estimation error divided by the true value of the odds, or the variance of the estimation error of the log odds, are less than a target value for any p in (0,1). The estimators are close to optimal in the sense of Wolfowitz's bound.

Estimating odds and log odds with guaranteed accuracy

TL;DR

The paper tackles the problem of estimating the Bernoulli parameter's odds

and log odds

with guarantees that the target accuracy holds uniformly over

. It introduces two unbiased sequential estimators based on inverse binomial sampling: for

, independent streams estimate

and

to form

; for

, independent streams estimate

and

to form

. The authors derive explicit variance bounds and average sample sizes, and show Wolfowitz-type efficiency near 1 as the controlling parameter

grows, illustrating a favorable trade-off between accuracy and sampling cost. The approach yields guaranteed relative MSE control for odds and absolute MSE control for log odds, with clear guidance on choosing

to meet desired RMSE targets. This work provides robust, unbiased, sequential estimators for risk-related measures that are broadly applicable in clinical, epidemiological, economic, and machine-learning contexts involving Bernoulli data.

Abstract

Paper Structure (4 sections, 5 theorems, 45 equations, 3 figures)

This paper contains 4 sections, 5 theorems, 45 equations, 3 figures.

Introduction
Estimation of odds
Estimation of log odds
Discussion and further work

Key Result

Theorem 1

For $r \in \mathbb N$, $r \geq 2$, $p \in (0,1)$, the estimator $\hat{\gamma}$ given by eq: vahodds has the following properties:

Figures (3)

Figure 1: Efficiency of odds estimator $\hat{\gamma}$
Figure 2: Proof of bounds in \ref{['eq: lema diff harm: bound']}
Figure 3: Efficiency of log odds estimator $\hat{\phi}$

Theorems & Definitions (10)

Theorem 1
proof
Proposition 1
proof
Theorem 2
Lemma 1
proof
proof : Proof of Theorem \ref{['theo: vahlogp']}
Theorem 3
proof

Estimating odds and log odds with guaranteed accuracy

TL;DR

Abstract

Estimating odds and log odds with guaranteed accuracy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (10)