Power and Sample Size Calculations for Bayes Factors in two-arm clinical Phase II Trials with binary Endpoints

Riko Kelter

Power and Sample Size Calculations for Bayes Factors in two-arm clinical Phase II Trials with binary Endpoints

Riko Kelter

TL;DR

The approach allows for a Bayes-frequentist compromise by providing a Bayesian analogue to a frequentist power analysis for various Bayes factors in the two-arm binomial setting of a phase II clinical trial.

Abstract

Bayesian sample size calculations in clinical trials usually rely on complex Monte Carlo simulations in practice. Obtaining bounds on Bayesian notions of the false-positive rate and power often lack closed-form or approximate numerical solutions. In this paper, we focus on power and sample size calculations for Bayes factors in the two-arm binomial setting of phase II trials. We cover point-null versus composite and directional hypothesis tests, derive the corresponding Bayes factors, and discuss relevant aspects to consider when pursuing Bayesian design of experiments with the introduced approach. Based on these Bayes factors, we propose a numerical approach which allows to determine the necessary sample size to obtain prespecified bounds of Bayesian power and type-I-error rate in a computationally efficient way. Our method does not rely on Monte Carlo simulations and instead solely relies on standard numerical methods. Real-world examples of phase II trials from oncology and autoimmune diseases illustrate the advantage of the proposed calibration method. In summary, our approach allows for a Bayes-frequentist compromise by providing a Bayesian analogue to a frequentist power analysis for various Bayes factors in the two-arm binomial setting of a phase II clinical trial. The methods are implemented in our R package bfbin2arm.

Power and Sample Size Calculations for Bayes Factors in two-arm clinical Phase II Trials with binary Endpoints

TL;DR

Abstract

Paper Structure (48 sections, 121 equations, 10 figures, 3 tables)

This paper contains 48 sections, 121 equations, 10 figures, 3 tables.

Introduction
Outline
Bayesian power and sample size calculations for Bayes factors in the two-arm setting
Two-sided hypothesis test
Choice of the design and analysis prior
Derivation of the prior-predictive distribution
Derivation of the Bayes factor
Numerical root-finding
Computation of critical value(s)
Computation of Bayesian type-I-error rate and power
Sample size calculation for the Bayes factor
One-sided hypothesis test of $H_0:\eta=0$ versus $H_+:\eta>0$
Binomial model and hypotheses
Design and analysis priors under $H_0$ and $H_+$
Finite-sum form for $C$ (integer shapes)
...and 33 more sections

Figures (10)

Figure 1: Overview of the methodology underlying Bayesian power and sample size calculations for Bayes factors in the two-arm binomial setting, modified and adapted from the one-arm setting in KelterPawel2025
Figure 2: Illustration of the computation of critical values for the Bayes factor power calculation, based on $n_1=n_2=5$ and flat analysis priors with $a_i^a=b_i^a=1$ for $i=1,2$.
Figure 3: Untruncated Beta$(a_1^d,a_2^d)$ design (or analysis) prior (right) and truncated Beta$(a_1^d,a_2^d)$ prior (left) for $a_1^d=2$, $b_1^d=5$, $a_2^d=3$ and $b_2^d=4$, where the truncation is to the set $\iint_{A} \pi_{\text{untr}}(p_1,p_2)\,dp_1\,dp_2 = P(p_2>p_1)$.
Figure 4: Bayesian power and sample size calculations for the riociguat trial, where $H_0:p_1=p_2$ versus $H_1:p_2>p_1$ is tested. Flat design and analysis priors are used with moderate evidence thresholds $k=1/3$, $k_f=3$. The calibration shows the results for 80% power, 5% type-I-error rate and 80% probability of compelling evidence under $H_0$. Frequentist power was obtained under $p_1=0.4$ and $p_2=0.6$.
Figure 5: Bayesian power and sample size calculations for the riociguat trial, where $H_0:p_1=p_2$ versus $H_1:p_2>p_1$ is tested. Informative $B(1,2)$ and $B(2,1)$ design and flat analysis priors are used with strong evidence threshold $k=1/10$ and evidence threshold for compelling evidence under $H_0$ of $k_f=3$. The calibration shows the results for 80% power, 5% type-I-error rate and 80% probability of compelling evidence under $H_0$. Frequentist power was obtained under $p_1=0.4$ and $p_2=0.6$.
...and 5 more figures

Theorems & Definitions (5)

proof : Prior-predictive under $H_1:\eta \neq 0$
proof : Prior-predictive under $H_0:\eta=0$
proof : Finite-sum form for $C$ (integer shapes)
proof : Predictive density under $H_+:\eta >0$
proof : Finite-sum form for $I(y_1,y_2)$ (integer shapes)

Power and Sample Size Calculations for Bayes Factors in two-arm clinical Phase II Trials with binary Endpoints

TL;DR

Abstract

Power and Sample Size Calculations for Bayes Factors in two-arm clinical Phase II Trials with binary Endpoints

Authors

TL;DR

Abstract

Table of Contents

Figures (10)

Theorems & Definitions (5)