Table of Contents
Fetching ...

Adaptive Gaussian Process Search for Simulation-Based Sample Size Estimation in Clinical Prediction Models: Validation of the pmsims R Package

Oyebayo Ridwan Olaniran, Diana Shamsutdinova, Sarah Markham, Felix Zimmer, Daniel Stahl, Gordon Forbes, Ewan Carr

Abstract

Background: Determining an adequate sample size is essential for developing reliable and generalisable clinical prediction models, yet practical guidance on selecting appropriate methods remains limited. Existing analytical and simulation-based approaches often rely on restrictive assumptions and focus on mean-based criteria. We present and validate pmsims, an R package that uses Gaussian process surrogate modelling to provide a flexible and computationally efficient simulation-based framework for sample size determination across diverse prediction settings. Methods: We conducted a comprehensive simulation study with two aims. First, we compared three search engines implemented in pmsims: a Gaussian process-based adaptive method, a deterministic bisection method, and a hybrid approach, across binary, continuous, and survival outcomes. Second, we benchmarked the best-performing pmsims engine against existing analytical (pmsampsize) and simulation-based (samplesizedev) methods, evaluating recommended sample sizes, computational time, and achieved performance on large independent validation datasets. Results: The Gaussian process-based method consistently produced the most stable sample size estimates, particularly in low-signal, high-dimensional settings. In benchmarking, pmsims achieved performance close to prespecified targets across all outcome types, matching simulation-based approaches and outperforming analytical methods in more challenging scenarios. Conclusions: pmsims provides an efficient and flexible framework for principled sample size planning in clinical prediction modelling, requiring fewer model evaluations than non-adaptive simulation approaches.

Adaptive Gaussian Process Search for Simulation-Based Sample Size Estimation in Clinical Prediction Models: Validation of the pmsims R Package

Abstract

Background: Determining an adequate sample size is essential for developing reliable and generalisable clinical prediction models, yet practical guidance on selecting appropriate methods remains limited. Existing analytical and simulation-based approaches often rely on restrictive assumptions and focus on mean-based criteria. We present and validate pmsims, an R package that uses Gaussian process surrogate modelling to provide a flexible and computationally efficient simulation-based framework for sample size determination across diverse prediction settings. Methods: We conducted a comprehensive simulation study with two aims. First, we compared three search engines implemented in pmsims: a Gaussian process-based adaptive method, a deterministic bisection method, and a hybrid approach, across binary, continuous, and survival outcomes. Second, we benchmarked the best-performing pmsims engine against existing analytical (pmsampsize) and simulation-based (samplesizedev) methods, evaluating recommended sample sizes, computational time, and achieved performance on large independent validation datasets. Results: The Gaussian process-based method consistently produced the most stable sample size estimates, particularly in low-signal, high-dimensional settings. In benchmarking, pmsims achieved performance close to prespecified targets across all outcome types, matching simulation-based approaches and outperforming analytical methods in more challenging scenarios. Conclusions: pmsims provides an efficient and flexible framework for principled sample size planning in clinical prediction modelling, requiring fewer model evaluations than non-adaptive simulation approaches.
Paper Structure (27 sections, 10 equations, 18 figures, 9 tables, 3 algorithms)

This paper contains 27 sections, 10 equations, 18 figures, 9 tables, 3 algorithms.

Figures (18)

  • Figure 1: Aim 1 (Binary outcome): Comparison of CV and computational time across search engines gp, bisection and gp-bs for calibration slope metric under varying number of simulation replicates per evaluation $(\kappa = 10, 20)$, prevalence $(0.05, 0.20)$, number of predictors $(p = 10, 100)$ and total simulation budget $(B = 200$--$2000)$.
  • Figure 2: Aim 1 (Survival outcome): Comparison of CV and computational time across search engines gp, bisection and gp-bs for calibration slope metric under varying $(\kappa = 10, 20)$, event rate $(0.4, 0.8)$, number of predictors $(p = 10, 100)$ and total simulation budget $(B = 200$--$2000)$.
  • Figure S1: Aim 1 (Binary outcome): Comparison of CV and computational time across search engines gp, bisection and gp-bs for AUC metric under varying $(\kappa = 10, 20)$, prevalence $(0.05, 0.20)$, number of predictors $(p = 10, 100)$ and total simulation budget $(B = 200$--$2000)$.
  • Figure S2: Aim 1 (Continuous outcome): Comparison of CV and computational time across search engines gp, bisection and gp-bs for calibration slope metric under varying $(\kappa = 10, 20)$, $R^2 = 0.2, 0.7$, number of predictors $(p = 10, 100)$ and total simulation budget $(B = 200$--$2000)$.
  • Figure S3: Aim 1 (Continuous outcome): Comparison of CV and computational time across search engines gp, bisection and gp-bs for $R^2$ metric under varying $(\kappa = 10, 20)$, $R^2 = 0.2, 0.7$, number of predictors $(p = 10, 100)$ and total simulation budget $(B = 200$--$2000)$.
  • ...and 13 more figures