Table of Contents
Fetching ...

New tools for comparing classical and neural ODE models for tumor growth

Anthony D. Blaom, Samuel Okon

TL;DR

This paper presents TumorGrowth.jl, a Julia package for calibrating and comparing ODE-based tumor growth models, including novel neural ODEs, against classical formulations. Using a re-analysis of the Laleh et al. meta-study of 652 lesion time series, the authors assess statistical significance of performance differences and explore rebound dynamics with two-dimensional extensions. The results indicate that the General Bertalanffy model generally outperforms alternatives on average, while two-dimensional neural ODEs offer limited gains and can overfit without sufficient data; more measurements can make complex models advantageous. The work provides a transparent, reproducible framework for model comparison in tumor growth and highlights practical considerations for calibration stability and data requirements in clinical contexts.

Abstract

A new computational tool TumorGrowth$.$jl for modeling tumor growth is introduced. The tool allows the comparison of standard textbook models, such as General Bertalanffy and Gompertz, with some newer models, including, for the first time, neural ODE models. As an application, we revisit a human meta-study of non-small cell lung cancer and bladder cancer lesions, in patients undergoing two different treatment options, to determine if previously reported performance differences are statistically significant, and if newer, more complex models perform any better. In a population of examples with at least four time-volume measurements available for calibration, and an average of about 6.3, our main conclusion is that the General Bertalanffy model has superior performance, on average. However, where more measurements are available, we argue that more complex models, capable of capturing rebound and relapse behavior, may be better choices.

New tools for comparing classical and neural ODE models for tumor growth

TL;DR

This paper presents TumorGrowth.jl, a Julia package for calibrating and comparing ODE-based tumor growth models, including novel neural ODEs, against classical formulations. Using a re-analysis of the Laleh et al. meta-study of 652 lesion time series, the authors assess statistical significance of performance differences and explore rebound dynamics with two-dimensional extensions. The results indicate that the General Bertalanffy model generally outperforms alternatives on average, while two-dimensional neural ODEs offer limited gains and can overfit without sufficient data; more measurements can make complex models advantageous. The work provides a transparent, reproducible framework for model comparison in tumor growth and highlights practical considerations for calibration stability and data requirements in clinical contexts.

Abstract

A new computational tool TumorGrowthjl for modeling tumor growth is introduced. The tool allows the comparison of standard textbook models, such as General Bertalanffy and Gompertz, with some newer models, including, for the first time, neural ODE models. As an application, we revisit a human meta-study of non-small cell lung cancer and bladder cancer lesions, in patients undergoing two different treatment options, to determine if previously reported performance differences are statistically significant, and if newer, more complex models perform any better. In a population of examples with at least four time-volume measurements available for calibration, and an average of about 6.3, our main conclusion is that the General Bertalanffy model has superior performance, on average. However, where more measurements are available, we argue that more complex models, capable of capturing rebound and relapse behavior, may be better choices.

Paper Structure

This paper contains 13 sections, 9 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Selected model comparisons for observations with relapse or rebound. Solid diamonds indicate calibration data, open diamonds subsequent observations not used in calibration. Two classical models, exponential and General Bertalanffy ( bertalanffy), are compared with a 2D generalization of General Bertalanffy ( bertalanffy2) and a 2D, 14-parameter neural ODE ( neural2). While the newer models perform better on the holdout data in (a), the classical models do better in (d). Results are mixed in cases (b) and (c).
  • Figure 2: Box-and-whisker plots of the absolute prediction errors on a holdout set.