Table of Contents
Fetching ...

Representation Meets Optimization: Training PINNs and PIKANs for Gray-Box Discovery in Systems Pharmacology

Nazanin Ahmadi Daryakenari, Khemraj Shukla, George Em Karniadakis

TL;DR

This work systematically benchmarks PINNs and tanh-cPIKANs for gray-box discovery in pharmacology, introducing a Chebyshev-based tanh-cPIKAN variant that stabilizes training. By evaluating a spectrum of optimizers, learning-rate schedulers, and numerical precisions on PK and PD inverse problems, it shows that no single method is universally best; however, hybrid optimization (RAdam warm-up followed by BFGS) and double-precision training consistently yield strong accuracy, especially for tanh-cPIKANs. The results offer practical guidance on architecture choice, optimization strategies, and precision settings to robustly recover missing dynamics in sparse, ill-posed biomedical data. The study also demonstrates the value of detailed loss-landscape analyses and provides public code to reproduce the findings and apply them to related gray-box modeling tasks.

Abstract

Physics-Informed Kolmogorov-Arnold Networks (PIKANs) are gaining attention as an effective counterpart to the original multilayer perceptron-based Physics-Informed Neural Networks (PINNs). Both representation models can address inverse problems and facilitate gray-box system identification. However, a comprehensive understanding of their performance in terms of accuracy and speed remains underexplored. In particular, we introduce a modified PIKAN architecture, tanh-cPIKAN, which is based on Chebyshev polynomials for parametrization of the univariate functions with an extra nonlinearity for enhanced performance. We then present a systematic investigation of how choices of the optimizer, representation, and training configuration influence the performance of PINNs and PIKANs in the context of systems pharmacology modeling. We benchmark a wide range of first-order, second-order, and hybrid optimizers, including various learning rate schedulers. We use the new Optax library to identify the most effective combinations for learning gray-boxes under ill-posed, non-unique, and data-sparse conditions. We examine the influence of model architecture (MLP vs. KAN), numerical precision (single vs. double), the need for warm-up phases for second-order methods, and sensitivity to the initial learning rate. We also assess the optimizer scalability for larger models and analyze the trade-offs introduced by JAX in terms of computational efficiency and numerical accuracy. Using two representative systems pharmacology case studies - a pharmacokinetics model and a chemotherapy drug-response model - we offer practical guidance on selecting optimizers and representation models/architectures for robust and efficient gray-box discovery. Our findings provide actionable insights for improving the training of physics-informed networks in biomedical applications and beyond.

Representation Meets Optimization: Training PINNs and PIKANs for Gray-Box Discovery in Systems Pharmacology

TL;DR

This work systematically benchmarks PINNs and tanh-cPIKANs for gray-box discovery in pharmacology, introducing a Chebyshev-based tanh-cPIKAN variant that stabilizes training. By evaluating a spectrum of optimizers, learning-rate schedulers, and numerical precisions on PK and PD inverse problems, it shows that no single method is universally best; however, hybrid optimization (RAdam warm-up followed by BFGS) and double-precision training consistently yield strong accuracy, especially for tanh-cPIKANs. The results offer practical guidance on architecture choice, optimization strategies, and precision settings to robustly recover missing dynamics in sparse, ill-posed biomedical data. The study also demonstrates the value of detailed loss-landscape analyses and provides public code to reproduce the findings and apply them to related gray-box modeling tasks.

Abstract

Physics-Informed Kolmogorov-Arnold Networks (PIKANs) are gaining attention as an effective counterpart to the original multilayer perceptron-based Physics-Informed Neural Networks (PINNs). Both representation models can address inverse problems and facilitate gray-box system identification. However, a comprehensive understanding of their performance in terms of accuracy and speed remains underexplored. In particular, we introduce a modified PIKAN architecture, tanh-cPIKAN, which is based on Chebyshev polynomials for parametrization of the univariate functions with an extra nonlinearity for enhanced performance. We then present a systematic investigation of how choices of the optimizer, representation, and training configuration influence the performance of PINNs and PIKANs in the context of systems pharmacology modeling. We benchmark a wide range of first-order, second-order, and hybrid optimizers, including various learning rate schedulers. We use the new Optax library to identify the most effective combinations for learning gray-boxes under ill-posed, non-unique, and data-sparse conditions. We examine the influence of model architecture (MLP vs. KAN), numerical precision (single vs. double), the need for warm-up phases for second-order methods, and sensitivity to the initial learning rate. We also assess the optimizer scalability for larger models and analyze the trade-offs introduced by JAX in terms of computational efficiency and numerical accuracy. Using two representative systems pharmacology case studies - a pharmacokinetics model and a chemotherapy drug-response model - we offer practical guidance on selecting optimizers and representation models/architectures for robust and efficient gray-box discovery. Our findings provide actionable insights for improving the training of physics-informed networks in biomedical applications and beyond.

Paper Structure

This paper contains 28 sections, 25 equations, 16 figures, 13 tables.

Figures (16)

  • Figure 1: PINs: Physics-Informed Networks. Here $u$ and $u_0$ are observables at $t>0$ and $t=0$, respectively. $\hat{u}$ and $\hat{u}_0$ are inferred field from PINNs or PIKANs at t>0 and $t=0$, respectively.
  • Figure 2: PK model: Comparison of loss landscapes between cPIKANs and tanh-cPIKANs in the PCA subspace. Both models are trained for the same number of iterations(70k), and parameter snapshots were collected every 100 epochs. Subfigures (a) and (b) illustrate the 3D loss surface reconstructed from the top two PCA directions of parameter evolution. Introducing outer nonlinearities in tanh-cPIKANs smoothens the loss surface and improves convergence and robustness.
  • Figure 3: Types of error in Physics-Informed Networks (PINNs and PIKANs).
  • Figure 4: Pharmacokinetics model.
  • Figure 5: Pharmacodynamics model.
  • ...and 11 more figures