Table of Contents
Fetching ...

Inferring Chronic Treatment Onset from ePrescription Data: A Renewal Process Approach

Pavlin G. Poličar, Dalibor Stanimirović, Blaž Zupan

TL;DR

A probabilistic framework to infer chronic treatment onset is proposed by modeling prescription dynamics as a renewal process and detecting transitions from sporadic to sustained therapy via change-point detection between a baseline Poisson (sporadic prescribing) regime and a regime-specific Weibull (sustained therapy) renewal model.

Abstract

Longitudinal electronic health record (EHR) data are often left-censored, making diagnosis records incomplete and unreliable for determining disease onset. In contrast, outpatient prescriptions form renewal-based trajectories that provide a continuous signal of disease management. We propose a probabilistic framework to infer chronic treatment onset by modeling prescription dynamics as a renewal process and detecting transitions from sporadic to sustained therapy via change-point detection between a baseline Poisson (sporadic prescribing) regime and a regime-specific Weibull (sustained therapy) renewal model. Using a nationwide ePrescription dataset of 2.4 million individuals, we show that the approach yields more temporally plausible onset estimates than naive rule-based triggering, substantially reducing implausible early detections under strong left censoring. Detection performance varies across diseases and is strongly associated with prescription density, highlighting both the strengths and limits of treatment-based onset inference.

Inferring Chronic Treatment Onset from ePrescription Data: A Renewal Process Approach

TL;DR

A probabilistic framework to infer chronic treatment onset is proposed by modeling prescription dynamics as a renewal process and detecting transitions from sporadic to sustained therapy via change-point detection between a baseline Poisson (sporadic prescribing) regime and a regime-specific Weibull (sustained therapy) renewal model.

Abstract

Longitudinal electronic health record (EHR) data are often left-censored, making diagnosis records incomplete and unreliable for determining disease onset. In contrast, outpatient prescriptions form renewal-based trajectories that provide a continuous signal of disease management. We propose a probabilistic framework to infer chronic treatment onset by modeling prescription dynamics as a renewal process and detecting transitions from sporadic to sustained therapy via change-point detection between a baseline Poisson (sporadic prescribing) regime and a regime-specific Weibull (sustained therapy) renewal model. Using a nationwide ePrescription dataset of 2.4 million individuals, we show that the approach yields more temporally plausible onset estimates than naive rule-based triggering, substantially reducing implausible early detections under strong left censoring. Detection performance varies across diseases and is strongly associated with prescription density, highlighting both the strengths and limits of treatment-based onset inference.
Paper Structure (13 sections, 5 equations, 3 figures)

This paper contains 13 sections, 5 equations, 3 figures.

Figures (3)

  • Figure 1: Weibull parameter estimates across drugs; each point represents one drug. (a) Shape parameter estimates from all prescription intervals versus chronically labeled intervals. (b) Shape $k$ and scale $\lambda$ stratified by renewable and non-renewable prescriptions; $k>1$ indicates regular (non-Poisson) prescribing. Point size reflects the number of intervals.
  • Figure 2: Distribution of differences between recorded diagnosis date and inferred treated-phenotype onset date for the naive and change-point methods. Violin plots are ordered from left to right by decreasing mean difference. Negative values indicate inferred onset preceding the recorded diagnosis date.
  • Figure 3: Disease-level recall of naive and renewal-based onset detection. (a) Recall as a function of the symmetric temporal tolerance window around the diagnosis date. Curves show the median recall across ICD codes and shaded regions show the interquartile range (IQR). (b) Relationship between disease-level recall in the $\pm 365$ day window around the true diagnosis date and prescription density, measured as the median number of prescriptions per diagnosed patient. Each point represents one ICD code.