SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning

Trisha Das; Zifeng Wang; Afrah Shafquat; Mandis Beigi; Jason Mezey; Jacob Aptekar; Jimeng Sun

SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning

Trisha Das, Zifeng Wang, Afrah Shafquat, Mandis Beigi, Jason Mezey, Jacob Aptekar, Jimeng Sun

TL;DR

SynRL addresses privacy-constrained sharing of clinical trial data by learning to generate synthetic data aligned with user-defined clinical endpoints through reinforcement learning. It introduces a data value critic that scores synthetic records by downstream task utility and fidelity, and uses a PPO-based RL loop to update a base generator (e.g., TVAE or CTGAN). The approach yields higher downstream utility while maintaining fidelity and privacy across four real trial datasets and is applicable as a general framework to other generators. The work also discusses limitations and future directions, including incorporating privacy into the reward and extending to sequential data, with open-source code available for replication.

Abstract

Each year, hundreds of clinical trials are conducted to evaluate new medical interventions, but sharing patient records from these trials with other institutions can be challenging due to privacy concerns and federal regulations. To help mitigate privacy concerns, researchers have proposed methods for generating synthetic patient data. However, existing approaches for generating synthetic clinical trial data disregard the usage requirements of these data, including maintaining specific properties of clinical outcomes, and only use post hoc assessments that are not coupled with the data generation process. In this paper, we propose SynRL which leverages reinforcement learning to improve the performance of patient data generators by customizing the generated data to meet the user-specified requirements for synthetic data outcomes and endpoints. Our method includes a data value critic function to evaluate the quality of the generated data and uses reinforcement learning to align the data generator with the users' needs based on the critic's feedback. We performed experiments on four clinical trial datasets and demonstrated the advantages of SynRL in improving the quality of the generated synthetic data while keeping the privacy risks low. We also show that SynRL can be utilized as a general framework that can customize data generation of multiple types of synthetic data generators. Our code is available at https://anonymous.4open.science/r/SynRL-DB0F/.

SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning

TL;DR

Abstract

SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)