Knowledgeable Language Models as Black-Box Optimizers for Personalized Medicine
Michael S. Yao, Osbert Bastani, Alma Andersson, Tommaso Biancalani, Aïcha Bentaieb, Claudia Iriondo
TL;DR
LEON reframes personalized medicine as constrained conditional black-box optimization under distribution shift and leverages LLMs with domain knowledge to propose patient-specific treatments without fine-tuning. By introducing entropy-based and Wasserstein-distance constraints via a source critic, LEON guides the optimizer toward in-distribution, high-certainty designs and formalizes an efficient four-step algorithm that updates prompting context rather than model weights. Empirically, LEON outperforms both traditional optimizers and other LLM-based methods across five real-world tasks, while ablations reveal the crucial roles of prior knowledge, embedding choice, and iterative feedback. The work demonstrates a practical pathway for integrating knowledge-rich LLMs into clinical design tasks while preserving patient privacy and supporting responsible evaluation, with future directions including active learning and multi-objective optimization.
Abstract
The goal of personalized medicine is to discover a treatment regimen that optimizes a patient's clinical outcome based on their personal genetic and environmental factors. However, candidate treatments cannot be arbitrarily administered to the patient to assess their efficacy; we often instead have access to an in silico surrogate model that approximates the true fitness of a proposed treatment. Unfortunately, such surrogate models have been shown to fail to generalize to previously unseen patient-treatment combinations. We hypothesize that domain-specific prior knowledge - such as medical textbooks and biomedical knowledge graphs - can provide a meaningful alternative signal of the fitness of proposed treatments. To this end, we introduce LLM-based Entropy-guided Optimization with kNowledgeable priors (LEON), a mathematically principled approach to leverage large language models (LLMs) as black-box optimizers without any task-specific fine-tuning, taking advantage of their ability to contextualize unstructured domain knowledge to propose personalized treatment plans in natural language. In practice, we implement LEON via 'optimization by prompting,' which uses LLMs as stochastic engines for proposing treatment designs. Experiments on real-world optimization tasks show LEON outperforms both traditional and LLM-based methods in proposing individualized treatments for patients.
