Table of Contents
Fetching ...

Supporting Medicinal Chemists in Iterative Hypothesis Generation for Drug Target Identification

Youngseung Jeon, Christopher Hwang, Ziwen Li, Taylor Le Lievre, Jesus J. Campagna, Cohn Whitaker, Varghese John, Eunice Jun, Xiang Anthony Chen

TL;DR

The paper tackles inefficiencies in target-identification by introducing HAPPIER, an integrated AI-driven interface that unifies physical/functional interactions, therapeutic impact, and docking potential within a single PPI-graph. It combines retrieval-augmented generation and docking models to support divergent exploration and convergent validation, enabling iterative cycles of hypothesis generation. Empirical evidence from formative and user studies shows increased quantity and confidence of high-quality hypotheses when experts engage in iterative cycles, with design insights on information layout, domain knowledge integration, and human–AI collaboration. The work demonstrates practical potential to accelerate drug-target discovery and offers design principles for AI-enabled scientific tools across health domains.

Abstract

While drug discovery is vital for human health, the process remains inefficient. Medicinal chemists must navigate a vast protein space to identify target proteins that meet three criteria: physical and functional interactions, therapeutic impact, and docking potential. Prior approaches have provided fragmented support for each criterion, limiting the generation of promising hypotheses for wet-lab experiments. We present HAPPIER, an AI-powered tool that supports hypothesis generation with integrated multi-criteria support for target identification. HAPPIER enables medicinal chemists to 1) efficiently explore and verify proteins in a single integrated graph component showing multi-criteria satisfaction and 2) validate AI suggestions with domain knowledge. These capabilities facilitate iterative cycles of divergent and convergent thinking, essential for hypothesis generation. We evaluated HAPPIER with ten medicinal chemists, finding that it increased the number of high-confidence hypotheses and support for the iterative cycle, and further demonstrated the relationship between engaging in such cycles and confidence in outputs.

Supporting Medicinal Chemists in Iterative Hypothesis Generation for Drug Target Identification

TL;DR

The paper tackles inefficiencies in target-identification by introducing HAPPIER, an integrated AI-driven interface that unifies physical/functional interactions, therapeutic impact, and docking potential within a single PPI-graph. It combines retrieval-augmented generation and docking models to support divergent exploration and convergent validation, enabling iterative cycles of hypothesis generation. Empirical evidence from formative and user studies shows increased quantity and confidence of high-quality hypotheses when experts engage in iterative cycles, with design insights on information layout, domain knowledge integration, and human–AI collaboration. The work demonstrates practical potential to accelerate drug-target discovery and offers design principles for AI-enabled scientific tools across health domains.

Abstract

While drug discovery is vital for human health, the process remains inefficient. Medicinal chemists must navigate a vast protein space to identify target proteins that meet three criteria: physical and functional interactions, therapeutic impact, and docking potential. Prior approaches have provided fragmented support for each criterion, limiting the generation of promising hypotheses for wet-lab experiments. We present HAPPIER, an AI-powered tool that supports hypothesis generation with integrated multi-criteria support for target identification. HAPPIER enables medicinal chemists to 1) efficiently explore and verify proteins in a single integrated graph component showing multi-criteria satisfaction and 2) validate AI suggestions with domain knowledge. These capabilities facilitate iterative cycles of divergent and convergent thinking, essential for hypothesis generation. We evaluated HAPPIER with ten medicinal chemists, finding that it increased the number of high-confidence hypotheses and support for the iterative cycle, and further demonstrated the relationship between engaging in such cycles and confidence in outputs.

Paper Structure

This paper contains 45 sections, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Usage scenario of HAPPIER to support scientific discovery in Target ID. Users input an initial protein, a therapeutic impact, and a ligand (a). They then verify protein-protein interactions (PPIs) on graphs across all three criteria (C1, C2, and C3) through AI models (b-d). Users bookmarked PPIs that are likely to satisfy all criteria (e). Users generate their own PPI graph by repeating the previous steps (f).
  • Figure 2: Overview of HAPPIER. (a) PPI-Graph Panel enables users to explore and validate PPIs across three criteria: C1) interaction potential, C2) therapeutic impact, and C3) docking potential. (b) Detail Panel provides supporting information to enable users to justify AI suggestions, including references and 3D docking simulations.
  • Figure 3: PPI-Bookmark mode enables users to build a personalized PPI graph that consists of bookmarked PPIs. This allows users to filter bookmarked PPIs by subgraph using the checkboxes on the right.
  • Figure 4: (a) The bar graph shows that the divergent and convergent thinking support was significantly higher in HAPPIER than in the control group. (b) The bar graph shows that the PPI outputs were significantly higher in the HAPPIER than in the control group ($^{*}p<0.05$, $^{**}p<0.01$).
  • Figure 5: Process of categorizing the submitted PPIs. Step-1: generating design moves (DMs) by refining transcript in user studies; Step-2: making a linkography with the DMs to identify the DMs of Divergent (orange) and Convergent (purple) thinking; Step-3: identifying PPIs in the DMs of divergent and convergent thinking; Step-4: labeling the submitted PPIs into three groups: (1) Both-DC: present in both divergent (D) and convergent (C) thinking, (2) Either-DC: present in either one but not both, and (3) Neither-DC: present in neither. We considered PPIs that appeared in the DMs but were not among the submitted PPIs (e.g., MARK2 and DCTN1) as those that experts had filtered out through a process of divergent and convergent thinking.
  • ...and 2 more figures