Table of Contents
Fetching ...

Chatting Up Attachment: Using LLMs to Predict Adult Bonds

Paulo Soares, Sean McCurdy, Andrew J. Gerber, Peter Fonagy

TL;DR

This work uses GPT-4 and Claude 3 Opus to create agents that simulate adults with varying profiles, childhood memories, and attachment styles, and indicates that training the models using only synthetic data achieves performance comparable to training the models on human data.

Abstract

Obtaining data in the medical field is challenging, making the adoption of AI technology within the space slow and high-risk. We evaluate whether we can overcome this obstacle with synthetic data generated by large language models (LLMs). In particular, we use GPT-4 and Claude 3 Opus to create agents that simulate adults with varying profiles, childhood memories, and attachment styles. These agents participate in simulated Adult Attachment Interviews (AAI), and we use their responses to train models for predicting their underlying attachment styles. We evaluate our models using a transcript dataset from 9 humans who underwent the same interview protocol, analyzed and labeled by mental health professionals. Our findings indicate that training the models using only synthetic data achieves performance comparable to training the models on human data. Additionally, while the raw embeddings from synthetic answers occupy a distinct space compared to those from real human responses, the introduction of unlabeled human data and a simple standardization allows for a closer alignment of these representations. This adjustment is supported by qualitative analyses and is reflected in the enhanced predictive accuracy of the standardized embeddings.

Chatting Up Attachment: Using LLMs to Predict Adult Bonds

TL;DR

This work uses GPT-4 and Claude 3 Opus to create agents that simulate adults with varying profiles, childhood memories, and attachment styles, and indicates that training the models using only synthetic data achieves performance comparable to training the models on human data.

Abstract

Obtaining data in the medical field is challenging, making the adoption of AI technology within the space slow and high-risk. We evaluate whether we can overcome this obstacle with synthetic data generated by large language models (LLMs). In particular, we use GPT-4 and Claude 3 Opus to create agents that simulate adults with varying profiles, childhood memories, and attachment styles. These agents participate in simulated Adult Attachment Interviews (AAI), and we use their responses to train models for predicting their underlying attachment styles. We evaluate our models using a transcript dataset from 9 humans who underwent the same interview protocol, analyzed and labeled by mental health professionals. Our findings indicate that training the models using only synthetic data achieves performance comparable to training the models on human data. Additionally, while the raw embeddings from synthetic answers occupy a distinct space compared to those from real human responses, the introduction of unlabeled human data and a simple standardization allows for a closer alignment of these representations. This adjustment is supported by qualitative analyses and is reflected in the enhanced predictive accuracy of the standardized embeddings.
Paper Structure (18 sections, 10 figures, 2 tables)

This paper contains 18 sections, 10 figures, 2 tables.

Figures (10)

  • Figure 1: The main components of the system. It begins with the creation and persistence of interviewee agents, each equipped with a unique profile and ten childhood memories. During a simulated Adult Attachment Interview (AAI), an interviewee agent updates its working memory (WM) based on the current context, which includes previous chat messages. The agent retrieves and ranks relevant childhood memories, which are then fed into the language module. This module incorporates the selected memories, the user profile, chat history, and the most recent AAI question into the prompt to generate an appropriate response. This process is repeated for each question in the interview until there are no more questions to ask.
  • Figure 2: Example of a user profile generated by GPT-4 with instructions in \ref{['ap:user_profile_prompt']}.
  • Figure 3: Example of a childhood memory generated by GPT-4 for the user profile in \ref{['fig:user_profile_example']} with instructions in \ref{['ap:childhood_memories_prompt']}.
  • Figure 4: The distribution of cosine similarities between all pairwise combinations of embeddings in the synthetic datasets generated by two different large language models (LLMs). We computed cosine similarities per attachment style.
  • Figure 5: 2D UMAP projections of synthetic (GPT-4) and human data embeddings. Both plots show clear clusters corresponding to the different attachment styles within the synthetic embeddings. In the left plot, however, the synthetic embeddings occupy a distinct region of the space, separate from the human embeddings. The right plot demonstrates the impact of standardizing synthetic embeddings using unlabeled human data, where the synthetic embeddings are now more closely grouped, and the attachment style clusters better align with the attachment styles of the labeled human dataset interviews.
  • ...and 5 more figures