Table of Contents
Fetching ...

Artificial Intelligence Clones

Annie Liang

TL;DR

The paper introduces a formal framework to compare AI-mediated search using imperfect, high-dimensional representations of individuals with traditional in-person search. It demonstrates that, for any fixed dimensionality $k$, there is a finite AI-equivalent sample size beyond which AI search cannot outperform meeting $m$ people in person; and as $k$ grows large, two in-person meetings suffice to beat even an arbitrarily large AI platform. The analysis reveals a counterintuitive high-dimensional effect: the value of AI-mediated search collapses when many attributes determine match quality, due to noise in AI representations and concentration of distances in high dimensions. A data-heterogeneity extension shows a systematic bias in AI-driven matching toward data-rich individuals, implying potential new forms of social stratification. Overall, AI representations are most beneficial when compatibility is governed by a small set of attributes, while direct human judgment remains essential for complex, multidimensional matching tasks.

Abstract

Large language models, trained on personal data, are increasingly able to mimic individual personalities. These ``AI clones'' or ``AI agents'' have the potential to transform how people search for matches in contexts ranging from marriage to employment. This paper presents a theoretical framework to study the tradeoff between the substantially expanded search capacity of AI representations and their imperfect representation of humans. An individual's personality is modeled as a point in $k$-dimensional Euclidean space, and an individual's AI representation is modeled as a noisy approximation of that personality. I compare two search regimes: Under in person search, each person randomly meets some number of individuals and matches to the most compatible among them; under AI-mediated search, individuals match to the person with the most compatible AI representation. I show that a finite number of in-person encounters yields a better expected match than search over infinite AI representations. Moreover, when personality is sufficiently high-dimensional, simply meeting two people in person is more effective than search on an AI platform, regardless of the size of its candidate pool.

Artificial Intelligence Clones

TL;DR

The paper introduces a formal framework to compare AI-mediated search using imperfect, high-dimensional representations of individuals with traditional in-person search. It demonstrates that, for any fixed dimensionality , there is a finite AI-equivalent sample size beyond which AI search cannot outperform meeting people in person; and as grows large, two in-person meetings suffice to beat even an arbitrarily large AI platform. The analysis reveals a counterintuitive high-dimensional effect: the value of AI-mediated search collapses when many attributes determine match quality, due to noise in AI representations and concentration of distances in high dimensions. A data-heterogeneity extension shows a systematic bias in AI-driven matching toward data-rich individuals, implying potential new forms of social stratification. Overall, AI representations are most beneficial when compatibility is governed by a small set of attributes, while direct human judgment remains essential for complex, multidimensional matching tasks.

Abstract

Large language models, trained on personal data, are increasingly able to mimic individual personalities. These ``AI clones'' or ``AI agents'' have the potential to transform how people search for matches in contexts ranging from marriage to employment. This paper presents a theoretical framework to study the tradeoff between the substantially expanded search capacity of AI representations and their imperfect representation of humans. An individual's personality is modeled as a point in -dimensional Euclidean space, and an individual's AI representation is modeled as a noisy approximation of that personality. I compare two search regimes: Under in person search, each person randomly meets some number of individuals and matches to the most compatible among them; under AI-mediated search, individuals match to the person with the most compatible AI representation. I show that a finite number of in-person encounters yields a better expected match than search over infinite AI representations. Moreover, when personality is sufficiently high-dimensional, simply meeting two people in person is more effective than search on an AI platform, regardless of the size of its candidate pool.

Paper Structure

This paper contains 38 sections, 15 theorems, 130 equations, 3 figures, 1 table.

Key Result

Proposition 3.1

For every number of dimensions $k \in \mathbb{Z}_+$, the AI-equivalent sample size $m^*_k$ is finite.

Figures (3)

  • Figure 1: Replicant helps job candidates create AI clones to represent them in conversations with recruiters and potential employees. Source: Replicant website Replicant2024.
  • Figure 2: Rather than observing $\|X_2-x_0\|$, the true distance between individual $i=2$ and the target, the platform observes $\|Y_{20}-Y_{02}\|$, the distance between their representations.
  • Figure 3: This figure reports estimates of $d^{\text{IP}}_k(2)$ and $d^{\text{AI}}_k(N)$ for $\sigma=0.05$ and $N=10,000$, using their average values over 1,000 draws. The quantity $d^{\text{AI}}_k(N)$ is initially below $d^{\text{IP}}_k(2)$ and eventually overtakes it. Both $d^{\text{IP}}_k(2)$ and $d^{\text{AI}}_k(N)$ converge to $k/(k+1)$ from below.

Theorems & Definitions (28)

  • Definition 2.1
  • Example 2.1: Hiring
  • Example 2.2: Dating
  • Example 2.3: Purchasing a Home
  • Proposition 3.1
  • Theorem 3.1
  • Lemma 3.1
  • Proposition 4.1
  • Corollary 4.1: Many Dimensions
  • Corollary 4.2: Large Noise Disparity
  • ...and 18 more