Can ChatGPT Learn My Life From a Week of First-Person Video?
Keegan Harris
TL;DR
This work investigates whether foundation models can learn meaningful personal information from passively collected first-person video. By collecting $54$ hours of egocentric footage over a week and creating a hierarchy of time-stamped summaries, the author fine-tunes GPT-4o and GPT-4o-mini within a $100$ budget to produce a personalized system, KeeganGPT. The results show that both models can recover basic attributes (e.g., gender, approximate age) and some contextual details (e.g., residence in Pittsburgh, CMU affiliation, right-handedness, pet), with GPT-4o generally performing more accurately than GPT-4o-mini, but both exhibiting hallucinations such as invented names and occasionally incorrect personality inferences. The study highlights meaningful learnings and clear risks—confident misremembering and bias—from passively collected personal data, underscoring privacy, reliability, and governance considerations for future wearable AI and low-cost personalization. It also points to future work on longer, richer datasets, multimodal signals, and direct video-based reasoning to improve fidelity while addressing ethical and practical concerns.
Abstract
Motivated by recent improvements in generative AI and wearable camera devices (e.g. smart glasses and AI-enabled pins), I investigate the ability of foundation models to learn about the wearer's personal life through first-person camera data. To test this, I wore a camera headset for 54 hours over the course of a week, generated summaries of various lengths (e.g. minute-long, hour-long, and day-long summaries), and fine-tuned both GPT-4o and GPT-4o-mini on the resulting summary hierarchy. By querying the fine-tuned models, we are able to learn what the models learned about me. The results are mixed: Both models learned basic information about me (e.g. approximate age, gender). Moreover, GPT-4o correctly deduced that I live in Pittsburgh, am a PhD student at CMU, am right-handed, and have a pet cat. However, both models also suffered from hallucination and would make up names for the individuals present in the video footage of my life.
