A Benchmark and Knowledge-Grounded Framework for Advanced Multimodal Personalization Study
Xia Hu, Honglei Zhuang, Brian Potetz, Alireza Fathi, Bo Hu, Babak Samari, Howard Zhou
TL;DR
This work addresses the challenge of evaluating and enabling advanced multimodal personalization by introducing Life-Bench, a fully synthetic benchmark of virtual accounts with multimodal histories, and LifeGraph, a retrieval-enhanced personal knowledge graph framework. Life-Bench probes complex relational, temporal, and aggregative reasoning over personalized histories, while LifeGraph provides a structured, graph-based retrieval mechanism that grounds multimodal data for personalized reasoning. Empirical results show substantial gaps for existing retrieval-based methods on complex tasks and demonstrate LifeGraph’s strong performance, particularly in relational-temporal reasoning, highlighting the value of graph-structured context and explicit data provenance in personalization. Collectively, the benchmark and framework offer a privacy-preserving, scalable pathway to advance real-world personalized AI that reason over evolving, multimodal personal histories.
Abstract
The powerful reasoning of modern Vision Language Models open a new frontier for advanced personalization study. However, progress in this area is critically hampered by the lack of suitable benchmarks. To address this gap, we introduce Life-Bench, a comprehensive, synthetically generated multimodal benchmark built on simulated user digital footprints. Life-Bench features over questions evaluating a wide spectrum of capabilities, from persona understanding to complex reasoning over historical data. These capabilities expand far beyond prior benchmarks, reflecting the critical demands essential for real-world applications. Furthermore, we propose LifeGraph, an end-to-end framework that organizes personal context into a knowledge graph to facilitate structured retrieval and reasoning. Our experiments on Life-Bench reveal that existing methods falter significantly on complex personalized tasks, exposing a large performance headroom, especially in relational, temporal and aggregative reasoning. While LifeGraph closes this gap by leveraging structured knowledge and demonstrates a promising direction, these advanced personalization tasks remain a critical open challenge, motivating new research in this area.
