Table of Contents
Fetching ...

GEO: Generative Engine Optimization

Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande

TL;DR

This work formalizes Generative Engine Optimization (GEO), a black-box optimization framework that helps content creators improve their visibility in Generative Engine responses. It introduces a concrete set of impression metrics, including $Imp_{wc}$, $Imp_{pwc}$, and $SubjectiveImpression$, and defines a consumer-centric objective that balances citation relevance and presence within GE outputs. The authors validate GEO using GEO-bench, a large, diverse 10K-query benchmark across nine datasets, and demonstrate up to 40% gains in GE-visible content, with domain-dependent effectiveness and strong gains from combining GEO strategies. They further validate practicality by testing on Perplexity.ai, a deployed GE, and discuss implications for the creator economy, providing a foundation for future domain-specific optimization and broader adoption in real-world GE ecosystems.

Abstract

The advent of large language models (LLMs) has ushered in a new paradigm of search engines that use generative models to gather and summarize information to answer user queries. This emerging technology, which we formalize under the unified framework of generative engines (GEs), can generate accurate and personalized responses, rapidly replacing traditional search engines like Google and Bing. Generative Engines typically satisfy queries by synthesizing information from multiple sources and summarizing them using LLMs. While this shift significantly improves $\textit{user}$ utility and $\textit{generative search engine}$ traffic, it poses a huge challenge for the third stakeholder -- website and content creators. Given the black-box and fast-moving nature of generative engines, content creators have little to no control over $\textit{when}$ and $\textit{how}$ their content is displayed. With generative engines here to stay, we must ensure the creator economy is not disadvantaged. To address this, we introduce Generative Engine Optimization (GEO), the first novel paradigm to aid content creators in improving their content visibility in generative engine responses through a flexible black-box optimization framework for optimizing and defining visibility metrics. We facilitate systematic evaluation by introducing GEO-bench, a large-scale benchmark of diverse user queries across multiple domains, along with relevant web sources to answer these queries. Through rigorous evaluation, we demonstrate that GEO can boost visibility by up to $40\%$ in generative engine responses. Moreover, we show the efficacy of these strategies varies across domains, underscoring the need for domain-specific optimization methods. Our work opens a new frontier in information discovery systems, with profound implications for both developers of generative engines and content creators.

GEO: Generative Engine Optimization

TL;DR

This work formalizes Generative Engine Optimization (GEO), a black-box optimization framework that helps content creators improve their visibility in Generative Engine responses. It introduces a concrete set of impression metrics, including , , and , and defines a consumer-centric objective that balances citation relevance and presence within GE outputs. The authors validate GEO using GEO-bench, a large, diverse 10K-query benchmark across nine datasets, and demonstrate up to 40% gains in GE-visible content, with domain-dependent effectiveness and strong gains from combining GEO strategies. They further validate practicality by testing on Perplexity.ai, a deployed GE, and discuss implications for the creator economy, providing a foundation for future domain-specific optimization and broader adoption in real-world GE ecosystems.

Abstract

The advent of large language models (LLMs) has ushered in a new paradigm of search engines that use generative models to gather and summarize information to answer user queries. This emerging technology, which we formalize under the unified framework of generative engines (GEs), can generate accurate and personalized responses, rapidly replacing traditional search engines like Google and Bing. Generative Engines typically satisfy queries by synthesizing information from multiple sources and summarizing them using LLMs. While this shift significantly improves utility and traffic, it poses a huge challenge for the third stakeholder -- website and content creators. Given the black-box and fast-moving nature of generative engines, content creators have little to no control over and their content is displayed. With generative engines here to stay, we must ensure the creator economy is not disadvantaged. To address this, we introduce Generative Engine Optimization (GEO), the first novel paradigm to aid content creators in improving their content visibility in generative engine responses through a flexible black-box optimization framework for optimizing and defining visibility metrics. We facilitate systematic evaluation by introducing GEO-bench, a large-scale benchmark of diverse user queries across multiple domains, along with relevant web sources to answer these queries. Through rigorous evaluation, we demonstrate that GEO can boost visibility by up to in generative engine responses. Moreover, we show the efficacy of these strategies varies across domains, underscoring the need for domain-specific optimization methods. Our work opens a new frontier in information discovery systems, with profound implications for both developers of generative engines and content creators.
Paper Structure (36 sections, 5 equations, 4 figures, 7 tables)

This paper contains 36 sections, 5 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Our proposed Generative Engine Optimization (GEO) method optimizes websites to boost their visibility in Generative Engine responses. GEO's black-box optimization framework then enables the website owner of the pizza website, which lacked visibility originally, to optimize their website to increase visibility under Generative Engines. Further, GEO's general framework allows content creators to define and optimize their custom visibility metrics, giving them greater control in this new emerging paradigm.
  • Figure 2: Overview of Generative Engines. Generative Engines primrarily consists of a set of generative models and a search engine to retrieve relevant documents. Generative Engines take user query as input and through a series of steps generate a final response that is grounded in the retrieved sources with inline attributions.
  • Figure 3: Ranking and Visibility Metrics are straightforward in traditional search engines, which list website sources in ranked order with verbatim content. However, Generative Engines generate rich, structured responses, often embedding citations in a single block interleaved with each other. This makes ranking and visibility nuanced and multi-faceted. Further, unlike search engines, where significant research has been conducted on improving visibility, optimizing visibility in generative engine responses remains unclear. To address these challenges, our black-box optimization framework proposes a series of well-designed impression metrics that creators can use to gauge and optimize their website's performance and also allows the creator to define their impression metrics.
  • Figure 4: Relative Improvement on using combination of GEO strategies. Using Fluency Optimization and Statistics Addition in conjunction results in maximum performance. The rightmost column shows using Fluency Optimization with other strategies is most beneficial.