Table of Contents
Fetching ...

Source Coverage and Citation Bias in LLM-based vs. Traditional Search Engines

Peixian Zhang, Qiming Ye, Zifan Peng, Kiran Garimella, Gareth Tyson

TL;DR

This paper delivers the first large-scale, cross-engine comparison of LLM-based search engines (LLM-SEs) and traditional search engines (TSEs), examining how each sources, cites, and presents information. Using 55,936 queries across six LLM-SEs and two TSEs, it reveals that LLM-SEs broaden domain coverage (37% of domains are unique to LLM-SEs) but generally do not beat TSEs in credibility, political neutrality, or safety. The study finds that LLM-SEs cite far fewer sources per response, yet their domain choices are highly variable and often less popular, with notable biases emerging in certain engines (e.g., prominent left-leaning tendencies and heavy reliance on a small set of domains). A feature-based analysis identifies HTML readability, domain popularity, and link structure as key predictors of which domains are uniquely cited by LLM-SEs, offering actionable guidance for users, publishers, and developers to improve transparency and trust. Overall, the work underscores the need for careful curation and transparent citation practices in LLM-SEs to balance coverage, credibility, and safety in information retrieval.

Abstract

LLM-based Search Engines (LLM-SEs) introduces a new paradigm for information seeking. Unlike Traditional Search Engines (TSEs) (e.g., Google), these systems summarize results, often providing limited citation transparency. The implications of this shift remain largely unexplored, yet raises key questions regarding trust and transparency. In this paper, we present a large-scale empirical study of LLM-SEs, analyzing 55,936 queries and the corresponding search results across six LLM-SEs and two TSEs. We confirm that LLM-SEs cites domain resources with greater diversity than TSEs. Indeed, 37% of domains are unique to LLM-SEs. However, certain risks still persist: LLM-SEs do not outperform TSEs in credibility, political neutrality and safety metrics. Finally, to understand the selection criteria of LLM-SEs, we perform a feature-based analysis to identify key factors influencing source choice. Our findings provide actionable insights for end users, website owners, and developers.

Source Coverage and Citation Bias in LLM-based vs. Traditional Search Engines

TL;DR

This paper delivers the first large-scale, cross-engine comparison of LLM-based search engines (LLM-SEs) and traditional search engines (TSEs), examining how each sources, cites, and presents information. Using 55,936 queries across six LLM-SEs and two TSEs, it reveals that LLM-SEs broaden domain coverage (37% of domains are unique to LLM-SEs) but generally do not beat TSEs in credibility, political neutrality, or safety. The study finds that LLM-SEs cite far fewer sources per response, yet their domain choices are highly variable and often less popular, with notable biases emerging in certain engines (e.g., prominent left-leaning tendencies and heavy reliance on a small set of domains). A feature-based analysis identifies HTML readability, domain popularity, and link structure as key predictors of which domains are uniquely cited by LLM-SEs, offering actionable guidance for users, publishers, and developers to improve transparency and trust. Overall, the work underscores the need for careful curation and transparent citation practices in LLM-SEs to balance coverage, credibility, and safety in information retrieval.

Abstract

LLM-based Search Engines (LLM-SEs) introduces a new paradigm for information seeking. Unlike Traditional Search Engines (TSEs) (e.g., Google), these systems summarize results, often providing limited citation transparency. The implications of this shift remain largely unexplored, yet raises key questions regarding trust and transparency. In this paper, we present a large-scale empirical study of LLM-SEs, analyzing 55,936 queries and the corresponding search results across six LLM-SEs and two TSEs. We confirm that LLM-SEs cites domain resources with greater diversity than TSEs. Indeed, 37% of domains are unique to LLM-SEs. However, certain risks still persist: LLM-SEs do not outperform TSEs in credibility, political neutrality and safety metrics. Finally, to understand the selection criteria of LLM-SEs, we perform a feature-based analysis to identify key factors influencing source choice. Our findings provide actionable insights for end users, website owners, and developers.

Paper Structure

This paper contains 34 sections, 3 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: The Pipeline of Data Collection
  • Figure 2: ECDFs: number of unique (a) websites and (b) domains per response across search engines.
  • Figure 3: Lorenz curve illustrating domain frequency distribution and corresponding Gini index.
  • Figure 4: Percentage of domain overlap: Unique indicates domains appearing in only one search engine type, while Common indicates sharing between aise and tse.
  • Figure 5: RTD of news relevant domains distribution between aise and (a) Google (b) Bing. The domains on the left are more likely to appear in aise response and domains on the right are more likely to appear in tse response.
  • ...and 6 more figures