Table of Contents
Fetching ...

Structural Feature Engineering for Generative Engine Optimization: How Content Structure Shapes Citation Behavior

Junwei Yu, Mufeng Yang, Yepeng Ding, Hiroyuki Sato

Abstract

The proliferation of AI-powered search engines has shifted information discovery from traditional link-based retrieval to direct answer generation with selective source citation, creating new challenges for content visibility. While existing Generative Engine Optimization (GEO) approaches focus primarily on semantic content modification, the role of structural features in influencing citation behavior remains underexplored. In this paper, we propose GEO-SFE, a systematic framework for structural feature engineering in generative engine optimization. Our approach decomposes content structure into three hierarchical levels: macro-structure (document architecture), meso-structure (information chunking), and micro-structure (visual emphasis), and models their impact on citation probability across different generative engine architectures. We develop architecture-aware optimization strategies and predictive models that preserve semantic integrity while improving structural effectiveness. Experimental evaluation across six mainstream generative engines demonstrates consistent improvements in citation rate (17.3 percent) and subjective quality (18.5 percent), validating the effectiveness and generalizability of the proposed framework. This work establishes structural optimization as a foundational component of GEO, providing a data-driven methodology for enhancing content visibility in LLM-powered information ecosystems.

Structural Feature Engineering for Generative Engine Optimization: How Content Structure Shapes Citation Behavior

Abstract

The proliferation of AI-powered search engines has shifted information discovery from traditional link-based retrieval to direct answer generation with selective source citation, creating new challenges for content visibility. While existing Generative Engine Optimization (GEO) approaches focus primarily on semantic content modification, the role of structural features in influencing citation behavior remains underexplored. In this paper, we propose GEO-SFE, a systematic framework for structural feature engineering in generative engine optimization. Our approach decomposes content structure into three hierarchical levels: macro-structure (document architecture), meso-structure (information chunking), and micro-structure (visual emphasis), and models their impact on citation probability across different generative engine architectures. We develop architecture-aware optimization strategies and predictive models that preserve semantic integrity while improving structural effectiveness. Experimental evaluation across six mainstream generative engines demonstrates consistent improvements in citation rate (17.3 percent) and subjective quality (18.5 percent), validating the effectiveness and generalizability of the proposed framework. This work establishes structural optimization as a foundational component of GEO, providing a data-driven methodology for enhancing content visibility in LLM-powered information ecosystems.

Paper Structure

This paper contains 20 sections, 24 equations, 2 figures, 6 tables, 5 algorithms.

Figures (2)

  • Figure 1: The workflow of Structural Feature Engineering Citation Performance Assessment. The workflow demonstrates the quantitative evaluation methodology for measuring optimization effectiveness, showing how the theoretical visibility function decomposes into empirically measurable components (coverage, position, influence) and their integration into final citation performance metrics.
  • Figure 2: GEO-SFE Framework Architecture. GEO-SFE Framework Architecture Shows the three-tier hierarchical structure with Macro, Meso, and Micro levels, connected to feature extraction, optimization, and validation components. The framework follows a four-stage pipeline: (1) Feature Extraction analyzes existing content to quantify structural characteristics across all hierarchical levels, (2) Structural Analysis identifies optimization opportunities through cross-platform citation pattern analysis, (3) Optimization applies algorithmic transformations to enhance structural features while preserving semantic integrity, and (4) Validation measures citation performance improvements across target generative engines.