Table of Contents
Fetching ...

AI Agent-Driven Framework for Automated Product Knowledge Graph Construction in E-Commerce

Dimitar Peshevski, Riste Stojanov, Dimitar Trajanov

TL;DR

The paper tackles automated construction of product knowledge graphs from unstructured product descriptions in e-commerce. It proposes an AI-agent framework powered by large language models that jointly perform ontology creation and expansion, ontology refinement, and knowledge graph population, without handcrafted schemas or extraction rules. Evaluation on 291 air conditioner descriptions yields an ontology with 42 classes and 69 properties and a KG with 7,459 RDF triples, achieving 97.1% property coverage and only 3% failures due to invalid RDF. The work demonstrates a scalable, end-to-end approach for automated product knowledge extraction with practical impact on search, recommendation, and analytics, and discusses extensions to multimodal data and streaming updates.

Abstract

The rapid expansion of e-commerce platforms generates vast amounts of unstructured product data, creating significant challenges for information retrieval, recommendation systems, and data analytics. Knowledge Graphs (KGs) offer a structured, interpretable format to organize such data, yet constructing product-specific KGs remains a complex and manual process. This paper introduces a fully automated, AI agent-driven framework for constructing product knowledge graphs directly from unstructured product descriptions. Leveraging Large Language Models (LLMs), our method operates in three stages using dedicated agents: ontology creation and expansion, ontology refinement, and knowledge graph population. This agent-based approach ensures semantic coherence, scalability, and high-quality output without relying on predefined schemas or handcrafted extraction rules. We evaluate the system on a real-world dataset of air conditioner product descriptions, demonstrating strong performance in both ontology generation and KG population. The framework achieves over 97\% property coverage and minimal redundancy, validating its effectiveness and practical applicability. Our work highlights the potential of LLMs to automate structured knowledge extraction in retail, providing a scalable path toward intelligent product data integration and utilization.

AI Agent-Driven Framework for Automated Product Knowledge Graph Construction in E-Commerce

TL;DR

The paper tackles automated construction of product knowledge graphs from unstructured product descriptions in e-commerce. It proposes an AI-agent framework powered by large language models that jointly perform ontology creation and expansion, ontology refinement, and knowledge graph population, without handcrafted schemas or extraction rules. Evaluation on 291 air conditioner descriptions yields an ontology with 42 classes and 69 properties and a KG with 7,459 RDF triples, achieving 97.1% property coverage and only 3% failures due to invalid RDF. The work demonstrates a scalable, end-to-end approach for automated product knowledge extraction with practical impact on search, recommendation, and analytics, and discusses extensions to multimodal data and streaming updates.

Abstract

The rapid expansion of e-commerce platforms generates vast amounts of unstructured product data, creating significant challenges for information retrieval, recommendation systems, and data analytics. Knowledge Graphs (KGs) offer a structured, interpretable format to organize such data, yet constructing product-specific KGs remains a complex and manual process. This paper introduces a fully automated, AI agent-driven framework for constructing product knowledge graphs directly from unstructured product descriptions. Leveraging Large Language Models (LLMs), our method operates in three stages using dedicated agents: ontology creation and expansion, ontology refinement, and knowledge graph population. This agent-based approach ensures semantic coherence, scalability, and high-quality output without relying on predefined schemas or handcrafted extraction rules. We evaluate the system on a real-world dataset of air conditioner product descriptions, demonstrating strong performance in both ontology generation and KG population. The framework achieves over 97\% property coverage and minimal redundancy, validating its effectiveness and practical applicability. Our work highlights the potential of LLMs to automate structured knowledge extraction in retail, providing a scalable path toward intelligent product data integration and utilization.

Paper Structure

This paper contains 16 sections, 1 figure.

Figures (1)

  • Figure 1: Agent-based workflow consisting of ontology creation, refinement, and KG population. Each step is handled by a dedicated LLM-powered agent operating within a modular pipeline.