Table of Contents
Fetching ...

Using generative AI to support standardization work -- the case of 3GPP

Miroslaw Staron, Jonathan Strom, Albin Karlsson, Wilhelm Meding

TL;DR

The paper investigates how large language models can assist 3GPP standardization by automatically summarizing contributor documents and revealing agreements and discussion points. It implements a design science artifact that combines BART XLM for heading summarization, All-MiniLM embeddings for section-level similarity, and cosine-based, heading-weighted similarity to produce usable visualizations and agenda proposals, then validates the approach with Ericsson and a 3GPP delegate. Findings show strong correlations with human judgments in some analyses (up to 0.98) but weaker alignment at the full-document level (≈0.49) and highlight the need for domain-specific pretraining to capture technical nuances and proposals. The study demonstrates potential to reduce effort and accelerate consensus-building in standardization, while outlining concrete future work such as domain-specific 3GPP training and multimedia information extraction to improve accuracy and trust.

Abstract

Standardization processes build upon consensus between partners, which depends on their ability to identify points of disagreement and resolving them. Large standardization organizations, like the 3GPP or ISO, rely on leaders of work packages who can correctly, and efficiently, identify disagreements, discuss them and reach a consensus. This task, however, is effort-, labor-intensive and costly. In this paper, we address the problem of identifying similarities, dissimilarities and discussion points using large language models. In a design science research study, we work with one of the organizations which leads several workgroups in the 3GPP standard. Our goal is to understand how well the language models can support the standardization process in becoming more cost-efficient, faster and more reliable. Our results show that generic models for text summarization correlate well with domain expert's and delegate's assessments (Pearson correlation between 0.66 and 0.98), but that there is a need for domain-specific models to provide better discussion materials for the standardization groups.

Using generative AI to support standardization work -- the case of 3GPP

TL;DR

The paper investigates how large language models can assist 3GPP standardization by automatically summarizing contributor documents and revealing agreements and discussion points. It implements a design science artifact that combines BART XLM for heading summarization, All-MiniLM embeddings for section-level similarity, and cosine-based, heading-weighted similarity to produce usable visualizations and agenda proposals, then validates the approach with Ericsson and a 3GPP delegate. Findings show strong correlations with human judgments in some analyses (up to 0.98) but weaker alignment at the full-document level (≈0.49) and highlight the need for domain-specific pretraining to capture technical nuances and proposals. The study demonstrates potential to reduce effort and accelerate consensus-building in standardization, while outlining concrete future work such as domain-specific 3GPP training and multimedia information extraction to improve accuracy and trust.

Abstract

Standardization processes build upon consensus between partners, which depends on their ability to identify points of disagreement and resolving them. Large standardization organizations, like the 3GPP or ISO, rely on leaders of work packages who can correctly, and efficiently, identify disagreements, discuss them and reach a consensus. This task, however, is effort-, labor-intensive and costly. In this paper, we address the problem of identifying similarities, dissimilarities and discussion points using large language models. In a design science research study, we work with one of the organizations which leads several workgroups in the 3GPP standard. Our goal is to understand how well the language models can support the standardization process in becoming more cost-efficient, faster and more reliable. Our results show that generic models for text summarization correlate well with domain expert's and delegate's assessments (Pearson correlation between 0.66 and 0.98), but that there is a need for domain-specific models to provide better discussion materials for the standardization groups.
Paper Structure (19 sections, 8 figures, 5 tables)

This paper contains 19 sections, 8 figures, 5 tables.

Figures (8)

  • Figure 1: High-level overview of the working group meetings during the 3GPP standardization process. The grey background indicates the scope of this paper.
  • Figure 2: Artefact: System for analyzing and summarizing contributor documents
  • Figure 3: Part of a diagram showing similarity between sections in contributor documents. Each bar represents a pair of documents.
  • Figure 4: An example of clusters of content sections labeled with the most common words in the passages. Each dot represents a section on contributor documents. This diagram indicates which topics the contributors agree on the most.
  • Figure 5: Visualization of the headings and content of each paragraph of the contributions. Each dot represents one section from one document. Each color represents one cluster (based on the k-Means clustering algorithm with k=10).
  • ...and 3 more figures