Table of Contents
Fetching ...

Using Large Language Models to Understand Telecom Standards

Athanasios Karapantelakis, Mukesh Thakur, Alexandros Nikou, Farnaz Moradi, Christian Orlog, Fitsum Gaim, Henrik Holm, Doumitrou Daniil Nimara, Vincent Huang

TL;DR

This work tackles the challenge of navigating increasingly large and complex telecom standards (notably 3GPP) by evaluating large language models as question-answering assistants. It introduces TeleQuAD, a telecom-focused benchmark, and TelcoGenAI, a Retrieval-Augmented Generation platform, to benchmark both foundation models and a domain-adapted extractor, TeleRoBERTa. The study shows that TeleRoBERTa and large but resource-heavy models can achieve high accuracy on telecom standard questions, with TeleRoBERTa matching or approaching the performance of bigger models while using orders of magnitude fewer parameters. Through targeted context engineering and fine-tuning on telecom data, the authors demonstrate substantial gains in accuracy, underscoring the viability of LLM-based tools for telecom troubleshooting, maintenance, and development, and offering methods likely transferable to other standards bodies.

Abstract

The Third Generation Partnership Project (3GPP) has successfully introduced standards for global mobility. However, the volume and complexity of these standards has increased over time, thus complicating access to relevant information for vendors and service providers. Use of Generative Artificial Intelligence (AI) and in particular Large Language Models (LLMs), may provide faster access to relevant information. In this paper, we evaluate the capability of state-of-art LLMs to be used as Question Answering (QA) assistants for 3GPP document reference. Our contribution is threefold. First, we provide a benchmark and measuring methods for evaluating performance of LLMs. Second, we do data preprocessing and fine-tuning for one of these LLMs and provide guidelines to increase accuracy of the responses that apply to all LLMs. Third, we provide a model of our own, TeleRoBERTa, that performs on-par with foundation LLMs but with an order of magnitude less number of parameters. Results show that LLMs can be used as a credible reference tool on telecom technical documents, and thus have potential for a number of different applications from troubleshooting and maintenance, to network operations and software product development.

Using Large Language Models to Understand Telecom Standards

TL;DR

This work tackles the challenge of navigating increasingly large and complex telecom standards (notably 3GPP) by evaluating large language models as question-answering assistants. It introduces TeleQuAD, a telecom-focused benchmark, and TelcoGenAI, a Retrieval-Augmented Generation platform, to benchmark both foundation models and a domain-adapted extractor, TeleRoBERTa. The study shows that TeleRoBERTa and large but resource-heavy models can achieve high accuracy on telecom standard questions, with TeleRoBERTa matching or approaching the performance of bigger models while using orders of magnitude fewer parameters. Through targeted context engineering and fine-tuning on telecom data, the authors demonstrate substantial gains in accuracy, underscoring the viability of LLM-based tools for telecom troubleshooting, maintenance, and development, and offering methods likely transferable to other standards bodies.

Abstract

The Third Generation Partnership Project (3GPP) has successfully introduced standards for global mobility. However, the volume and complexity of these standards has increased over time, thus complicating access to relevant information for vendors and service providers. Use of Generative Artificial Intelligence (AI) and in particular Large Language Models (LLMs), may provide faster access to relevant information. In this paper, we evaluate the capability of state-of-art LLMs to be used as Question Answering (QA) assistants for 3GPP document reference. Our contribution is threefold. First, we provide a benchmark and measuring methods for evaluating performance of LLMs. Second, we do data preprocessing and fine-tuning for one of these LLMs and provide guidelines to increase accuracy of the responses that apply to all LLMs. Third, we provide a model of our own, TeleRoBERTa, that performs on-par with foundation LLMs but with an order of magnitude less number of parameters. Results show that LLMs can be used as a credible reference tool on telecom technical documents, and thus have potential for a number of different applications from troubleshooting and maintenance, to network operations and software product development.
Paper Structure (14 sections, 5 figures, 2 tables)

This paper contains 14 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Increase of number of tokens (words) for 3GPP releases, between Release 8 (2006-01-23) and Release 17 (2018-06-15). At the moment of writing of this paper, Releases 18 and 19 still have an "Open" status, meaning that material is being added to the documents comprising those Releases.
  • Figure 2: Architecture of TelcoGenAI
  • Figure 3: Performance of foundation , using TeleQuAD dataset as ground truth and GPT-4 as reference.
  • Figure 4: Performance of foundation , using BERTScore and TeleQuAD dataset as ground truth.
  • Figure 5: Table transformation: Exemplary table from a specification (top) and replacement text (bottom).