Table of Contents
Fetching ...

Telecom Language Models: Must They Be Large?

Nicola Piovesan, Antonio De Domenico, Fadhel Ayed

TL;DR

This paper evaluates a small telecom language model, Phi-2, against GPT-3.5 and GPT-4 on telecom-domain understanding using TeleQnA, and shows that Phi-2 benefits substantially from Retrieval-Augmented Generation (RAG) by incorporating an external standards knowledge base. The experiments demonstrate that Phi-2, while far smaller, can achieve competitive performance in standards specifications when paired with RAG (approaching GPT-3.5), and can produce accurate energy-consumption models and user-association decisions in targeted telecom tasks. However, the results also reveal limitations in multi-step reasoning for compact models, underscoring the need for advanced prompting and domain specialization to close the gap with very large models. Overall, the work highlights the practical viability of small language models for telecom applications, especially when combined with RAG and curated knowledge bases, enabling edge deployment and energy-efficient operation with meaningful performance gains.

Abstract

The increasing interest in Large Language Models (LLMs) within the telecommunications sector underscores their potential to revolutionize operational efficiency. However, the deployment of these sophisticated models is often hampered by their substantial size and computational demands, raising concerns about their viability in resource-constrained environments. Addressing this challenge, recent advancements have seen the emergence of small language models that surprisingly exhibit performance comparable to their larger counterparts in many tasks, such as coding and common-sense reasoning. Phi-2, a compact yet powerful model, exemplifies this new wave of efficient small language models. This paper conducts a comprehensive evaluation of Phi-2's intrinsic understanding of the telecommunications domain. Recognizing the scale-related limitations, we enhance Phi-2's capabilities through a Retrieval-Augmented Generation approach, meticulously integrating an extensive knowledge base specifically curated with telecom standard specifications. The enhanced Phi-2 model demonstrates a profound improvement in accuracy, answering questions about telecom standards with a precision that closely rivals the more resource-intensive GPT-3.5. The paper further explores the refined capabilities of Phi-2 in addressing problem-solving scenarios within the telecom sector, highlighting its potential and limitations.

Telecom Language Models: Must They Be Large?

TL;DR

This paper evaluates a small telecom language model, Phi-2, against GPT-3.5 and GPT-4 on telecom-domain understanding using TeleQnA, and shows that Phi-2 benefits substantially from Retrieval-Augmented Generation (RAG) by incorporating an external standards knowledge base. The experiments demonstrate that Phi-2, while far smaller, can achieve competitive performance in standards specifications when paired with RAG (approaching GPT-3.5), and can produce accurate energy-consumption models and user-association decisions in targeted telecom tasks. However, the results also reveal limitations in multi-step reasoning for compact models, underscoring the need for advanced prompting and domain specialization to close the gap with very large models. Overall, the work highlights the practical viability of small language models for telecom applications, especially when combined with RAG and curated knowledge bases, enabling edge deployment and energy-efficient operation with meaningful performance gains.

Abstract

The increasing interest in Large Language Models (LLMs) within the telecommunications sector underscores their potential to revolutionize operational efficiency. However, the deployment of these sophisticated models is often hampered by their substantial size and computational demands, raising concerns about their viability in resource-constrained environments. Addressing this challenge, recent advancements have seen the emergence of small language models that surprisingly exhibit performance comparable to their larger counterparts in many tasks, such as coding and common-sense reasoning. Phi-2, a compact yet powerful model, exemplifies this new wave of efficient small language models. This paper conducts a comprehensive evaluation of Phi-2's intrinsic understanding of the telecommunications domain. Recognizing the scale-related limitations, we enhance Phi-2's capabilities through a Retrieval-Augmented Generation approach, meticulously integrating an extensive knowledge base specifically curated with telecom standard specifications. The enhanced Phi-2 model demonstrates a profound improvement in accuracy, answering questions about telecom standards with a precision that closely rivals the more resource-intensive GPT-3.5. The paper further explores the refined capabilities of Phi-2 in addressing problem-solving scenarios within the telecom sector, highlighting its potential and limitations.
Paper Structure (14 sections, 2 equations, 4 figures, 1 table)

This paper contains 14 sections, 2 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Scheme depicting the retrieve, augment and generate phases of the RAG mechanism
  • Figure 2: Accuracy achieved by different language models over the 'Standards Specifications' category.
  • Figure 3: Ground-truth energy consumption at different BS load levels and estimation performed by Phi-2 standalone and Phi-2 with RAG.
  • Figure 4: Accuracy achived by GPT-3.5 and Phi-2 in resolving the user association problem.