Table of Contents
Fetching ...

Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications

Ali Maatouk, Kenny Chirino Ampudia, Rex Ying, Leandros Tassiulas

TL;DR

The paper tackles the lack of domain-specific LLMs in telecommunications by creating Tele-Data and Tele-Eval to train and assess telecom-specialized models. It demonstrates that full fine-tuning outperforms parameter-efficient methods for domain adaptation and shows that training on the entire telecom corpus yields stronger cross-domain transfer than focusing on isolated aspects. The authors release Tele-LLMs (1B–8B) that outperform general-purpose models on Tele-Eval and telecom literature tasks while preserving prior capabilities, mitigating catastrophic forgetting. They also provide insights into data composition, training dynamics, and instruction-tuning effects, laying groundwork for future multi-modal telecom reasoning and retrieval-augmented generation frameworks.

Abstract

The emergence of large language models (LLMs) has significantly impacted various fields, from natural language processing to sectors like medicine and finance. However, despite their rapid proliferation, the applications of LLMs in telecommunications remain limited, often relying on general-purpose models that lack domain-specific specialization. This lack of specialization results in underperformance, particularly when dealing with telecommunications-specific technical terminology and their associated mathematical representations. This paper addresses this gap by first creating and disseminating Tele-Data, a comprehensive dataset of telecommunications material curated from relevant sources, and Tele-Eval, a large-scale question-and-answer dataset tailored to the domain. Through extensive experiments, we explore the most effective training techniques for adapting LLMs to the telecommunications domain, ranging from examining the division of expertise across various telecommunications aspects to employing parameter-efficient techniques. We also investigate how models of different sizes behave during adaptation and analyze the impact of their training data on this behavior. Leveraging these findings, we develop and open-source Tele-LLMs, the first series of language models ranging from 1B to 8B parameters, specifically tailored for telecommunications. Our evaluations demonstrate that these models outperform their general-purpose counterparts on Tele-Eval and telecommunications-related literature tasks while retaining their previously acquired capabilities, thus avoiding the catastrophic forgetting phenomenon.

Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications

TL;DR

The paper tackles the lack of domain-specific LLMs in telecommunications by creating Tele-Data and Tele-Eval to train and assess telecom-specialized models. It demonstrates that full fine-tuning outperforms parameter-efficient methods for domain adaptation and shows that training on the entire telecom corpus yields stronger cross-domain transfer than focusing on isolated aspects. The authors release Tele-LLMs (1B–8B) that outperform general-purpose models on Tele-Eval and telecom literature tasks while preserving prior capabilities, mitigating catastrophic forgetting. They also provide insights into data composition, training dynamics, and instruction-tuning effects, laying groundwork for future multi-modal telecom reasoning and retrieval-augmented generation frameworks.

Abstract

The emergence of large language models (LLMs) has significantly impacted various fields, from natural language processing to sectors like medicine and finance. However, despite their rapid proliferation, the applications of LLMs in telecommunications remain limited, often relying on general-purpose models that lack domain-specific specialization. This lack of specialization results in underperformance, particularly when dealing with telecommunications-specific technical terminology and their associated mathematical representations. This paper addresses this gap by first creating and disseminating Tele-Data, a comprehensive dataset of telecommunications material curated from relevant sources, and Tele-Eval, a large-scale question-and-answer dataset tailored to the domain. Through extensive experiments, we explore the most effective training techniques for adapting LLMs to the telecommunications domain, ranging from examining the division of expertise across various telecommunications aspects to employing parameter-efficient techniques. We also investigate how models of different sizes behave during adaptation and analyze the impact of their training data on this behavior. Leveraging these findings, we develop and open-source Tele-LLMs, the first series of language models ranging from 1B to 8B parameters, specifically tailored for telecommunications. Our evaluations demonstrate that these models outperform their general-purpose counterparts on Tele-Eval and telecommunications-related literature tasks while retaining their previously acquired capabilities, thus avoiding the catastrophic forgetting phenomenon.
Paper Structure (32 sections, 8 equations, 8 figures, 7 tables)

This paper contains 32 sections, 8 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Overall pipeline of the LLM adaptation to telecommunications.
  • Figure 2: Raw cross-entropy loss.
  • Figure 3: Cleaned cross-entropy loss.
  • Figure 4: Training metrics for Gemma-2B (top) and Llama-3-8B (bottom) models using LoRa.
  • Figure 5: Training metrics for Gemma-2B (top) and Llama-3-8B (bottom) models across three epochs.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4