Table of Contents
Fetching ...

Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices

Tao Shen, Didi Zhu, Ziyu Zhao, Zexi Li, Chao Wu, Fei Wu

TL;DR

The paper argues that scaling laws for foundation models are approaching data and compute bottlenecks due to finite high-quality public data and centralized compute power. It proposes leveraging massive distributed edge devices to democratize AI by using edge-generated data and aggregated edge compute for training large models. It surveys data and compute trends, edge-data advantages, and technical advances in small language models, collaborative inference, and on-device/collaborative training, while outlining open problems in heterogenous device fusion and compute sharing. If solved, edge-based distributed training could broaden participation in AI development, reduce environmental impact, and reshape the AI landscape toward more diverse, locally adaptable models.

Abstract

The remarkable success of foundation models has been driven by scaling laws, demonstrating that model performance improves predictably with increased training data and model size. However, this scaling trajectory faces two critical challenges: the depletion of high-quality public data, and the prohibitive computational power required for larger models, which have been monopolized by tech giants. These two bottlenecks pose significant obstacles to the further development of AI. In this position paper, we argue that leveraging massive distributed edge devices can break through these barriers. We reveal the vast untapped potential of data and computational resources on massive edge devices, and review recent technical advancements in distributed/federated learning that make this new paradigm viable. Our analysis suggests that by collaborating on edge devices, everyone can participate in training large language models with small edge devices. This paradigm shift towards distributed training on edge has the potential to democratize AI development and foster a more inclusive AI community.

Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices

TL;DR

The paper argues that scaling laws for foundation models are approaching data and compute bottlenecks due to finite high-quality public data and centralized compute power. It proposes leveraging massive distributed edge devices to democratize AI by using edge-generated data and aggregated edge compute for training large models. It surveys data and compute trends, edge-data advantages, and technical advances in small language models, collaborative inference, and on-device/collaborative training, while outlining open problems in heterogenous device fusion and compute sharing. If solved, edge-based distributed training could broaden participation in AI development, reduce environmental impact, and reshape the AI landscape toward more diverse, locally adaptable models.

Abstract

The remarkable success of foundation models has been driven by scaling laws, demonstrating that model performance improves predictably with increased training data and model size. However, this scaling trajectory faces two critical challenges: the depletion of high-quality public data, and the prohibitive computational power required for larger models, which have been monopolized by tech giants. These two bottlenecks pose significant obstacles to the further development of AI. In this position paper, we argue that leveraging massive distributed edge devices can break through these barriers. We reveal the vast untapped potential of data and computational resources on massive edge devices, and review recent technical advancements in distributed/federated learning that make this new paradigm viable. Our analysis suggests that by collaborating on edge devices, everyone can participate in training large language models with small edge devices. This paradigm shift towards distributed training on edge has the potential to democratize AI development and foster a more inclusive AI community.

Paper Structure

This paper contains 48 sections, 3 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Trend of Computational Demand for Model Training. (Data source: epoch2023trendsinmachinelearninghardware).
  • Figure 2: Global data volume from 2014 to 2025 and IoT device data volume in 2015 and 2025. (Data sources: Global data volume from statista_global_2023; IoT device data volume from statista_iot_2023.)
  • Figure 3: Smartphone data volume with edge computing market size (right) from 2018 to 2028. (Data sources: Edge computing market grandview_edge_2023; Smartphone data volume from bankmycell_smartphone_2023.)
  • Figure 4: Edge Computing Power Evolution Trend. (Data source: nanoreview2025).
  • Figure 5: Smartphone Market Share and Computing Power Trends. (Data source: canalys2025).
  • ...and 1 more figures