PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training

Rongjie Yi; Xiang Li; Weikai Xie; Zhenyan Lu; Chenghua Wang; Ao Zhou; Shangguang Wang; Xiwen Zhang; Mengwei Xu

PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training

Rongjie Yi, Xiang Li, Weikai Xie, Zhenyan Lu, Chenghua Wang, Ao Zhou, Shangguang Wang, Xiwen Zhang, Mengwei Xu

TL;DR

This work presents a simple yet effective principle for SLM design: architecture searching for (near-)optimal runtime efficiency before pre-training, and develops PhoneLM SLM family, that acheive the state-of-the-art capability-efficiency tradeoff among those with similar parameter size.

Abstract

The interest in developing small language models (SLM) for on-device deployment is fast growing. However, the existing SLM design hardly considers the device hardware characteristics. Instead, this work presents a simple yet effective principle for SLM design: architecture searching for (near-)optimal runtime efficiency before pre-training. Guided by this principle, we develop PhoneLM SLM family (currently with 0.5B and 1.5B versions), that acheive the state-of-the-art capability-efficiency tradeoff among those with similar parameter size. We fully open-source the code, weights, and training datasets of PhoneLM for reproducibility and transparency, including both base and instructed versions. We also release a finetuned version of PhoneLM capable of accurate Android Intent invocation, and an end-to-end Android demo. All materials are available at https://github.com/UbiquitousLearning/PhoneLM.

PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training

TL;DR

Abstract

PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)