Yi-Lightning Technical Report

Alan Wake; Bei Chen; C. X. Lv; Chao Li; Chengen Huang; Chenglin Cai; Chujie Zheng; Daniel Cooper; Fan Zhou; Feng Hu; Ge Zhang; Guoyin Wang; Heng Ji; Howard Qiu; Jiangcheng Zhu; Jun Tian; Katherine Su; Lihuan Zhang; Liying Li; Ming Song; Mou Li; Peng Liu; Qicheng Hu; Shawn Wang; Shijun Zhou; Shiming Yang; Shiyong Li; Tianhang Zhu; Wen Xie; Wenhao Huang; Xiang He; Xiaobo Chen; Xiaohui Hu; Xiaoyi Ren; Xinyao Niu; Yanpeng Li; Yongke Zhao; Yongzhen Luo; Yuchi Xu; Yuxuan Sha; Zhaodong Yan; Zhiyuan Liu; Zirui Zhang; Zonghong Dai

Yi-Lightning Technical Report

Alan Wake, Bei Chen, C. X. Lv, Chao Li, Chengen Huang, Chenglin Cai, Chujie Zheng, Daniel Cooper, Fan Zhou, Feng Hu, Ge Zhang, Guoyin Wang, Heng Ji, Howard Qiu, Jiangcheng Zhu, Jun Tian, Katherine Su, Lihuan Zhang, Liying Li, Ming Song, Mou Li, Peng Liu, Qicheng Hu, Shawn Wang, Shijun Zhou, Shiming Yang, Shiyong Li, Tianhang Zhu, Wen Xie, Wenhao Huang, Xiang He, Xiaobo Chen, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Yanpeng Li, Yongke Zhao, Yongzhen Luo, Yuchi Xu, Yuxuan Sha, Zhaodong Yan, Zhiyuan Liu, Zirui Zhang, Zonghong Dai

TL;DR

Yi-Lightning tackles the practical challenge of building scalable, safe, and high-performing LLMs by advancing the Mixture-of-Experts architecture with fine-grained segmentation, a refined routing scheme, and cross-layer KV cache sharing to boost efficiency. The work couples this core model design with a three-stage pre-training, targeted data enhancements, and a two-stage SFT followed by RLHF under the RAISE safety framework, all supported by a hardware-aware, high-throughput infrastructure. Long-context capabilities are extended to 64K tokens, and data synthesis plus DPO-based alignment contribute to strong real-world performance, as evidenced by top-tier results on Chatbot Arena and competitive academic benchmarks. The authors also emphasize a gap between traditional benchmarks and real user preferences, arguing for evaluation approaches that better reflect practical intelligent agents and safety considerations in deployed systems.

Abstract

This technical report presents Yi-Lightning, our latest flagship large language model (LLM). It achieves exceptional performance, ranking 6th overall on Chatbot Arena, with particularly strong results (2nd to 4th place) in specialized categories including Chinese, Math, Coding, and Hard Prompts. Yi-Lightning leverages an enhanced Mixture-of-Experts (MoE) architecture, featuring advanced expert segmentation and routing mechanisms coupled with optimized KV-caching techniques. Our development process encompasses comprehensive pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), where we devise deliberate strategies for multi-stage training, synthetic data construction, and reward modeling. Furthermore, we implement RAISE (Responsible AI Safety Engine), a four-component framework to address safety issues across pre-training, post-training, and serving phases. Empowered by our scalable super-computing infrastructure, all these innovations substantially reduce training, deployment and inference costs while maintaining high-performance standards. With further evaluations on public academic benchmarks, Yi-Lightning demonstrates competitive performance against top-tier LLMs, while we observe a notable disparity between traditional, static benchmark results and real-world, dynamic human preferences. This observation prompts a critical reassessment of conventional benchmarks' utility in guiding the development of more intelligent and powerful AI systems for practical applications. Yi-Lightning is now available through our developer platform at https://platform.lingyiwanwu.com.

Yi-Lightning Technical Report

TL;DR

Abstract

Yi-Lightning Technical Report

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)