OpenGrok: Enhancing SNS Data Processing with Distilled Knowledge and Mask-like Mechanisms
Lumen AI, Zaozhuang No. 28 Middle School, Shihao Ji, Zihui Song, Fucheng Zhong, Jisen Jia, Zhaobo Wu, Zheyi Cao, Tianhao Xu
TL;DR
OpenGrok addresses SNS data processing challenges by transferring knowledge from a large LLM (Grok) to a compact Phi-3-mini via simple response-based distillation, complemented by a mask-like mechanism to filter noise. The approach combines data acquisition through prompt-driven Grok responses with distillation and a mask-guided fine-tuning objective, formalized by a cross-entropy loss and optimized by AdamW. Empirical results show state-of-the-art performance on several SNS tasks, supported by ablation studies that quantify the impact of distillation and the mask mechanism. The method offers practical benefits by delivering high accuracy with reduced computational overhead, enabling efficient processing of noisy, informal SNS text in real-world settings.
Abstract
This report details Lumen Labs' novel approach to processing Social Networking Service (SNS) data. We leverage knowledge distillation, specifically a simple distillation method inspired by DeepSeek-R1's CoT acquisition, combined with prompt hacking, to extract valuable training data from the Grok model. This data is then used to fine-tune a Phi-3-mini model, augmented with a mask-like mechanism specifically designed for handling the nuances of SNS data. Our method demonstrates state-of-the-art (SOTA) performance on several SNS data processing tasks, outperforming existing models like Grok, Phi-3, and GPT-4. We provide a comprehensive analysis of our approach, including mathematical formulations, engineering details, ablation studies, and comparative evaluations.
