LOCUS: A System and Method for Low-Cost Customization for Universal Specialization
Dhanasekar Sundararaman, Keying Li, Wayne Xiong, Aashna Garg
TL;DR
LOCUS addresses the cost and rigidity of large LLM-based customization by a compact pipeline that combines few-shot input, retrieval-driven data augmentation, synthetic generation, and parameter-efficient fine-tuning. The method builds task-specific NER and TC datasets from minimal supervision, merging retrieved real examples with generated samples, and supports both full fine-tuning and LoRA adapters. Across benchmarks like MIT CrossNER, ATIS, AGNews, and MultiNERD, LOCUS and its LoRA-based variant achieve competitive or superior accuracy with drastically reduced memory footprints, often outperforming GPT-4o in few-shot settings. The work demonstrates that high-quality, domain-adapted NLP can be achieved with significantly smaller parameter budgets, enabling on-premises or low-resource deployments.
Abstract
We present LOCUS (LOw-cost Customization for Universal Specialization), a pipeline that consumes few-shot data to streamline the construction and training of NLP models through targeted retrieval, synthetic data generation, and parameter-efficient tuning. With only a small number of labeled examples, LOCUS discovers pertinent data in a broad repository, synthesizes additional training samples via in-context data generation, and fine-tunes models using either full or low-rank (LoRA) parameter adaptation. Our approach targets named entity recognition (NER) and text classification (TC) benchmarks, consistently outperforming strong baselines (including GPT-4o) while substantially lowering costs and model sizes. Our resultant memory-optimized models retain 99% of fully fine-tuned accuracy while using barely 5% of the memory footprint, also beating GPT-4o on several benchmarks with less than 1% of its parameters.
