OASIS: Online Sample Selection for Continual Visual Instruction Tuning
Minjae Lee, Minhyuk Seo, Tingyu Qu, Tinne Tuytelaars, Jonghyun Choi
TL;DR
Continual instruction tuning on streaming data faces training delays and forgetting as distributions shift. OASIS tackles this with two components: ORIS estimates sample informativeness from last‑layer gradients and normalizes across batches using EMA/EMV to obtain a cross‑batch relative score; SIREN then reduces redundancy by accounting for gradient similarity and higher‑order overlaps within the batch, without re-forwarding. Selection is probabilistic, guided by a thresholded sigmoid over the relative informativeness, enabling flexible data quotas. Empirical results across multiple large models and CIT benchmarks show OASIS achieves near full‑data performance using as little as 25% of data, with better efficiency and diversity than prior baselines.
Abstract
In continual instruction tuning (CIT) scenarios, where new instruction tuning data continuously arrive in an online streaming manner, training delays from large-scale data significantly hinder real-time adaptation. Data selection can mitigate this overhead, but existing strategies often rely on pretrained reference models, which are impractical in CIT setups since future data are unknown. Recent reference model-free online sample selection methods address this, but typically select a fixed number of samples per batch (e.g., top-k), making them vulnerable to distribution shifts where informativeness varies across batches. To address these limitations, we propose OASIS, an adaptive online sample selection approach for CIT that (1) selects informative samples by estimating each sample's informativeness relative to all previously seen data, beyond batch-level constraints, and (2) minimizes informative redundancy of selected samples through iterative selection score updates. Experiments on various large foundation models show that OASIS, using only 25 percent of the data, achieves comparable performance to full-data training and outperforms the state-of-the-art sampling methods.
