Multi-Integration of Labels across Categories for Component Identification (MILCCI)
Noga Mudrik, Yuxi Chen, Gal Mishne, Adam S. Charles
TL;DR
MILCCI addresses the challenge of mapping multi-category trial labels to high-dimensional time-series by learning category-specific component dictionaries with label-conditioned variants and trial-specific temporal traces. It decomposes observations as a sum over categories with variant-aware loadings, and optimizes via a three-stage fitting process that enforces sparsity and label-distance consistency while smoothing temporal trajectories. Across synthetic data and diverse real-world datasets (voting histories, Wikipedia pageviews, and multi-region neural recordings), MILCCI demonstrates improved recoverability of underlying components and interpretable, label-aware patterns, outperforming conventional tensor and matrix decompositions. The framework enables flexible, cross-trial analysis that separates label-driven structure from non-label-driven variability, with potential extensions to non-linear dynamics and multi-modal data.
Abstract
Many fields collect large-scale temporal data through repeated measurements (trials), where each trial is labeled with a set of metadata variables spanning several categories. For example, a trial in a neuroscience study may be linked to a value from category (a): task difficulty, and category (b): animal choice. A critical challenge in time-series analysis is to understand how these labels are encoded within the multi-trial observations, and disentangle the distinct effect of each label entry across categories. Here, we present MILCCI, a novel data-driven method that i) identifies the interpretable components underlying the data, ii) captures cross-trial variability, and iii) integrates label information to understand each category's representation within the data. MILCCI extends a sparse per-trial decomposition that leverages label similarities within each category to enable subtle, label-driven cross-trial adjustments in component compositions and to distinguish the contribution of each category. MILCCI also learns each component's corresponding temporal trace, which evolves over time within each trial and varies flexibly across trials. We demonstrate MILCCI's performance through both synthetic and real-world examples, including voting patterns, online page view trends, and neuronal recordings.
