Table of Contents
Fetching ...

LUMOS: Democratizing SciML Workflows with L0-Regularized Learning for Unified Feature and Parameter Adaptation

Shouwei Gao, Xu Zheng, Dongsheng Luo, Sheng Di, Wenqian Dong

TL;DR

LUMOS is introduced, an end-to-end framework based on L0-regularized learning that unifies feature selection and model pruning to democratize SciML model design and reduces the reliance on manual tuning while maintaining predictive accuracy.

Abstract

The rapid growth of scientific machine learning (SciML) has accelerated discovery across diverse domains, yet designing effective SciML models remains a challenging task. In practice, building such models often requires substantial prior knowledge and manual expertise, particularly in determining which input features to use and how large the model should be. We introduce LUMOS, an end-to-end framework based on L0-regularized learning that unifies feature selection and model pruning to democratize SciML model design. By employing semi-stochastic gating and reparameterization techniques, LUMOS dynamically selects informative features and prunes redundant parameters during training, reducing the reliance on manual tuning while maintaining predictive accuracy. We evaluate LUMOS across 13 diverse SciML workloads, including cosmology and molecular sciences, and demonstrate its effectiveness and generalizability. Experiments on 13 SciML models show that LUMOS achieves 71.45% parameter reduction and a 6.4x inference speedup on average. Furthermore, Distributed Data Parallel (DDP) training on up to eight GPUs confirms the scalability of

LUMOS: Democratizing SciML Workflows with L0-Regularized Learning for Unified Feature and Parameter Adaptation

TL;DR

LUMOS is introduced, an end-to-end framework based on L0-regularized learning that unifies feature selection and model pruning to democratize SciML model design and reduces the reliance on manual tuning while maintaining predictive accuracy.

Abstract

The rapid growth of scientific machine learning (SciML) has accelerated discovery across diverse domains, yet designing effective SciML models remains a challenging task. In practice, building such models often requires substantial prior knowledge and manual expertise, particularly in determining which input features to use and how large the model should be. We introduce LUMOS, an end-to-end framework based on L0-regularized learning that unifies feature selection and model pruning to democratize SciML model design. By employing semi-stochastic gating and reparameterization techniques, LUMOS dynamically selects informative features and prunes redundant parameters during training, reducing the reliance on manual tuning while maintaining predictive accuracy. We evaluate LUMOS across 13 diverse SciML workloads, including cosmology and molecular sciences, and demonstrate its effectiveness and generalizability. Experiments on 13 SciML models show that LUMOS achieves 71.45% parameter reduction and a 6.4x inference speedup on average. Furthermore, Distributed Data Parallel (DDP) training on up to eight GPUs confirms the scalability of
Paper Structure (24 sections, 13 equations, 13 figures, 4 tables)

This paper contains 24 sections, 13 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: (S1) Training from scratch: domain scientists must select physics-informative scientific input features and co-design the model architecture and parameters, often relying on costly grid search and ad hoc heuristics. (S2) Fine-tuning a foundation model: the pretrained backbone is fixed, and adaptation is achieved by re-mapping input features and adjusting subsets of parameters. Both workflows face common bottlenecks in feature efficiency and model/parameter efficiency, motivating the development of unified frameworks such as Lumos.
  • Figure 2: Overall workflow of Lumos.
  • Figure 3: Illustration of Input Features: the blue words represent input features regarding two examples, climate application and power grid simulation, respectively.
  • Figure 4: Integration with different model layers.
  • Figure 5: Structured transition between the CONV and FC layer. When a CONV layer is followed by an FC layer, the output of the CONV layer serves as the input to the FC layer; therefore, the redundant outputs corresponding to the deactivated neurons of the CONV layer must also be removed to maintain tensor alignments.
  • ...and 8 more figures