DAG Learning from Zero-Inflated Count Data Using Continuous Optimization
Noriaki Sato, Marco Scutari, Shuichi Kawano, Rui Yamaguchi, Seiya Imoto
TL;DR
ZICO introduces a scalable, continuous-optimization framework for learning DAGs from zero-inflated count data by modeling each node with a ZINB/ZIP conditional on parents and enforcing acyclicity via differentiable log-determinant surrogates. The method jointly learns two weight matrices for zero-inflation and mean components, includes sparsity and alignment regularization, and optimizes with mini-batch AdamW. Across simulated data and SCT-like GRN scenarios, ZICO achieves competitive or superior structure recovery and faster runtimes compared with existing approaches, particularly as the number of variables grows. The framework is flexible to NB/Poisson variants and highlights the benefits of modeling zero inflation explicitly for gene-regulatory-network inference from single-cell data, while noting limitations and directions for future work on feedback motifs and identifiability.
Abstract
We address network structure learning from zero-inflated count data by casting each node as a zero-inflated generalized linear model and optimizing a smooth, score-based objective under a directed acyclic graph constraint. Our Zero-Inflated Continuous Optimization (ZICO) approach uses node-wise likelihoods with canonical links and enforces acyclicity through a differentiable surrogate constraint combined with sparsity regularization. ZICO achieves superior performance with faster runtimes on simulated data. It also performs comparably to or better than common algorithms for reverse engineering gene regulatory networks. ZICO is fully vectorized and mini-batched, enabling learning on larger variable sets with practical runtimes in a wide range of domains.
