A Tutorial on Dimensionless Learning: Geometric Interpretation and the Effect of Noise
Zhengtao Jake Gan, Xiaoyu Xie
TL;DR
This work addresses the challenge of automatically discovering dimensionless numbers and scaling laws from experimental data by marrying Buckingham's π theorem with a geometric, data-driven approach. It introduces a five-module pipeline that computes a null-space basis of the dimension matrix, reduces dimensionality with PCA or SIR, and discovers dimensionless groups via learnable coefficients in a neural-network framework, aided by a quantization regularizer that enforces simple, interpretable coefficients. The approach is validated on synthetic cases, demonstrating robustness to noise and discrete sampling, and extended to multiple dominant dimensionless numbers, where the learned representations form low-dimensional manifolds and subspaces of equivalent forms. An open-source Streamlit-based interface (PyDimension) is provided to make dimensionless learning accessible to experimentalists, with discussions of current limitations and directions for improving input selection, scalability, and user accessibility.
Abstract
Dimensionless learning is a data-driven framework for discovering dimensionless numbers and scaling laws from experimental measurements. This tutorial introduces the method, explaining how it transforms experimental data into compact physical laws that reveal compact dimensional invariance between variables. The approach combines classical dimensional analysis with modern machine learning techniques. Starting from measurements of physical quantities, the method identifies the fundamental ways to combine variables into dimensionless groups, then uses neural networks to discover which combinations best predict the experimental output. A key innovation is a regularization technique that encourages the learned coefficients to take simple, interpretable values like integers or half-integers, making the discovered laws both accurate and physically meaningful. We systematically investigate how measurement noise and discrete sampling affect the discovery process, demonstrating that the regularization approach provides robustness to experimental uncertainties. The method successfully handles cases with single or multiple dimensionless numbers, revealing how different but equivalent representations can capture the same underlying physics. Despite recent progress, key challenges remain, including managing the computational cost of identifying multiple dimensionless groups, understanding the influence of data characteristics, automating the selection of relevant input variables, and developing user-friendly tools for experimentalists. This tutorial serves as both an educational resource and a practical guide for researchers seeking to apply dimensionless learning to their experimental data.
