Human Guided Learning of Transparent Regression Models
Lukas Pensel, Stefan Kramer
TL;DR
This work introduces HuGuR, a human-in-the-loop framework for permutation regression that builds interpretable models from binary, human-understandable order constraints. The model combines a gradient-boosted regressor with a constraint-derived feature space, yielding ŷ(x) = μ + ∑_{i=1}^{l} β_i g_{ρ_i}(x), and uses greedy gradient boosting to iteratively select informative constraints. A user study across nine real-world datasets shows HuGuR often outperforms naive and fixed-encoding baselines on small data and remains competitive with neural sequence encoders on larger data, while using far fewer parameters. The results support the value of interactive, constraint-guided modeling for transparency and performance in permutation-based tasks, with future work extending to trust studies and broader pattern domains.
Abstract
We present a human-in-the-loop (HIL) approach to permutation regression, the novel task of predicting a continuous value for a given ordering of items. The model is a gradient boosted regression model that incorporates simple human-understandable constraints of the form x < y, i.e. item x has to be before item y, as binary features. The approach, HuGuR (Human Guided Regression), lets a human explore the search space of such transparent regression models. Interacting with HuGuR, users can add, remove, and refine order constraints interactively, while the coefficients are calculated on the fly. We evaluate HuGuR in a user study and compare the performance of user-built models with multiple baselines on 9 data sets. The results show that the user-built models outperform the compared methods on small data sets and in general perform on par with the other methods, while being in principle understandable for humans. On larger datasets from the same domain, machine-induced models begin to outperform the user-built models. Further work will study the trust users have in models when constructed by themselves and how the scheme can be transferred to other pattern domains, such as strings, sequences, trees, or graphs.
