A Mathematical Model of the Hidden Feedback Loop Effect in Machine Learning Systems
Andrey Veprikov, Alexander Afanasiev, Anton Khritankov
TL;DR
This work models the long-term effects of repeated machine learning by formulating a dynamical system on probability densities, where each step f_{t+1} = D_t(f_t) captures data sampling, learning, and prediction feedback. The authors prove sufficient conditions for D_t to map PDFs to PDFs, and establish a dichotomy in the limit behavior: weak convergence to a delta function (positive feedback/accurate residuals) or to a zero distribution (error amplification). They derive an autonomy criterion for when the evolution is time-invariant and provide concrete examples of D_t, along with a suite of experiments on synthetic data that validate the theoretical predictions, including moment decay and normality breakdown of prediction errors. The results offer a principled basis for diagnosing hidden feedback loops and guiding design choices to mitigate bias amplification and trustworthiness violations in societal-scale ML systems.
Abstract
Widespread deployment of societal-scale machine learning systems necessitates a thorough understanding of the resulting long-term effects these systems have on their environment, including loss of trustworthiness, bias amplification, and violation of AI safety requirements. We introduce a repeated learning process to jointly describe several phenomena attributed to unintended hidden feedback loops, such as error amplification, induced concept drift, echo chambers and others. The process comprises the entire cycle of obtaining the data, training the predictive model, and delivering predictions to end-users within a single mathematical model. A distinctive feature of such repeated learning setting is that the state of the environment becomes causally dependent on the learner itself over time, thus violating the usual assumptions about the data distribution. We present a novel dynamical systems model of the repeated learning process and prove the limiting set of probability distributions for positive and negative feedback loop modes of the system operation. We conduct a series of computational experiments using an exemplary supervised learning problem on two synthetic data sets. The results of the experiments correspond to the theoretical predictions derived from the dynamical model. Our results demonstrate the feasibility of the proposed approach for studying the repeated learning processes in machine learning systems and open a range of opportunities for further research in the area.
