Information-Seeking Decision Strategies Mitigate Risk in Dynamic, Uncertain Environments
Nicholas W. Barendregt, Joshua I. Gold, Krešimir Josić, Zachary P. Kilpatrick
TL;DR
The paper addresses how agents balance information gathering against direct reward pursuit in dynamic, uncertain environments. It develops a Bayesian, sequential two-alternative foraging framework to compare reward-maximizing (rewardmax) and information-maximizing (infomax) strategies using dynamic programming, with key parameters such as environmental change probability $\epsilon$, reward reliability $q$, and future discounting $\gamma$. Results show that while rewardmax attains higher average rewards, infomax delivers more robust and consistent reward distributions, with both strategies displaying similar phase transitions between exploration and exploitation as conditions shift. These findings highlight the adaptive value of information-seeking behavior in naturalistic settings and inform broader discussions of decision-making in dynamic, uncertain environments, including connections to POMDPs and future reward discounting.
Abstract
To survive in dynamic and uncertain environments, individuals must develop effective decision strategies that balance information gathering and decision commitment. Models of such strategies often prioritize either optimizing tangible payoffs, like reward rate, or gathering information to support a diversity of (possibly unknown) objectives. However, our understanding of the relative merits of these two approaches remains incomplete, in part because direct comparisons have been limited to idealized, static environments that lack the dynamic complexity of the real world. Here we compared the performance of normative reward- and information-seeking strategies in a dynamic foraging task. Both strategies show similar transitions between exploratory and exploitative behaviors as environmental uncertainty changes. However, we find subtle disparities in the actions they take, resulting in meaningful performance differences: whereas reward-seeking strategies generate slightly more reward on average, information-seeking strategies provide more consistent and predictable outcomes. Our findings support the adaptive value of information-seeking behaviors that can mitigate risk with minimal reward loss.
