Table of Contents
Fetching ...

Random Forest-of-Thoughts: Uncertainty-aware Reasoning for Computational Social Science

Xiaohua Wu, Xiaohui Tao, Wenjie Wu, Yuefeng Li, Lin Li

TL;DR

This paper introduces Random Forest of Thoughts (RFoT), an uncertainty-aware prompting framework for computational social science. By combining iterative chain-of-thought generation, linguistic cues, and Shapley-value evaluated thought subsets within a bootstrap ensemble, RFoT probes diverse reasoning paths to improve mental-state prediction from questionnaire data. Empirical results on CGSS and ESS happiness datasets show RFoT outperforming IO prompting, CoT, SC-CoT, and ToT across two open LLMs, with the added benefit of explainability through thought-level attributions. The approach advances practical social survey analysis by enabling richer exploration of domain theories and uncertainty in respondents’ answers, at the cost of higher computation. This balance suggests RFoT’s potential for more robust, interpretable decision-support in large-scale social surveys and EMA-inspired analyses.

Abstract

Social surveys in computational social science are well-designed by elaborate domain theories that can effectively reflect the interviewee's deep thoughts without concealing their true feelings. The candidate questionnaire options highly depend on the interviewee's previous answer, which results in the complexity of social survey analysis, the time, and the expertise required. The ability of large language models (LLMs) to perform complex reasoning is well-enhanced by prompting learning such as Chain-of-thought (CoT) but still confined to left-to-right decision-making processes or limited paths during inference. This means they can fall short in problems that require exploration and uncertainty searching. In response, a novel large language model prompting method, called Random Forest of Thoughts (RFoT), is proposed for generating uncertainty reasoning to fit the area of computational social science. The RFoT allows LLMs to perform deliberate decision-making by generating diverse thought space and randomly selecting the sub-thoughts to build the forest of thoughts. It can extend the exploration and prediction of overall performance, benefiting from the extensive research space of response. The method is applied to optimize computational social science analysis on two datasets covering a spectrum of social survey analysis problems. Our experiments show that RFoT significantly enhances language models' abilities on two novel social survey analysis problems requiring non-trivial reasoning.

Random Forest-of-Thoughts: Uncertainty-aware Reasoning for Computational Social Science

TL;DR

This paper introduces Random Forest of Thoughts (RFoT), an uncertainty-aware prompting framework for computational social science. By combining iterative chain-of-thought generation, linguistic cues, and Shapley-value evaluated thought subsets within a bootstrap ensemble, RFoT probes diverse reasoning paths to improve mental-state prediction from questionnaire data. Empirical results on CGSS and ESS happiness datasets show RFoT outperforming IO prompting, CoT, SC-CoT, and ToT across two open LLMs, with the added benefit of explainability through thought-level attributions. The approach advances practical social survey analysis by enabling richer exploration of domain theories and uncertainty in respondents’ answers, at the cost of higher computation. This balance suggests RFoT’s potential for more robust, interpretable decision-support in large-scale social surveys and EMA-inspired analyses.

Abstract

Social surveys in computational social science are well-designed by elaborate domain theories that can effectively reflect the interviewee's deep thoughts without concealing their true feelings. The candidate questionnaire options highly depend on the interviewee's previous answer, which results in the complexity of social survey analysis, the time, and the expertise required. The ability of large language models (LLMs) to perform complex reasoning is well-enhanced by prompting learning such as Chain-of-thought (CoT) but still confined to left-to-right decision-making processes or limited paths during inference. This means they can fall short in problems that require exploration and uncertainty searching. In response, a novel large language model prompting method, called Random Forest of Thoughts (RFoT), is proposed for generating uncertainty reasoning to fit the area of computational social science. The RFoT allows LLMs to perform deliberate decision-making by generating diverse thought space and randomly selecting the sub-thoughts to build the forest of thoughts. It can extend the exploration and prediction of overall performance, benefiting from the extensive research space of response. The method is applied to optimize computational social science analysis on two datasets covering a spectrum of social survey analysis problems. Our experiments show that RFoT significantly enhances language models' abilities on two novel social survey analysis problems requiring non-trivial reasoning.

Paper Structure

This paper contains 28 sections, 9 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: The questionnaire analysis with uncertainty question-answer pairs.
  • Figure 2: The framework of our proposed RFoT. Text input is decomposed into intermediate steps that are reconstructed as a prompt step $N$ boxed by a blue line. Each prompt step is an Iterative Chain of Thought (ICoT) as shown on the right.
  • Figure 3: The thoughts generation from an aspect by the proposed ICoT.
  • Figure 4: The case study on happiness prediction.