Table of Contents
Fetching ...

Primary and Secondary Factor Consistency as Domain Knowledge to Guide Happiness Computing in Online Assessment

Xiaohua Wu, Lin Li, Xiaohui Tao, Frank Xing, Jingling Yuan

TL;DR

This article proves that multiple prediction models with additive factor attributions will have the desirable property of primary and secondary relations consistency, and shows that factor relations with quantity can be represented as an importance distribution for encoding domain knowledge.

Abstract

Happiness computing based on large-scale online web data and machine learning methods is an emerging research topic that underpins a range of issues, from personal growth to social stability. Many advanced Machine Learning (ML) models with explanations are used to compute the happiness online assessment while maintaining high accuracy of results. However, domain knowledge constraints, such as the primary and secondary relations of happiness factors, are absent from these models, which limits the association between computing results and the right reasons for why they occurred. This article attempts to provide new insights into the explanation consistency from an empirical study perspective. Then we study how to represent and introduce domain knowledge constraints to make ML models more trustworthy. We achieve this through: (1) proving that multiple prediction models with additive factor attributions will have the desirable property of primary and secondary relations consistency, and (2) showing that factor relations with quantity can be represented as an importance distribution for encoding domain knowledge. Factor explanation difference is penalized by the Kullback-Leibler divergence-based loss among computing models. Experimental results using two online web datasets show that domain knowledge of stable factor relations exists. Using this knowledge not only improves happiness computing accuracy but also reveals more significative happiness factors for assisting decisions well.

Primary and Secondary Factor Consistency as Domain Knowledge to Guide Happiness Computing in Online Assessment

TL;DR

This article proves that multiple prediction models with additive factor attributions will have the desirable property of primary and secondary relations consistency, and shows that factor relations with quantity can be represented as an importance distribution for encoding domain knowledge.

Abstract

Happiness computing based on large-scale online web data and machine learning methods is an emerging research topic that underpins a range of issues, from personal growth to social stability. Many advanced Machine Learning (ML) models with explanations are used to compute the happiness online assessment while maintaining high accuracy of results. However, domain knowledge constraints, such as the primary and secondary relations of happiness factors, are absent from these models, which limits the association between computing results and the right reasons for why they occurred. This article attempts to provide new insights into the explanation consistency from an empirical study perspective. Then we study how to represent and introduce domain knowledge constraints to make ML models more trustworthy. We achieve this through: (1) proving that multiple prediction models with additive factor attributions will have the desirable property of primary and secondary relations consistency, and (2) showing that factor relations with quantity can be represented as an importance distribution for encoding domain knowledge. Factor explanation difference is penalized by the Kullback-Leibler divergence-based loss among computing models. Experimental results using two online web datasets show that domain knowledge of stable factor relations exists. Using this knowledge not only improves happiness computing accuracy but also reveals more significative happiness factors for assisting decisions well.
Paper Structure (36 sections, 10 equations, 8 figures, 4 tables)

This paper contains 36 sections, 10 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: The illustration of our research motivation. The various online web data is collected from users online, then is applied to the happiness computing models and the factor explanation is generated by an explanation method.
  • Figure 2: The primary factors and secondary factors for happiness level.
  • Figure 3: The Macro_F1 and Micro_F1 results of young group and elder group on CGSS and ESS datasets.
  • Figure 4: The comparison of explanation consistency (Kendall's tau coefficient) based on the factor distribution on young and elder groups of CGSS and ESS datasets.
  • Figure 5: The comparison of prediction accuracy stability among multiple models based on four groups of CGSS and ESS datasets. More details of these four groups on ESS can be found in Appendix \ref{['Appendix:accuracy_stability']}.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Definition 1