Table of Contents
Fetching ...

State Variation Mining: On Information Divergence with Message Importance in Big Data

Rui She, Shanyun Liu, Pingyi Fan

TL;DR

This work introduces the Message Importance Transfer Measure (MITM) to quantify information transfer in big data with a focus on rare events. MITM defines a transfer capacity $C = \sum_{\delta_0} p(\delta_0) \tilde{C}(\delta_0)$ with $\tilde{C}(\delta_0)= \max_{p(x)} \{ L(\tilde{Y}) - L(\tilde{Y}|X) \}$ under a Lipschitz constraint $|L(\tilde{Y})-L(\tilde{Y}|X)| \le \lambda \| p(\tilde{y})-p(\tilde{y}|x) \|_1$. The framework is extended to continuous distributions via $L(f(x))=\int f(x) e^{-f(x)} dx$ and $D_I(g||f)= \int [ g(x) e^{-g(x)} - f(x) e^{-f(x)} ] dx$, with a perturbation analysis showing $D_I(g_0||f_0) = O(\epsilon)$ for $g_0(x)= f_0(x) + \epsilon f_0^{\alpha}(x) u(x)$. Finally, the MITM is applied to Mobile Edge Computing using the $M/M/s/k$ queue to guide cache sizing, and simulations suggest MITM converges faster than KL divergence for state-variation assessment.

Abstract

Information transfer which reveals the state variation of variables usually plays a vital role in big data analytics and processing. In fact, the measures for information transfer could reflect the system change by use of the variable distributions, similar to KL divergence and Renyi divergence. Furthermore, in terms of the information transfer in big data, small probability events usually dominate the importance of the total message to some degree. Therefore, it is significant to design an information transfer measure based on the message importance which emphasizes the small probability events. In this paper, we propose a message importance transfer measure (MITM) and investigate its characteristics and applications on three aspects. First, the message importance transfer capacity based on MITM is presented to offer an upper bound for the information transfer process with disturbance. Then, we extend the MITM to the continuous case and discuss the robustness by using it to measuring information distance. Finally, we utilize the MITM to guide the queue length selection in the caching operation of mobile edge computing.

State Variation Mining: On Information Divergence with Message Importance in Big Data

TL;DR

This work introduces the Message Importance Transfer Measure (MITM) to quantify information transfer in big data with a focus on rare events. MITM defines a transfer capacity with under a Lipschitz constraint . The framework is extended to continuous distributions via and , with a perturbation analysis showing for . Finally, the MITM is applied to Mobile Edge Computing using the queue to guide cache sizing, and simulations suggest MITM converges faster than KL divergence for state-variation assessment.

Abstract

Information transfer which reveals the state variation of variables usually plays a vital role in big data analytics and processing. In fact, the measures for information transfer could reflect the system change by use of the variable distributions, similar to KL divergence and Renyi divergence. Furthermore, in terms of the information transfer in big data, small probability events usually dominate the importance of the total message to some degree. Therefore, it is significant to design an information transfer measure based on the message importance which emphasizes the small probability events. In this paper, we propose a message importance transfer measure (MITM) and investigate its characteristics and applications on three aspects. First, the message importance transfer capacity based on MITM is presented to offer an upper bound for the information transfer process with disturbance. Then, we extend the MITM to the continuous case and discuss the robustness by using it to measuring information distance. Finally, we utilize the MITM to guide the queue length selection in the caching operation of mobile edge computing.

Paper Structure

This paper contains 10 sections, 4 theorems, 38 equations, 3 figures, 1 table.

Key Result

Proposition 1

Assume that there exists an information transfer process as same as that mentioned in Eq. (eq.relation_1) and Eq. (eq.relation), where the disturbance $\delta$ follows a binary uniform distribution (namely $p$($\delta$)= (1/2, 1/2)), and the information transfer matrix is which indicates that variables $X$ and $\tilde{Y}$ both obey the binary distributions. In this case, the message importance tr

Figures (3)

  • Figure 1: Information transfer system model.
  • Figure 2: The performance of information measures for the state variation between the queue length $k$ and $k+1$ in the case of server number $s=1$.
  • Figure 3: The performance of information measures for the state variation between the queue length $k$ and $\infty$ in the case of server number $s=1$.

Theorems & Definitions (9)

  • Definition 1
  • Definition 2
  • Definition 3
  • Proposition 1
  • proof
  • Corollary 1
  • proof
  • Proposition 2
  • Proposition 3