Table of Contents
Fetching ...

ChainStream: An LLM-based Framework for Unified Synthetic Sensing

Jiacheng Liu, Yuanchun Li, Liangyan Li, Yi Sun, Hao Wen, Xiangyu Li, Yao Guo, Yunxin Liu

TL;DR

This work tackles the difficulty of building and understanding context-sensing applications by introducing ChainStream, a unified stream-based framework that uses natural language as the interface to data access and processing. It combines a streamlined API and a runtime built around a Stream Flow Graph with an iterative, sandbox-assisted program generator that employs a feedback-guided query optimizer to translate NL requests into executable sensing programs. A novel NL-Sense benchmark with 133 tasks and 16 data sources enables end-to-end evaluation of generation quality, latency, and cost, showing that the feedback-enabled approach significantly improves performance over strong baselines. The framework promises more transparent data processing, easier app development, and reusable sensing pipelines for privacy-conscious, context-aware systems, with open-source code released for community use.

Abstract

Many applications demand context sensing to offer personalized and timely services. Yet, developing sensing programs can be challenging for developers and using them is privacy-concerning for end-users. In this paper, we propose to use natural language as the unified interface to process personal data and sense user context, which can effectively ease app development and make the data pipeline more transparent. Our work is inspired by large language models (LLMs) and other generative models, while directly applying them does not solve the problem - letting the model directly process the data cannot handle complex sensing requests and letting the model write the data processing program suffers error-prone code generation. We address the problem with 1) a unified data processing framework that makes context-sensing programs simpler and 2) a feedback-guided query optimizer that makes data query more informative. To evaluate the performance of natural language-based context sensing, we create a benchmark that contains 133 context sensing tasks. Extensive evaluation has shown that our approach is able to automatically solve the context-sensing tasks efficiently and precisely. The code is opensourced at https://github.com/MobileLLM/ChainStream.

ChainStream: An LLM-based Framework for Unified Synthetic Sensing

TL;DR

This work tackles the difficulty of building and understanding context-sensing applications by introducing ChainStream, a unified stream-based framework that uses natural language as the interface to data access and processing. It combines a streamlined API and a runtime built around a Stream Flow Graph with an iterative, sandbox-assisted program generator that employs a feedback-guided query optimizer to translate NL requests into executable sensing programs. A novel NL-Sense benchmark with 133 tasks and 16 data sources enables end-to-end evaluation of generation quality, latency, and cost, showing that the feedback-enabled approach significantly improves performance over strong baselines. The framework promises more transparent data processing, easier app development, and reusable sensing pipelines for privacy-conscious, context-aware systems, with open-source code released for community use.

Abstract

Many applications demand context sensing to offer personalized and timely services. Yet, developing sensing programs can be challenging for developers and using them is privacy-concerning for end-users. In this paper, we propose to use natural language as the unified interface to process personal data and sense user context, which can effectively ease app development and make the data pipeline more transparent. Our work is inspired by large language models (LLMs) and other generative models, while directly applying them does not solve the problem - letting the model directly process the data cannot handle complex sensing requests and letting the model write the data processing program suffers error-prone code generation. We address the problem with 1) a unified data processing framework that makes context-sensing programs simpler and 2) a feedback-guided query optimizer that makes data query more informative. To evaluate the performance of natural language-based context sensing, we create a benchmark that contains 133 context sensing tasks. Extensive evaluation has shown that our approach is able to automatically solve the context-sensing tasks efficiently and precisely. The code is opensourced at https://github.com/MobileLLM/ChainStream.

Paper Structure

This paper contains 27 sections, 2 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Our basic idea: Enabling natural language-defined context sensing by reducing the gap between the query and the program.
  • Figure 2: The architecture of ChainStream programming framework.
  • Figure 3: The workflow of Iterative Program Generator in ChainStream.
  • Figure 4: An illustration of the augmented context-sensing query. 'Initial stream query' is the original natural language-based sensing query formatted as an expected stream description. The augmented query contains a base prompt and multiple historical sandbox feedbacks.
  • Figure 5: The distribution of task types and data types in our benchmark.
  • ...and 3 more figures