Table of Contents
Fetching ...

Learning Service Selection Decision Making Behaviors During Scientific Workflow Development

Xihao Xie, Jia Zhang, Rahul Ramachandran, Tsengdar J. Lee, Seungwon Lee

TL;DR

This work tackles the challenge of recommending next services during scientific workflow development by framing composition as a goal-oriented, context-aware sequence prediction problem grounded in workflow provenance. It builds a knowledge graph from historical workflows, generates sequential composition paths via intra- and inter-workflow strategies, and learns service representations with a goal-oriented gLSTM plus attention to capture path-level and workflow-level context. The offline model is then deployed online to predict top-$K$ candidate next services, achieving superior Recall@K, MRR, and Diversity@K compared with baselines on a large myExperiment-derived dataset. The approach demonstrates that incorporating provenance-derived context and goal signals yields accurate and diverse guidance for incremental workflow composition, with potential extensions to personalization and hierarchical graph integration for further improvements.

Abstract

Increasingly, more software services have been published onto the Internet, making it a big challenge to recommend services in the process of a scientific workflow composition. In this paper, a novel context-aware approach is proposed to recommending next services in a workflow development process, through learning service representation and service selection decision making behaviors from workflow provenance. Inspired by natural language sentence generation, the composition process of a scientific workflow is formalized as a step-wise procedure within the context of the goal of workflow, and the problem of next service recommendation is mapped to next word prediction. Historical service dependencies are first extracted from scientific workflow provenance to build a knowledge graph. Service sequences are then generated based on diverse composition path generation strategies. Afterwards, the generated corpus of composition paths are leveraged to study previous decision making strategies. Such a trained goal-oriented next service prediction model will be used to recommend top K candidate services during workflow composition process. Extensive experiments on a real-word repository have demonstrated the effectiveness of this approach.

Learning Service Selection Decision Making Behaviors During Scientific Workflow Development

TL;DR

This work tackles the challenge of recommending next services during scientific workflow development by framing composition as a goal-oriented, context-aware sequence prediction problem grounded in workflow provenance. It builds a knowledge graph from historical workflows, generates sequential composition paths via intra- and inter-workflow strategies, and learns service representations with a goal-oriented gLSTM plus attention to capture path-level and workflow-level context. The offline model is then deployed online to predict top- candidate next services, achieving superior Recall@K, MRR, and Diversity@K compared with baselines on a large myExperiment-derived dataset. The approach demonstrates that incorporating provenance-derived context and goal signals yields accurate and diverse guidance for incremental workflow composition, with potential extensions to personalization and hierarchical graph integration for further improvements.

Abstract

Increasingly, more software services have been published onto the Internet, making it a big challenge to recommend services in the process of a scientific workflow composition. In this paper, a novel context-aware approach is proposed to recommending next services in a workflow development process, through learning service representation and service selection decision making behaviors from workflow provenance. Inspired by natural language sentence generation, the composition process of a scientific workflow is formalized as a step-wise procedure within the context of the goal of workflow, and the problem of next service recommendation is mapped to next word prediction. Historical service dependencies are first extracted from scientific workflow provenance to build a knowledge graph. Service sequences are then generated based on diverse composition path generation strategies. Afterwards, the generated corpus of composition paths are leveraged to study previous decision making strategies. Such a trained goal-oriented next service prediction model will be used to recommend top K candidate services during workflow composition process. Extensive experiments on a real-word repository have demonstrated the effectiveness of this approach.
Paper Structure (31 sections, 21 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 31 sections, 21 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Motivating example workflow marked on #1794 in myExperiment.org
  • Figure 2: Blueprint of proposed approach. (a) Workflow repository. (b) Constructed knowledge graph. (c) Composition paths generated in forms of service sequences. (d) Composition path-based service prediction model trained offline. (e) Real-time workflow under construction. (f) Composition paths generated for the ongoing workflow. (g) Predicted probabilities of potential services to be composed at the next step. (h) Recommended list of top K candidate services. Operations from (a) to (d) are conducted in the offline phase and (e)-(h) is the online recommendation phase.
  • Figure 3: Portion of SKG motivating two service sequence generation methods. Edges are labeled with correponding id numbers of workflows. The nodes in green are services in workflow #941and the dependencies between them are colored in orange with label "941." Nodes in other colors are services invoked by other workflows, which appear to be downstream nodes of the services in workflow #941 in SKG.
  • Figure 4: Architecture of composition path-based service prediction.
  • Figure 5: Inner structure of the gLSTM memory block.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Definition 1: Service Repository
  • Definition 2: Workflow Repository
  • Definition 3: Incremental Workflow Composition
  • Definition 4: Composition Path
  • Definition 5: Composition Context