Table of Contents
Fetching ...

Elastic CRFs for Open-ontology Slot Filling

Yinpei Dai, Yichi Zhang, Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng

TL;DR

Elastic CRFs (eCRFs) address open-ontology slot filling by tying each slot to a natural-language description and embedding these descriptions into a shared semantic space with utterance features. The model defines a CRF with node potentials $e(y_i)^T h_i$ and edge potentials $e(y_i)^T W e(y_{i+1})$, enabling detection of unseen slots through description-driven label embeddings. Training uses conditional maximum likelihood with a staged pretraining strategy, and decoding employs Viterbi inference. On a Google simulated dataset, eCRFs outperform BiLSTM baselines and the concept tagging approach, particularly for unseen slots/values, in both in-domain and cross-domain settings, demonstrating improved generalization for open-ontology SLU.

Abstract

Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots. An ontology is defined by the collection of slots and the values that each slot can take. The most widely used practice of treating slot filling as a sequence labeling task suffers from two main drawbacks. First, the ontology is usually pre-defined and fixed and therefore is not able to detect new labels for unseen slots. Second, the one-hot encoding of slot labels ignores the correlations between slots with similar semantics, which makes it difficult to share knowledge learned across different domains. To address these problems, we propose a new model called elastic conditional random field (eCRF), where each slot is represented by the embedding of its natural language description and modeled by a CRF layer. New slot values can be detected by eCRF whenever a language description is available for the slot. In our experiment, we show that eCRFs outperform existing models in both in-domain and cross-domain tasks, especially in predicting unseen slots and values.

Elastic CRFs for Open-ontology Slot Filling

TL;DR

Elastic CRFs (eCRFs) address open-ontology slot filling by tying each slot to a natural-language description and embedding these descriptions into a shared semantic space with utterance features. The model defines a CRF with node potentials and edge potentials , enabling detection of unseen slots through description-driven label embeddings. Training uses conditional maximum likelihood with a staged pretraining strategy, and decoding employs Viterbi inference. On a Google simulated dataset, eCRFs outperform BiLSTM baselines and the concept tagging approach, particularly for unseen slots/values, in both in-domain and cross-domain settings, demonstrating improved generalization for open-ontology SLU.

Abstract

Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots. An ontology is defined by the collection of slots and the values that each slot can take. The most widely used practice of treating slot filling as a sequence labeling task suffers from two main drawbacks. First, the ontology is usually pre-defined and fixed and therefore is not able to detect new labels for unseen slots. Second, the one-hot encoding of slot labels ignores the correlations between slots with similar semantics, which makes it difficult to share knowledge learned across different domains. To address these problems, we propose a new model called elastic conditional random field (eCRF), where each slot is represented by the embedding of its natural language description and modeled by a CRF layer. New slot values can be detected by eCRF whenever a language description is available for the slot. In our experiment, we show that eCRFs outperform existing models in both in-domain and cross-domain tasks, especially in predicting unseen slots and values.

Paper Structure

This paper contains 13 sections, 4 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: An example of slot filling in the movie domain.
  • Figure 2: The architecture of the elastic CRF (eCRF) model.
  • Figure 3: The architecture of the concept Tagging (CT) model. Bapna2017Towards
  • Figure 4: Potential scores with only node potentials in eCRFs for the cross-domain task. The darker the color, the higher the potential score.
  • Figure 5: Potential scores with both node and edge potentials in eCRFs for the cross-domain task.
  • ...and 1 more figures