Table of Contents
Fetching ...

Semgrex and Ssurgeon, Searching and Manipulating Dependency Graphs

John Bauer, Chloe Kiddon, Eric Yeh, Alex Shan, Christopher D. Manning

TL;DR

Semgrex enables regex-like search over dependency graphs, and Ssurgeon provides graph rewriting driven by Semgrex matches, addressing the need for programmable manipulation of Universal Dependencies. The work introduces a compact pattern language for node and relation descriptions, along with support for named nodes/edges and a suite of edge- and node-editing operations, all integrated with CoreNLP and accessible via Java, Python, and a web interface. It demonstrates practical utility for linguistic analysis, relation extraction, and UD dataset processing, including code examples and real-world usage. Overall, Semgrex and Ssurgeon streamline querying and transforming dependency structures, offering a flexible toolchain for researchers and developers working with UD data.

Abstract

Searching dependency graphs and manipulating them can be a time consuming and challenging task to get right. We document Semgrex, a system for searching dependency graphs, and introduce Ssurgeon, a system for manipulating the output of Semgrex. The compact language used by these systems allows for easy command line or API processing of dependencies. Additionally, integration with publicly released toolkits in Java and Python allows for searching text relations and attributes over natural text.

Semgrex and Ssurgeon, Searching and Manipulating Dependency Graphs

TL;DR

Semgrex enables regex-like search over dependency graphs, and Ssurgeon provides graph rewriting driven by Semgrex matches, addressing the need for programmable manipulation of Universal Dependencies. The work introduces a compact pattern language for node and relation descriptions, along with support for named nodes/edges and a suite of edge- and node-editing operations, all integrated with CoreNLP and accessible via Java, Python, and a web interface. It demonstrates practical utility for linguistic analysis, relation extraction, and UD dataset processing, including code examples and real-world usage. Overall, Semgrex and Ssurgeon streamline querying and transforming dependency structures, offering a flexible toolchain for researchers and developers working with UD data.

Abstract

Searching dependency graphs and manipulating them can be a time consuming and challenging task to get right. We document Semgrex, a system for searching dependency graphs, and introduce Ssurgeon, a system for manipulating the output of Semgrex. The compact language used by these systems allows for easy command line or API processing of dependencies. Additionally, integration with publicly released toolkits in Java and Python allows for searching text relations and attributes over natural text.
Paper Structure (15 sections, 2 tables)