Using Felis to Represent the Semantics and Metadata of Astronomical Data Catalogs
Jeremy McCormick, Gregory P. Dubois-Felsmann, Andrei Salnikov, Brian Van Klaveren, Tim Jenness
TL;DR
Felis introduces a YAML-described, Pydantic-validated data description language to capture the semantics and metadata of astronomical data catalogs, addressing metadata gaps in traditional DDL. It defines a schemaDataModel with tables, columns, and metadata (e.g., units, UCDs), supports constraints and indexes, and enables validation through Python validators. The system provides a Python API for loading and inspecting schemas, and can generate DDL for multiple DB engines or populate TAP_SCHEMA for IVOA TAP services. Rubin Observatory adopts Felis for its SDM schemas and related SIAv2 models, leveraging Git-based versioning and an online schema browser, with future work targeting data format conversion and schema migration tooling.
Abstract
The Data Management team of the Vera C. Rubin Observatory has developed a data description language and toolset, Felis, for defining the semantics and metadata of its public-facing data catalogs. Felis uses a rich Pydantic data model for describing and validating catalog metadata, expressed as a human-readable and editable YAML format. Felis also provides a Python library and command line interface for working with these data models. The metadata is used to populate the TAP_SCHEMA tables for the IVOA TAP services utilized by the Rubin Science Platform (RSP). Felis's current capabilities will be discussed along with some future plans.
