Table of Contents
Fetching ...

The RooFit toolkit for data modeling

Wouter Verkerke, David Kirkby

TL;DR

RooFit introduces an object-oriented framework inside ROOT for building and fitting probability density models using modular PDF building blocks. It emphasizes automatic PDF normalization, flexible composition (sum, product, convolution), and multi-dimensional modeling with projections, plotting, and Monte Carlo generation. The paper details advanced fitting options, efficiency optimizations, and data/project management tools that address analysis-scale challenges. Practically, RooFit has matured from BaBar-centric tooling to an open-source platform widely adopted in high-energy physics analyses.

Abstract

RooFit is a library of C++ classes that facilitate data modeling in the ROOT environment. Mathematical concepts such as variables, (probability density) functions and integrals are represented as C++ objects. The package provides a flexible framework for building complex fit models through classes that mimic math operators, and is straightforward to extend. For all constructed models RooFit provides a concise yet powerful interface for fitting (binned and unbinned likelihood, chi^2), plotting and toy Monte Carlo generation as well as sophisticated tools to manage large scale projects. RooFit has matured into an industrial strength tool capable of running the BABAR experiment's most complicated fits and is now available to all users on SourceForge.

The RooFit toolkit for data modeling

TL;DR

RooFit introduces an object-oriented framework inside ROOT for building and fitting probability density models using modular PDF building blocks. It emphasizes automatic PDF normalization, flexible composition (sum, product, convolution), and multi-dimensional modeling with projections, plotting, and Monte Carlo generation. The paper details advanced fitting options, efficiency optimizations, and data/project management tools that address analysis-scale challenges. Practically, RooFit has matured from BaBar-centric tooling to an open-source platform widely adopted in high-energy physics analyses.

Abstract

RooFit is a library of C++ classes that facilitate data modeling in the ROOT environment. Mathematical concepts such as variables, (probability density) functions and integrals are represented as C++ objects. The package provides a flexible framework for building complex fit models through classes that mimic math operators, and is straightforward to extend. For all constructed models RooFit provides a concise yet powerful interface for fitting (binned and unbinned likelihood, chi^2), plotting and toy Monte Carlo generation as well as sophisticated tools to manage large scale projects. RooFit has matured into an industrial strength tool capable of running the BABAR experiment's most complicated fits and is now available to all users on SourceForge.

Paper Structure

This paper contains 15 sections, 6 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: One dimensional plot with histogram of a dataset, overlaid by a projection of the PDF M. The histogram error are asymmetric, reflecting the Poisson confidence interval corresponding to a $1 \sigma$ deviation. The PDF projection curve is automatically scaled to the size of the plotted dataset. The points that define the curve are chosen with an adaptive resolution-based technique that ensures a smooth appearance regardless of the dataset binning.
  • Figure 2: Left: Shape of PDF M. Right: Distribution of 10000 events generated from PDF M
  • Figure 3: Left: distribution of fitted value of parameter f of model M to 1000 Monte Carlo data sets of 100 events each. Right: Corresponding pull distribution
  • Figure 4: Example of a likelihood projection plot of model M3. Left: projection of full dataset and PDF on x. Right: Projection of dataset and PDF with a cut on the likelihood $L_{yz}$, calculated in the $(y,z)$ projection of the PDF, at -5.0.
  • Figure 5: Demonstration of prototype-based Monte Carlo event generation. a) Two-dimensional PDF M2. b) Event sample generated from M2. c) One dimensional event sample in $y$ with linear distribution. d) Event sample generated from M2 using event sample shown in c) as prototype for $y$ distribution.
  • ...and 1 more figures