Postprocessing Configuration#
The postprocessing framework allows you to define complex plotting workflows using a YAML configuration file.
Example Configuration#
Below are some examples of Postprocessing configurations demonstrating different available processors.
Ratio Plot#
The RatioPlot processor takes the numerator and denominator inputs and creates a 1D ratio plot. We can perform selections using metadata.
Show YAML Configuration
Postprocessing:
processors:
- name: RatioPlot
inputs:
- "*/*/*/composite_14_m"
scale: log
ratio_type: "significance"
normalize: False
structure:
select: {type: Histogram, pipeline: "Signal312"}
group: {"era.name": "*"}
subgroups:
denominator:
select: {sample_type: "MC", dataset_name: "!signal*"}
numerator:
select: {dataset_name: "signal_2018_312_1500_600"}
transforms:
- name: SelectAxesValues
select_axes_values: {"variation": "central"}
output_name: "{prefix}/{era.name}/{pipeline}/{name}_demo_ratioplot.png"
default_style_set:
styles:
- pattern:
dataset_name: 'signal*'
style:
plottype: step
linewidth: 2
- pattern:
sample_type: 'MC'
style:
plottype: fill
- pattern:
dataset_name: 'data*'
style:
plottype: errorbar
color: black
marker: 'o'
drop_sample_pattern:
or_exprs:
- sample_name: "*50to100*"
- sample_name: "*100to200*"
- sample_name: "*200to300*"
1D Histogram#
The Histogram1D processor creates stacked 1D distributions.
Show YAML Configuration
Postprocessing:
processors:
- name: Histogram1D
inputs:
- "*/*/*/HT"
scale: log
normalize: False
structure:
select:
type: "Histogram"
pipeline: "Signal312"
dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
group: {"era.name": "*"}
transforms:
- name: SelectAxesValues
select_axes_values: {"variation": "central"}
output_name: "{prefix}/demo_histogram1d_{era.name}.png"
default_style_set:
styles:
- pattern:
sample_type: 'MC'
style:
plottype: fill
- pattern:
dataset_name: 'signal*'
style:
plottype: step
linewidth: 2
Cutflow Table#
The CutflowTable processor creates a LaTeX cutflow table.
Show YAML Configuration
Postprocessing:
processors:
- name: CutflowTable
inputs:
- "*/*/*/selection"
format: latex
structure:
select:
type: "SelectionFlow"
pipeline: "Signal312"
dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
group: {"era.name": "*"}
output_name: "{prefix}/demo_cutflow_{era.name}.tex"
Generated LaTeX
\begin{tabular}{lrrrlrllrl}
\toprule
& \multicolumn{3}{r}{qcd_inclusive_2018} & \multicolumn{3}{r}{signal_2018_312_1500_100} & \multicolumn{3}{r}{signal_2018_312_1500_1000} \\
& Events & Eff. Rel. & Eff. Abs. & Events & Eff. Rel. & Eff. Abs. & Events & Eff. Rel. & Eff. Abs. \\
\midrule
initial & 329826188 & 1.000000 & 1.000000 & 9949 & 1.000000 & 1.000000 & 9957 & 1.000000 & 1.000000 \\
2bjet & 5133513 & 0.015564 & 0.015564 & 4871 & 0.489597 & 0.489597 & 6061 & 0.608717 & 0.608717 \\
b_dr & 3741809 & 0.728898 & 0.011345 & 4799 & 0.985219 & 0.482360 & 5631 & 0.929055 & 0.565532 \\
zero_electron & 3728175 & 0.996356 & 0.011303 & 4786 & 0.997291 & 0.481053 & 5621 & 0.998224 & 0.564527 \\
zero_muon & 3722059 & 0.998360 & 0.011285 & 4777 & 0.998120 & 0.480149 & 5606 & 0.997331 & 0.563021 \\
njets & 1943114 & 0.522054 & 0.005891 & 1545 & 0.323425 & 0.155292 & 4444 & 0.792722 & 0.446319 \\
jetpt & 873666 & 0.449622 & 0.002649 & 1421 & 0.919741 & 0.142828 & 4040 & 0.909091 & 0.405745 \\
\bottomrule
\end{tabular}
Applying Histogram Transforms#
You can apply transforms to the axes of the underlying histograms before they move onto processors. The transforms field in the structure definition allows you to SelectAxesValues for a systematic variation, SliceAxes to narrow down the range (or drop bins), RebinAxes, etc.
Slicing Axes Example The SliceAxes transformation uses dictionary items mirroring a normal python list slice: (start, stop) inside the dimension to cut limits out of generated graphs.
Show YAML Configuration
Postprocessing:
processors:
- name: Histogram1D
inputs:
- "*/*/*/HT"
scale: log
normalize: False
structure:
select:
type: "Histogram"
pipeline: "Signal312"
dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
group: {"era.name": "*"}
transforms:
- name: SelectAxesValues
select_axes_values: {"variation": "central"}
- name: SliceAxes
slices: {"HT": [500, null]} # Slice the axes above 500 HT
output_name: "{prefix}/demo_sliceaxes_{era.name}.png"
default_style_set:
styles:
- pattern:
sample_type: 'MC'
style:
plottype: fill
- pattern:
dataset_name: 'signal*'
style:
plottype: step
linewidth: 2
Rebinning Axes Example The RebinAxes combines grouped adjacent elements out of a plot into a single continuous bin block.
Show YAML Configuration
Postprocessing:
processors:
- name: Histogram1D
inputs:
- "*/*/*/HT"
scale: log
normalize: False
structure:
select:
type: "Histogram"
pipeline: "Signal312"
dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
group: {"era.name": "*"}
transforms:
- name: SelectAxesValues
select_axes_values: {"variation": "central"}
- name: RebinAxes
rebinning: {"HT": 2} # Rebin the HT axis by combining adjacent bins
output_name: "{prefix}/demo_rebinaxes_{era.name}.png"
default_style_set:
styles:
- pattern:
sample_type: 'MC'
style:
plottype: fill
- pattern:
dataset_name: 'signal*'
style:
plottype: step
linewidth: 2
Dropping Samples#
Sometimes, especially when dealing with many similar signal mass points, you may only want to process a subset of them to avoid plotting hundreds of overlapping distributions or running out of memory.
You can achieve this in the top-level configuration using drop_sample_pattern. This utilizes the same querying pattern language as other parts of the analyzer, allowing you to filter out specific samples before any postprocessing actions are run:
Postprocessing:
...
drop_sample_pattern:
or_exprs:
- sample_name: "*50to100*"
- sample_name: "*100to200*"
- sample_name: "*200to300*"
Running Postprocessing#
You can execute a Postprocessing configuration block using the CLI via the postprocess subcommand.
./osca postprocess \
config/analysis.yaml \
full_output/**/*.result \
--parallel 4 \
--prefix output_figures/
This command takes your master configuration file, gathers the specified .result files, and runs the defined postprocessors (like RatioPlot or CutflowTable). The --prefix option allows you to designate an output base directory for all the resulting files.