Postprocessing Configuration#

The postprocessing framework allows you to define complex plotting workflows using a YAML configuration file.

Example Configuration#

Below are some examples of Postprocessing configurations demonstrating different available processors.

Ratio Plot#

The RatioPlot processor takes the numerator and denominator inputs and creates a 1D ratio plot. We can perform selections using metadata.

Show YAML Configuration
Postprocessing:
  processors:
    - name: RatioPlot
      inputs:
        - "*/*/*/composite_14_m"
      scale: log
      ratio_type: "significance"
      normalize: False
      structure:
        select: {type: Histogram, pipeline: "Signal312"}
        group: {"era.name": "*"}
        subgroups:
          denominator:
            select: {sample_type: "MC", dataset_name: "!signal*"}
          numerator:
            select: {dataset_name: "signal_2018_312_1500_600"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
      output_name: "{prefix}/{era.name}/{pipeline}/{name}_demo_ratioplot.png"

  default_style_set:
    styles:
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'data*'
        style:
          plottype: errorbar
          color: black
          marker: 'o'

  drop_sample_pattern:
    or_exprs:
      - sample_name: "*50to100*"
      - sample_name: "*100to200*"
      - sample_name: "*200to300*"
      

1D Histogram#

The Histogram1D processor creates stacked 1D distributions.

Show YAML Configuration
Postprocessing:
  processors:
    - name: Histogram1D
      inputs:
        - "*/*/*/HT"
      scale: log
      normalize: False
      structure:
        select: 
          type: "Histogram"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
      output_name: "{prefix}/demo_histogram1d_{era.name}.png"

  default_style_set:
    styles:
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2
_images/demo_histogram1d_2018.png

Cutflow Table#

The CutflowTable processor creates a LaTeX cutflow table.

Show YAML Configuration
Postprocessing:
  processors:
    - name: CutflowTable
      inputs:
        - "*/*/*/selection"
      format: latex
      structure:
        select: 
          type: "SelectionFlow"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
      output_name: "{prefix}/demo_cutflow_{era.name}.tex"
Generated LaTeX
\begin{tabular}{lrrrlrllrl}
\toprule
 & \multicolumn{3}{r}{qcd_inclusive_2018} & \multicolumn{3}{r}{signal_2018_312_1500_100} & \multicolumn{3}{r}{signal_2018_312_1500_1000} \\
 & Events & Eff. Rel. & Eff. Abs. & Events & Eff. Rel. & Eff. Abs. & Events & Eff. Rel. & Eff. Abs. \\
\midrule
initial & 329826188 & 1.000000 & 1.000000 & 9949 & 1.000000 & 1.000000 & 9957 & 1.000000 & 1.000000 \\
2bjet & 5133513 & 0.015564 & 0.015564 & 4871 & 0.489597 & 0.489597 & 6061 & 0.608717 & 0.608717 \\
b_dr & 3741809 & 0.728898 & 0.011345 & 4799 & 0.985219 & 0.482360 & 5631 & 0.929055 & 0.565532 \\
zero_electron & 3728175 & 0.996356 & 0.011303 & 4786 & 0.997291 & 0.481053 & 5621 & 0.998224 & 0.564527 \\
zero_muon & 3722059 & 0.998360 & 0.011285 & 4777 & 0.998120 & 0.480149 & 5606 & 0.997331 & 0.563021 \\
njets & 1943114 & 0.522054 & 0.005891 & 1545 & 0.323425 & 0.155292 & 4444 & 0.792722 & 0.446319 \\
jetpt & 873666 & 0.449622 & 0.002649 & 1421 & 0.919741 & 0.142828 & 4040 & 0.909091 & 0.405745 \\
\bottomrule
\end{tabular}

Applying Histogram Transforms#

You can apply transforms to the axes of the underlying histograms before they move onto processors. The transforms field in the structure definition allows you to SelectAxesValues for a systematic variation, SliceAxes to narrow down the range (or drop bins), RebinAxes, etc.

Slicing Axes Example The SliceAxes transformation uses dictionary items mirroring a normal python list slice: (start, stop) inside the dimension to cut limits out of generated graphs.

Show YAML Configuration
Postprocessing:
  processors:
    - name: Histogram1D
      inputs:
        - "*/*/*/HT"
      scale: log
      normalize: False
      structure:
        select: 
          type: "Histogram"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
          - name: SliceAxes
            slices: {"HT": [500, null]} # Slice the axes above 500 HT
      output_name: "{prefix}/demo_sliceaxes_{era.name}.png"
  
  default_style_set:
    styles:
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2

Rebinning Axes Example The RebinAxes combines grouped adjacent elements out of a plot into a single continuous bin block.

Show YAML Configuration
Postprocessing:
  processors:
    - name: Histogram1D
      inputs:
        - "*/*/*/HT"
      scale: log
      normalize: False
      structure:
        select: 
          type: "Histogram"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
          - name: RebinAxes
            rebinning: {"HT": 2} # Rebin the HT axis by combining adjacent bins
      output_name: "{prefix}/demo_rebinaxes_{era.name}.png"
  
  default_style_set:
    styles:
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2

Dropping Samples#

Sometimes, especially when dealing with many similar signal mass points, you may only want to process a subset of them to avoid plotting hundreds of overlapping distributions or running out of memory. You can achieve this in the top-level configuration using drop_sample_pattern. This utilizes the same querying pattern language as other parts of the analyzer, allowing you to filter out specific samples before any postprocessing actions are run:

Postprocessing:
  ...
  drop_sample_pattern:
    or_exprs:
      - sample_name: "*50to100*"
      - sample_name: "*100to200*"
      - sample_name: "*200to300*"

Running Postprocessing#

You can execute a Postprocessing configuration block using the CLI via the postprocess subcommand.

./osca postprocess \
  config/analysis.yaml \
  full_output/**/*.result \
  --parallel 4 \
  --prefix output_figures/

This command takes your master configuration file, gathers the specified .result files, and runs the defined postprocessors (like RatioPlot or CutflowTable). The --prefix option allows you to designate an output base directory for all the resulting files.