Postprocessing Configuration#

The postprocessing framework allows you to define complex plotting workflows using a YAML configuration file.

Example Configuration#

Below are some examples of Postprocessing configurations demonstrating different available processors.

Ratio Plot#

The RatioPlot processor takes the numerator and denominator inputs and creates a 1D ratio plot. We can perform selections using metadata.

Show YAML Configuration
Postprocessing:
  processors:
    - name: RatioPlot
      inputs:
        - "*/*/*/composite_14_m"
      scale: log
      ratio_type: "significance"
      normalize: False
      structure:
        select: {type: Histogram, pipeline: "Signal312"}
        group: {"era.name": "*"}
        subgroups:
          denominator:
            select: {sample_type: "MC", dataset_name: "!signal*"}
          numerator:
            select: {dataset_name: "signal_2018_312_1500_600"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
      output_name: "{prefix}/{era.name}/{pipeline}/{name}_demo_ratioplot.png"

  default_style_set:
    styles:
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'data*'
        style:
          plottype: errorbar
          color: black
          marker: 'o'

  drop_sample_pattern:
    or_exprs:
      - sample_name: "*50to100*"
      - sample_name: "*100to200*"
      - sample_name: "*200to300*"
      

1D Histogram#

The Histogram1D processor creates stacked 1D distributions.

Show YAML Configuration
Postprocessing:
  processors:
    - name: Histogram1D
      inputs:
        - "*/*/*/HT"
      scale: log
      normalize: False
      structure:
        select: 
          type: "Histogram"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
      output_name: "{prefix}/demo_histogram1d_{era.name}.png"

  default_style_set:
    styles:
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2

Cutflow Table#

The CutflowTable processor creates a LaTeX cutflow table.

Show YAML Configuration
Postprocessing:
  processors:
    - name: CutflowTable
      inputs:
        - "*/*/*/selection"
      format: latex
      structure:
        select: 
          type: "SelectionFlow"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
      output_name: "{prefix}/demo_cutflow_{era.name}.tex"

Applying Histogram Transforms#

You can apply transforms to the axes of the underlying histograms before they move onto processors. The transforms field in the structure definition allows you to SelectAxesValues for a systematic variation, SliceAxes to narrow down the range (or drop bins), RebinAxes, etc.

Slicing Axes Example The SliceAxes transformation uses dictionary items mirroring a normal python list slice: (start, stop) inside the dimension to cut limits out of generated graphs.

Show YAML Configuration
Postprocessing:
  processors:
    - name: Histogram1D
      inputs:
        - "*/*/*/HT"
      scale: log
      normalize: False
      structure:
        select: 
          type: "Histogram"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
          - name: SliceAxes
            slices: {"HT": [500, null]} # Slice the axes above 500 HT
      output_name: "{prefix}/demo_sliceaxes_{era.name}.png"
  
  default_style_set:
    styles:
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2

Rebinning Axes Example The RebinAxes combines grouped adjacent elements out of a plot into a single continuous bin block.

Show YAML Configuration
Postprocessing:
  processors:
    - name: Histogram1D
      inputs:
        - "*/*/*/HT"
      scale: log
      normalize: False
      structure:
        select: 
          type: "Histogram"
          pipeline: "Signal312"
          dataset_name: "re:(signal_2018_312_1500_100$|signal_2018_312_1500_1000$|signal_2018_312_1000_100$|(?!signal_).*)"
        group: {"era.name": "*"}
        transforms:
          - name: SelectAxesValues
            select_axes_values: {"variation": "central"}
          - name: RebinAxes
            rebinning: {"HT": 2} # Rebin the HT axis by combining adjacent bins
      output_name: "{prefix}/demo_rebinaxes_{era.name}.png"
  
  default_style_set:
    styles:
      - pattern:
          sample_type: 'MC'
        style:
          plottype: fill
      - pattern:
          dataset_name: 'signal*'
        style:
          plottype: step
          linewidth: 2

Dropping Samples#

Sometimes, especially when dealing with many similar signal mass points, you may only want to process a subset of them to avoid plotting hundreds of overlapping distributions or running out of memory. You can achieve this in the top-level configuration using drop_sample_pattern. This utilizes the same querying pattern language as other parts of the analyzer, allowing you to filter out specific samples before any postprocessing actions are run:

Postprocessing:
  ...
  drop_sample_pattern:
    or_exprs:
      - sample_name: "*50to100*"
      - sample_name: "*100to200*"
      - sample_name: "*200to300*"

Running Postprocessing#

You can execute a Postprocessing configuration block using the CLI via the postprocess subcommand.

./osca postprocess \
  config/analysis.yaml \
  full_output/**/*.result \
  --parallel 4 \
  --prefix output_figures/

This command takes your master configuration file, gathers the specified .result files, and runs the defined postprocessors (like RatioPlot or CutflowTable). The --prefix option allows you to designate an output base directory for all the resulting files.