Running the Analyzer#

The analysis framework provides a CLI for executing configured analyses on datasets. The primary entry point for processing data is the run command. For convenience, you can use the ./osca script (which wraps uv run python -m analyzer) to run any CLI commands.

Key Commands#

Here are the most common commands you will use:

  • ./osca run: Execute the analyzer on datasets

  • ./osca check: Check the status of processed output files

  • ./osca patch: Resubmit failed or missing jobs

  • ./osca browse: Interactively browse results

Analysis Workflow#

A standard and robust workflow for running your analysis job involves several stages: test locally -> run on condor -> check for failures -> patch failures -> check again.

1. Test Locally#

Before submitting thousands of jobs to a cluster, it is highly recommended that you do a limited event run locally to catch any bugs in your analyzer code or configuration. The immediate or single-process-local execution patterns are perfect for this, combined with --max-sample-events.

./osca run \
  -e imm-10000 \
  --max-sample-events 10000 \
  config/analysis.yaml \
  test_output/

2. Run on Condor#

Once you’ve verified the analysis works locally on a small subset of events, you can launch the full dataset processing using Condor (via Dask).

Note: Make sure you are in a ``tmux`` or ``screen`` session before launching this command! The Dask scheduler runs on the local node, and without a persistent session, processing will crash if your SSH connection drops.

./osca run \
  -e dask-condor-lpc-4G-100000 \
  config/analysis.yaml \
  full_output/

3. Check Results#

After the condor jobs finish (or if some failed), use the check command to see the status of the output files. You can pass the --only-bad flag to easily identify any samples that failed or are missing.

./osca check \
  -c config/analysis.yaml \
  full_output/**/*.result \
  --only-bad

4. Patch Failed Jobs#

If you found missing or corrupted results in the previous step, use the patch command. This command looks at the directory and resubmits processing only for the chunks/samples that did not complete successfully.

./osca patch \
  -c config/analysis.yaml \
  -e dask-condor-lpc-4G-100000 \
  -o full_output \
  full_output/**/*.result

(You can also use a local executor for patching if only a few small pieces failed.)

5. Final Check#

Run the check command once more to confirm that the patch successfully completed and your dataset has been processed fully.

./osca check \
  -c config/analysis.yaml \
  full_output/**/*.result