Running the Analyzer
====================

The analysis framework provides a CLI for executing
configured analyses on datasets. The primary entry point for processing data is the ``run`` command. 
For convenience, you can use the ``./osca`` script (which wraps ``uv run python -m analyzer``) to run any CLI commands.

Key Commands
------------

Here are the most common commands you will use:

- ``./osca run``: Execute the analyzer on datasets
- ``./osca check``: Check the status of processed output files
- ``./osca patch``: Resubmit failed or missing jobs
- ``./osca browse``: Interactively browse results


Analysis Workflow
-----------------

A standard and robust workflow for running your analysis job involves several stages:
test locally -> run on condor -> check for failures -> patch failures -> check again.

1. Test Locally
^^^^^^^^^^^^^^^

Before submitting thousands of jobs to a cluster, it is highly recommended that you do a limited event run locally to catch any bugs in your analyzer code or configuration. The ``immediate`` or ``single-process-local`` execution patterns are perfect for this, combined with ``--max-sample-events``.

.. code-block:: bash

   ./osca run \
     -e imm-10000 \
     --max-sample-events 10000 \
     config/analysis.yaml \
     test_output/

2. Run on Condor
^^^^^^^^^^^^^^^^

Once you've verified the analysis works locally on a small subset of events, you can launch the full dataset processing using Condor (via Dask).

*Note: Make sure you are in a ``tmux`` or ``screen`` session before launching this command! 
The Dask scheduler runs on the local node, and without a persistent session, processing will crash if your SSH connection drops.*

.. code-block:: bash

   ./osca run \
     -e dask-condor-lpc-4G-100000 \
     config/analysis.yaml \
     full_output/

3. Check Results
^^^^^^^^^^^^^^^^

After the condor jobs finish (or if some failed), use the ``check`` command to see the status of the output files. You can pass the ``--only-bad`` flag to easily identify any samples that failed or are missing.

.. code-block:: bash

   ./osca check \
     -c config/analysis.yaml \
     full_output/**/*.result \
     --only-bad

4. Patch Failed Jobs
^^^^^^^^^^^^^^^^^^^^

If you found missing or corrupted results in the previous step, use the ``patch`` command. This command looks at the directory and resubmits processing only for the chunks/samples that did not complete successfully.

.. code-block:: bash

   ./osca patch \
     -c config/analysis.yaml \
     -e dask-condor-lpc-4G-100000 \
     -o full_output \
     full_output/**/*.result

*(You can also use a local executor for patching if only a few small pieces failed.)*

5. Final Check
^^^^^^^^^^^^^^

Run the ``check`` command once more to confirm that the patch successfully completed and your dataset has been processed fully.

.. code-block:: bash

   ./osca check \
     -c config/analysis.yaml \
     full_output/**/*.result