CUPiD Documentation#

CUPiD: CESM Unified Postprocessing and Diagnostics#

Python Framework for Generating Diagnostics from CESM

Project Vision#

CUPiD is a “one stop shop” that enables and integrates timeseries file generation, data standardization, diagnostics, and metrics from all CESM components.

This collaborative effort aims to simplify the user experience of running diagnostics by calling post-processing tools directly from CUPiD, running all component diagnostics from the same tool as either part of the CIME workflow or independently, and sharing python code and a standard conda environment across components.

Installing#

To install CUPiD, you need to check out the code and then set up a few environments.The initial examples have hard-coded paths that require you to be oncasper.

The code relies on submodules to install a few packages that are still being developed,so thegitclone process requires--recurse-submodules:

$gitclone--recurse-submoduleshttps://github.com/NCAR/CUPiD.git

Thencd into theCUPiD directory and build the necessary conda environments with

$cdCUPiD$mambaenvcreate-fenvironments/cupid-infrastructure.yml$condaactivatecupid-infrastructure$whichcupid-diagnostics$mambaenvcreate-fenvironments/cupid-analysis.yml

Notes:

  1. As of version 23.10.0,conda defaults to usingmamba to solve environments.It still feels slower than runningmamba directly, hence the recommendation to install withmambaenvcreate rather thancondaenvcreate.If you do not havemamba installed, you can still useconda… it will just be significantly slower.(To see what version of conda you have installed, runconda--version.)

  2. If the subdirectories inexternals/ are all empty, rungitsubmoduleupdate--init to clone the submodules.

  3. For existing users who clonedCUPiD prior to the switch from manage externals to git submodule, we recommend removingexternals/ before checking out main, runninggitsubmoduleupdate--init, and removingmanage_externals (if it is still present aftergitsubmoduleupdate--init).

  4. Ifwhichcupid-diagnostics returned the errorwhich:nocupid-diagnosticsin($PATH), then please run the following:

    $condaactivatecupid-infrastructure$pipinstall-e.# installs cupid
  5. In thecupid-infrastructure environment, runpre-commitinstall to configuregit to automatically runpre-commit checks when you try to commit changes from thecupid-infrastructure environment; the commit will only proceed if all checks pass. Note that CUPiD usespre-commit to ensure code formatting guidelines are followed, and pull requests will not be accepted if they fail thepre-commit-based Github Action.

  6. If you plan on contributing code to CUPiD,whether developing CUPiD itself or providing notebooks for CUPiD to run,please see theContributor’s Guide.

Running#

CUPiD currently provides an example for generating diagnostics.To test the package out, try to runexamples/key-metrics:

$condaactivatecupid-infrastructure$cdexamples/key_metrics$# machine-dependent: request multiple compute cores$cupid-diagnostics$cupid-webpage# Will build HTML from Jupyter Book

After the last step is finished, you can use Jupyter to view generated notebooks in${CUPID_ROOT}/examples/key-metrics/computed_notebooksor you can view${CUPID_ROOT}/examples/key-metrics/computed_notebooks/_build/html/index.html in a web browser.

Notes:

  1. Occasionally users report the following error the first time they run CUPiD:Environmentcupid-analysisspecifiedfor<YOUR-NOTEBOOK>.ipynbcouldnotbefound. The fix for this is the following:

    $condaactivatecupid-analysis(cupid-analysis)$python-mipykernelinstall--user--name=cupid-analysis

Furthermore, to clean thecomputed_notebooks folder which was generated by thecupid-diagnostics andcupid-webpage commands, you can run the following command:

$cupid-clean

This will clean thecomputed_notebooks folder which is at the location pointed to by therun_dir variable in theconfig.yml file.

CUPiD Options#

Most of CUPiD’s configuration is done via theconfig.yml file, but there are a few command line options as well:

(cupid-infrastructure)$cupid-diagnostics-hUsage:cupid-diagnostics[OPTIONS]CONFIG_PATHMainenginetosetuprunningallthenotebooks.Options:-s,--serialDonotuseLocalClusterobjects-ts,--time-seriesRuntimeseriesgenerationscriptspriortodiagnostics-atm,--atmosphereRunatmospherecomponentdiagnostics-ocn,--oceanRunoceancomponentdiagnostics-lnd,--landRunlandcomponentdiagnostics-ice,--seaiceRunseaicecomponentdiagnostics-glc,--landiceRunlandicecomponentdiagnostics-rof,--river-runoffRunriverrunoffcomponentdiagnostics--config_pathPathtotheYAMLconfigurationfilecontainingspecificationsfornotebooks(defaultconfig.yml)-h,--helpShowthismessageandexit.
Running in serial#

By default, several of the example notebooks provided use a daskLocalCluster object to run in parallel.However, the--serial option will pass a logical flag to each notebook that can be used to skip starting the cluster.

# Spin up cluster (if running in parallel)client=Noneifnotserial:cluster=LocalCluster(**lc_kwargs)client=Client(cluster)client
Specifying components#

If no component flags are provided, all component diagnostics listed inconfig.yml will be executed by default. Multiple flags can be used together to select a group of components, for example:cupid-diagnostics-ocn-ice.

Timeseries File Generation#

CUPiD also has the capability to generate single variable timeseries files from history files for all components. To run timeseries, edit theconfig.yml file’s timeseries section to fit your preferences, and then runcupid-timeseries.