- Notifications
You must be signed in to change notification settings - Fork28
CUPiD is a “one stop shop” that enables and integrates timeseries file generation, data standardization, diagnostics, and metrics from all CESM components.
License
NCAR/CUPiD
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Python Framework for Generating Diagnostics from CESM
CUPiD is a “one stop shop” that enables and integrates timeseries file generation, data standardization, diagnostics, and metrics from all CESM components.
This collaborative effort aims to simplify the user experience of running diagnostics by calling post-processing tools directly from CUPiD, running all component diagnostics from the same tool as either part of the CIME workflow or independently, and sharing python code and a standard conda environment across components.
To install CUPiD, you need to check out the code and then set up a few environments.The initial examples have hard-coded paths that require you to be oncasper
.
The code relies on submodules to install a few packages that are still being developed,so thegit clone
process requires--recurse-submodules
:
$ git clone --recurse-submodules https://github.com/NCAR/CUPiD.git
Thencd
into theCUPiD
directory and build the necessary conda environments with
$cd CUPiD$ mamba env create -f environments/cupid-infrastructure.yml$ conda activate cupid-infrastructure$ which cupid-diagnostics$ mamba env create -f environments/cupid-analysis.yml
Notes:
As of version 23.10.0,
conda
defaults to usingmamba
to solve environments.It still feels slower than runningmamba
directly, hence the recommendation to install withmamba env create
rather thanconda env create
.If you do not havemamba
installed, you can still useconda
... it will just be significantly slower.(To see what version of conda you have installed, runconda --version
.)If the subdirectories in
externals/
are all empty, rungit submodule update --init
to clone the submodules.For existing users who cloned
CUPiD
prior to the switch from manage externals to git submodule, we recommend removingexternals/
before checking out main, runninggit submodule update --init
, and removingmanage_externals
(if it is still present aftergit submodule update --init
).If
which cupid-diagnostics
returned the errorwhich: no cupid-diagnostics in ($PATH)
, then please run the following:$ conda activate cupid-infrastructure$ pip install -e.# installs cupid
In the
cupid-infrastructure
environment, runpre-commit install
to configuregit
to automatically runpre-commit
checks when you try to commit changes from thecupid-infrastructure
environment; the commit will only proceed if all checks pass. Note that CUPiD usespre-commit
to ensure code formatting guidelines are followed, and pull requests will not be accepted if they fail thepre-commit
-based Github Action.If you plan on contributing code to CUPiD,whether developing CUPiD itself or providing notebooks for CUPiD to run,please see theContributor's Guide.
CUPiD currently provides an example for generating diagnostics.To test the package out, try to runexamples/key-metrics
:
$ conda activate cupid-infrastructure$cd examples/key_metrics$# machine-dependent: request multiple compute cores$ cupid-diagnostics$ cupid-webpage# Will build HTML from Jupyter Book
After the last step is finished, you can use Jupyter to view generated notebooks in${CUPID_ROOT}/examples/key-metrics/computed_notebooks
or you can view${CUPID_ROOT}/examples/key-metrics/computed_notebooks/_build/html/index.html
in a web browser.
Notes:
- Occasionally users report the following error the first time they run CUPiD:
Environment cupid-analysis specified for <YOUR-NOTEBOOK>.ipynb could not be found
. The fix for this is the following:$ conda activate cupid-analysis(cupid-analysis) $ python -m ipykernel install --user --name=cupid-analysis
Furthermore, to clean thecomputed_notebooks
folder which was generated by thecupid-diagnostics
andcupid-webpage
commands, you can run the following command:
$ cupid-clean
This will clean thecomputed_notebooks
folder which is at the location pointed to by therun_dir
variable in theconfig.yml
file.
Most of CUPiD's configuration is done via theconfig.yml
file, but there are a few command line options as well:
(cupid-infrastructure) $ cupid-diagnostics -hUsage: cupid-diagnostics [OPTIONS] CONFIG_PATH Main engine toset up running all the notebooks.Options: -s, --serial Do not use LocalCluster objects -ts, --time-series Runtime series generation scripts prior to diagnostics -atm, --atmosphere Run atmosphere component diagnostics -ocn, --ocean Run ocean component diagnostics -lnd, --land Run land component diagnostics -ice, --seaice Run sea ice component diagnostics -glc, --landice Run land ice component diagnostics -rof, --river-runoff Run river runoff component diagnostics --config_path Path to the YAML configuration file containing specificationsfor notebooks (default config.yml) -h, --help Show this message and exit.
By default, several of the example notebooks provided use a daskLocalCluster
object to run in parallel.However, the--serial
option will pass a logical flag to each notebook that can be used to skip starting the cluster.
# Spin up cluster (if running in parallel)client=Noneifnotserial:cluster=LocalCluster(**lc_kwargs)client=Client(cluster)client
If no component flags are provided, all component diagnostics listed inconfig.yml
will be executed by default. Multiple flags can be used together to select a group of components, for example:cupid-diagnostics -ocn -ice
.
CUPiD also has the capability to generate single variable timeseries files from history files for all components. To run timeseries, edit theconfig.yml
file's timeseries section to fit your preferences, and then runcupid-timeseries
.
About
CUPiD is a “one stop shop” that enables and integrates timeseries file generation, data standardization, diagnostics, and metrics from all CESM components.