- Notifications
You must be signed in to change notification settings - Fork3
A graph-based workflow manager for computational chemistry pipelines
License
MolecularAI/maize
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
maize is a graph-based workflow manager for computational chemistry pipelines.
It is based on the principles offlow-based programming and thus allows arbitrary graph topologies, including cycles, to be executed. Each task in the workflow (referred to asnodes) is run as a separate process and interacts with other nodes in the graph by communicating through unidirectionalchannels, connected toports on each node. Every node can have an arbitrary number of input or output ports, and can read from them at any time, any number of times. This allows complex task dependencies and cycles to be modelled effectively.
This repository contains the core workflow execution functionality. For domain-specific steps and utilities, you should additionally installmaize-contrib, which will have additional dependencies.
A taste for defining and running workflows withmaize.
"""A simple hello-world-ish example graph."""frommaize.core.interfaceimportParameter,Output,MultiInputfrommaize.core.nodeimportNodefrommaize.core.workflowimportWorkflow# Define the nodesclassExample(Node):data:Parameter[str]=Parameter(default="Hello")out:Output[str]=Output()defrun(self)->None:self.out.send(self.data.value)classConcatAndPrint(Node):inp:MultiInput[str]=MultiInput()defrun(self)->None:result=" ".join(inp.receive()forinpinself.inp)self.logger.info("Received: '%s'",result)# Build the graphflow=Workflow(name="hello")ex1=flow.add(Example,name="ex1")ex2=flow.add(Example,name="ex2",parameters=dict(data="maize"))concat=flow.add(ConcatAndPrint)flow.connect(ex1.out,concat.inp)flow.connect(ex2.out,concat.inp)# Check and run!flow.check()flow.execute()
If you plan on not modifying maize, and will be usingmaize-contrib, then you should just follow the installation instructions for the latter. Maize will be installed automatically as a dependency.
Note thatmaize-contrib requires several additional domain-specific packages, and you should use its own environment file instead if you plan on using these extensions.
To get started quickly with running maize, you can install from an environment file:
conda env create -f env-users.ymlconda activate maizepip install --no-deps ./
If you want to develop the code or run the tests, use the development environment and install the package in editable mode:
conda env create -f env-dev.ymlconda activate maize-devpip install --no-deps ./
Maize requires the following packages and also depends on python 3.10:
- dill
- networkx
- pyyaml
- toml
- numpy
- matplotlib
- graphviz
- beartype
- psij-python
We also strongly recommend the installation ofmypy. To install everything use the following command:
conda install -c conda-forge python=3.10 dill networkx yaml toml mypy
If you wish to develop or add additional modules, the following additional packages will be required:
- pytest
- sphinx
You can find guides, examples, and the API in thedocumentation.
maize is still in an experimental stage, but the core of it is working:
- Arbitrary workflows with conditionals and loops
- Subgraphs, Subsubgraphs, ...
- Type-safe channels, graph will not build if port types mismatch
- Nodes for broadcasting, merging, round-robin, ...
- Potential deadlock warnings
- Multiple retries
- Fail-okay nodes
- Channels can send most types of data (using dill in the background)
- Commandline exposure
- Custom per-node python environments
About
A graph-based workflow manager for computational chemistry pipelines