- Notifications
You must be signed in to change notification settings - Fork14
Execute a subset of Python on HPC platforms
License
pypr/compyle
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Compyle allows users to execute a restricted subset of Python (almost similarto C) on a variety of HPC platforms. Currently we support multi-core CPUexecution using Cython, and for GPU devices we use OpenCL or CUDA.
Users start with code implemented in a very restricted Python syntax, this codeis then automatically transpiled, compiled and executed to run on either one CPUcore, or multiple CPU cores (viaOpenMP) or on a GPU. Compyle offerssource-to-source transpilation, making it a very convenient tool for writing HPClibraries.
Some simple yet powerful parallel utilities are provided which can allow youto solve a remarkably large number of interesting HPC problems. Compyle alsofeatures JIT transpilation making it easy to use.
Documentation and learning material is also available in the form of:
- Documentation at:https://compyle.readthedocs.io
- An introduction to compyle in the context of writing a parallel moleculardynamics simulator is in ourSciPy 2020 paper.
- Compyle poster presentation
- You may also try Compyle online for free on aGoogle Colab notebook.
While Compyle seems simple it is not a toy and is used heavily by thePySPHproject where Compyle has its origins.
Compyle is itself largely pure Python but depends onnumpy and requireseitherCython orPyOpenCL orPyCUDA along with the respective backends of aC/C++ compiler, OpenCL and CUDA. If you are only going to execute code on aCPU then all you need is Cython.
You should be able to install Compyle by doing:
$ pip install compyle
Here is a very simple example:
from compyle.api import Elementwise, annotate, wrap, get_configimport numpy as np@annotatedef axpb(i, x, y, a, b): y[i] = a*sin(x[i]) + bx = np.linspace(0, 1, 10000)y = np.zeros_like(x)a, b = 2.0, 3.0backend = 'cython'get_config().use_openmp = Truex, y = wrap(x, y, backend=backend)e = Elementwise(axpb, backend=backend)e(x, y, a, b)
This will execute the elementwise operation in parallel using OpenMP withCython. The code is auto-generated, compiled and called for you transparently.The first time this runs, it will take a bit of time to compile everything butthe next time, this is cached and will run much faster.
If you just change thebackend = 'opencl'
, the same exact code will beexecuted usingPyOpenCL and if you change the backend to'cuda'
, it willexecute via CUDA without any other changes to your code. This is obviously avery trivial example, there are more complex examples available as well.
Some simple examples and benchmarks are available in theexamples directory.
You may also run these examples on theGoogle Colab notebook
About
Execute a subset of Python on HPC platforms