slurm
Here are 737 public repositories matching this topic...
Language:All
Sort:Most stars
Machine Learning Engineering Open Book
- Updated
Jul 18, 2025 - Python
Slurm: A Highly Scalable Workload Manager
- Updated
Jul 18, 2025 - C
A DSL for data-driven computational pipelines
- Updated
Jul 18, 2025 - Groovy
dstack is an open-source container orchestrator that simplifies workload orchestration and drives GPU utilization for ML teams. It works with any GPU cloud, on-prem cluster, or accelerated hardware.
- Updated
Jul 18, 2025 - Python
A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
- Updated
Jul 17, 2025 - Python
Best practices & guides on how to write distributed pytorch training code
- Updated
Feb 24, 2025 - Python
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
- Updated
Jul 18, 2025 - Python
A Slurm cluster using docker-compose
- Updated
Jul 17, 2025 - Dockerfile
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
- Updated
Jul 17, 2025 - Python
A scheduler for GPU/CPU tasks
- Updated
Mar 6, 2024 - C
Simplify HPC and Batch workloads on Azure
- Updated
Mar 20, 2023 - Python
An open-source toolkit for deploying and managing high performance clusters for HPC, AI, and data analytics workloads.
- Updated
Jul 18, 2025 - YAML
Prometheus exporter for performance metrics from Slurm.
- Updated
Jun 20, 2024 - Go
Run Slurm in Kubernetes
- Updated
Jul 18, 2025 - Go
Improve this page
Add a description, image, and links to theslurm topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theslurm topic, visit your repo's landing page and select "manage topics."