Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

SCHeMa (Scheduler for scientific Containers on clusters of Heterogeneous Machines) is an open source platform to facilitate the execution and reproducibility of computational experiments on heterogeneous clusters.

License

NotificationsYou must be signed in to change notification settings

athenarc/schema

Repository files navigation

Scientific Containers on Heterogeneous Machines (SCHeMa)


SCHeMa (Scheduler for scientific Containers on clusters of Heterogeneous Machines) is an open source platform to facilitate the execution and reproducibility of computational experiments on heterogeneous clusters. The platform exploits containerization, experiment packaging, and workflow management technologies to ease reproducibility, while it leverages machine learning technologies to automatically identify the type of node that is more suitable to undertake each submitted computational task.

If you are using SCHeMa for your research please cite: Thanasis Vergoulis, Konstantinos Zagganas, Loukas Kavouras, Martin Reczko, Stelios Sartzetakis, and Theodore Dalamagas. "SCHeMa: Scheduling Scientific Containers on a Cluster of Heterogeneous Machines." arXiv preprint arXiv:2103.13138 (2021).

Deploying with Helm (recommended)

This helm chart includes the main SCHeMa web interface, along with a private docker registry, a postgres database server and an FTP server (for use by TESK).

Prerequisites

In order to be able to install SCHeMa you need:

Deployment

  1. Create a new namespace (schema) with
kubectl create namespace schema
  1. Editdeployment/values.yaml and fill the values appropriate for your installation in the following fields:
NameDescription
domainThe ingress domain name to deploy the apps
schema.volume.deploy_volumeWhether to deploy a storage volume for the user data in SCHeMa
schema.volume.sizesize of the volume (e.g. 50Gi)
schema.volume.storageClassThe name of the ReadWriteMany storageClass
postgres.volume.deploy_volumeWhether to deploy a storage volume for the DB data
postgres.volume.sizesame as schema.volume.size for the DB volume
postgres.volume.storageClassSame as schema.volume.storageClass
postgres.deployment.dbUsernameUsername of the DB user
postgres.deployment.dbPasswordPassword of the DB user
postgres.deployment.dbNameName of the DB
cluster_endpointEndpoint of the Kubernetes api server (e.g.https://xxx.xxx.xxx.xxx:443)
registryURL of the private registry
registry.data_volume.deploy_volumeWhether to deploy a storage volume for the registry data
registry.data_volume.sizesame as schema.volume.size for the registry data volume
registry.data_volume.storageClassSame as schema.volume.storageClass for the registry data volume
registry.credentials_volume.deploy_volumeWhether to deploy a storage volume for the registry authentication credentials
registry.credentials_volume.storageClassSame as schema.volume.storageClass
registry.credentials_volume.sizeWe do not recommend anything greater than 10M for this volume
registry.deployment.usernameYour registry username
registry.deployment.passwordYour registry password
ftp.deployment.usernameYour FTP username
ftp.deployment.passwordYour FTP password
tesk.urlThe URL of your TESK installation
wes.urlThe URL of your cwl-WES installation
standalone.isStandaloneLeave to "true" (unless you are running theCLIMA project management system.)
standalone.ResourcesMaximum resources for job pods when running in standalone mode
metrics.urlLink to a metrics server dashboard of your choice (leave blank if not available)

Note: you can either create Persistent Volume Claims (PVC) with the appropriate names invalues.yaml or you can allow the helm chart to create them automatically.

  1. Deploy the Helm chart with
helm install schema-app deployment -f deployment/values.yaml
  1. Create the database structure and add required data:
kubectl -n schemaexec -it<schema-pod-id> -- psql -h postgres -U<your-db-username> -d<your-db-name> -f /app/web/schema/database_schema/schema_db.sql
  1. Run the same command for all migration files/app/web/schema/database-schema/migration-xx.sql in order. If you are upgrading to the latest version of SCHeMa, please run the migration files that have been published since the last version.

After all steps have been completed the app should be running as expected. By default a superadministrator account is created and you can login using "superadmin" as username and password. Please change it as soon as possible after logging in.

Installing on a dedicated machine (Deprecated)

Prerequisites

In order to install SCHeMa you need:

  • an operational Kubernetes cluster or minikube cluster (tutorial) with metrics-server installed
  • a docker registry configured with TLS and basic authentication (or see below for installation instructions for a private local registry)
  • an Apache server with PHP 7.2 installed on the cluster master or another machine that has access to the "kubectl" command
  • a PostgreSQL database server
  • python 3 and docker installed
  • a local directory exposed via NFS (called local NFS from here on) to the cluster so that Kubernetes pods can read/write data from/on it (tutorial)
  • a system user with sudo permissions that is able to run docker and kubectl without using sudo.
  • acwl-WES (see below) in k8s namespacewes andTESK in k8s namespacetes, for workflow and task execution respectively.
  • a ReadWriteMany Kubernetes StorageClass (like NFS) for cwl-WES and TESK.

Required PHP packages

The node running the installation of SCHeMa should have the following PHP packages installed:

  • php-mbstring
  • php-xml
  • php-gd
  • php-pgsql
  • php-yaml

Required Python packages

The node running the installation of SCHeMa should have the following Python packages installed:

  • python3-ruamel.yaml
  • python3-psycopg2
  • python3-yaml
  • python3-requests
  • rocrate (install with pip3)
  • python3-sklearn
  • dockertarpusher (install with pip3)

Other packages required:

  • cwltool
  • graphviz

Installing a local docker registry with self-signed certificates and basic authentication

On the machine that will run the SCHeMa installation:

  1. Create a folder for the registry certificates and authentication files (e.g. /data/registry) with two additional directories, "certs" and reg_auth".
  2. Create self-signed certificates:
openssl req \  -newkey rsa:4096 -nodes -sha256 -keyout<registry_data_directory>/certs/domain.key \  -x509 -days 365 -out<registry_data_directory>/certs/domain.crt
  1. Create a username and password for the registry (change<registry_username> and<registry_username> appropriately):
sudo docker run -it --entrypoint htpasswd -v$PWD/reg_auth:/auth -w /auth registry:2 -Bbc /auth/htpasswd<registry_username><registry_password>
  1. Start the registry with the created certificates:
  docker run -d \  --restart=always \  --name registry \  -v"$(pwd)"/certs:/certs \  -v"$(pwd)"/reg_auth:/auth \  -e REGISTRY_HTTP_ADDR=0.0.0.0:5000 \  -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \  -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \  -e"REGISTRY_AUTH=htpasswd" \  -e"REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" \  -e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \  -e REGISTRY_STORAGE_DELETE_ENABLED=true \  -p 5000:5000 \  registry:2
  1. Create folders with the certificate for the docker registry and copy the certificates:
sudo mkdir -p /etc/docker/certs.d/127.0.0.1:5000sudo mkdir -p /etc/docker/certs.d/localhost:5000sudo cp<registry_data_directory>/certs/domain.crt /etc/docker/certs.d/127.0.0.1:5000/ca.crtsudo cp<registry_data_directory>/certs/domain.crt /etc/docker/certs.d/localhost:5000/ca.crt
  1. Login to the registry:
docker login 127.0.0.1:5000 -u<registry_username> -p pass<registry_password>
  1. Create a Kubernetes secret nameddocker-secret with your Docker login. This is so that Kubernetes can retrieve images from your private registry:
kubectl create secret docker-registry --docker-server<docker-registry-ip> --docker-username<registry_username> --docker-password<registry_password>

Installing SCHeMa

  1. Install the Yii2 framework(tutorial) and install the following plugins:
  1. Download the SCHeMa code from GitHub and replace the files inside the Yii project folder.

  2. Create a postgres database named "schema" for user "schema".

  3. Restore the .sql file inside the "database_schema" folder as user "postgres" to the database created in the previous step:sudo -u postgres psql -d schema -f <path_to_database_schema>/database_schema.sql

  4. Copy the docker registry certificates in the project_root/scheduler_files/certificates:cp <registry_data_directory>/certs/* <path_to_schema_project>/scheduler_files/certificates

  5. Using root permissions create an empty file inside /etc/sudoers.d/ withvisudo and paste the following inside it after filling the relevant information:

www-data ALL=(<user>) NOPASSWD:<path-to-kubectl>,<path-to-docker>,<path_to_schema_project>/scheduler_files/scheduler.py,<path_to_schema_project>/scheduler_files/ontology/initialClassify.py,<path_to_schema_project>/scheduler_files/imageUploader.py,<path_to_schema_project>/scheduler_files/imageRemover.py,<path_to_schema_project>/scheduler_files/inputReplacer.py,<path_to_schema_project>/scheduler_files/probe_stats.py,<path_to_schema_project>/scheduler_files/setupMpiCluster.py,<path_to_schema_project>/scheduler_files/mpiMonitorAndClean.py,<path_to_schema_project>/scheduler_files/existingImageUploader.py,<path_to_schema_project>/scheduler_files/workflowMonitorAndClean.py,<path_to_schema_project>/scheduler_files/workflowUploader.py,<path_to_cwltool>/cwltool

where<user>: a user that has permissions to run path-to-kubectl. As an example take a look at the following

  www-data ALL=(ubuntu) NOPASSWD: /usr/bin/kubectl, /data/www/schema/scheduler_files/scheduler.py, /data/www/schema/scheduler_files/ontology/initialClassify.py, /data/www/schema/scheduler_files/imageUploader.py, /data/www/schema/scheduler_files/imageRemover.py, /data/www/schema/scheduler_files/inputReplacer.py, /data/www/schema/scheduler_files/probe_stats.py, /data/www/schema/scheduler_files/setupMpiCluster.py,/data/www/schema/scheduler_files/mpiMonitorAndClean.py, /data/www/schema/scheduler_files/existingImageUploader.py, /data/www/schema/scheduler_files/workflowMonitorAndClean.py, /data/www/schema/scheduler_files/workflowUploader.py

This will allow www-data to run kubectl and the python scripts inside the folder as the user you have selected.

  1. Inside the project folder change the following files according to the database and Docker registry configuration:
  • scheduler_files/configuration.json using the template found at scheduler_files/configuration-template.json and fill the appropriate details.
  • config/db.php and fill the details for the database (for details see the Yii2 documentation)
  • config/params.php and fill the following details according to your configuration (you can use params-template.php):
  1. Create a new namespace in Kubernetes for the Open MPI Cluster:
kubectl create namespace mpi-cluster

About

SCHeMa (Scheduler for scientific Containers on clusters of Heterogeneous Machines) is an open source platform to facilitate the execution and reproducibility of computational experiments on heterogeneous clusters.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp