- Notifications
You must be signed in to change notification settings - Fork19
Declarative cluster management using constraint programming, where constraints are described using SQL.
License
vmware-archive/declarative-cluster-management
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
- Overview
- Download
- Pre-requisites for use
- Quick Start
- Documentation
- Contributing
- Information for developers
- Learn more
Modern cluster management systems like Kubernetes routinely grapplewith hard combinatorial optimization problems: load balancing,placement, scheduling, and configuration. Implementing application-specific algorithms tosolve these problems is notoriously hard to do, making it challenging to evolve the system over timeand add new features.
DCM is a tool to overcome this challenge. It enables programmers to build schedulersand cluster managers using a high-level declarative language (SQL).
Specifically, developers need to represent cluster state in an SQL database, and write constraintsand policies that should apply on that state using SQL. From the SQL specification, the DCM compiler synthesizes aprogram that at runtime, can be invoked to compute policy-compliant cluster management decisions given the latestcluster state. Under the covers, the generated program efficiently encodes the cluster state as anoptimization problem that can be solved using off-the-shelf solvers, freeing developers from having todesign ad-hoc heuristics.
The high-level architecture is shown in the diagram below.
The DCM project's groupId iscom.vmware.dcm and its artifactId isdcm.We make DCM's artifacts available through Maven Central.
To use DCM from a Maven-based project, use the following dependency:
<dependency> <groupId>com.vmware.dcm</groupId> <artifactId>dcm</artifactId> <version>0.15.0</version></dependency>
To use within a Gradle-based project:
implementation 'com.vmware.dcm:dcm:0.15.0'We test regularly on JDK 11 and 16.
We test regularly on OSX and Ubuntu 20.04.
We currently support two solver backends.
Google OR-tools CP-SAT (version 9.1.9490). This is available by default when using the maven dependency.
MiniZinc (version 2.3.2). This backend is currently being deprecated. If you still want to use itin your project, or if you want run all tests in this repository, you will have to install MiniZinc out-of-band.
To do so, download MiniZinc fromhttps://www.minizinc.org/software.html... and make sure you are able to invoke the
minizincbinary from your commandline.
Here is acomplete programthat you can run to get a feel for DCM.
importcom.vmware.dcm.Model;importorg.jooq.DSLContext;importorg.jooq.impl.DSL;importorg.junit.jupiter.api.Test;importjava.util.List;importstaticorg.junit.jupiter.api.Assertions.assertEquals;importstaticorg.junit.jupiter.api.Assertions.assertTrue;publicclassQuickStartTest {@TestpublicvoidquickStart() {// Create an in-memory database and get a JOOQ connection to itfinalDSLContextconn =DSL.using("jdbc:h2:mem:");// A table representing some machinesconn.execute("create table machines(id integer)");// A table representing tasks, that need to be assigned to machines by DCM.// To do so, create a variable column (prefixed by controllable__).conn.execute("create table tasks(task_id integer, controllable__worker_id integer, " +"foreign key (controllable__worker_id) references machines(id))");// Add four machinesconn.execute("insert into machines values(1)");conn.execute("insert into machines values(3)");conn.execute("insert into machines values(5)");conn.execute("insert into machines values(8)");// Add two tasksconn.execute("insert into tasks values(1, null)");conn.execute("insert into tasks values(2, null)");// Time to specify a constraint! Just for fun, let's assign tasks to machines such that// the machine IDs sum up to 6.finalStringconstraint ="create constraint example_constraint as " +"select * from tasks check sum(controllable__worker_id) = 6";// Create a DCM model using the database connection and the above constraintfinalModelmodel =Model.build(conn,List.of(constraint));// Solve and return the tasks table. The controllable__worker_id column will either be [1, 5] or [5, 1]finalList<Integer>column =model.solve("TASKS") .map(e ->e.get("CONTROLLABLE__WORKER_ID",Integer.class));assertEquals(2,column.size());assertTrue(column.contains(1));assertTrue(column.contains(5)); }}
TheModel class serves as DCM's public API. It exposestwo methods:Model.build() andmodel.solve().
- Check out thetutorial to learn how to use DCM by building a simple VM load balancer
- Check out ourresearch papers for the back story behind DCM
- The Model APIJavadocs
We welcome all feedback and contributions! ❤️
Please use Githubissues for user questionsand bug reports.
Check out thecontributing guide if you'd like to send us a pull request.
The entire build including unit tests can be triggered from the root folder with the following command (makesure to setup both solvers first):
$: ./gradlew build
To avoid documentation drift, code snippets in a documentation file (like the README or tutorial)are embedded directly from source files that are continuously tested. To refresh these documentationfiles:
$: npx embedme<file>
The Kubernetes scheduler also comes with integration tests that run against a real Kubernetes cluster.It goes without saying that you should not point to a production cluster as these tests repeatedly delete allrunning pods and deployments. To run these integration-tests, make sure you have a validKUBECONFIGenvironment variable that points to a Kubernetes cluster.
We recommend setting up a local multi-node cluster and a correspondingKUBECONFIG usingkind. Once you've installedkind, run the followingto create a test cluster:
$: kind create cluster --config k8s-scheduler/src/test/resources/kind-test-cluster-configuration.yaml --name dcm-it
The above step will create a configuration file in your home folder (~/.kube/kind-config-dcm-it), make sureyou initialize aKUBECONFIG environment variable to point to that path.
You can then execute the following command to run integration-tests against the created local cluster:
$: KUBECONFIG=~/.kube/kind-config-dcm-it ./gradlew :k8s-scheduler:integrationTestTo run a specific integration test class (example:SchedulerIT from thek8s-scheduler module):
$: KUBECONFIG=~/.kube/kind-config-dcm-it ./gradlew :k8s-scheduler:integrationTest --tests SchedulerITTo learn more about DCM, we suggest going through the following references:
Talks:
- Hydra 2021 (~75 minutes)
- OSDI 2020 (20 minutes)
Research papers:
Building Scalable and Flexible Cluster Managers Using Declarative Programming
Lalith Suresh, Joao Loff, Faria Kalim, Sangeetha Abdu Jyothi, Nina Narodytska, Leonid Ryzhyk, Sahan Gamage, Brian Oki, Pranshu Jain, Michael Gasch.To appear, 14th USENIX Symposium on Operating Systems Design and Implementation, (OSDI 2020).Automating Cluster Management with Weave
Lalith Suresh, Joao Loff, Faria Kalim, Nina Narodytska, Leonid Ryzhyk, Sahan Gamage, Brian Oki, Zeeshan Lokhandwala, Mukesh Hira, Mooly Sagiv. arXiv preprint arXiv:1909.03130 (2019).Synthesizing Cluster Management Code for Distributed Systems
Lalith Suresh, João Loff, Nina Narodytska, Leonid Ryzhyk, Mooly Sagiv, and Brian Oki. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS 2019).ACM, New York, NY, USA, 45-50. DOI:https://doi.org/10.1145/3317550.3321444
About
Declarative cluster management using constraint programming, where constraints are described using SQL.
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Contributors9
Uh oh!
There was an error while loading.Please reload this page.
