Skip to content
/ Cook Public
forked from twosigma/Cook

Fair job scheduler on Mesos for batch workloads and Spark

License

Notifications You must be signed in to change notification settings

nsinkov/Cook

 
 

Repository files navigation

Cook Scheduler

Slack Status

Welcome to Two Sigma's Cook Scheduler!

What is Cook?

  • Cook is a powerful batch scheduler, specifically designed to provide a great user experience when there are more jobs to run than your cluster has capacity for.
  • Cook is able to intelligently preempt jobs to ensure that no user ever needs to wait long to get quick answers, while simultaneously helping you to achieve 90%+ utilization for massive workloads.
  • Cook has been battle-hardened to automatically recover after dozens of classes of cluster failures.
  • Cook can act as a Spark scheduler, and it comes with a REST API, Java client, Python client, and CLI.

Core concepts is a good place to start to learn more.

Releases

Check the changelog for release info.

Subproject Summary

In this repository, you'll find several subprojects, each of which has its own documentation.

  • scheduler - This is the actual Mesos framework, Cook. It comes with a JSON REST API.
  • jobclient - This includes the Java and Python APIs for Cook, both of which use the REST API under the hood.
  • spark - This contains the patch to Spark to enable Cook as a backend.

Please visit the scheduler subproject first to get started.

Quickstart

Using Google Kubernetes Engine (GKE)

The quickest way to get Cook running locally against GKE is with Vagrant.

  1. Install Vagrant
  2. Install Virtualbox
  3. Clone down this repo
  4. Run GCP_PROJECT_NAME=<gcp_project_name> vagrant up --provider=virtualbox to create the dev environment
  5. Run vagrant ssh to ssh into the dev environment

In your Vagrant dev environment

  1. Run gcloud auth login to login to Google cloud
  2. Run bin/make-gke-test-clusters to create GKE clusters
  3. Run bin/start-datomic.sh to start Datomic (Cook database)
  4. Run lein exec -p datomic/data/seed_k8s_pools.clj $COOK_DATOMIC_URI to seed some Cook pools in the database
  5. Run bin/run-local-kubernetes.sh to start the Cook scheduler
  6. Cook should now be listening locally on port 12321

To test a simple job submission:

  1. Run cs submit --pool k8s-alpha --cpu 0.5 --mem 32 --docker-image gcr.io/google-containers/alpine-with-bash:1.0 ls to submit a simple job
  2. Run cs show <job_uuid> to show the status of your job (it should eventually show Success)

To run automated tests:

  1. Run lein test :all-but-benchmark to run unit tests
  2. Run cd ../integration && pytest -m 'not cli' to run integration tests
  3. Run cd ../integration && pytest -k test_basic_submit -n 0 -s to run a particular integration test

Using Mesos

The quickest way to get Mesos and Cook running locally is with docker and minimesos.

  1. Install docker
  2. Clone down this repo
  3. cd scheduler
  4. Run bin/build-docker-image.sh to build the Cook scheduler image
  5. Run ../travis/minimesos up to start Mesos and ZooKeeper using minimesos
  6. Run bin/run-docker.sh to start the Cook scheduler
  7. Cook should now be listening locally on port 12321

Contributing

In order to accept your code contributions, please fill out the appropriate Contributor License Agreement in the cla folder and submit it to [email protected].

Disclaimer

Apache Mesos is a trademark of The Apache Software Foundation. The Apache Software Foundation is not affiliated, endorsed, connected, sponsored or otherwise associated in any way to Two Sigma, Cook, or this website in any manner.

© Two Sigma Open Source, LLC

About

Fair job scheduler on Mesos for batch workloads and Spark

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Clojure 63.8%
  • Python 27.3%
  • Java 7.3%
  • Shell 1.3%
  • Jupyter Notebook 0.2%
  • Dockerfile 0.1%