This project allows one to run Apache Mesos frameworks with Universal Resource Broker (URB) in a Kubernetes cluster.
It utilizes the urb-core project and provides the Kubernetes adapter for URB.
Please see the Universal Resource Broker core project for more architectual details.
The following steps need to be done to perform a project build:
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && chmod a+x kubectl && sudo mv kubectl /usr/local/bin
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin
Requires Virtualbox or another supported minikube backend to be installed
minikube start
This will be used inside the build container
tools/conf.sh
Requires docker
to be installed
cd urb-core/vagrant
make
SYNCED_FOLDER=../.. vagrant up --provider=docker
Enter the build container
vagrant ssh
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && chmod a+x kubectl && sudo mv kubectl /usr/local/bin
cd /scratch/urb
make
export KUBECONFIG=/vagrant/.kube/config
kubectl proxy&
make test
make dist
Open a new shell in the root of the project and run:
eval $(minikube docker-env)
make images
URB service in minikube Kubernetes cluster can be started by creating first ConfigMap with URB configuration:
kubectl create configmap urb-config --from-file=etc/urb.conf
and creating URB service deployment with following command:
kubectl create -f source/urb-master.yaml
There are two options to run Mesos framework schedulers:
- As pods within a Kubernetes cluster
- As processes outside of Kubernetes cluster
In both cases the LD_LIBRARY_PATH
and MESOS_NATIVE_JAVA_LIBRARY
(for Java or Scala based frameworks) environment variables have to be specified in the run time environment of the framework. LD_LIBRARY_PATH
has to contain a path to the URB liburb.so
shared library. MESOS_NATIVE_JAVA_LIBRARY
should point to the same library file. Different frameworks may have different ways of specifing the Mesos master URI. In general, standard Mesos URI has to be changed to the URB one: urb://urb-master:6379
.
The framework has to be "dockerized" and associated with the corresponding Kubernetes object like pod, deployment, service, etc.
The following run time dependencies are required to be installed in the framework Docker container: libev
, libuuid
, zlib
as well as liburb.so
(found in urb-core/dist/urb-*-linux-x86_64/lib/linux-x86_64
) and LD_LIBRARY_PATH
and/or MESOS_NATIVE_JAVA_LIBRARY
set, and URB URI specified as urb://urb-master.default:6379
(see for example Marathon service). There are two docker images which can be used as a base for creating custom framework images urb-bin-base.dockerfile with URB binary dependencies specified above and urb-python-base.dockerfile with added Python dependencies. They are used in following examples: C++ example framework, Python example framework.
The URB service can be accessible from outside of the cluster at port 30379
. In the minikube based development environment the URB URI can be retrieved with: minikube service urb-master --format "urb://{{.IP}}:{{.Port}}"
. It is crucial to have the framework runtime installed on the same paths inside and outside of the Kubernetes cluster as well as to have URB related paths (LD_LIBRARY_PATH
, MESOS_NATIVE_JAVA_LIBRARY
) properly set.
In many situations, to access global data or when a framework uses a custom executor or an executor requires a massive run time bundle that is shared by the framework scheduler and executor it is convenient to have a common run time located on a persistent volume that is accessible from both the executor runner and the framework. Persistent volume claims to be used by framework's executor can be configured with persistent_volume_claims
configuration option in URB configuration file etc/urb.conf on per framework basis or in DefaultFrameworkConfig
section as default value for all frameworks. It allows to place framework or other data files on shared volume within the cluster.
Alternatively, all framework and executor files can be located in corresponding self contained docker images. Such frameworks has to be configured in etc/urb.conf with executor_runner
framework configuration option for custom executor runner docker image which can be based on generic URB executor runner image urb-executor-runner.dockerfile. See custom executor runner docker image for Python test framework python-executor-runner.dockerfile for example.
Mesos and URB projects include several example frameworks such as C++ example_framework.cpp and Python test_framework.py frameworks for demonstration purposes. In order to start these frameworks, the URB service has to be running (kubectl create -f source/urb-master.yaml
).
Examples below will demonstrate different options for running Mesos frameworks:
- C++ framework runs inside kubernetes cluster, framework executable and custom executor located on shared persistent volume
- C++ framework executable located and runs outside of the kubernetes cluster and custom executor located on shared persistent volume
- Python framework and custom executor located in self contained docker images
This C++ example framework relies on both framework and custom executor executables located on persistent volume example-pv
defined in pv.yaml with corresponding persistent volume claim pvc.yaml. This persistent volume is configured in C++ example framework configuration section TestFrameworkC*FrameworkConfig
in URB configuration file etc/urb.conf in a following way: persistent_volume_claims = example-pvc:/opt/example
to be accessible from generic URB executor runner. The same persistent volume is used in the job definition for C++ framework: test/example-frameworks/cpp-framework.yaml.
The run.sh
helper script is designed to allow consecutive runs of the example framework by first cleaning up the Kubernetes cluster from the previous run, creating the persistent volume, starting C++ framework in the minikube environment, and waiting for the completion. It can be run with the following command:
test/example-frameworks/run.sh
Assuming that persistent volume is already created by the script, C++ example framework can be started manually:
kubectl create -f test/example-frameworks/cpp-framework.yaml
The output from the framework can be examined with (the pod name has to be replaced with actual name from your environment):
kubectl logs cpp-framework-ck8zz
This is an example of how to run the C++ example framework from outside of the Kubernetes cluster (build machine).
- Get the URB service URI:
minikube service urb-master --format "urb://{{.IP}}:{{.Port}}"
- Login to build machine:
cd urb-core/vagrant; vagrant ssh
- Create a directory to contain C++ framework binary which matches the path
/opt/example/bin/example_framework.test
in the Kubernetes persistent volume created in minikube bytest/example-frameworks/run.sh
in the previous example:
sudo mkdir -p /opt/example/bin
- Copy the C++ framework executable:
sudo cp /scratch/urb/urb-core/dist/urb-*-linux-x86_64/share/examples/frameworks/linux-x86_64/example_framework.test /opt/example/bin
- Run the C++ framework (substitute
<URB_URI>
with an actual URI determined in the first step from the host machine)
cd /scratch/urb
LD_LIBRARY_PATH=$(pwd)/urb-core/dist/urb-*-linux-x86_64/lib/linux-x86_64:$LD_LIBRARY_PATH URB_MASTER=<URB_URI> /opt/example/bin/example_framework.test
Framework tasks will be submitted to Kubernetes cluster in a same way as in previous example.
With Python example framework both framework scheduler and custom executor will run in self contained docker containers with no relying on the persistent volume. Python framework test_framework.py
file is added in /urb/bin
directory in docker image python-framework.dockerfile based on urb-python-base.dockerfile which includes all required URB binary and Python dependencies. Python example framework job is defined in test/example-frameworks/python-framework.yaml. Similarly, custom executor runner docker image python-executor-runner.dockerfile contains custom executor test_executor.py
file on the same path (/urb/bin
). This image is based on generic urb-executor-runner.dockerfile. And framework configurtion section TestFrameworkPy*FrameworkConfig
for Python example framework in etc/urb.conf has executor_runner = local/python-executor-runner
configuration option which points to custom executor docker image.
Run Python framework example with following command:
kubectl create -f test/example-frameworks/python-framework.yaml
The output from the framework can be examined with (the pod name has to be replaced with actual name from your environment):
kubectl logs python-framework-ck8zz
Some Mesos framework schedulers, such as Marathon or Chronos, have a dependency on Zookeeper. Like the C++ and Python example frameworks they can run inside or outside of a Kubernetes cluster submitting their tasks to the Kubernetes cluster.
In this example the Marathon scheduler will be running inside of the Kubernetes cluster. The external project kubernetes-zookeeper is used to install Zookeeper in minikube:
kubectl create -f test/marathon/kubernetes-zookeeper-master/zoo-rc.yaml
kubectl create -f test/marathon/kubernetes-zookeeper-master/zoo-service.yaml
Create Marathon Docker image. This image is based on mesosphere/marathon image thus URB binary dependencies has to be installed directly with package manager in test/marathon/marathon.dockerfile.
cd test/marathon
docker build --rm -t local/marathon -f marathon.dockerfile .
Create Marathon Kubernetes service and deployment with test/marathon/marathon.yaml. It relies on persistent volume urb-pvc
with URB installation located in /opt/urb
created by helper script in first C++ example.
kubectl create -f marathon.yaml
Now Marathon scheduler can be accessed from the web browser at:
minikube service marathonsvc --url
Marathon jobs will be dispatched to minikube Kubernetes cluster.
In this section Python Pi, wordcount examples and PySpark Shell from the Spark data processing framework will be demonstrated.
From the project root on the host machine run following script:
test/spark/run.sh
It creates a Spark installation in the persistent volume (test/spark/pv.yaml) accessible by both driver and executor sides of the Spark application, creates a Docker container (test/spark/spark.dockerfile) based on URB binary base image (urb-bin-base
) and corresponding Kubernetes job (test/spark/spark.yaml) which will be used to run driver side of the Spark Pi application. This Spark example will be registered as Mesos framework with name SparkExamplePi
defined in parameter to --name
option of spark-submit
command. Correspondingly Spark*FrameworkConfig
configuration section in etc/urb.conf has to be configured with persistent_volume_claims = spark-pvc:/opt/spark-2.1.0-bin-hadoop2.7
for generic URB executor runner urb-executor-runner
to be able to access this persistent volumes.
Upon execution of the script, determine a Spark driver pod name with kubectl get pods | grep "^spark"
.
Run the Spark Pi example on the pod with the name from the previous command:
kubectl exec spark-7g14w -it -- /opt/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --name SparkExamplePi --master mesos://urb://urb-master:6379 /opt/spark-2.1.0-bin-hadoop2.7/examples/src/main/python/pi.py
It should produce an output which includes Pi number estimate similar to:
Pi is roughly 3.140806
Alternatively, the Spark wordcount example can be run without relying on persistent volume to keep Spark deployment but using custom framework test/spark/spark-driver.dockerfile and executor runner test/spark/spark-exec.dockerfile docker images containing Spark run time files with corresponding Kubernetes job test/spark/spark-driver.yaml which will be used to run driver side of the Spark wordcount application. This example requires some input text file (for example current README.md
file) which will be located on separate persistent volume test/spark/scratch-pv.yaml (created in previous example by test/spark/run.sh
script). The spark-submit
command will have --name CustomWordCount
parameter so Custom*FrameworkConfig
framework configuration section of etc/urb.conf file has to be configured for persistent volume claim scratch-pvc
and custom executor runner docker image local/spark-exec
with:
executor_runner = local/spark-exec
persistent_volume_claims = scratch-pvc:/scratch
Create docker images running following commands on the host machine:
cd test/spark
docker build --rm -t local/spark-driver -f spark-driver.dockerfile .
docker build --rm -t local/spark-exec -f spark-exec.dockerfile .
Create Spark driver job in Kubernetes cluster from the root of the project:
kubectl create -f spark-driver.yaml
Determine a Spark driver pod name with kubectl get pods | grep spark-driver
.
Run Spark wordcount example using previously determined Spark driver pod name:
kubectl exec spark-driver-7g14w -it -- /opt/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --name CustomWordCount --master mesos://urb://urb-master:6379 /opt/spark-2.1.0-bin-hadoop2.7/examples/src/main/python/wordcount.py file:///scratch/README.md
It should produce an output with the list of the words and a number of occurances in README.md
file.
Following command will start Python Spark Shell with 4 as a total amount of cores which can be used by all executors for this interactive session:
kubectl exec spark-driver-7g14w -it -- /opt/spark-2.1.0-bin-hadoop2.7/bin/pyspark --master mesos://urb://urb-master:6379 --total-executor-cores 4
Python Spark Shell uses PySparkShell
framework name to register, thus taking configuration from PySparkShellFrameworkConfig
section of URB configuration file etc/urb.conf and being able to access persistent volume scratch-pc
for data store. Now some Spark action can be executed, for example:
>>> rdd = sc.textFile('/scratch/README.md')
>>> rdd.collect()
URB configuration file etc/urb.conf consists of multiple configuration settings documented inside a file. Most commonly frameworks configurations and URB service logging levels would be modified. After modification, URB configuration can be reloaded with:
kubectl create configmap urb-config --from-file=etc/urb.conf --dry-run -o yaml | kubectl replace -f -