diff --git a/docs/source/about/glossary.rst b/docs/source/about/glossary.rst
index 1bfd27857c..4092d308c1 100644
--- a/docs/source/about/glossary.rst
+++ b/docs/source/about/glossary.rst
@@ -1,37 +1,54 @@
Glossary
========
-**App**
-~~~~~~~~
-
-Marathon app. A unit of configuration in Marathon. During normal
-operation, one service "instance" maps to one Marathon app, but during
-deploys there may be more than one app. Apps contain Tasks.
-
**Docker**
~~~~~~~~~~
Container `technology `_ that
PaaSTA uses.
+**Kubernetes**
+~~~~~~~~~~~~~~
+
+`Kubernetes `_ (a.k.a. k8s) is the open-source system on which Yelp runs many compute workloads.
+In Kubernetes, tasks are distributed to and run by servers called Kubelets (but a.k.a. kube nodes or Kubernetes agents) from the Kubernetes control plane.
+
+**Kubernetes Node**
+~~~~~~~~~~~~~~~~~~~
+
+A node is a worker machine in a Kubernetes cluster that runs Pods.
+In our case, it's usually a virtual machine provisioned via AWS EC2 Fleets or AutoScalingGroups
+
+**Kubernetes Horizontal Pod Autoscaler (HPA)**
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It's a Kubernetes feature that automatically scales the number of pods in a deployment based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
+
**clustername**
~~~~~~~~~~~~~~~
A shortname used to describe a PaaSTA cluster. Use \`paasta
list-clusters\` to see them all.
+**Kubernetes pod**
+~~~~~~~~~~~~~~~~~~~
+
+Atomic deployment unit for PaaSTA workloads at Yelp and all Kubernetes clusters. Can be thought of as a collection of 1 or more related containers.
+Pods can be seen as one or more containers that share a network namespace, at Yelp these are individual instances of one of our services, many can run on each server.
+
**instancename**
~~~~~~~~~~~~~~~~
-Logical collection of Mesos tasks that comprise a Marathon app. service
-name + instancename = Marathon app name. Examples: main, canary.
+Logical collection of Kubernetes pods that comprise a Kubernetes Deployment. service
+name + instancename = Kubernetes Deployment. Examples: main, canary. Each instance represents a running
+version of a service with its own configuration and resources.
**namespace**
~~~~~~~~~~~~~
An haproxy/SmartStack concept grouping backends that listen on a
-particular port. A namespace may route to many healthy Marathon
-instances. By default, the namespace in which a Marathon job appears is
+particular port. A namespace may route to many healthy paaSTA
+instances. By default, the namespace in which a Kubernetes deployment appears is
its instancename.
**Nerve**
@@ -40,32 +57,6 @@ its instancename.
A service announcement `daemon `_
that registers services in zookeeper to be discovered.
-**Marathon**
-~~~~~~~~~~~~
-
-A `Mesos Framework `_
-designed to deploy stateless services.
-
-**Mesos**
-~~~~~~~~~
-
-A `Cluster/Scheduler `_ that interacts
-with other `Framework `_
-software to run things on nodes.
-
-**Mesos Master**
-~~~~~~~~~~~~~~~~
-
-A machine running a Mesos Master process, responsible for coordination
-but not responsible for actually running Marathon or Tron jobs. There
-are several Masters, coordinating as a quorum via Zookeeper.
-
-**Mesos Slave**
-~~~~~~~~~~~~~~~
-
-A machine running a Mesos Slave process, responsible for running
-Marathon or Tron jobs as assigned by the Mesos Master.
-
**PaaSTA**
~~~~~~~~~~
@@ -87,12 +78,6 @@ The brand name for Airbnb’s Nerve + Synapse service discovery solution.
A local haproxy daemon that runs on yocalhost
-**Task**
-~~~~~~~~
-
-Marathon task. A process (usually inside a Docker container) running on
-a machine (a Mesos Slave). One or more Tasks constitutes an App.
-
**soa-configs**
~~~~~~~~~~~~~~~
@@ -107,5 +92,5 @@ services.
**Zookeeper**
~~~~~~~~~~~~~
-A distributed key/value store used by Mesos for coordination and
+A distributed key/value store used by PaaSTA for coordination and
persistence.
diff --git a/docs/source/about/paasta_principles.rst b/docs/source/about/paasta_principles.rst
index ee7fbe404c..7ad5baac39 100644
--- a/docs/source/about/paasta_principles.rst
+++ b/docs/source/about/paasta_principles.rst
@@ -54,7 +54,7 @@ a particular app in a theoretical PaaS:
+=============================================+=====================================+
| :: | :: |
| | |
-| $ cat >marathon-cluster.yaml <kubernetes-cluster.yaml <`_ to deploy
+PaaSTA uses `Kubernetes `_ to deploy
long-running services. At Yelp, PaaSTA clusters are deployed at the
``superregion`` level. This means that a service could potentially be deployed
on any available host in that ``superregion`` that has resources to run it. If
-PaaSTA were unaware of the Smartstack ``discover:`` settings, Marathon would
-naively deploy tasks in a potentially "unbalanced" manner:
+PaaSTA were unaware of the Smartstack ``discover:`` settings, Kubernetes scheduler would
+naively deploy pods in a potentially "unbalanced" manner:
.. image:: unbalanced_distribution.svg
:width: 700px
-With the naive approach, there is a total of six tasks for the superregion, but
+With the naive approach, there is a total of six pods for the superregion, but
four landed in ``region 1``, and two landed in ``region 2``. If
the ``discover`` setting were set to ``habitat``, there would be habitats
-**without** tasks available to serve anything, likely causing an outage.
+**without** pods available to serve anything, likely causing an outage.
In a world with configurable SmartStack discovery settings, the deployment
-system (Marathon) must be aware of these and deploy accordingly.
+system (Kubernetes) must be aware of these and deploy accordingly.
-What A SmartStack-Aware Deployment Looks Like
+How to set PaaSTA to be aware of SmartStack
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-By taking advantage of
-`Marathon Constraint Language `_
-, specifically the
-`GROUP_BY `_
-operator, Marathon can deploy tasks in such a way as to ensure a balanced number
-of tasks in each latency zone.
-
-Example: Balanced deployment to every habitat
-*********************************************
-
-For example, if the SmartStack setting
-were ``discover: habitat`` [1]_, we Marathon could enforce the constraint
-``["habitat", "GROUP_BY"]``, which will ask Marathon to distribute tasks
-evenly between the habitats[2]_:
-
-.. image:: balanced_distribution.svg
- :width: 700px
-
-Example: Deployment balanced to each region
-*******************************************
-
-Similarly, if the ``discover`` setting were set to ``region``, the equivalent
-Marathon constraint would ensure an equal number of tasks distributed to each region.
-
-.. image:: balanced_distribution_region.svg
- :width: 700px
-
-Even though there some habitats in this diagram that lack the service, the
-``discover: region`` setting allows clients to utilize *any* process as long
-as it is in the local region. The Marathon constraint of ``["region", "GROUP_BY"]``
-ensures that tasks are distributed equally over the regions, in this case three
-in each.
-
-
-.. [1] Technically PaaSTA should be using the smallest value of the ``advertise``
- setting, tracked in `PAASTA-1253 `_.
-.. [2] Currently the ``instances:`` count represents the total number of
- instances in the cluster. Eventually with `PAASTA-1254 `_
- the instance count will be a per-discovery-location setting, meaning there
- will always be an equal number of instances per location. (With ``instances: 6``
- and a ``discovery: habitat``, and three habitats, the total task count would be
- 18, 6 in each habitat.)
-
+To balance pods across Availability Zones (AZs) in Kubernetes, we use `topology spread contraints `_. By using the key
+"topology_spread_constraints" in soa-configs to assign it for each instance of a service.
How SmartStack Settings Influence Monitoring
--------------------------------------------
@@ -116,7 +75,7 @@ Example: Checking Each Habitat When ``discover: habitat``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If SmartStack is configured to ``discover: habitat``, PaaSTA configures
-Marathon to balance tasks to each habitat. But what if it is unable to do that?
+Kubernetes to balance tasks to each habitat. But what if it is unable to do that?
.. image:: replication_alert_habitat.svg
:width: 700px
@@ -154,7 +113,7 @@ Example: Checking Each Region When ``discover: region``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If SmartStack is configured to ``discover: region``, PaaSTA configures
-Marathon to balance tasks to each region. But what if it is unable to launch
+Kubernetes to balance tasks to each region. But what if it is unable to launch
all the tasks, but there were tasks running in that region?
.. image:: replication_noalert_region.svg
@@ -189,9 +148,9 @@ components of the same service on different ports. In PaaSTA we call these
api:
proxy_port: 20002
-The corresponding Marathon configuration in PaaSTA might look like this::
+The corresponding Kubernetes configuration in PaaSTA might look like this::
- #marathon.yaml
+ #kubernetes.yaml
main:
instances: 10
cmd: myserver.py
@@ -214,7 +173,7 @@ the same Nerve namespace. Consider this example::
main:
proxy_port: 20001
- #marathon.yaml
+ #kubernetes.yaml
main:
instances: 10
cmd: myserver.py
@@ -238,7 +197,7 @@ Sharding is another use case for using alternative namespaces::
main:
proxy_port: 20001
- #marathon.yaml
+ #kubernetes.yaml
shard1:
instances: 10
registrations: ['service.main']
diff --git a/docs/source/contributing.rst b/docs/source/contributing.rst
index 3264867a0b..ad6ba7d14f 100644
--- a/docs/source/contributing.rst
+++ b/docs/source/contributing.rst
@@ -29,7 +29,7 @@ System Package Building / itests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PaaSTA is distributed as a debian package. This package can be built and tested
-with ``make itest_xenial``. These tests make assertions about the
+with ``make itest_``. These tests make assertions about the
packaging implementation.
@@ -71,12 +71,3 @@ it is a little tricky.
* ``eval "$(.tox/py27/bin/register-python-argcomplete ./tox/py27/bin/paasta)"``
* There is a simple integration test. See the itest/ folder.
-
-Upgrading Components
---------------------
-
-As things progress, there will come a time that you will have to upgrade
-PaaSTA components to new versions.
-
-* See `Upgrading Mesos `_ for how to upgrade Mesos safely.
-* See `Upgrading Marathon `_ for how to upgrade Marathon safely.
diff --git a/docs/source/generated/paasta_tools.rst b/docs/source/generated/paasta_tools.rst
index 7ab576c7c9..de15c7ecf5 100644
--- a/docs/source/generated/paasta_tools.rst
+++ b/docs/source/generated/paasta_tools.rst
@@ -13,7 +13,6 @@ Subpackages
paasta_tools.frameworks
paasta_tools.instance
paasta_tools.kubernetes
- paasta_tools.mesos
paasta_tools.metrics
paasta_tools.monitoring
paasta_tools.paastaapi
@@ -71,9 +70,6 @@ Submodules
paasta_tools.log_task_lifecycle_events
paasta_tools.long_running_service_tools
paasta_tools.mac_address
- paasta_tools.marathon_dashboard
- paasta_tools.mesos_maintenance
- paasta_tools.mesos_tools
paasta_tools.monitoring_tools
paasta_tools.monkrelaycluster_tools
paasta_tools.nrtsearchservice_tools
diff --git a/docs/source/installation/example_cluster.rst b/docs/source/installation/example_cluster.rst
index a612783e22..d0dd079456 100644
--- a/docs/source/installation/example_cluster.rst
+++ b/docs/source/installation/example_cluster.rst
@@ -24,11 +24,6 @@ everything with ``docker-compose down && docker-compose run playground``.
Getting Started
---------------
-Mesos
-~~~~~
-To launch a running Mesos cluster, then run ``docker-compose run playground``
-and you'll be dropped into a shell with the paasta\_tools package installed in development mode.
-
Kubernetes
~~~~~~~~~~
To instead launch a Kubernetes cluster, run
@@ -47,9 +42,7 @@ Try it out
The cluster includes a git remote and docker registry. The git remote
contains an example repo but you can add more if you want.
-The mesos and marathon webuis are exposed on your docker host
-on port 5050, 8080, 8081. So load them up if you want to watch. Then in
-the playground container:
+In the playground container:
::
@@ -63,9 +56,8 @@ the playground container:
Scaling The Cluster
-------------------
-If you want to add more capacity to the cluster, you can increase the number of Mesos agents/Kubernetes Nodes:
+If you want to add more capacity to the cluster, you can increase the number of Kubernetes Nodes:
-``docker-compose scale mesosslave=4`` or
``docker-compose scale kubernetes=4``
@@ -79,9 +71,8 @@ Some but not all of the paasta command line tools should work. Try:
paasta status -s hello-world
Scribe is not included with this example cluster. If you are looking for
-logs, check ``/var/logs/paasta_logs`` and syslog on the mesosmaster for
-the output from cron. Also note that all the slaves share the host's
-docker daemon.
+logs, check syslog on the kubernetes node that the pod is running on for the output from cron.
+You can get the host the pod is running on by adding "-v" to the command above.
Cleanup
-------
diff --git a/docs/source/installation/getting_started.rst b/docs/source/installation/getting_started.rst
index 13d562de2d..f87db3ed45 100644
--- a/docs/source/installation/getting_started.rst
+++ b/docs/source/installation/getting_started.rst
@@ -33,9 +33,7 @@ are currently not available, so one must build them and install them manually::
make itest_xenial
sudo dpkg -i dist/paasta-tools*.deb
-This package must be installed anywhere the PaaSTA CLI and on the Mesos/Marathon
-masters. If you are using SmartStack for service discovery, then the package must
-be installed on the Mesos Slaves as well so they can query the local API.
+This package must be installed anywhere the PaaSTA CLI and on the kube nodes.
Once installed, ``paasta_tools`` reads global configuration from ``/etc/paasta/``.
This configuration is in key/value form encoded as JSON. All files in ``/etc/paasta``
@@ -76,7 +74,7 @@ Docker and a Docker Registry
PaaSTA uses `Docker `_ to build and distribute code for each service. PaaSTA
assumes that a single registry is available and that the associated components
-(Docker commands, unix users, mesos slaves, etc) have the correct credentials
+(Docker commands, unix users, etc) have the correct credentials
to use it.
The docker registry needs to be defined in a config file in ``/etc/paasta/``.
@@ -91,34 +89,24 @@ filename is irrelevant, but here would be an example
There are many registries available to use, or you can
`host your own `_.
-Mesos
------
-
-PaaSTA uses Mesos to do the heavy lifting of running the actual services on
-pools of machines. See the `official documentation `_
-on how to get started with Mesos.
-
-Marathon
---------
+Kubernetes
+----------
-PaaSTA uses `Marathon `_ for supervising long-running services running in Mesos.
-See the `official documentation `__ for how to get started with Marathon.
-Then, see the `PaaSTA documentation <../yelpsoa_configs.html#marathon-clustername-yaml>`_ for how to define Marathon
-jobs.
+PaaSTA uses `Kubernetes `_ to manage and orchestrate its containerized services.
+See the `PaaSTA documentation <../yelpsoa_configs.html#kubernetes-clustername-yaml>`_ for how to define paasta
+services in Kubernetes.
-Once Marathon jobs are defined in soa-configs, there are a few tools provided by PaaSTA
-that interact with the Marathon API:
+Once PaaSTA services are defined in soa-configs, there are a few tools provided by PaaSTA
+that interact with the Kubernetes API:
-* ``deploy_marathon_services``: Does the initial sync between soa-configs and the Marathon API.
- This is the tool that handles "bouncing" to new version of code, and resizing Marathon applications when autoscaling
+* ``setup_kubernetes_job``: Does the initial sync between soa-configs and the Kubernetes API.
+ This is the tool that handles "bouncing" to new version of code, and resizing Kubernetes deployments when autoscaling
is enabled.
- This is idempotent, and should be run periodically on a box with a ``marathon.json`` file in the
- `system paasta config <../system_configs.html>`_ directory (Usually ``/etc/paasta``).
- We recommend running this frequently - delays between runs of this command will limit how quickly new versions of
- services or changes to soa-configs are picked up.
-* ``cleanup_marathon_jobs``: Cleans up lost or abandoned services. This tool
- looks for Marathon jobs that are *not* defined in soa-configs and removes them.
-* ``check_marathon_services_replication``: Iterates over all Marathon services
+ This is idempotent, and is ran periodically on a box with a ``deployments.json`` file in the
+ ``/nail/etc/services`` directory, updating or creating the Kubernetes Deployment object representing the modified service instance.
+* ``cleanup_kubernetes_jobs``: Cleans up lost or abandoned services. This tool
+ looks for Kubernetes instances that are *not* defined in soa-configs and removes them.
+* ``check_kubernetes_services_replication``: Iterates over all Kubernetes services
and inspects their health. This tool integrates with the monitoring infrastructure
and will alert the team responsible for the service if it becomes unhealthy to
the point where manual intervention is required.
@@ -128,7 +116,7 @@ SmartStack and Hacheck
`SmartStack `_ is
a dynamic service discovery system that allows clients to find and route to
-healthy mesos tasks for a particular service.
+healthy Kubernetes pods for a particular service.
Smartstack consists of two agents: `nerve `_ and `synapse `_.
Nerve is responsible for health-checking services and registering them in ZooKeeper.
Synapse then reads that data from ZooKeeper and configures an HAProxy instance.
@@ -137,7 +125,7 @@ To manage the configuration of nerve (detecting which services are running on a
we have a package called `nerve-tools `_.
This repo builds a .deb package, and should be installed on all slaves.
Each slave should run ``configure_nerve`` periodically.
-We recommend this runs quite frequently (we run it every 5s), since Marathon tasks created by Paasta are not available
+We recommend this runs quite frequently (we run it every 5s), since kubernetes pods created by Paasta are not available
to clients until nerve is reconfigured.
Similarly, to manage the configuration of synapse, we have a package called `synapse-tools `_.
diff --git a/docs/source/isolation.rst b/docs/source/isolation.rst
index f118361f19..4eefc8b737 100644
--- a/docs/source/isolation.rst
+++ b/docs/source/isolation.rst
@@ -1,27 +1,25 @@
==============================================
-Resource Isolation in PaaSTA, Mesos and Docker
+Resource Isolation in PaaSTA, Kubernetes and Docker
==============================================
PaaSTA instance definitions include fields that specify the required resources
-for your service. The reason for this is two-fold: firstly, so that whichever
-Mesos framework can evaluate which Mesos agent making
-offers have enough capacity to run the task (and pick one of the agents
-accordingly); secondly, so that tasks can be protected from especially noisy
-neighbours on a box. That is, if a task under-specifies the resources it
+for your service. The reason for this is two-fold: firstly, so that the Kubernetes scheduler
+can evaluate which Kubernetes nodes have enough capacity to schedule the kubernetes pods (representing paasta instances) on, in the cluster specified;
+secondly, so that the pods can be protected from especially noisy
+neighbours on a box. That is, if a pod under-specifies the resources it
requires to run, or in another case, has a bug that means that it consumes far
-more resources than it *should* require, then the offending tasks can be
+more resources than it *should* require, then the offending pods can be
isolated effectively, preventing them from having a negative impact on its
neighbours.
-This document is designed to give a more detailed review of how Mesos
-Frameworks such as Marathon use these requirements to run tasks on
-different Mesos agents, and how these isolation mechanisms are implemented.
+This document is designed to give a more detailed review of how Kubernetes
+use these requirements to run pods on different Kubernetes nodes, and how these isolation mechanisms are implemented.
Note: Knowing the details of these systems isn't a requirement of using PaaSTA;
most service authors may never need to know the details of such things. In
fact, one of PaaSTA's primary design goals is to *hide* the minutiae of
schedulers and resource isolation. However, this may benefit administrators
-of PaaSTA (and, more generally, Mesos clusters), and the simply curious.
+of PaaSTA (and, more generally, Kubernetes clusters), and the simply curious.
Final note: The details herein may, nay, will contain (unintended) inaccuracies.
If you notice such a thing, we'd be super grateful if you could open a pull
@@ -31,64 +29,55 @@ How Tasks are Scheduled on Hosts
--------------------------------
To first understand how these resources are used, one must understand how
-a task is run on a Mesos cluster.
-
-Mesos can run in two modes: Master and Agent. When a node is running Mesos in
-Master mode, it is responsible for communicating between agent processes and
-frameworks. A Framework is a program which wants to run tasks on the Mesos
-cluster.
-
-A master is responsible for presenting frameworks with resource offers.
-Resource offers are compute resource free for a framework to run a task. The
-details of that compute resource comes from the agent nodes, which regularly
-tell the Master agent the resources it has available for running tasks. Using
-the correct parlance, Mesos agents make 'offers' to the master.
-
-Once a master node receives offers from an agent, it forwards it to
-a framework. Resource offers are split between frameworks according to
-the master's configuration - there may be particular priority given
-to some frameworks.
-
-At Yelp, we treat the frameworks we run (at the time of writing, Marathon and
-Tron) equally. That means that frameworks *should* have offers distributed
-between them evenly, and all tasks are considered equal.
-
-It is then up to the framework to decide what it wants to do with an offer.
-The framework may decide to:
-
- * Reject the offer, if the framework has no tasks to run.
- * Reject the offer, if the resources included in the offer are not enough to
- match those required by the application.
- * Reject the offer, if attributes on the slave conflict with any constraints
- set by the task.
- * Accept the offer, if there is a task that requires resources less than or
- equal to the resources offered by the Agent.
-
-When rejecting an offer, the framework may apply a 'filter' to the offer. This
-filter is then used by the Mesos master to ensure that it does *not* resend
-offers that are 'filtered' by a framework. The default filter applied includes
-a timeout - a Master will not resend an offer to a framework for a period of 5
-seconds.
-
-If a framework decides it wants to accept a resource offer, it then tells the
-master to run a task on the agent. The details of the 'acceptance' include a
-detail of the task to be run, and the 'executor' used to run the task.
-
-By default, PaaSTA uses the 'Docker' executor everywhere. This means that *all*
-tasks launched by Marathon and Tron are done so with a Docker container.
-
-How Tasks are isolated from each other.
+a pod is run on a Kubernetes cluster.
+
+Kubernetes has two types of nodes: Master and worker nodes. The master node is
+is responsible for managing the cluster. This includes scheduling applications (i.e. paasta services), maintaining
+applications' desired states, scaling applications and rolling out new updates.
+
+The master node contains the following components:
+
+ * API Server: Exposes the Kubernetes API. It is the front-end for the Kubernetes control plane.
+ * Scheduler: Responsible for distributing workloads across multiple nodes.
+ * Controller Manager: Responsible for regulating the state of the cluster.
+ * etcd: Consistent and highly-available key value store used as Kubernetes' backing store for all cluster data.
+
+Worker nodes are the machines that run the workload. Each worker node runs the following components
+to manage the execution and networking of containers:
+
+ * Kubelet: An agent that runs on each node in the cluster. It makes sure that containers are running in a pod.
+ * Kube-proxy: Maintains network rules on nodes. These network rules allow network communication to pods from network sessions inside or outside of the cluster.
+ * Container runtime: The software that is responsible for running containers. Kubernetes supports several container runtimes: Docker, containerd, CRI-O, and any implementation of the Kubernetes CRI (Container Runtime Interface).
+
+
+When a new pod (representing a paasta instance) is created, the Kubernetes scheduler (kube-scheduler) will assign it to the best node for it to run on.
+The scheduler will take into account the resources required by the pod, the resources available on the nodes, and any constraints that are specified. It takes the following
+criteria into account when selecting a node to have the pod run on:
+
+ * Resource requirements: Checks if nodes have enough CPU, memory, and other resources requested by the pod.
+ * Node affinity: Checks if the pod should be scheduled on a node that has a specific label.
+ * Inter-pod affinity/anti-affinity: checks if the pod should be scheduled near or far from another pod.
+ * Taints and tolerations: Checks if the pod should be scheduled on a node that has a specific taint.
+ * Node selectors: Checks if the pod should be scheduled on a node that has a specific label.
+ * Custom Policies: any custom scheduling policies or priorities such as "deploy_whitelist", "deploy_blacklist" and "discovery" (used by smartstack).
+
+The scheduler will then score each node that can host the pod, based on the criteria above and any custom policies and then select the node
+with the highest score to run the pod on. If multiple nodes have the same highest score then one of them is chosen randomly. Once a node is selected, the scheduler assigns
+the pod to the node and the decision is then communicated back to the API server, which in turn notifies the kubelet on the chosen node to start the pod.
+
+How are paasta services isolated from each other.
---------------------------------------
-Given that a slave may run multiple tasks, we need to ensure that tasks cannot
+Given that a node may run multiple pods for paasta services, we need to ensure that pods cannot
'interfere' with one another. We do this on a file system level using Docker -
processes launched in Docker containers are protected from each other and the
host by using kernel namespaces. Note that the use of kernel namespaces is a
-feature of Docker - PaaSTA doesn't do anything 'extra' to enable this. It's
-also worth noting that there are other 'container' technologies that could
-provide this - the native Mesos 'containerizer' included.
+feature of Docker - PaaSTA doesn't do anything 'extra' to enable this. In addition,
+Kubernetes namespaces provide further isolation of resources within the cluster. Each paasta service
+is assigned to a namespace, and resources within a namespace are isolated from those in other namespaces. This helps in managing
+resources for different paasta services.
-However, these tasks are still running and consuming resources on the same
+However, these pods are still running and consuming resources on the same
host. The next section aims to explain how PaaSTA services are protected from
so-called 'noisy neighbours' that can starve others from resources.
@@ -130,21 +119,21 @@ If the processes in the cgroup reaches the ``memsw.limit_in_bytes`` value ,
then the kernel will invoke the OOM killer, which in turn will kill off one of
the processes in the cgroup (often, but not always, this is the biggest
contributor to the memory usage). If this is the only process running in the
-Docker container, then the container will die. The mesos framework which
-launched the task may or may not decide to try and start the same task
-elsewhere.
+Docker container, then the container will die. Kubernetes will attempt to reschedule the pod
+to maintain the desired number of replicas specified in the Deployment. For each paasta instance, a deployment is created
+which manages the state of the pods for that instance, ensuring that a specified number of replicas (specified in soa-configs) are running at any given time.
CPUs
""""
CPU enforcement is implemented slightly differently. Many people expect the
value defined in the ``cpus`` field in a service's soa-configs to map to a
-number of cores that are reserved for a task. However, isolating CPU time like
+number of cores that are reserved for a pod. However, isolating CPU time like
this can be particularly wasteful; unless a task spends 100% of its time on
-CPU (and thus has *no* I/O), then there is no need to prevent other tasks from
+CPU (and thus has *no* I/O), then there is no need to prevent other pods from
running on the spare CPU time available.
-Instead, the CPU value is used to give tasks a relative priority. This priority
+Instead, the CPU value is used to give pods a relative priority. This priority
is used by the Linux Scheduler decide the order in which to run waiting
threads.
@@ -170,17 +159,9 @@ Some notes on this:
against the share available for another. The result of this may be that
a higher number of 'skinny' containers may be preferable to 'fat' containers.
-This is different from how Mesos and Marathon use the CPU value when evaluating
-whether a task 'fits' on a host. Yelp configures agents to advertise the number
-of cores on the box, and Marathon will only schedule containers on agents where
-there is enough 'room' on the host, when in reality, there is no such limit.
-
Disk
"""""
-Unfortunately, the isolator provided by Mesos does not support isolating disk
-space used by Docker containers; that is, we have no way of limiting the amount
-of disk space used by a task. Our best effort is to ensure that the disk space
-is part of the offer given by a given Mesos agent to frameworks, and ensure
-that any services we know to use high disk usage (such as search indexes) have
-the ``disk`` field set appropriately in their configuration.
+Kubernetes supports disk resource isolation through the use of Persistent Volumes (PVs), Persistent Volume Claims (PVCs), and
+storage quotas. Paasta instances pods can claim a portion of storage through PVCs. This doesn't limit the disk space that the pod can use directly but
+it allows for the allocation of storage resources to pods. Disk resource is also isolated through the use of namespaces - each namespace has a set quota for the total amount of storage that can be requested by the paasta service running in it.
diff --git a/docs/source/soa_configs.rst b/docs/source/soa_configs.rst
index 83054be5a4..40d4da88bf 100644
--- a/docs/source/soa_configs.rst
+++ b/docs/source/soa_configs.rst
@@ -22,8 +22,6 @@ directory. There is one folder per service. Here is an example tree::
├── api
│ ├── adhoc-prod.yaml
│ ├── deploy.yaml
- │ ├── marathon-dev.yaml
- │ ├── marathon-prod.yaml
│ ├── monitoring.yaml
│ ├── service.yaml
│ ├── smartstack.yaml
diff --git a/docs/source/style_guide.rst b/docs/source/style_guide.rst
index 0c0eb5941e..b4b6fc7015 100644
--- a/docs/source/style_guide.rst
+++ b/docs/source/style_guide.rst
@@ -47,9 +47,9 @@ Bad:
* Anything going to scribe should ALSO go to stdout.
Good:
- * setup_marathon_job => general output to stdout, app-specific output to scribe
+ * setup_kubernetes_job => general output to stdout, app-specific output to scribe
Bad:
- * setup_marathon_job | stdint2scribe (no selective filtering, raw stdout dump)
+ * setup_kubernetes_job | stdint2scribe (no selective filtering, raw stdout dump)
Good:
* paasta itest => Sends summary of pass or fail to scribe event log. Sends full output of the run to the scribe debug log
diff --git a/docs/source/system_configs.rst b/docs/source/system_configs.rst
index 9edfdf4e43..cbeeca57fb 100644
--- a/docs/source/system_configs.rst
+++ b/docs/source/system_configs.rst
@@ -2,7 +2,7 @@ System Paasta Configs
=====================
The "System Paasta Configs" inform Paasta about your environment and cluster setup, such as how to connect to
-Marathon/hacheck/etc, what the cluster name is, etc.
+Kubernetes/hacheck/etc, what the cluster name is, etc.
Structure
@@ -26,9 +26,6 @@ Configuration options
These are the keys that may exist in system configs:
- * ``zookeeper``: A zookeeper connection url, used for discovering where the Mesos leader is, and some locks.
- Example: ``"zookeeper": "zk://zookeeper1:2181,zookeeper2:2181,zookeeper3:2181/mesos"``.
-
* ``docker_registry``: The name of the docker registry where paasta images will be stored. This can optionally
be set on a per-service level as well, see `yelpsoa_configs `_
Example: ``"docker_registry": "docker-paasta.yelpcorp.com:443"``
@@ -44,9 +41,8 @@ These are the keys that may exist in system configs:
Example::
"dashboard_links": {
- "uswest1-prod": {
- "Mesos": "http://mesos.paasta-uswest1-prod.yelpcorp.com",
- "Cluster charts": "http://kibana.yelpcorp.com/something",
+ "norcal-devc": {
+ "Tron": "http://y/tron-norcal-devc",
}
}
@@ -97,15 +93,6 @@ These are the keys that may exist in system configs:
Example: ``"sensu_port": 3031``
- * ``dockercfg_location``: A URI of a .dockercfg file, to allow mesos slaves
- to authenticate with the docker registry.
- Defaults to ``file:///root/.dockercfg``.
- While this must be set, this file can contain an empty JSON dictionary (``{}``) if your docker registry does not
- require authentication.
- May use any URL scheme supported by Mesos's `fetcher module. `_
-
- Example: ``"dockercfg_location": "http://somehost/somepath"``
-
* ``synapse_port``: The port that haproxy-synapse exposes its status on.
Defaults to ``3212``.
@@ -113,7 +100,6 @@ These are the keys that may exist in system configs:
* ``synapse_host``: The default host that paasta should interrogate for haproxy-synapse state.
Defaults to ``localhost``.
- Primarily used in `check_marathon_services_replication `_.
Example: ``"synapse_host": 169.254.255.254``
diff --git a/docs/source/workflow.rst b/docs/source/workflow.rst
index 5aae3605d1..001f0ae735 100644
--- a/docs/source/workflow.rst
+++ b/docs/source/workflow.rst
@@ -7,9 +7,9 @@ Ways That PaaSTA Can Run Services
Long Running Services
^^^^^^^^^^^^^^^^^^^^^
-Long running services are are processes that are expected to run continuously
+Long running services are processes that are expected to run continuously
and usually have the same process id throughout. PaaSTA uses
-`Marathon `_ to configure how these
+`Kubernetes `_ to configure how these
services should run.
These services often serve network traffic, usually HTTP. PaaSTA integrates with
@@ -61,68 +61,13 @@ Deployment
A yelpsoa-configs master runs `generate_deployments_for_service `_
frequently. The generated ``deployments.json`` appears in ``/nail/etc/services/service_name`` throughout the cluster.
-Marathon masters run `deploy_marathon_services `_,
-a thin wrapper around ``setup_marathon_job``.
-These scripts parse ``deployments.json`` and the current cluster state,
-then issue commands to Marathon to put the cluster into the right state
--- cluster X should be running version Y of service Z.
-
How PaaSTA Runs Docker Containers
---------------------------------
-Marathon launches the Docker containers that comprise a PaaSTA service.
-
-Docker images are run by Mesos's native Docker executor. PaaSTA composes the
-configuration for the running image:
-
-* ``--attach``: stdout and stderr from running images are sent to logs that end
- up in the Mesos sandbox (currently unavailable).
-
-* ``--cpu-shares``: This is the value set in ``marathon.yaml`` as "cpus".
-
-* ``--memory``: This is the value set in ``marathon.yaml`` as "mem".
-
-* ``--memory-swap``: Total memory limit (memory + swap). We set this to the same value
- as "mem", rounded up to the nearest MB, to prevent containers being able to swap.
-
-* ``--net``: PaaSTA uses bridge mode to enable random port allocation.
-
-* ``--env``: Any environment variables specified in the ``env`` section will be here. Additional
- ``PAASTA_``, ``MARATHON_``, and ``MESOS_`` environment variables will also be injected, see the
- `related docs `_ for more information.
-
-* ``--publish``: Mesos picks a random port on the host that maps to and exposes
- port 8888 inside the container. This random port is announced to Smartstack
- so that it can be used for load balancing.
+Kubernetes launches the Docker containers that comprise a PaaSTA service. Once a pod is scheduled to start, the kubelet on the node running the pod interacts with the container runtime
+through the Container Runtime Interface (CRI) to start the container defined in the pod specification.
-* ``--privileged``: Containers run by PaaSTA are not privileged.
-
-* ``--restart``: No restart policy is set on PaaSTA containers. Restarting
- tasks is left as a job for the Framework (Marathon).
-
-* ``--rm``: Mesos containers are rm'd after they finish.
-
-* ``--tty``: Mesos containers are *not* given a tty.
-
-* ``--volume``: Volume mapping is controlled via the paasta_tools
- configuration. PaaSTA uses the volumes declared in ``/etc/paasta/volumes.json``
- as well as per-service volumes declared in ``extra_volumes`` declared
- in the `soa-configs `_.
-
-* ``--workdir``: Mesos containers are launched in a temporary "workspace"
- directory on disk. Use the workdir sparingly and try not to output files.
-
-Mesos is the actual system that runs the docker images. In Mesos land these are
-called "TASKS". PaaSTA-configured tasks use exponential backoff to prevent
-unhealthy tasks from continuously filling up disks and logs -- the more times
-that your service has failed to start, the longer Mesos will wait before
-trying to start it again.
-
-Mesos *will* healthcheck the task based on the same healthcheck that SmartStack
-uses, in order to prune unhealthy tasks. This pruning is less aggressive than
-SmartStack's checking, so a dead task will go DOWN in SmartStack before it is
-reaped by Marathon. By default the healthchecks occur every 10 seconds, and a service
-must fail 30 times before that task is pruned and a new one is launched in its place.
-This means a task had 5 minutes by default to properly respond to its healthchecks.
+Note: Kubernetes support multiple container runtimes, including Docker (via "dockershim", which is deprecated and removed as of Kubernetes v1.24), containerd, and CRI-O.
+In Yelp, we use docker and are currently in the process of migrating to containerd.
Time Zones In Docker Containers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -180,15 +125,10 @@ Monitoring
PaaSTA gives you a few `Sensu `_-powered
monitoring checks for free:
-* `setup_marathon_job `_:
- Alerts when a Marathon service cannot be deployed or bounced for some reason.
- It will resolve when a service has been successfully deployed/bounced.
-
-* `check_marathon_services_replication `_:
+* **check_kubernetes_services_replication**:
runs periodically and sends an alert if fewer than 50% of the requested
- instances are deployed on a cluster. If the service is registered in Smartstack
- it will look in Smartstack to count the available instances. Otherwise it
- counts the number of healthy tasks in Mesos.
+ instances are deployed on a cluster. it will look in Smartstack to count the available instances
+ against the expected amount of instances that should've been deployed via Kubernetes.
The PaaSTA command line
diff --git a/docs/source/yelpsoa_configs.rst b/docs/source/yelpsoa_configs.rst
index ebaf400d6b..b3acaef6ed 100644
--- a/docs/source/yelpsoa_configs.rst
+++ b/docs/source/yelpsoa_configs.rst
@@ -41,21 +41,15 @@ Example::
All configuration files that define something to launch on a PaaSTA Cluster can
specify the following options:
- * ``cpus``: Number of CPUs an instance needs. Defaults to 1. CPUs in Mesos
- are "shares" and represent a minimal amount of a CPU to share with a task
- relative to the other tasks on a host. A task can burst to use any
- available free CPU, but is guaranteed to get the CPU shares specified. For
- a more detailed read on how this works in practice, see the docs on `isolation `_.
+ * ``cpus``: Number of CPUs an instance needs. Defaults to 1. The CPU resource in kubernetes is measured in *CPU* units. One CPU in Kubernetes is equivalent to: 1 AWS vCPU, 1 GCP Core, 1 Azure vCore, 1 Hyperthread on a bare-metal Intel processor with Hyperthreading.
* ``cpu_burst_add``: Maximum number of additional CPUs an instance may use while bursting; if unspecified, PaaSTA defaults to 1 for long-running services, and 0 for scheduled jobs (Tron).
For example, if a service specifies that it needs 2 CPUs normally and 1 for burst, the service may go up to 3 CPUs, if needed.
- * ``mem``: Memory (in MB) an instance needs. Defaults to 4096 (4GB). In Mesos
- memory is constrained to the specified limit, and tasks will reach
+ * ``mem``: Memory (in MB) an instance needs. Defaults to 4096 (4GB). In Kubernetes
+ memory is constrained to the specified limit, and containers will reach
out-of-memory (OOM) conditions if they attempt to exceed these limits, and
- then be killed. There is currently not way to detect if this condition is
- met, other than a ``TASK_FAILED`` message. For more a more detailed read on
- how this works, see the docs on `isolation `_
+ then be killed.
* ``disk``: Disk (in MB) an instance needs. Defaults to 1024 (1GB). Disk limits
may or may not be enforced, but services should set their ``disk`` setting
@@ -79,6 +73,8 @@ specify the following options:
* ``PAASTA_RESOURCE_MEM``: Amount of ram in MB allocated to a container
* ``PAASTA_RESOURCE_DISK``: Amount of disk space in MB allocated to a container
* ``PAASTA_RESOURCE_GPUS``: Number of GPUS (if requested) allocated to a container
+ * ``PAASTA_IMAGE_VERSION``: The version of the docker image
+ * ``PAASTA_INSTANCE_TYPE``: The instance type of the service (e.g: tron, kubernetes, eks, etc)
* ``extra_volumes``: An array of dictionaries specifying extra bind-mounts
@@ -112,7 +108,7 @@ Placement Options
-----------------
Placement options provide control over how PaaSTA schedules a task, whether it
-is scheduled by Marathon (on Mesos), Kubernetes, Tron, or ``paasta remote-run``.
+is scheduled by Kubernetes, Tron, or ``paasta remote-run``.
Most commonly, it is used to restrict tasks to specific locations.
.. _general-placement-options:
@@ -120,7 +116,7 @@ Most commonly, it is used to restrict tasks to specific locations.
General
^^^^^^^
-These options are applicable to tasks scheduled through Mesos or Kubernetes.
+These options are applicable to tasks scheduled through Kubernetes.
* ``deploy_blacklist``: A list of lists indicating a set of locations to *not* deploy to. For example:
@@ -264,31 +260,6 @@ documentation on `node affinities
pod_management_policy: Parallel
-.. _mesos-placement-options:
-
-Mesos
-^^^^^
-
-These options are applicable only to tasks scheduled on Mesos.
-
- * ``constraints``: Overrides the default placement constraints for services.
- Should be defined as an array of arrays (E.g ``[["habitat", "GROUP_BY"]]``
- or ``[["habitat", "GROUP_BY"], ["hostname", "UNIQUE"]]``). Defaults to
- ``[[", "GROUP_BY"], ["pool", "LIKE", ],
- [, "UNLIKE", ], ...]``
- where ```` is defined by the ``discover`` attribute
- in ``smartstack.yaml``, ```` is defined by the ``pool`` attribute in
- ``marathon.yaml``, and ``deploy_blacklist_type`` and
- ``deploy_blacklist_value`` are defined in the ``deploy_blacklist`` attribute
- in marathon.yaml. For more details and other constraint types, see the
- official `Marathon constraint documentation
- `_.
-
- * ``extra_constraints``: Adds to the default placement constraints for
- services. This acts the same as ``constraints``, but adds to the default
- constraints instead of replacing them. See ``constraints`` for details on
- format and the default constraints.
-
``kubernetes-[clustername].yaml``
-------------------------------
@@ -522,225 +493,6 @@ a container is unhealthy, and the action to take is to completely destroy it and
launch it elsewhere. This is a more expensive operation than taking a container
out of the load balancer, so it justifies having less sensitive thresholds.
-``marathon-[clustername].yaml``
--------------------------------
-
-e.g. ``marathon-pnw-prod.yaml``, ``marathon-mesosstage.yaml``. The
-clustername is usually the same as the ``superregion`` in which the cluster
-lives (``pnw-prod``), but not always (``mesosstage``). It MUST be all
-lowercase. (non alphanumeric lowercase characters are ignored)
-
-**Note:** All values in this file except the following will cause PaaSTA to
-`bounce `_ the service:
-
-* ``min_instances``
-* ``instances``
-* ``max_instances``
-* ``backoff_seconds``
-
-Top level keys are instance names, e.g. ``main`` and ``canary``. Each
-instance MAY have:
-
- * Anything in the `Common Settings`_.
-
- * Anything from :ref:`General Placement Options `
- and :ref:`Mesos Placement Options `.
-
- * ``cap_add``: List of capabilities that are passed to Docker. Defaults
- to empty list. Example::
-
- "cap_add": ["IPC_LOCK", "SYS_PTRACE"]
-
- * ``instances``: Marathon will attempt to run this many instances of the Service
-
- * ``min_instances``: When autoscaling, the minimum number of instances that
- marathon will create for a service. Defaults to 1.
-
- * ``max_instances``: When autoscaling, the maximum number of instances that
- marathon will create for a service
-
- * ``registrations``: A list of SmartStack registrations (service.namespace)
- where instances of this PaaSTA service ought register in. In SmartStack,
- each service has difference pools of backend servers that are listening on
- a particular port. In PaaSTA we call these "Registrations". By default, the
- Registration assigned to a particular instance in PaaSTA has the *same name*,
- so a service ``foo`` with a ``main`` instance will correspond to the
- ``foo.main`` Registration. This would correspond to the SmartStack
- namespace defined in the Registration service's ``smartstack.yaml``. This
- ``registrations`` option allows users to make PaaSTA instances appear
- under an *alternative* namespace (or even service). For example
- ``canary`` instances can have ``registrations: ['foo.main']`` to route
- their traffic to the same pool as the other ``main`` instances.
-
- * ``backoff_factor``: PaaSTA will automatically calculate the duration of an
- application's backoff period in case of a failed launch based on the number
- of instances. For each consecutive failure that duration is multiplied by
- ``backoff_factor`` and added to the previous value until it reaches
- ``max_launch_delay_seconds``. See `Marathon's API docs `_
- for more information. Defaults to 2.
-
- * ``max_launch_delay_seconds``: The maximum time marathon will wait between attempts
- to launch an app that previously failed to launch. See `Marathon's API docs
- `_ for more information. Defaults to 300 seconds.
-
- .. _net:
-
- * ``net``: Specify which kind of
- `networking mode `_
- instances of this service should be launched using. Defaults to ``'bridge'``.
-
- * ``container_port``: Specify the port to expose when in ``bridge`` mode.
- Defaults to ``8888``.
-
- * ``bounce_method``: Controls the bounce method; see `bounce_lib `_
-
- * ``bounce_health_params``: A dictionary of parameters for get_happy_tasks.
-
- * ``check_haproxy``: Boolean indicating if PaaSTA should check the local
- haproxy to make sure this task has been registered and discovered
- (Defaults to ``True`` if service is in SmartStack)
-
- * ``min_task_uptime``: Minimum number of seconds that a task must be
- running before we consider it healthy (Disabled by default)
-
- * ``haproxy_min_fraction_up``: if ``check_haproxy`` is True, we check haproxy on up to 20 boxes to see whether a task is available.
- This fraction of boxes must agree that the task is up for the bounce to treat a task as healthy.
- Defaults to 1.0 -- haproxy on all queried boxes must agree that the task is up.
-
- * ``bounce_margin_factor``: proportionally increase the number of old instances
- to be drained when the crossover bounce method is used.
- 0 < bounce_margin_factor <= 1. Defaults to 1 (no influence).
- This allows bounces to proceed in the face of a percentage of failures.
- It doesn’t affect any other bounce method but crossover.
- See `the bounce docs `_ for a more detailed description.
-
- * ``bounce_start_deadline``: a floating point number of seconds to add to the deadline when deployd notices a change
- to soa-configs or the marked-for-deployment version of an instance.
- Defaults to 0. (deadline = now)
- When deployd has a queue of instances to process, it will choose to process instances with a lower deadline first.
- Set this to a large positive number to allow deployd to process other instances before this one, even if their
- soa-configs change or mark-for-deployment happened after this one.
- This setting only affects the first time deployd processes an instance after a change --
- instances that need to be reprocessed will be reenqueued normally.
-
- * ``drain_method``: Controls the drain method; see `drain_lib
- `_. Defaults to ``noop`` for
- instances that are not in Smartstack, or ``hacheck`` if they are.
-
- * ``drain_method_params``: A dictionary of parameters for the specified
- drain_method. Valid parameters are any of the kwargs defined for the
- specified bounce_method in `drain_lib `_.
-
- * ``cmd``: The command that is executed. Can be used as an alternative to
- args for containers without an `entrypoint
- `_. This value is
- wrapped by Mesos via ``/bin/sh -c ${app.cmd}``. Parsing the Marathon config
- file will fail if both args and cmd are specified [#note]_.
-
- * ``args``: An array of docker args if you use the `"entrypoint"
- `_ functionality.
- Parsing the Marathon config file will fail if both args and cmd are
- specified [#note]_.
-
- * ``monitoring``: See the `monitoring.yaml`_ section for details.
-
- * ``autoscaling``: See the `autoscaling docs `_ for valid options and how they work
-
- * ``metrics_provider``: Which method PaaSTA will use to determine a service's utilization.
-
- * ``decision_policy``: Which method PaaSTA will use to determine when to autoscale a service.
-
- * ``deploy_group``: A string identifying what deploy group this instance belongs
- to. The ``step`` parameter in ``deploy.yaml`` references this value
- to determine the order in which to build & deploy deploy groups. Defaults to
- ``clustername.instancename``. See the deploy group doc_ for more information.
-
- * ``replication_threshold``: An integer representing the percentage of instances that
- need to be available for monitoring purposes. If less than ``replication_threshold``
- percent instances of a service's backends are not available, the monitoring
- scripts will send a CRITICAL alert.
-
-In addition, each instancename MAY configure additional Marathon healthcheck
-options (Read the official
-`mesos documentation `_
-for more low-level details:
-
- * ``healthcheck_mode``: One of ``cmd``, ``tcp``, ``http``, or ``https``.
- If set to ``http`` or ``https``, a ``curl`` command will be executed
- inside the container.
-
- If set to ``cmd`` then PaaSTA will execute ``healthcheck_cmd`` and
- examine the return code. It must return 0 to be considered healthy.
-
- If the service is registered in SmartStack, the healthcheck_mode will
- automatically use the same setings specified by ``smartstack.yaml``.
-
- If not in smartstack, the default healthcheck is "None", which means
- the container is considered healthy unless it crashes.
-
- A http healthcheck is considered healthy if it returns a 2xx or 3xx
- response code.
-
- * ``healthcheck_cmd``: If ``healthcheck_mode`` is set to ``cmd``, then this
- command is executed inside the container as a healthcheck. It must exit
- with status code 0 to signify a successful healthcheck. Any other exit code
- is treated as a failure. This is a required field if ``healthcheck_mode``
- is ``cmd``.
-
- * ``healthcheck_grace_period_seconds``: Marathon will wait this long for a
- service to come up before counting failed healthchecks. Defaults to 60
- seconds.
-
- * ``healthcheck_interval_seconds``: Marathon will wait this long between
- healthchecks. Defaults to 10 seconds.
-
- * ``healthcheck_timeout_seconds``: Marathon will wait this long for a
- healthcheck to return before considering it a failure. Defaults to 10
- seconds.
-
- * ``healthcheck_max_consecutive_failures``: Marathon will kill the current
- task if this many healthchecks fail consecutively. Defaults to 30 attempts.
-
- * ``healthcheck_uri``: The url of the service to healthcheck if using http.
- Defaults to the same uri specified in ``smartstack.yaml``, but can be
- set to something different here.
-
-**Note**: Although many of these settings are inherited from ``smartstack.yaml``,
-their thresholds are not the same. The reason for this has to do with control
-loops and infrastructure stability. The load balancer tier can be pickier
-about which copies of a service it can send requests to, compared to Mesos.
-
-A load balancer can take a container out of service and put it back in a few
-seconds later. Minor flaps and transient errors are tolerated.
-
-The healthchecks specified here in this file signal to the infrastructure that
-a container is unhealthy, and the action to take is to completely destroy it and
-launch it elsewhere. This is a more expensive operation than taking a container
-out of the load balancer, so it justifies having less sensitive thresholds.
-
-**Footnotes**:
-
-.. [#note] The Marathon docs and the Docker docs are inconsistent in their
- explanation of args/cmd:
-
- The `Marathon docs
- `_
- state that it is invalid to supply both cmd and args in the same app.
-
- The `Docker docs `_
- do not state that it's incorrect to specify both args and cmd. Furthermore,
- they state that "Command line arguments to docker run will be
- appended after all elements in an exec form ENTRYPOINT, and will override
- all elements specified using CMD" which implies that both cmd and args can
- be provided, but cmd will be silently ignored.
-
- To avoid issues resulting from this discrepancy, we abide by the stricter
- requirements from Marathon and check that no more than one of cmd and args
- is specified. If both are specified, an exception is thrown with an
- explanation of the problem, and the program terminates.
-
-.. _doc: deploy_groups.html
-
``tron-[clustername].yaml``
--------------------------------
@@ -796,8 +548,6 @@ Each Tron **action** of a job MAY specify the following:
* Anything in the `Common Settings`_.
* Anything from :ref:`General Placement Options `
- and :ref:`Mesos Placement Options ` (currently, Tron
- only supports Mesos workloads).
* ``service``: Uses a docker image from different service. When ``service`` is set
for an action, that setting takes precedence over what is set for the job.
@@ -843,15 +593,38 @@ Each instance MAY have:
* Anything in the `Common Settings`_.
- * ``net``
+ * ``net``: Specify which kind of
+ `networking mode `_
+ instances of this service should be launched using. Defaults to ``'bridge'``.
- * ``cmd``
+ * ``cmd``: The command that is executed. Can be used as an alternative to
+ args for containers without an `entrypoint
+ `_. [#note]_.
- * ``args``
+ * ``args``: An array of docker args if you use the `"entrypoint"
+ `_ functionality. [#note]_.
- * ``deploy_group``
+ * ``deploy_group``: A string identifying what deploy group this instance belongs
+ to. The ``step`` parameter in ``deploy.yaml`` references this value
+ to determine the order in which to build & deploy deploy groups. Defaults to
+ ``clustername.instancename``. See the deploy group doc_ for more information.
-See the `marathon-[clustername].yaml`_ section for details for each of these parameters.
+**Footnotes**:
+
+.. [#note] The Docker docs explanation on using both args and cmd:
+ The `Docker docs `_
+ do not state that it's incorrect to specify both args and cmd. Furthermore,
+ they state that "Command line arguments to docker run will be
+ appended after all elements in an exec form ENTRYPOINT, and will override
+ all elements specified using CMD" which implies that both cmd and args can
+ be provided, but cmd will be silently ignored.
+
+ To avoid issues resulting from this discrepancy, we abide by the stricter
+ requirements and check that no more than one of cmd and args
+ is specified. If both are specified, an exception is thrown with an
+ explanation of the problem, and the program terminates.
+
+.. _doc: deploy_groups.html
``smartstack.yaml``
-------------------
@@ -872,7 +645,7 @@ Here is an example smartstack.yaml::
The ``main`` key is the service namespace. Namespaces were introduced for
PaaSTA services in order to support running multiple daemons from a single
-service codebase. In PaaSTA, each instance in your marathon.yaml maps to a
+service codebase. In PaaSTA, each instance in your kubernetes.yaml maps to a
smartstack namespace of the same name, unless you specify a different
``registrations``.
@@ -1130,12 +903,6 @@ An example of switching from region to superregion discovery:
- advertise: [region]
+ advertise: [region, superregion]
-1b. When moving from a large grouping to a smaller grouping (like
-moving from superregion => region) you must add an additional constraint
-to ensure Marathon balances the tasks evenly::
-
- extra_constraints: [['region', 'GROUP_BY', 2]]
-
2. (Optional) Use zkCli.sh to monitor your new registrations for each
superregion you are changing::
@@ -1145,7 +912,7 @@ superregion you are changing::
[host1-uswest1adevc_0000015910, host2-uswest1cdevc_0000015898, host3-uswest1cdevc_0000015893]
[zk: 10.40.5.6:22181(CONNECTED) 2]
-2b. Run ``paasta status -v`` to verify that Marathon has balanced services
+2b. Run ``paasta status -v`` to verify that paasta has balanced services
across the infrastructure as expected.
3. Once zookeeper shows the proper servers, switch the discovery key::
@@ -1254,7 +1021,7 @@ An example of a service that only pages on a cluster called "prod"::
team: devs
page: false
- # marathon-prod.yaml
+ # kubernetes-prod.yaml
main:
instances: 3
monitoring:
@@ -1273,13 +1040,13 @@ A service that pages everywhere, but only makes a ticket for a tron job::
page: false
ticket: true
-A marathon/kubernetes service that overrides options on different instances (canary)::
+A kubernetes service that overrides options on different instances (canary)::
# monitoring.yaml
team: frontend
page: false
- # marathon-prod.yaml or kubernetes-prod.yaml
+ # kubernetes-prod.yaml
main:
instances: 20
monitoring: