Skip to content

Commit

Permalink
document ResourceUsage and Metrics CR (#975)
Browse files Browse the repository at this point in the history
* Add docs

* add ResourceUsage and Metrics document details

* revise based on comments

* revise based on comments

* revise based on comments

* revise based on comments

* revise based on comments

* revise based on comments

---------

Co-authored-by: Shiming Zhang <[email protected]>
  • Loading branch information
caozhuozi and wzshiming authored Mar 25, 2024
1 parent 014fec7 commit 5c13cee
Show file tree
Hide file tree
Showing 13 changed files with 424 additions and 2 deletions.
28 changes: 28 additions & 0 deletions demo/resource-usage.demo
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Let's getting started with kwokctl!
kwokctl create cluster --enable-metrics-server -c ./kustomize/metrics/resource/metrics-resource.yaml -c ./kustomize/metrics/usage/usage-from-annotation.yaml

kwokctl scale node --replicas 2
kwokctl scale pod --replicas 8

# Wait for a while to let the metrics server collect the metrics.
sleep 45

# Now we can check the metrics.
kubectl top node
kubectl top pod

# Let's add some usage metrics to the pods.
kubectl patch pod pod-000000 --type=json -p='[{"op":"add","path":"/metadata/annotations","value":{"kwok.x-k8s.io/usage-cpu":"10000m","kwok.x-k8s.io/usage-memory":"10000Mi"}}]'

# Wait for a while to let the metrics server collect the metrics.
sleep 15

# Now we can check the metrics again.
kubectl top node
kubectl top pod

# Delete the cluster.
kwokctl delete cluster

# That's all, enjoy it!
clear
1 change: 1 addition & 0 deletions demo/resource-usage.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions kustomize/metrics/resource/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Metrics Resource

This Metrics simulates kubelet's `/metrics/resource` endpoint.
Please refer to [Metrics](https://kwok.sigs.k8s.io/docs/user/metrics-configuration) for more on how it works.

8 changes: 8 additions & 0 deletions kustomize/metrics/usage/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Resource Usage

This ResourceUsage simulates the resource usage of Pod(s) based on information collected from the respective annotations.

Provided two annotations for Pod(s):

- `kwok.x-k8s.io/usage-cpu`
- `kwok.x-k8s.io/usage-memory`
12 changes: 12 additions & 0 deletions site/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,18 @@ menu:
- identifier: attach
pageRef: "/docs/user/attach-configuration"
parent: configuration
- identifier: metrics
pageRef: "/docs/user/metrics-configuration"
parent: configuration
- identifier: resource-usage
pageRef: "/docs/user/resource-usage-configuration"
parent: configuration
- identifier: go-template
pageRef: "/docs/user/go-template"
parent: configuration
- identifier: cel-expressions
pageRef: "/docs/user/cel-expressions"
parent: configuration

# Design Children
- identifier: introduction
Expand Down
2 changes: 2 additions & 0 deletions site/content/en/docs/user/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ If any special concerns, you can configure KWOK with options and stages.
- [Exec]
- [Logs]
- [Attach]
- [ResourceUsage]

I hope this helps you get started with KWOK! Good luck and have fun!

Expand All @@ -53,3 +54,4 @@ I hope this helps you get started with KWOK! Good luck and have fun!
[Exec]: {{< relref "/docs/user/exec-configuration" >}}
[Logs]: {{< relref "/docs/user/logs-configuration" >}}
[Attach]: {{< relref "/docs/user/attach-configuration" >}}
[ResourceUsage]: {{< relref "/docs/user/resource-usage-configuration" >}}
77 changes: 77 additions & 0 deletions site/content/en/docs/user/cel-expressions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
title: "CEL Expressions in `kwok`"
---

# Notes on CEL Expressions in `kwok`

The page provides a concise note on writing CEL expressions in `kwok` CRs.

Below is the list of all CRs in `kwok` that contains CEL based fields.
* [Metric]
* [ResourceUsage]
* [ClusterResourceUsage]


You must follow [the CEL language specification] when writing the expressions.
For predefined functions of CEL, please refer to [CEL predefined functions].

Besides the built-in functions, `kwok` also provides some customized extension functions.
An exhaustive list of all the extension functions with their usages is given below.

* `Now()`: takes no parameters and returns the current timestamp.
* `Rand()`: takes no parameters and returns a random `float64` value.
* `SinceSecond()` returns the seconds elapsed since a given resource (`pod` or `node`) was created.
For example: `SinceSecond(pod)`, `node.SinceSecond(node)`.
* `UnixSecond()` returns the Unix time of a given time of type `time.Time`.
For example: , `UnixSecond(Now())`, `UnixSecond(node.metadata.creationTimestamp)`.
* `Quantity()` returns a float64 value of a given Quantity value. For example: `Quantity("100m")`, `Quantity("10Mi")`.
* `Usage()` returns the current instantaneous resource usage with the simulation data in [ResourceUsage (ClusterResourceUsage)].
For example: `Usage(pod, "memory")`, `Usage(node, "memory")`, `Usage(pod, "memory", container.name)` return the
current working set of a resource (pod, node or container) in bytes.
* `CumulativeUsage()` returns the cumulative resource usage in seconds with the simulation data given in [ResourceUsage (ClusterResourceUsage)].
For example: `CumulativeUsage(pod, "cpu")`, `CumulativeUsage(node, "cpu")`, `CumulativeUsage(pod, "cpu", container.name)`
return a cumulative cpu time consumed by a resource (pod, node or container) in core-seconds.

Additionally, `kwok` provides three special CEL variables `node`, `pod`, and `container` that could be used
in the expressions.
The three variables are set to the corresponding node, pod, container resource object respectively and users can
reference any nested fields of the resource objects simply via the CEL field selection expression (`e.f` format).
For example, you could use expression `node.metadata.name` to obtain the node name.

{{< hint "info" >}}

The functions with at least one parameter can be called in a receiver call-style.
That is, a function call like `f(e1, e2)` can also be called in style `e1.f(e2)`. For example, you can use `pod.Usage("memory")`
as an alternative to `Usage(pod, "memory")`.

{{< /hint >}}


It is worth noting that the use of some extension functions is restricted to specific CRs and contexts in the sense
that they are not generic but designed for special evaluating tasks.
The detailed limitations are described below.

## Functions Limitation

Function `Usage()` and `CumulativeUsage()` can only be used in the Metric resource.
For other functions listed above, users are also allowed to use them in ResourceUsage and ClusterResourceUsage
to build dynamic resource usage patterns.

The reason behind is that when `kwok` evaluates functions `Usage()` or `CumulativeUsage()`,
it actually takes the simulation data given in ResourceUsage and ClusterResourceUsage to obtain metric values.
Therefore, please ensure that the associated ResourceUsage or ClusterResourceUsage with the needed resource types
(cpu or memory) are also provided when using function `Usage()` and `CumulativeUsage()`.

## Variables Limitation

When using the three special CEL variables `node`, `pod`, and `container` in Metric resource, you should follow the below rules.
* When `dimension` is `node`: only `node` variable can be used.
* When `dimension` is `pod`: only `node`, `pod` can be used.
* When `dimension` is `container`: `node`, `pod`, `container` all can be used.


[Metric]: {{< relref "/docs/generated/apis" >}}#kwok.x-k8s.io/v1alpha1.Metric
[ResourceUsage]: {{< relref "/docs/generated/apis" >}}#kwok.x-k8s.io/v1alpha1.ResourceUsage
[ClusterResourceUsage]: {{< relref "/docs/generated/apis" >}}#kwok.x-k8s.io/v1alpha1.ClusterResourceUsage
[the CEL language specification]: https://github.com/google/cel-spec/blob/master/doc/langdef.md
[CEL predefined functions]: https://github.com/google/cel-spec/blob/master/doc/langdef.md#list-of-standard-definitions
26 changes: 26 additions & 0 deletions site/content/en/docs/user/go-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
title: "Go Template in `kwok`"
---

# Notes on Go Template in `kwok`


The page provides a concise note on writing go templates in kwok CRs.


Currently, only `Stage` CR has go template based fields (`Spec.Next.StatusTemplate`).


You must follow [the go text template syntax] when writing the templates.
For predefined functions of go text template, please refer to [go text template functions].
Besides the built-in functions, `kwok` also supports [sprig template functions].

It is worth noting that the "context" (which is denoted by the period character `.` ) to a template in `kwok` is set to the
referenced Kubernetes resource.
For example, you can use `.metadata.name` in a template to obtain the corresponding Kubernetes resource name.



[the go text template syntax]: https://pkg.go.dev/text/template
[go text template functions]: https://pkg.go.dev/text/template#hdr-Functions
[sprig template functions]: https://masterminds.github.io/sprig/
17 changes: 17 additions & 0 deletions site/content/en/docs/user/kwok-in-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,21 @@ NOTE: This configures the pod/node emulation behavior, if not it will do nothing
kubectl apply -f "https://github.com/${KWOK_REPO}/releases/download/${KWOK_LATEST_RELEASE}/stage-fast.yaml"
```

## Set up default CRs of resource usage (optional)

This allows to simulate the resource usage of nodes, pods and containers.

``` bash
kubectl apply -f "https://github.com/${KWOK_REPO}/releases/download/${KWOK_LATEST_RELEASE}/metrics-usage.yaml"
```

The above configuration sets the CPU and memory usage of all the containers managed by `kwok` to `1m` and to `1Mi` respectively.
To override the defaults, you can add annotation `"kwok.x-k8s.io/usage-cpu"` (for cpu usage) and
`"kwok.x-k8s.io/usage-memory"` (for memory usage) with any quantity value you want to the fake pods.

The resource usage simulation used above is annotation-based and the configuration is available at [here][resource usage from annotation].
For the explanation of how it works and more complex resource usage simulation methods, please refer to [ResourceUsage configuration].

## Old way to deploy kwok

Old way to deploy kwok is [here][kwok in cluster old].
Expand All @@ -45,3 +60,5 @@ Now, you can use `kwok` to [manage nodes and pods] in the Kubernetes cluster.

[manage nodes and pods]: {{< relref "/docs/user/kwok-manage-nodes-and-pods" >}}
[kwok in cluster old]: {{< relref "/docs/user/kwok-in-cluster-old" >}}
[resource usage from annotation]: https://github.com/kubernetes-sigs/kwok/tree/main/kustomize/metrics/usage
[ResourceUsage configuration]: {{< relref "/docs/user/resource-usage-configuration" >}}
108 changes: 108 additions & 0 deletions site/content/en/docs/user/metrics-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
---
title: "Metrics"
---

# Metrics Configuration

{{< hint "info" >}}

This document walks you through how to configure the Metrics feature.

{{< /hint >}}

## What is a Metrics?

The [Metrics] is a [`kwok` Configuration][configuration] that allows users to define and simulate metrics endpoints exposed by kubelet.

The YAML below shows all the fields of a Metrics resource:

``` yaml
kind: Metrics
apiVersion: kwok.x-k8s.io/v1alpha1
metadata:
name: <string>
spec:
path: <string>
metrics:
- name: <string>
help: <string>
kind: <string>
dimension: <string>
labels:
- name: <string>
value: <string>
value: <string> # for counter and gauge
buckets: # for histogram
- le: <float64>
value: <string>
hidden: <bool>
```
There are total four metric-related endpoints in kubelet: `/metrics`, `/metrics/resource`, `/metrics/probe` and `/metrics/cadvisor`,
all of which are exposed with a Prometheus style. The Metrics resource is capable of simulating endpoints with such style.

To simulate a metric endpoint, first, you need to specify the RESTful `path` of the endpoint,
which will be installed and exposed by the metric service of `kwok` at port `10247` after applied.
The `path` must start with `/metrics`, otherwise, `kwok` will not install it.


{{< hint "info" >}}
Starting from metrics-server 0.7.0, it is allowed to specify the path to scrape metrics for a node.
Specifically, metrics-server will check if a node has annotation `metrics.k8s.io/resource-metrics-path`
and use it as the target metric scrape path.
Combined with the Metric CR, the feature makes it possible to integrate `kwok` and metrics-server.
For a fake node, by adding that annotation and setting its value to the `path`
specified in a Metric resource, metrics-server will collect data from the endpoints exposed by `kwok` instead of
scrapping from kubelet.
{{< /hint >}}

Besides, compared to kubelet, which only exposes the metric of the node it is located on, `kwok` needs to expose the
metrics of all the fake nodes it manages. Instead of creating a separate Metric CR for each fake node, it is possible
to bind all the metrics endpoints from different nodes into a single `path`. Metric CR allows for a built-in
`{nodeName}` path parameter to be included in the `path` field. For example: `/metrics/nodes/{nodeName}/metrics/resource`.
With `{nodeName}`, a single `path` is able to differentiate the metric data from different nodes.


The `metrics` field are used to customize the return body of the installed metrics endpoint.

The descriptions of each sub-field are available at [Metric API][Metric].
For readers' convenience, we also mirror the documents here with some additional notes.

`metrics` is a list of specific configuration items, with each corresponding to a Prometheus style metric:
* `name` defines the metric name.
* `labels` defines the metric labels, with each item corresponding to a specific metric label.
- `name` is a const string that provides the label name.
- `value` is represented as a CEL expression that dynamically determines the label value.
For example: you can use `node.metadata.name` to reference the node name as the label value.
* `help` defines the help string of a metric.
* `kind` defines the type of the metric: `counter`, `gauge`, or `histogram`.
* `dimension` defines where the data comes from. It could be `node`, `pod`, or `container`.
* `value` is a CEL expression that defines the metric value if `kind` is `counter` or `gauge`.
Please refer to [CEL expressions in `kwok`] for more detailed instructions that might be helpful to simulate the metric value.
* `buckets` is exclusively for customizing the data of the metric of kind `histogram`.
- `le`, which defines the histogram bucket’s upper threshold, has the same meaning as the one of Prometheus histogram bucket.
That is, each bucket contains values less than or equal to `le`.
- `value` is a CEL expression that provides the value of the bucket.
- `hidden` indicates whether to show the bucket in the metric.
But the value of the bucket will be calculated and cumulated into the next bucket.

Please refer to [Metrics for kubelet's "metrics/resource" endpoint][metrics resource endpoint] for a detailed example.


## Out-of-box Metric Config

`kwok` currently provides the [Metrics config][metrics resource endpoint] that is capable of
simulating kubelet's `"metrics/resource"` endpoint.

To integrate the simulated endpoint with metrics-server (required version >= 0.7.0), add the
`"metrics.k8s.io/resource-metrics-path": "/metrics/nodes/<nodeName>/metrics/resource"` annotation to the fake
nodes managed by `kwok`.

<img width="700px" src="/img/demo/resource-usage.svg">


[configuration]: {{< relref "/docs/user/configuration" >}}
[Metrics]: {{< relref "/docs/generated/apis" >}}#kwok.x-k8s.io/v1alpha1.Metrics
[CEL expressions in `kwok`]: {{< relref "/docs/user/cel-expressions" >}}
[metrics resource endpoint]: https://github.com/kubernetes-sigs/kwok/blob/main/kustomize/metrics/resource
[ResourceUsage (ClusterResourceUsage)]: {{< relref "/docs/user/resource-usage-configuration" >}}
Loading

0 comments on commit 5c13cee

Please sign in to comment.