Skip to content

Commit

Permalink
feat: Introduce internal docu (#88)
Browse files Browse the repository at this point in the history
* chore: introduce internal docu

* fix

* fix

* fix

* Apply suggestions from code review

Co-authored-by: Nina Hingerl <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nina Hingerl <[email protected]>

---------

Co-authored-by: Korbinian Stoemmer <[email protected]>
Co-authored-by: Nina Hingerl <[email protected]>
  • Loading branch information
3 people authored Oct 9, 2024
1 parent 1e3d19b commit 5678715
Show file tree
Hide file tree
Showing 9 changed files with 870 additions and 298 deletions.
264 changes: 12 additions & 252 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,9 @@
![GitHub tag checks state](https://img.shields.io/github/checks-status/kyma-project/kyma-metrics-collector/main?label=kyma-metrics-collector&link=https%3A%2F%2Fgithub.com%2Fkyma-project%2Fkyma-metrics-collector%2Fcommits%2Fmain)

## Overview
Kyma Metrics Collector (KMC) is a component that scrapes all Kyma clusters to generate metrics. These metrics are sent to an SAP internal tool called Event Data Platform (EDP) as an event stream and used for billing information.
Kyma Metrics Collector (KMC) is a component that scrapes all Kyma clusters to generate metrics. These metrics are sent to an SAP-internal tool called Event Data Platform (EDP) as an event stream and used for billing information.

## Functionality
The basic flow for KMC is as follows:
* KMC workers get a list of runtimes from [Kyma Environment Broker (KEB)](https://github.com/kyma-project/kyma-environment-broker/tree/main).
* KMC adds the runtimes to a queue to work through them. If an error occurs, KMC re-queues the affected runtime.
* Information on PVCs, SVCs and Nodes is retrieved from SAP BTP, Kyma runtime (SKR).
* This information is sent to EDP as an event stream.
* For every process step, internal metrics are exposed with the [Prometheus client library](https://github.com/prometheus/client_golang). See the [metrics.md](metrics.md) file for exposed metrics.
Learn more about functionality and architecture in the [Contributor README](./docs/contributor/README.md).

## Usage

Expand Down Expand Up @@ -53,255 +47,21 @@ Kyma Metrics Collector comes with the following environment variables:

## Development
- Run a deployment in a currently configured k8s cluster:
>**NOTE:** In order to do this, you need a token from a secret `kcp-kyma-metrics-collector`.
```
ko apply -f dev/
```
>**NOTE:** In order to do this, you need a token from a secret `kcp-kyma-metrics-collector`.
```
ko apply -f dev/
```

- Run tests:
```
make test
```
```
make test
```

### Troubleshooting
- Check logs:
```
kubectl logs -f -n kcp-system $(kubectl get po -n kcp-system -l 'app=kmc-dev' -oname) kmc-dev
```

### Data collection

Kyma Metrics Collector collects information about billable hyperscaler usage and sends it to EDP. This data has to adhere to the following schema:

```json
{
"name": "kmc-consumption-metrics",
"jsonSchema": {
"type": "object",
"title": "SKR Metering Schema",
"description": "SKR Metering Schema.",
"required": [
"timestamp",
"compute",
"networking"
],
"properties": {
"timestamp": {
"$id": "#/properties/timestamp",
"type": "string",
"format": "date-time",
"title": "The Timestamp Schema",
"description": "Event Creation Timestamp",
"default": "",
"examples": ["2020-03-25T09:16:41+00:00"]
},
"compute": {
"$id": "#/properties/compute",
"type": "object",
"title": "The Compute Schema",
"description": "Contains Azure Compute metrics",
"default": {},
"examples": [
{
"provisioned_cpus": 24.0,
"provisioned_volumes": {
"size_gb_rounded": 192.0,
"count": 3.0,
"size_gb_total": 150.0
},
"vm_types": [
{
"name": "Standard_D8_v3",
"count": 3.0
},
{
"name": "Standard_D6_v3",
"count": 2.0
}
],
"provisioned_ram_gb": 96.0
}
],
"required": [
"vm_types",
"provisioned_cpus",
"provisioned_ram_gb",
"provisioned_volumes"
],
"properties": {
"vm_types": {
"$id": "#/properties/compute/properties/vm_types",
"type": "array",
"title": "The Vm_types Schema",
"description": "A list of VM types that have been used for this SKR instance.",
"default": [],
"items": {
"$id": "#/properties/compute/properties/vm_types/items",
"type": "object",
"title": "The Items Schema",
"description": "The Azure instance type name and the provisioned quantity at the time of the event.",
"default": {},
"examples": [
{
"name": "Standard_D8_v3",
"count": 3.0
},
{
"name": "Standard_D6_v3",
"count": 2.0
}
],
"required": ["name", "count"],
"properties": {
"name": {
"$id": "#/properties/compute/properties/vm_types/items/properties/name",
"type": "string",
"title": "The Name Schema",
"description": "Name of the instance type",
"default": "",
"examples": ["Standard_D8_v3"]
},
"count": {
"$id": "#/properties/compute/properties/vm_types/items/properties/count",
"type": "integer",
"title": "The Count Schema",
"description": "Quantity of the instances",
"default": 0,
"examples": [3]
}
}
}
},
"provisioned_cpus": {
"$id": "#/properties/compute/properties/provisioned_cpus",
"type": "integer",
"title": "The Provisioned_cpus Schema",
"description": "The total sum of all CPUs provisioned from all instances (number of instances * number of CPUs per instance)",
"default": 0,
"examples": [24]
},
"provisioned_ram_gb": {
"$id": "#/properties/compute/properties/provisioned_ram_gb",
"type": "integer",
"title": "The Provisioned_ram_gb Schema",
"description": "The total sum of Memory (RAM) of all provisioned instances (number of instances * number of GB RAM per instance).",
"default": 0,
"examples": [96]
},
"provisioned_volumes": {
"$id": "#/properties/compute/properties/provisioned_volumes",
"type": "object",
"title": "The Provisioned_volumes Schema",
"description": "Volumes (Disk) provisioned(excluding the Node volumes).",
"default": {},
"examples": [
{
"size_gb_rounded": 192.0,
"count": 3.0,
"size_gb_total": 150.0
}
],
"required": ["size_gb_total", "count", "size_gb_rounded"],
"properties": {
"size_gb_total": {
"$id": "#/properties/compute/properties/provisioned_volumes/properties/size_gb_total",
"type": "integer",
"title": "The Size_gb_total Schema",
"description": "The total GB disk space requested by a kyma instance",
"default": 0,
"examples": [150]
},
"count": {
"$id": "#/properties/compute/properties/provisioned_volumes/properties/count",
"type": "integer",
"title": "The Count Schema",
"description": "The number of disks provisioned.",
"default": 0,
"examples": [3]
},
"size_gb_rounded": {
"$id": "#/properties/compute/properties/provisioned_volumes/properties/size_gb_rounded",
"type": "integer",
"title": "The Size_gb_rounded Schema",
"description": "Azure charges disk in 32GB blocks. If one provisions e.g. 16GB, he still pays 32 GB. This value here is rounding up each volume to the next y 32 dividable number and sums these values up.",
"default": 0,
"examples": [192]
}
}
}
}
},
"networking": {
"$id": "#/properties/networking",
"type": "object",
"title": "The Networking Schema",
"description": "Some networking controlling data.",
"default": {},
"examples": [
{
"provisioned_vnets": 2.0,
"provisioned_ips": 3.0
}
],
"required": [
"provisioned_vnets",
"provisioned_ips"
],
"properties": {
"provisioned_vnets": {
"$id": "#/properties/networking/properties/provisioned_vnets",
"type": "integer",
"title": "The Provisioned_vnets Schema",
"description": "Number of virtual networks",
"default": 0,
"examples": [2]
},
"provisioned_ips": {
"$id": "#/properties/networking/properties/provisioned_ips",
"type": "integer",
"title": "The Provisioned_ips Schema",
"description": "Number of IPs",
"default": 0,
"examples": [3]
}
}
}
}
},
"version": "1",
"eventTimeField": "event.timestamp"
}
```

See the example of data sent to EDP:

```json
{
"compute": {
"vm_types": [
{
"name": "Standard_D8_v3",
"count": 3
},
{
"name": "Standard_D6_v3",
"count": 2
}
],
"provisioned_cpus": 24,
"provisioned_ram_gb": 96,
"provisioned_volumes": {
"size_gb_total": 150,
"count": 3,
"size_gb_rounded": 192
}
},
"networking": {
"provisioned_vnets": 2,
"provisioned_ips": 3
}
}
```
```
kubectl logs -f -n kcp-system $(kubectl get po -n kcp-system -l 'app=kmc-dev' -oname) kmc-dev
```

## Contributing

Expand Down
36 changes: 3 additions & 33 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,38 +2,8 @@

## Overview

The `docs` folder contains two subfolders - `user` and `contributor`.
This folder contains two subfolders - `user` and `contributor`.

The `user` subfolder contains the end-user documentation, which is displayed on the [Kyma website](https://kyma-project.io/#/). Depending on your module needs, the subfolder must include overview, usage, or technical reference documents. To display the content on the website properly, create a `_sidebar.md` file in the `user` subfolder and list the documents it contains there. For more information on how to publish user documentation, follow [this guide](https://github.com/kyma-project/community/blob/main/docs/guidelines/content-guidelines/01-user-docs.md).
The [`user`](./user/README.md) subfolder contains the end-user documentation.

The `contributor` subfolder includes any developer-related documentation to help them manually install, develop, and operate a module.

To have a common structure across all modules, all documents must be properly numbered according to the following structure:

> **NOTE:** It is suggested to use the following titles if you have the content that matches them; otherwise use your own, more suitable titles, or simply skip the ones you find irrelevant.
- 00-xx-overview
- 01-xx-tutorial/configuration
- 02-xx-usage
- 03-xx-troubleshooting

where `xx` is the number of the given document. For example:

```bash
00-00-overview-telemetry-manager
00-10-overview-logs
00-20-overview-traces
00-30-overview-metrics
01-10-configure-logs
01-20-configure-traces
01-30-configure-metrics
02-10-use-logs
02-20-use-traces
02-30-use-metrics
(...)
```
> **NOTE:** Before introducing [docsify](https://docsify.js.org/#/?id=docsify), we agreed to use the `10`, `20`, `30` numbering. It was to help maintain the proper order of docs if they were rendered automatically on the website. With docsify, you manually add the content to the `_sidebar.md` file, and docs are displayed in the order you add them. However, this numbering is still recommended to have the unified structure of the docs in the module repositories.

If you have other content that does not fit into the above topics, create your own 04-10-module-specific document(s).

You can divide your documentation into subfolders to avoid having too many documents in one `docs/user` or `docs/contributor` folder. For example, if you have many technical reference documents, you can create a `technical reference` subfolder in `docs/user` and keep relevant documentation there. Each subfolder in the `user` folder must have its own `_sidebar.md` file with the links to the main module page and the list of docs it contains.
The [`contributor`](./contributor/README.md) subfolder includes any developer-related documentation to help them manually install, develop, and operate the component.
61 changes: 60 additions & 1 deletion docs/contributor/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,60 @@
In this folder, you can add any developer-related documentation, for example, advanced installation options, testing strategy, governance, etc.
# Contributing to Kyma Metrics Collector

## Overview

To bill hyperscaler resources used by SKR clusters, the Kyma Control Plane (KCP) uses the Kyma Metrics Collector (KMC), which is integrated with Unified Metering using Event Data Platform (EDP).

## Architecture

Every SKR cluster runs in a hyperscaler account dedicated to the related global account, so it is shared between many clusters of the same customer. The hyperscaler account is paid by Kyma, and individual resource usage is charged to the customer. The bill to the end user contains one entry, listing the consumed capacity units (CU) without any further breakdown. The bill is created by the Unified Metering service.

[!arch](./assets/arch.drawio.svg)

The following step happens once for every SKR registration and deregistration:

1. KEB registers/unregisters a new tenant in EDP.

The following steps happen periodically:

2. KMC workers fetch the list of billable SKR clusters from [Kyma Environment Broker (KEB)](https://github.com/kyma-project/kyma-environment-broker/tree/main) and add them to a queue to work through them. If an error occurs, KMC re-queues the affected SKR cluster. For every process step, internal metrics are exposed with the [Prometheus client library](https://github.com/prometheus/client_golang). For details about the exposed metrics, see the [metrics.md](./metrics.md) file.
2. KMC fetches the kubeconfig for every SKR cluster from the control plane resources.
2. KMC fetches specific Kubernetes resources from the APIServer of every SKR cluster using the related kubeconfig. Hereby, the following resources are collected:
- node type - using the labeled machine type, KMC maps how much memory and CPU the node provides and maps it to an amount of CPU.
- storage - for every storage, KMC determines the provisioned GB value.
- services - this resource type is not used currently, it's dropped after fetching
2. KMC maps the retrieved Kubernetes resources to a memory/CPU/storage value and sends the value to EDP as event stream.
2. EDP calculates the consumed CUs based on the consumed CPU or storage with a fixed formula and sends the consumed CUs to Unified Metering.

## EDP interface

The data sent to EDP must adhere to the schema you see in [edp.json](./assets/edp.json).

See the following example payload:

```json
{
"compute": {
"vm_types": [
{
"name": "Standard_D8_v3",
"count": 3
},
{
"name": "Standard_D6_v3",
"count": 2
}
],
"provisioned_cpus": 24,
"provisioned_ram_gb": 96,
"provisioned_volumes": {
"size_gb_total": 150,
"count": 3,
"size_gb_rounded": 192
}
},
"networking": {
"provisioned_vnets": 2,
"provisioned_ips": 3
}
}
```
Loading

0 comments on commit 5678715

Please sign in to comment.