expose pvc/pod level metrics (if not already exposed by kubelet and cadvisor) #78

iyashu · 2021-05-21T15:56:20Z

Required k8s persistent volume & filesystem level metrics along with their grafana dashboards and few sane alerts preferably in kube-prometheus mixin format. I believe many of them are already exposed by kubelet (or embedded cadvisor), but we need to check and expose them for cgroup v2 hierarchy as well.

Utilisation metrics (both inodes and bytes usage) along with total space available. - Already being exposed by kubelet. Grafana dashboards and prometheus alerts are already provided by kube-prometheus stack.
Volume read and write throughput metrics both in terms of iops and bytes per second. - These seems to be exposed by cadvisor, but somehow not visible for cgroup v2 hierarchy.
Disk read & write IO latency - Need to check if cadvisor already exposes these for cgroup v2.
No. of outstanding IO operations (preferably both queued as well as waiting for block device).
PV abnormality metrics due to degrading of underlying disk attached to node, fs corruption, accidental volume deletion on node etc. See if we can leverage volume health monitoring for the same.

Additionally we require following metrics related to pvc failure & provisioning to generate appropriate alerts.

pvc pending from long time. Explore if we can leverage kube-state-metrics to expose the same. Or we need to see if external provisioner already provide these metrics.
Other plugin level metrics (both controller and node driver) like client-go metrics, creation/expansion/deletion rpc rates, latency & failures.

Environment:

Kubernetes version (use kubectl version): >= 1.19
OS (e.g. from /etc/os-release): Debian 10

The text was updated successfully, but these errors were encountered:

kmova · 2021-06-09T07:17:10Z

Most of the metrics are available via:

kube-state-metrics
cAdvisor
Node exporter (standard and include kubelet mount point metrics )

In addition to the above, the LVM node-plugin will expose metrics (in addition to what exposed by sample LVM textfile exporter) with required labels attached to the metrics to co-relate with metrics exposed via standard exporters enabled in the cluster.

Sample dashboard with workload using LVM Local PV showing the PV utilization and performance metrics

iyashu · 2021-06-09T17:21:41Z

Thanks @kmova. Let me know as the dashboard gets ready & pushed somewhere. I would like to try them out in our playground clusters.

dsharma-dc · 2024-06-04T11:09:55Z

Need to verify the metrics. Previous comments mention that metrics are available. @abhilashshetty04 Could you please check this.

dsharma-dc · 2024-07-30T08:41:33Z

Not yet picked up to prioritise.

avishnu · 2024-09-12T10:07:04Z

@abhilashshetty04 @w3aman please confirm if the dashboard contains the needed metrics.

kmova mentioned this issue Jun 12, 2021

Add localPV workload dashboard openebs/monitoring#29

Merged

dsharma-dc assigned abhilashshetty04 Jun 4, 2024

dsharma-dc added the enhancement New feature or request label Jun 4, 2024

dsharma-dc self-assigned this Jul 5, 2024

avishnu assigned w3aman Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expose pvc/pod level metrics (if not already exposed by kubelet and cadvisor) #78

expose pvc/pod level metrics (if not already exposed by kubelet and cadvisor) #78

iyashu commented May 21, 2021 •

edited

Loading

kmova commented Jun 9, 2021

iyashu commented Jun 9, 2021

dsharma-dc commented Jun 4, 2024

dsharma-dc commented Jul 30, 2024

avishnu commented Sep 12, 2024

expose pvc/pod level metrics (if not already exposed by kubelet and cadvisor) #78

expose pvc/pod level metrics (if not already exposed by kubelet and cadvisor) #78

Comments

iyashu commented May 21, 2021 • edited Loading

kmova commented Jun 9, 2021

iyashu commented Jun 9, 2021

dsharma-dc commented Jun 4, 2024

dsharma-dc commented Jul 30, 2024

avishnu commented Sep 12, 2024

iyashu commented May 21, 2021 •

edited

Loading