Add metrics for provisioning/deprovisioning step result #1039

vvxxvvxx · 2024-08-14T13:14:17Z

Description

In order to get notified or mitigate the issue before the provisioning/deprovisioning reports final timeout failure, we'd like to have the metrics for the step result of the provisioning/deprovisioning operation.

With the step result metrics, we can set up the alert when the provisioning/deprovisioning gets constant failure in a certain step. We can be notified with the alert for step failure and then mitigate the issue before the provisioning/deprovisioning hits its final timeout failure.

We have the operation step result metric in v1, but they were removed in v2. We could add them back and assign proper labels for the step result metrics.

Reasons
We only have the metrics to report the provisioning/deprovisioning final status (succeeded or failed), so we can't get notified when one step of provisioning/deprovisioning is failing and keep retrying until the whole process gets timeout. If we have the alert for step failure, we can mitigate the issue earlier and possibly mitigate the issue before the operation hits timeout failure.

Attachments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add metrics for provisioning/deprovisioning step result #1039

Add metrics for provisioning/deprovisioning step result #1039

vvxxvvxx commented Aug 14, 2024 •

edited

Loading

Add metrics for provisioning/deprovisioning step result #1039

Add metrics for provisioning/deprovisioning step result #1039

Comments

vvxxvvxx commented Aug 14, 2024 • edited Loading

vvxxvvxx commented Aug 14, 2024 •

edited

Loading