Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodeos_block_num metric flapping on BP node #956

Closed
ank-everstake opened this issue Oct 21, 2024 · 1 comment · Fixed by #957 or #975
Closed

nodeos_block_num metric flapping on BP node #956

ank-everstake opened this issue Oct 21, 2024 · 1 comment · Fixed by #957 or #975
Assignees
Labels
bug The product is not working as was intended. OCI Work exclusive to OCI team
Milestone

Comments

@ank-everstake
Copy link

Describe the bug

nodeos_block_num Prometheus metric is showing 0 when block producer is producing a block itself.
It generates false-positive alerts and renders Grafana charts unreadable. Non-producer nodes are not affected.

Expected behavior

Show real block number even if node is currently producing without flapping.

Screenshots

2024-10-21_10-37

2024-10-18_00-19
Left side - producer logs, right side is showing while true curl | grep nodeos_block_num

Specs / Additional context

OS - Ubuntu 22.04 LTS
RAM - 256 GB
CPU - AMD Ryzen Threadripper 7960X

We observe this issue starting from

v1.0.0-a8159feae6f7f0d89dc0f990682a3b09635a3e1f

it persists on current version

v1.0.1-574650744460373f635d48cac9aa6dee67dcbfdb
@heifner
Copy link
Member

heifner commented Oct 21, 2024

Note: #707

@heifner heifner added the bug The product is not working as was intended. label Oct 21, 2024
@heifner heifner added this to the Spring v1.0.3 milestone Oct 21, 2024
@heifner heifner self-assigned this Oct 21, 2024
@heifner heifner added the OCI Work exclusive to OCI team label Oct 21, 2024
heifner added a commit that referenced this issue Oct 24, 2024
…-metrics

[1.0.3] Prometheus: Update speculative block metrics for produced blocks
heifner added a commit that referenced this issue Oct 24, 2024
heifner added a commit that referenced this issue Oct 25, 2024
…-metrics-main

[1.0.3 -> main] Prometheus: Update speculative block metrics for produced blocks
@github-project-automation github-project-automation bot moved this from Todo to Done in Team Backlog Oct 25, 2024
@heifner heifner removed the triage label Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The product is not working as was intended. OCI Work exclusive to OCI team
Projects
Status: Done
3 participants