Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address prometheus scraping jobs that are failing #551

Open
jfly opened this issue Feb 1, 2025 · 0 comments
Open

Address prometheus scraping jobs that are failing #551

jfly opened this issue Feb 1, 2025 · 0 comments
Assignees

Comments

@jfly
Copy link
Contributor

jfly commented Feb 1, 2025

I see that we have 3 failing scraping jobs (at time of writing). Query:

  • up{instance="r13y.com:443", job="r13y"}
  • up{instance="127.0.0.1:9190", job="rfc39"}
  • up{instance="hydra.nixos.org:9199", job="hydra_notify"}

r13y

Last successful scrape: 2024-09-21

Added in 3c4f476.

Looks like it's a reproducibility checker created by @grahamc: https://github.com/grahamc/r13y.com.

Perhaps this can just be removed?

rfc39

This is trickier. Seems to be periodically up (link):Query

Image

Apparently this is a known "issue", see comment here: https://github.com/nixos/infra/blob/af0ed6d10dbb3a3ec919321314506b180d1f5faf/build/pluto/prometheus/exporters/rfc39.nix#L12.

AFAICT, we don't have any alerting rules configured that react to this (just this systemd unit state). Perhaps we could just stop scraping this? Is there useful historical data in here?

hydra_notify

Last successful scrape: 2024-08-02

Added in bf95096, also see 88abf45.

Looks like @mweinelt disabled hydra-notify here: 66da5cf, which lines up with the last successful scrape.

Seems like we should just disable this scrape job as well.

@jfly jfly self-assigned this Feb 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant