Address prometheus scraping jobs that are failing #551

jfly · 2025-02-01T01:34:29Z

I see that we have 3 failing scraping jobs (at time of writing). Query:

up{instance="r13y.com:443", job="r13y"}
up{instance="127.0.0.1:9190", job="rfc39"}
up{instance="hydra.nixos.org:9199", job="hydra_notify"}

r13y

Last successful scrape: 2024-09-21

Added in 3c4f476.

Looks like it's a reproducibility checker created by @grahamc: https://github.com/grahamc/r13y.com.

Perhaps this can just be removed?

rfc39

This is trickier. Seems to be periodically up (link):Query

Apparently this is a known "issue", see comment here: https://github.com/nixos/infra/blob/af0ed6d10dbb3a3ec919321314506b180d1f5faf/build/pluto/prometheus/exporters/rfc39.nix#L12.

AFAICT, we don't have any alerting rules configured that react to this (just this systemd unit state). Perhaps we could just stop scraping this? Is there useful historical data in here?

hydra_notify

Last successful scrape: 2024-08-02

Added in bf95096, also see 88abf45.

Looks like @mweinelt disabled hydra-notify here: 66da5cf, which lines up with the last successful scrape.

Seems like we should just disable this scrape job as well.

The text was updated successfully, but these errors were encountered:

jfly mentioned this issue Feb 1, 2025

Alert if any scraping job fails #552

Draft

jfly self-assigned this Feb 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address prometheus scraping jobs that are failing #551

Address prometheus scraping jobs that are failing #551

jfly commented Feb 1, 2025

Address prometheus scraping jobs that are failing #551

Address prometheus scraping jobs that are failing #551

Comments

jfly commented Feb 1, 2025

r13y

rfc39

hydra_notify