This repository has been archived by the owner on Feb 16, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 8
Add Service Monitoring #103
Comments
Requested via MOC in bug https://bugzilla.mozilla.org/show_bug.cgi?id=1390296 |
You may find more value in “max queue age” than raw count, since max age
should be fixed at some small value thanks to autoscale.
…On Mon, Aug 14, 2017 at 13:54 Jonathan Claudius ***@***.***> wrote:
Requested via MOC in bug
https://bugzilla.mozilla.org/show_bug.cgi?id=1390296
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#103 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAFqDGwX20tQy9_IOWmcIV2WZNn62d4Fks5sYLPngaJpZM4O2a5g>
.
|
@floatingatoll good point, I'll need to add a reporting attribute to the stats to ensure this is visible. I like it a lot because it doesn't require a monitoring endpoint to maintain state between checks. It would just say if "max queue age" gets past X then alert. |
QUEUED_MAX_AGE attribute has been deployed to production and can be seen here... https://sshscan.rubidus.com/api/v1/stats Acceptable tolerances requested of MOC are between 0-30 seconds. Anything outside that is either an infrastructure issue or an abuse scenario, which fundamentally affects a user experience. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Usually, April is the first person to hear about Mozilla SSH Observatory issues because she's working Observatory stuff a lot more than I. However, these issues generally boil down to one of two areas, which I should just add monitoring to let me know, so I'm the first person to know.
1.) Alert me when the site is not responding (this is usually nginx restarting and failing or a failed lets encrypt renew)
2.) Alert me when the queues are non-zero and not changing (this is usually an indication that something is broken or site abuse)
The text was updated successfully, but these errors were encountered: