-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hangfire.ProRedis parse error seems to cause infinite retry #860
Comments
Oh dear, this can happen, when
Duplicates #790. |
That makes sense. I noticed that the Redis instance did fill up, there were "workers" and hundreds of thousands of jobs queued up, but no jobs were being processed. Thank you Sergey. |
I am also encountering this issue, with same stack trace. |
Guys, please tell me what Hangfire packages are you using and their versions. This behavior could also be caused by a bug related to expiring jobs in a batch continuation fixed in Hangfire.Pro 1.4.4. The problem is background job is expired or removed prior the correct time. I've added a workaround for this to upcoming Hangfire.Pro.Redis 2.1.0, but we need to know the exact problem. |
Hangfire: 1.6.8 |
I just checked my production server, I have 300k jobs backed up and they will not dequeue - is there a change that I can make to get these to dequeue? |
Interestingly if I restart the server then it will dequeue 10-15 jobs and execute them, then it will no longer dequeue any more. |
I manually connected to the redis server and removed the jobs that were showing as having a problem and it appears that my processing has resumed (although will take several hours to catch back up). I think any fix needs to also handle the exception and allow the workers to keep on running, perhaps transitioning the job into some "fault" state that can be further resumed. At the moment it has blocked up over 350k jobs and has caused downstream impact on my users. |
I've just released Hangfire.Pro.Redis 2.1.0. Missing type and method information will not lead to an exception anymore, and job will be moved to the Failed state. I've increased the initial expiration timeouts to prevent job expiration during clock changes. So, the consequences of such problem will not cause the whole processing to stop. I'll also add a protection layer to the Hangfire.Core package. Thank you for reporting such a non-trivial problem! |
Sorry for the delayed response. Our versions are: Looks like we have a little updating to do. @odinserj i'll see about upgrading all of our Hangfire packages on Wed. Thank you so much for your work on this. -Mark |
I've deployed the new Redis library into our test environment, it appears to skip the failed job for me. |
We have been running new redis in production for three weeks now - only issue seems to be that the "failed" jobs tab on the dashboard now returns a 404, but I will raise a different issue for that. |
Same here. So far so good. |
Silently on one of our integration servers, we started getting the below error. Its not easy to figure out what was causing this so we had to wipe the REDIS DB.
Unfortunately it caused thousands of errors to recur and since we use the RayGun error logging service it triggered our plan to the next tier. :(
What sort of things could possibly cause this?
The text was updated successfully, but these errors were encountered: