Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding settings for beanstalk fetch retries #197

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

99kennetn
Copy link
Contributor

@99kennetn 99kennetn commented Dec 11, 2024

This pr request to implement some new options for making multiple retries upon server exceptions.

These changes are request as the server setup i work on in my day job runs a lot of tubes which sometimes is not active, which makes it so that the tube is not existing anymore. Sometimes connections will fail for just a very breaf moment, so the option to add some simple retries will not be notisable when using the console but will help a lot.

This is the third pr to fix a lot of issues i see on a daily basis in our setup when working with this console.

the settings has been defaulted to have the same behaviour as it already is.

3 things will be requested added with this pr.

  1. Adding new config options and code to making data retry possible and tweak the settings of how often and how many times
  2. Storing statsTube data in the session mostly for reuse when auto refreshing and statsTube sometimes can be missing because of server exceptions
  3. Some minor bug fixes and optimizations:
    3.1 checking $this->_globalVar['action'] instead of $_GET['action'] as these two should be the same, and if i remember corret $_GET removes data upon first retrieaval so it might not exist here
    3.2 $tubeStats['pause-time-left'] would sometimes be missing when auto refreshing

This is my first set of contribution pr's, so please do say if i am missing anything or have forgotten to add something :)

example of the missing pause-time-left keys and statsTube missing: These errors are pr tube failing, which in our case is almost always every tube (around 40)
image

99kennetn and others added 4 commits December 11, 2024 20:38
This pr request to implement some new options for making multiple retries upon server exceptions.

These changes are request as the server setup i work on in my day job runs a lot of tubes which sometimes is not active, which makes it so that the tube is not existing anymore. Sometimes connections will fail for just a very breaf moment, so the option to add some simple retries will not be notisable when using the console but will help a lot.

This is the third pr to fix a lot of issues i see on a daily basis in our setup when working with this console.

3 things will be requested added with this pr.

1. Adding new config options and code to making data retry possible and tweak the settings of how often and how many times
2. Storing statsTube data in the session mostly for reuse when auto refreshing and statsTube sometimes can be missing because of server exceptions
3. Some minor bug fixes and optimizations:
    3.1 checking $this->_globalVar['action'] instead of $_GET['action'] as these two should be the same, and if i remember corret $_GET removes data upon first retrieaval so it might not exist here
    3.2 $tubeStats['pause-time-left'] would sometimes be missing when auto refreshing

This is my first set of contribution pr's, so please do say if i am missing anything or have forgotten to add something :)
@pentium10
Copy link
Collaborator

what version of beanstalkd you are running as you get pause-time-left is not there?

@99kennetn
Copy link
Contributor Author

99kennetn commented Dec 12, 2024

As far as i can see we are using a version of 1.12, it is a version deployed by our server hosting partner, so i have reached out to them for confirmation on this. (has been confirmed)

Either way the weird thing is that it is not that pause-time-left is not part of the normal data. it just randomly is missing sometimes when using auto refresh.

The only reason i might have for this, which is not how i would expect the code to handle, is if the scenario where tubesStats is missing it then throws an error, but continues on with the code and then are missing the data when getting to the pause-time-left handling. The thing i would not expect is for the code to continue in the situation where it failes to get the tubesStats, as i would assume the whole file would fail to execute further.

If it indeed continues i guess it is use of <?php ?> scoping that i am not used to working with.
normaly i work in the laravel framework and not raw php like this

Furthermore, if this is the case, it relates back to the initial reason for my pr's, where a lot of errors would be thrown in the frontend at random but generelly a couple of times pr minute.

We only experience these seemingly connection problems when using the beanstalk_console and dont experience it in our actual data handling against beanstlkd.

@pentium10
Copy link
Collaborator

You are the first in 7 years who has this kind of issue, and myself personally running beanstalk console everyday with this feature on and never experienced what you described. I am wondering if some other factors add to this. We had previous issues with customers where they had linux limits, like reaching open files limit, file handling limits, or conntrack limits, or TIME_WAIT issues. Can you check with your sysadmins what they comment on these.

@99kennetn
Copy link
Contributor Author

actually did not think about this, thanks for the resources.
It could sound plausible, although i have been thinking that it could maybe be related to the amount of data that is sometimes in jobs, as we work with some fairly large dataset parsed to the job.

It might take a couple of days to check this, and it is soon holiday vacation, so there is a possibility that i only get an answer in the new year. But i will get back to you :)

@pentium10
Copy link
Collaborator

Can you name the amount of data?
Personally I tested this assumption of yours on a project with 4 million items in Beanstalkd these days and not a single error occurred.

@99kennetn
Copy link
Contributor Author

99kennetn commented Dec 17, 2024

It is html email bodies that is constantly going through the system.
a random body i just checked, had around 55400 characters when shown in the tube view.

I think however it is a combination of multiple things, but i have not been able to pinpoint it fully yet.
Right now the best assumtion is that it has something to do with the amount of data combined with the number of available tubes keeps changing.

The beanstalk_console is on a seperate server from the beanstalkd setups and are all called through a local network ip.
Which is also why it is a bit weird that the connection seems to fail, but only with this project.

I do find it weird how consistent our problem is compared to it not being reproducable. But it seems to be when there is a lot of jobs going through the system.
I will try to see if i can make a reproducible scenario seperate from our production setup, but it might be hard if it both needs huge amount of data, multiple processes and tubes shifting from being watched and not watched.

@pentium10
Copy link
Collaborator

pentium10 commented Dec 17, 2024

We've been there with an email system, and each day we send 100M emails.
I think you have a network congestion issue elsewhere as well.
We could not ramp out our email/minute rate because we were saturating the bandwith between the machines.
We moved the body of the email to be kept in Memcache, and only have the hash in the tube itself. Throughput was amazing.

Also beanstalkd works best if the agents are on the same machines as the server. It can handle incredible throughput.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants