Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making scaledown of match / execute servers more gradual and slower #729

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

n8kim1
Copy link
Contributor

@n8kim1 n8kim1 commented Jan 20, 2024

To alleviate #605

This would slightly increase costs of year-round runs, especially when someone runs only one match in the random middle of the year. But those increases shouldn't be much anyways

Happy to tweak params, or just not do this anyways. I mainly made this PR just to close my tabs xp

https://cloud.google.com/compute/docs/autoscaler#scale-in_controls
https://cloud.google.com/compute/docs/autoscaler/understanding-autoscaler-decisions#delays_in_scaling_in

Copy link
Member

j-mao commented Jan 20, 2024

Noting that #720 already alleviates this a lot by making there be less machines. If we can come up with a test plan to evaluate this before/after, we can deploy and evaluate what works better.

Noting also that scaling is already delayed by 1-2 mins because pub/sub metrics take time to propagate. So by the time scaling in happens, the scrim queue is actually far less than the assignment ratio.

@n8kim1
Copy link
Contributor Author

n8kim1 commented Jan 20, 2024

about #720 -- Yes good catch (had thought about that but forgot to mention). For a plan... what if I watch the queue during the next tournament as-is, and we can evaluate how much scrimmage servers interrupted? (like how much the bad behavior is still present)
I can also deploy this version and watch a tournament too

about delay -- good to know, thanks; That should help a ton for scaling in, and I can make scaling out a bit faster. Unfortunately relying on the baked-in 1-2 minute delay isn't probably long enough on its own though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants