KWOK service are crashing on host with big amount of running KWOK clusters. #1034
Unanswered
RomanBudnyk
asked this question in
Q&A
Replies: 1 comment 3 replies
-
From your logs, it looks like you created the cluster using binary runtime, and it looks like the machines went dormant causing the kube-controller and kube-scheduler to quit, But at the moment, there is no keep-alive mechanism for binary runtime, so you need to restart using |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi colleagues,
first of all - thanks for creating and supporting KWOK tool, this is very helpful!
We are trying to emulate a big amount of running cluster (our goal - 10K, but for nearest future is 200).
For this purpose, we've chosen KWOK solution and created host with 32 cpus, 64 gb ram and 80 gb storage.
there is no load on running clusters, we were operating with only one of them (creating pods, etc.).
but after some time we cannot create any pods:
what we were trying to do to fix the issue:
the only way that is working for us now - is to recreate KWOK clusters, but this cause a big issue as it is corrupting all underlying processes.
please help with the following:
I am attaching cluster logs here, please check. For me it looks like a big amount of clusters could be the issue, but cannot be 100% sure.
will wait for any answers, it will be very helpful!
Regards,
Roman
logs.zip
Beta Was this translation helpful? Give feedback.
All reactions