-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Upgrading AKS Cluster to 1.31.1 (Preview) broke the cluster #4607
Comments
Hello, Have you tried to reconcile the cluster? You can use the following command to do that if you have not yet: az aks update -g MyResourceGroup -n MyManagedCluster If that has not resolved the issue, you can check the diagnose and solve section under the AKS blade in the portal. If it is still down I would suggest you get a support ticket opened for this one. Thanks Richard |
I did tried all of that but nothing helped. I'll open support ticket in that case.
Thanks for the quick reply though.
|
You should be able to add nodes using 1.30.5 which will be compatible with the 1.31 API until this issue is resolved |
@lareeth - This worked for me. Thanks for the solution. 👍 So this seems to be a bug on Microsoft side - as AKS is not able to identify Nodepools with newer version. |
Same here. Upgraded from 1.30.5 to 1.31.1. and I got this error : IMDS query failed, exit code: 28... in logs I created another system nodepool with the 1.30.5 version and nodes come in ready state. |
The same issue here So embarrassing time after time in Azure |
We faced the same issue when we updated our development cluster from 1.30.5 to 1.31.1. As described here, I created new pools, and they are in a ready state cluster back to normal operation, but now I am facing a failed state for many operations. As I understand it, I now have control plane 1.31.1 and nodes 1.30.5. Does Azure have any mechanism to roll back the update? |
Describe scenario
I had a cluster running on 1.30.5. I initiated cluster upgrade to 1.31.1 (Preview). Post this, the cluster has moved to failed state.
I did tried to add more nodepools, they get added successfully but Ready Nodes count is zero.
I also tried to scale existing nodepools, stop and start them - that resulted in node pools being visible in Azure Portal but with zero active nodes.
Question
How can I recover cluster and get my services back online? I'd prefer not to re-create AKS cluster as this cluster has lot of important deployments running.
The text was updated successfully, but these errors were encountered: