-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add statefulset partition controller #633
Conversation
98c05a1
to
08421b3
Compare
e8f9cfb
to
8582509
Compare
615115a
to
2306d09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not update MySQL Pods after the following operations.
This means we can't roll back MySQLClusters by Argo CD, etc.
If the pod's phase is not Running
(= Pending
, Succeeded
, or Failed
phase), how about proceeding with the partition to update the error pod?
This behavior may be the same as the PodDisruptionBudget. Please see the following note.
https://kubernetes.io/docs/tasks/run-application/configure-pdb/#unhealthy-pod-eviction-policy
NOTE: Pods in Pending, Succeeded or Failed phase are always considered for eviction.
- Create a MySQLCluster
$ kubectl create ns test
$ kubectl apply -f - << EOF
apiVersion: moco.cybozu.com/v1beta2
kind: MySQLCluster
metadata:
namespace: test
name: test
spec:
replicas: 3
podTemplate:
spec:
containers:
- name: mysqld
image: ghcr.io/cybozu-go/moco/mysql:8.0.36.2
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
EOF
-
Wait to become Healthy.
-
Update the MySQLCluster with wrong image.
$ kubectl apply -f - << EOF
apiVersion: moco.cybozu.com/v1beta2
kind: MySQLCluster
metadata:
namespace: test
name: test
spec:
replicas: 3
podTemplate:
spec:
containers:
- name: mysqld
# invalid image
image: foo/bar:baz
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
EOF
- Wait for mysql pod
-2
to become error.
$ kubectl get pod -n test -w
NAME READY STATUS RESTARTS AGE
...
moco-test-2 3/3 Terminating 0 4m7s
...
moco-test-2 0/3 Pending 0 0s
...
moco-test-2 0/3 Init:ErrImagePull 0 17s
moco-test-2 0/3 Init:ImagePullBackOff 0 33s
2bdea15
to
f665979
Compare
74dae8e
to
088bac5
Compare
@YZ775 @masa213f moco/controllers/partition_controller.go Line 134 in 088bac5
And with that I have also added the case for E2E testing. Lines 163 to 241 in 088bac5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check and fix this behavior.
I am still reviewing, but I will comment first because this issue is important. :)
- Create a MySQLCluster
$ kubectl create ns test
$ kubectl apply -f - << EOF
apiVersion: moco.cybozu.com/v1beta2
kind: MySQLCluster
metadata:
namespace: test
name: test
spec:
replicas: 3
podTemplate:
spec:
containers:
- name: mysqld
image: ghcr.io/cybozu-go/moco/mysql:8.0.36.2
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
EOF
-
Wait to become Healthy.
-
Rollout restart MySQL StatefulSet. (It's OK to update MySQLCluster)
$ kubectl rollout restart sts -n test moco-test
At this time, even though the rolling update has not been finished, the partition
become to 0.
$ kubectl get sts -n test moco-test -o json | jq .spec.updateStrategy
{
"rollingUpdate": {
"partition": 0
},
"type": "RollingUpdate"
}
$ kubectl get pod -n test
NAME READY STATUS RESTARTS AGE
moco-test-0 3/3 Running 0 2m32s
moco-test-1 3/3 Running 0 2m32s
moco-test-2 3/3 Terminating 0 2m32s
$ kubectl events -n test
LAST SEEN TYPE REASON OBJECT MESSAGE
...
28s Normal Killing Pod/moco-test-2 Stopping container slow-log
28s Normal Killing Pod/moco-test-2 Stopping container agent
28s Normal Killing Pod/moco-test-2 Stopping container mysqld
28s Normal PartitionUpdate StatefulSet/moco-test Updated partition from 3 to 2
28s Normal PartitionUpdate StatefulSet/moco-test Updated partition from 2 to 1
28s Normal PartitionUpdate StatefulSet/moco-test Updated partition from 1 to 0
6s (x10 over 28s) Normal SuccessfulDelete StatefulSet/moco-test delete Pod moco-test-2 in StatefulSet moco-test successful
6s Normal Scheduled Pod/moco-test-2 Successfully assigned test/moco-test-2 to moco-worker2
6s (x9 over 7s) Normal RecreatingTerminatedPod StatefulSet/moco-test StatefulSet test/moco-test is recreating terminated Pod
If the PARTITION is 0, when a mysql-0 pod is accidentally delete during a rolling update, it is re-created with a new revesion.
If the order of updating the MySQL Pods is swapped, the data may be corrupted during the MySQL upgrade.
https://github.com/cybozu-go/moco/blob/v0.23.2/docs/upgrading.md?plain=1#L58-L60
@d-kuro
|
f8b23dd
to
abecf35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I updated volumeClaimTemplates, the PartitionReconciler did not work.
I guess we need add partition when a StatefulSet is created.
- Create a MySQLCluster
$ kubectl create ns test
$ kubectl apply -f - << EOF
apiVersion: moco.cybozu.com/v1beta2
kind: MySQLCluster
metadata:
namespace: test
name: test
spec:
replicas: 3
podTemplate:
spec:
containers:
- name: mysqld
image: ghcr.io/cybozu-go/moco/mysql:8.0.36.2
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
EOF
# wait for the cluster become healthy.
- Watch the following at another terminals.
# another terminals
$ watch kubectl get mysqlcluster,pod -n test --show-labels
$ watch "kubectl get sts -n test moco-test -o json | jq .metadata.generation,.spec.updateStrategy,.status"
- Create a DB.
$ kubectl moco mysql -n test test -u moco-writable -- -e "CREATE DATABASE test;"
- Crash the MySQL pod 0: keep killing mysqld.
# another terminal
$ watch kubectl exec -n test moco-test-0 -c mysqld -- kill 1
- Update the
podTemplate
andvolumeClaimTemplates
.
$ kubectl apply -f - << EOF
apiVersion: moco.cybozu.com/v1beta2
kind: MySQLCluster
metadata:
namespace: test
name: test
spec:
replicas: 3
podTemplate:
metadata:
labels:
hoge: piyo # add
spec:
containers:
- name: mysqld
image: ghcr.io/cybozu-go/moco/mysql:8.0.36.2
volumeClaimTemplates:
- metadata:
name: mysql-data
labels:
foo: bar # add
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
EOF
@@ -0,0 +1,290 @@ | |||
package e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests when a MySQL Pod is crash looping, rolling update will not start?
I think we can implement this test as following steps.
- Create a MySQLCluster.
- Wait for the MySQLCluster become Healthy.
- Create a DB.
- e.g.
kubectl moco mysql <MySQLCluster> -u moco-writable -- -e "CREATE DATABASE test;"
- e.g.
- Continue to kill mysql-0 or mysql-1 in goroutine.
- e.g. Continue to exec
kubectl exec moco-<MySQLCluster>-0 -c mysqld -- kill 1
- e.g. Continue to exec
- Update MySQLCluster.
Then, check the mysql pods will not start restarting.
Also, I want 2 test cases, one is the StatefulSet is simply Updated and other the StatefulSet will be re-created.
Signed-off-by: d-kuro <[email protected]>
Signed-off-by: d-kuro <[email protected]>
Signed-off-by: d-kuro <[email protected]>
Signed-off-by: d-kuro <[email protected]>
Signed-off-by: d-kuro <[email protected]>
ddae9eb
to
8d17448
Compare
Signed-off-by: d-kuro <[email protected]>
c996c26
to
c97f237
Compare
Signed-off-by: d-kuro <[email protected]>
Signed-off-by: d-kuro <[email protected]>
Signed-off-by: d-kuro <[email protected]>
617433f
to
d9a1ed4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you.
refs: #628
See documentation for details:
https://github.com/cybozu-go/moco/blob/d-kuro/partition/docs/rolling-update-strategy.md