-
-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add detached script to join cluster #363
Conversation
@mgagliardo91 looks great, thanks! |
bin/vernemq.sh
Outdated
@@ -339,4 +339,5 @@ trap 'sigterm_handler' SIGTERM | |||
# Start VerneMQ | |||
/vernemq/bin/vernemq console -noshell -noinput $@ & | |||
pid=$! | |||
/vernemq/bin/join_cluster > /var/log/vernemq/log/join_cluster.log & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just asking to be sure. Let's say we only want to start a single node in Docker (no Kubernetes). We'll run the join_cluster
script so we have to do the right thing for any possible situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, currently the join_cluster
script will start, see that they are not in an environment that it would require joining for, and shutdown. But I can also make the join_cluster
script execute conditionally? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ioolkos I went ahead and updated it to conditionally start based on it needing to join a cluster.
Dockerfile
Outdated
@@ -14,6 +14,7 @@ ENV DOCKER_VERNEMQ_KUBERNETES_LABEL_SELECTOR="app=vernemq" \ | |||
PATH="/vernemq/bin:$PATH" \ | |||
VERNEMQ_VERSION="1.13.0" | |||
COPY --chown=10000:10000 bin/vernemq.sh /usr/sbin/start_vernemq | |||
COPY --chown=10000:10000 bin/join_cluster.sh bin/join_cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's copy the script into /usr/sbin
alongside the starte_vernemq
script
bin/join_cluster.sh
Outdated
# Let's set our nodename correctly | ||
# https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#list-pod-v1-core | ||
podList=$(k8sCurlGet "api/v1/namespaces/${DOCKER_VERNEMQ_KUBERNETES_NAMESPACE}/pods?labelSelector=${DOCKER_VERNEMQ_KUBERNETES_LABEL_SELECTOR}") | ||
kube_pod_names=$(echo ${podList} | jq '.items[].spec.hostname' | sed 's/"//g' | tr '\n' ' ') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgagliardo91 I think we need to add a similar fix to the kube_pod_names=
lines as added here: https://github.com/vernemq/docker-vernemq/pull/361/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Since this script is super important for all things Docker/K8s, we'll need to give this a bit of post-surgery attention after the merge :)
Resolves an issue I was running into related to #304 in which the joining of clusters does not resolve in Kubernetes. After some discovery, it appears that misfire of joining the cluster is likely related to a race condition of the other replicas not being ready to accept the join request. Given that the join request only occurs once, if it misses, it ends up creating a standalone cluster. With a replica set of 3, I was seeing all three sit in their own cluster.
To resolve this, I threw together a script that attempts to join the cluster on an interval after the pod starts up, using the Kube API to recognize that their is other pods to cluster with. The script will attempt to join and check to see if it has successfully joined via the
vmq-admin
CLI. Once properly joined, it will exit properly. The log of attempts is logged to the/var/log/vernemq/log
directory.