-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harde restate-server::raft_metadata_cluster raft_metadata_cluster_chaos_test #2828
Comments
It seems that in both failing cases, we were restarting a killed node with a completely different configuration: https://github.com/restatedev/restate/actions/runs/13721166634/job/38376765214#step:12:2510 |
I looks as if the node has been started with an empty configuration. Maybe the file hasn't been fully written before the process starts? |
Dropping the file alone, does not guarantee that the file content is written to disk and the file being closed immediately. If the config file is not written when starting the Restate process, then it will start with the default configuration which causes the raft_metadata_cluster_chaos_test to fail every now and then. This fixes restatedev#2828.
Dropping the file alone, does not guarantee that the file content is written to disk and the file being closed immediately. If the config file is not written when starting the Restate process, then it will start with the default configuration which causes the raft_metadata_cluster_chaos_test to fail every now and then. This fixes restatedev#2828.
The
raft_metadata_cluster_chaos_test
seems to be unstable. It looks that the cluster sometimes does not start up fast enough for the initial health checks to pass. Maybe we should give it a bit more time.https://github.com/restatedev/restate/actions/runs/13657796133/job/38181318353?pr=2825#step:12:2853
The text was updated successfully, but these errors were encountered: