Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step down if local_put fails in leader worker #27

Merged
merged 1 commit into from
Jun 24, 2014

Conversation

andrewjstone
Copy link
Contributor

If we fail to write locally as a leader, we cannot get a valid quorum since those quorums include the leader. Therefore we need to step down to prevent committing unsafely.

In riak_ensemble_peer:put_obj/4 when local_put/4 fails, step down by
sending a request_failed message to the leader.

@andrewjstone
Copy link
Contributor Author

Fixes #2

@jtuple
Copy link
Contributor

jtuple commented Jun 16, 2014

/cc basho/riak#536

@jtuple jtuple assigned andrewjstone and jtuple and unassigned andrewjstone and jtuple Jun 16, 2014
@lordnull
Copy link

Had to rebase to get ee to compile. ensemble develop introduced a type that does not exist in this branch.

riak_test run:

./riak_test -c ee -t ensemble_basic -t ensemble_basic2 -t ensemble_basic3 -t ensemble_basic4 -t ensemble_interleave -t ensemble_remove_node -t ensemble_remove_node2 -t ensemble_start_without_aae -t ensemble_sync -t ensemble_util -t ensemble_vnode_crash

First run had sync, remove_node2, and remove_node all fail. Running them individually had them pass, and 2nd run with all ensemble tests passed.

Code is nicely contained.

I'll run the ensemble suite a couple more times to see if the failure can be replicated.

@lordnull
Copy link

👍 02ee10c

At least one test will timeout when running the suite, but each test will pass individually, so I don't think the changes in this pr are responsible for that. Since that was the only concern, this is ready for merging.

@andrewjstone
Copy link
Contributor Author

@borshop merge

In riak_ensemble_peer:put_obj/4 when local_put/4 fails, step down by
sending a 'request_failed' message to the leader.
@andrewjstone
Copy link
Contributor Author

👍 8f6efad

borshop added a commit that referenced this pull request Jun 24, 2014
Step down if local_put fails in leader worker

Reviewed-by: andrewjstone
@andrewjstone
Copy link
Contributor Author

@borshop merge

@borshop borshop merged commit 8f6efad into develop Jun 24, 2014
@seancribbs seancribbs deleted the bugfix/step-down-on-local-failure branch April 1, 2015 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants