Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When leader fails local get, we should step down #30

Closed
jtuple opened this issue Jun 16, 2014 · 3 comments
Closed

When leader fails local get, we should step down #30

jtuple opened this issue Jun 16, 2014 · 3 comments
Milestone

Comments

@jtuple
Copy link
Contributor

jtuple commented Jun 16, 2014

As mentioned in here and here -- when the leader fails a local get it should step down.

Uncommenting the sending of request_failed messages should be enough, but double check things and test. Is there any reason why we commented out those lines in the first place?

/cc basho/riak#536

@jtuple
Copy link
Contributor Author

jtuple commented Jun 18, 2014

I just now remembered why we don't step down currently: because of the backend pinging logic. For puts it's necessary to step down so the user can resolve partial writes, but for gets it's not strictly required. The question concerning gets is if it helps responsiveness or not.

Here's the general issue.

A get timing out likely means the backend is slow to respond -- perhaps this is a K/V backend and the vnode is slow due to Bitcask merging or some such. Since we always try local gets first, a slow leader will always cause requests to timeout, even if other peers aren't slow. So, stepping down seems to make sense. If we timeout, let's step down and perhaps a different leader will be elected (no guarantee, it's random election; but in practice, this will hold).

However, we already have the backend ping mechanism to handle this. The backend ping logic pings the backend, and causes the leader to step down if the backend does not respond within alive leader ticks -- alive is user configurable but defaults to 2.

This means that a momentarily unresponsive backend that recovers won't trigger a step down, but a very sad backend will. Isn't this a better option than always stepping down on first timed out operation? Stepping down leads to the ensemble being unavailable briefly, plus leads to a new epoch. A new epoch is expensive because that forces all operations to rewrite the key on first reference. Avoiding the leader change unless the backend is really sad seems preferable.

Thoughts?

@andrewjstone
Copy link
Contributor

Ahh, I forgot about backend ping. That is the better solution. Let's just close this issue and take note to remove the commented out lines.

@engelsanchez
Copy link
Contributor

oh cool. I was going to get started with this, but that sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants