Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LWWReg (as currently implemented) is not commutative. #7

Open
jemc opened this issue Jul 4, 2017 · 3 comments
Open

LWWReg (as currently implemented) is not commutative. #7

jemc opened this issue Jul 4, 2017 · 3 comments

Comments

@jemc
Copy link

jemc commented Jul 4, 2017

Greetings.

I've been working on implementing these delta-state convergent replicated data types in another language (Pony). If you're interested in seeing my implementation so far, you can find it here.

While implementing, I noticed that your lwwreg::join operation is not commutative. That is, when multiple join operations are applied with the same timestamp (U), the final value for any particular replica of the register depends on the order in which the operations were applied. Specifically, for a set of operations whose timestamp is the same, and that timestamp is the highest timestamp that the replicas will see, the first of such join operations will "win".

Here's a trivial worked example, where more than one value arrives with a logical timestamp of "6":

replica A receives: (5, "foo"), (6, "bar"), (6, "baz"); final: (6, "bar")
replica B receives: (6, "bar"), (5, "foo"), (6, "baz"); final: (6, "bar")
replica C receives: (6, "baz"), (6, "bar"), (5, "foo"); final: (6, "baz")

In the example, two of the replicas end with a final value of "bar", and one replica ends with a final value of "baz", because of the order in which the delta-states was applied. Even with infinite retransmission of those delta-states, the result in each replica will not change from the shown final value.


The workaround I used to make the join operation fully commutative and salvage this data structure as a CRDT is to:

  • require that the value type (T) is also comparable.
  • introduce a "bias" by value, that favors either the greater or the lesser value in cases where the timestamp is equal.

This strategy ensures that all replicas will eventually converge to the same final value and timestamp.

In essence, the "Last Writer Wins Register" (LWWReg) must become either a "Last Writer Greater Value Wins Register" (LWGVWReg) or a "Last Writer Lesser Value Wins Register".

If both the timestamp and the value are equivalent (neither side is greater or lesser in either case), then there is no conflict to resolve.


You can see my implementation of LWWReg here, including a bias type parameter named B which accepts either BiasGreater or BiasLesser as the type argument, so that the bias strategy is selected at compile time and is part of the type signature.

Just because it might help make sense of the code, I'll note that one additional change I've made from your model is that my delta-mutator methods accept a delta-state-so-far as a parameter, so that the pattern of accumulating several delta-states before sending it to the other replicas is facilitated with fewer object allocations.


In closing, thank you for your work on this highly important paper, and on this reference implementation. I hope this issue ticket can be of some help in the overall goal of increasing adoption of these concepts.

@vitorenesduarte
Copy link

Hi,

From A comprehensive study of Convergent and Commutative
Replicated Data Types
:

Timestamps are assumed unique, totally ordered,
and consistent with causal order

I think this is assumed here.
There's also a reference implementation in Erlang that assumes the same.

It's great to see an implementation in Pony.

@CBaquero
Copy link
Owner

CBaquero commented Jul 5, 2017

Hi Joe, and thanks for pointing this out and suggesting a nice solution.

LWW based CRDTs are even hard to consider CRDTs since the direction of evolution is directed by an external parameter, here a timestamp. So, as Vitor said, we assume that the fine granularity of the externally supplied timestamps ensures them to be unique.

But, it is good to plan for possible ties in the timestamps. And in those cases the trick is to use something else (as you did with the value) to introduce a tie breaking bias. I didn't check the Pony code yet, but the solution description you made looks perfectly correct to me.

I think that in another datatype we also had to introduce a bias to fix LWW ties "RWLWWSet: Last-writer-wins set with remove wins bias (SoundCloud inspired)".

Thanks again for working in the Pony side, we might revise later our LWWReg to ensure convergence in case of ties, but I want to think more carefully on the impact of forcing payload to be comparable.

Best,

Carlos

@jemc
Copy link
Author

jemc commented Jul 5, 2017

Thanks for your quick responses.

I wasn't thinking of the timestamp unicity as being a premise of the model, because I was still thinking in terms of the SoundCloud-inspired LWWSet, which as you mention does not assume timestamp unicity, and uses either an add-wins or remove-wins bias to resolve the conflict.

From my perspective requiring comparability of the value is not too onerous, as even if the value is not inherently comparable, a commutative conflict-resolution function could be supplied which provides a total ordering of the possible values. Even if that total ordering is arbitrary from the perspective of the data, it still provides eventual consistency for the case of timestamp collision, and the semantics of the CRDT as far as the application is concerned would be "the result is arbitrary in the case of timestamp collision", though that arbitrary result is still eventually consistent. However, I could see how it might be an implementation practicality concern to figure out how the caller would supply these conflict-resolution functions in C++, for the cases of types that are not inherently comparable.

Anyway, I appreciate the discussion here. You can feel free to close this issue ticket if you don't see it as useful to keep open as an "action item" for work. I will proceed with my approach for an LWWReg with bias, and you can take whatever approach you find is best for your project. I might just suggest that you may want to add a bit of explicit documentation to the README about timestamp unicity for LWWReg, to avoid potential problems and/or confusion when users are not aware of this requirement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants