Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DeleteAll for CacheKV/MultiVersionKV so that all keys are properly deleted #464

Merged
merged 2 commits into from
Mar 22, 2024

Conversation

codchen
Copy link
Collaborator

@codchen codchen commented Mar 19, 2024

Describe your changes and provide context

If we have a cacheKV that has a dirty cache of 1->a and a parent with an entry 2->b, the previous implementation of DeleteAll would only call cacheKV.Delete on key 1 but not affecting 2 at all (i.e. if someone queries key 2 on the cacheKV they would still get b, but the expectation is for 2 to be deleted as appearing on the cacheKV level). This PR fixes that by first getting all the keys to be (marked as) deleted recursively and then delete all those keys one-by-one; note that this recursion should be fairly efficient since it's not subject to the exponential latency that merge iterators have.

This PR also fixed DeleteAll for mvkv so that it doesn't actually touch parent store's value but only change the writeset (as a single Delete would have done).

Testing performed to validate your change

unit test

@codchen codchen requested review from udpatil and yzang2019 March 19, 2024 09:48
Copy link
Collaborator

@yzang2019 yzang2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for cachekv

keyStrs := map[string]struct{}{}
for _, pk := range v.parent.GetAllKeyStrsInRange(start, end) {
keyStrs[pk] = struct{}{}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to also consider the multiVersionStore when getting all the keys in range, since earlier txs potentially may have written within the range or deleted keys within the range, both of which should affect the result of this function. As such, we would also need to update readsets here, right? we should add all of these keys read both from parent OR from multiversion store to the readset so that the DeleteAll is properly backed by an updated readset representing the accesses of those keys / values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, we should do something like iterateset in this case, because the keys that are accessed within the range is important here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the easiest solution in this case (perhaps not the most optimal but should be correct) would be to create an Iterator within deleteAll, iterate over the whole iterator collecting keys, and THEN call delete on each one. Then you get the iteration safety out of the box

Copy link
Contributor

@udpatil udpatil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated comments from the earlier review clarifying further what needs to change in DeleteAll for OCC correctness

@codecov-commenter
Copy link

codecov-commenter commented Mar 20, 2024

Codecov Report

Attention: Patch coverage is 39.13043% with 42 lines in your changes are missing coverage. Please review.

Project coverage is 55.34%. Comparing base (7f668bd) to head (514046c).
Report is 31 commits behind head on seiv2.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##            seiv2     #464      +/-   ##
==========================================
- Coverage   55.36%   55.34%   -0.02%     
==========================================
  Files         629      629              
  Lines       53878    53943      +65     
==========================================
+ Hits        29827    29855      +28     
- Misses      21917    21955      +38     
+ Partials     2134     2133       -1     
Files Coverage Δ
store/multiversion/mvkv.go 93.20% <100.00%> (+1.11%) ⬆️
store/types/store.go 68.75% <ø> (ø)
server/mock/store.go 16.10% <0.00%> (-0.28%) ⬇️
store/gaskv/store.go 88.11% <0.00%> (-1.79%) ⬇️
store/listenkv/store.go 84.61% <0.00%> (-2.69%) ⬇️
store/tracekv/store.go 83.13% <0.00%> (-2.06%) ⬇️
store/cachekv/store.go 80.81% <84.21%> (+0.73%) ⬆️
store/dbadapter/store.go 80.70% <0.00%> (-11.30%) ⬇️
store/iavl/store.go 58.80% <0.00%> (-1.49%) ⬇️
storev2/commitment/store.go 10.37% <0.00%> (-0.74%) ⬇️
... and 1 more

... and 22 files with indirect coverage changes

vis.Set([]byte("key5"), []byte("value7"))
vis.Delete([]byte("key6"))
require.Equal(t, []string{"key1", "key2", "key4", "key5"}, vis.GetAllKeyStrsInRange(nil, nil))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add an additional unit test that does the following:
has some keys in parent store, some written by tx1, and tx3 has some in its writeset

then tx3 does a delete all across the range
then tx2 writes its writeset (including a key within the target range)
Then we do validate transaction state which should be false due to that key added by tx2 that wasnt caught into deleteAll

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@codchen codchen merged commit 57593a9 into seiv2 Mar 22, 2024
13 checks passed
@codchen codchen deleted the delete-prefix branch March 22, 2024 04:45
udpatil pushed a commit that referenced this pull request Mar 26, 2024
…y deleted (#464)

## Describe your changes and provide context
If we have a cacheKV that has a dirty cache of 1->a and a parent with an
entry 2->b, the previous implementation of `DeleteAll` would only call
cacheKV.Delete on key 1 but not affecting 2 at all (i.e. if someone
queries key 2 on the cacheKV they would still get `b`, but the
expectation is for 2 to be deleted as appearing on the cacheKV level).
This PR fixes that by first getting all the keys to be (marked as)
deleted recursively and then delete all those keys one-by-one; note that
this recursion should be fairly efficient since it's not subject to the
exponential latency that merge iterators have.

This PR also fixed DeleteAll for `mvkv` so that it doesn't actually
touch parent store's value but only change the writeset (as a single
`Delete` would have done).

## Testing performed to validate your change
unit test
udpatil pushed a commit that referenced this pull request Mar 26, 2024
…y deleted (#464)

## Describe your changes and provide context
If we have a cacheKV that has a dirty cache of 1->a and a parent with an
entry 2->b, the previous implementation of `DeleteAll` would only call
cacheKV.Delete on key 1 but not affecting 2 at all (i.e. if someone
queries key 2 on the cacheKV they would still get `b`, but the
expectation is for 2 to be deleted as appearing on the cacheKV level).
This PR fixes that by first getting all the keys to be (marked as)
deleted recursively and then delete all those keys one-by-one; note that
this recursion should be fairly efficient since it's not subject to the
exponential latency that merge iterators have.

This PR also fixed DeleteAll for `mvkv` so that it doesn't actually
touch parent store's value but only change the writeset (as a single
`Delete` would have done).

## Testing performed to validate your change
unit test
udpatil pushed a commit that referenced this pull request Mar 27, 2024
…y deleted (#464)

## Describe your changes and provide context
If we have a cacheKV that has a dirty cache of 1->a and a parent with an
entry 2->b, the previous implementation of `DeleteAll` would only call
cacheKV.Delete on key 1 but not affecting 2 at all (i.e. if someone
queries key 2 on the cacheKV they would still get `b`, but the
expectation is for 2 to be deleted as appearing on the cacheKV level).
This PR fixes that by first getting all the keys to be (marked as)
deleted recursively and then delete all those keys one-by-one; note that
this recursion should be fairly efficient since it's not subject to the
exponential latency that merge iterators have.

This PR also fixed DeleteAll for `mvkv` so that it doesn't actually
touch parent store's value but only change the writeset (as a single
`Delete` would have done).

## Testing performed to validate your change
unit test
udpatil pushed a commit that referenced this pull request Apr 16, 2024
…y deleted (#464)

## Describe your changes and provide context
If we have a cacheKV that has a dirty cache of 1->a and a parent with an
entry 2->b, the previous implementation of `DeleteAll` would only call
cacheKV.Delete on key 1 but not affecting 2 at all (i.e. if someone
queries key 2 on the cacheKV they would still get `b`, but the
expectation is for 2 to be deleted as appearing on the cacheKV level).
This PR fixes that by first getting all the keys to be (marked as)
deleted recursively and then delete all those keys one-by-one; note that
this recursion should be fairly efficient since it's not subject to the
exponential latency that merge iterators have.

This PR also fixed DeleteAll for `mvkv` so that it doesn't actually
touch parent store's value but only change the writeset (as a single
`Delete` would have done).

## Testing performed to validate your change
unit test
udpatil pushed a commit that referenced this pull request Apr 19, 2024
…y deleted (#464)

## Describe your changes and provide context
If we have a cacheKV that has a dirty cache of 1->a and a parent with an
entry 2->b, the previous implementation of `DeleteAll` would only call
cacheKV.Delete on key 1 but not affecting 2 at all (i.e. if someone
queries key 2 on the cacheKV they would still get `b`, but the
expectation is for 2 to be deleted as appearing on the cacheKV level).
This PR fixes that by first getting all the keys to be (marked as)
deleted recursively and then delete all those keys one-by-one; note that
this recursion should be fairly efficient since it's not subject to the
exponential latency that merge iterators have.

This PR also fixed DeleteAll for `mvkv` so that it doesn't actually
touch parent store's value but only change the writeset (as a single
`Delete` would have done).

## Testing performed to validate your change
unit test
udpatil pushed a commit that referenced this pull request Apr 19, 2024
…y deleted (#464)

## Describe your changes and provide context
If we have a cacheKV that has a dirty cache of 1->a and a parent with an
entry 2->b, the previous implementation of `DeleteAll` would only call
cacheKV.Delete on key 1 but not affecting 2 at all (i.e. if someone
queries key 2 on the cacheKV they would still get `b`, but the
expectation is for 2 to be deleted as appearing on the cacheKV level).
This PR fixes that by first getting all the keys to be (marked as)
deleted recursively and then delete all those keys one-by-one; note that
this recursion should be fairly efficient since it's not subject to the
exponential latency that merge iterators have.

This PR also fixed DeleteAll for `mvkv` so that it doesn't actually
touch parent store's value but only change the writeset (as a single
`Delete` would have done).

## Testing performed to validate your change
unit test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants