Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: unconditionally interleave ignorable boundaries #3475

Merged
merged 3 commits into from
Apr 5, 2024

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Mar 29, 2024

db: unconditionally interleave ignorable boundaries

The levelIter has a concept of ignorable boundary keys. When a levelIter
exhausts a file's point keys, it's possible that the same file still contains
range deletions relevant to other levels of the LSM. The levelIter will, under
some conditions, interleave the largest boundary key of the sstable into
iteration as an 'ignorable boundary key,' so that the file's range deletions
remain accessible until all other levels progress beyound the file's boundary.

When block-property filters are in use, a file's point key iterator may become
exhausted early, before the file's range deletions are irrelevant, even if the
file's largest key is not a range deletion. To work around this subtlety, the
sstable iterator previously would surface knowledge of whether any point keys
may have been skipped through a MaybeFilteredKeys method. The levelIter used
this method to determine when to interleave an ignorable largest boundary in
case there may be additional relevant range deletions.

This commit removes the conditioning on the output of MaybeFilteredKeys,
instead unconditionally interleaving a synthetic boundary at a file's largest
point key if the levelIter user is using the file's range deletion iterator.
This simplifies the logic and removes a fragile reliance on the sstable's
iterators accounting of when it may have filtered keys.

Future work (#2863) will remove the need to interleave synthetic boundaries at
all, instead interleaving the range deletion bounds themselves among point
keys.

Informs #2863.
Close #3334.

sstable: remove Iterator.MaybeFilteredKeys

Remove the MaybeFilteredKeys method that surfaced knowledge of whether an
sstable iterator may have omitted keys due to block-property filters. With the
previous commit's refactor to unconditionally interleave sstable largest
boundaries when using a levelIter to drive range deletion iteration, we no
longer need to know whether keys might've been filtered. This reduces the
code complexity considerably.

sstable: strengthen assertion around mutual exclusion of seek optimizations

Previously an assertion that the TrySeekUsingNext and monotonic bounds
optimizations could not simultaneously be active was disabled due to the
TestIterRandomizedMaybeFilteredKeys test that could violate the invariant. The
test has been removed (along with the method it was testing), so we can
now uncondtionally restore the invariant.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jbowens jbowens requested review from a team and RaduBerinde March 29, 2024 15:19
@jbowens jbowens force-pushed the leveliter branch 2 times, most recently from 815e47d to c8106d3 Compare March 29, 2024 15:45
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!! Thanks for making this work and for the cleanup!

:lgtm:

Reviewable status: 0 of 15 files reviewed, 2 unresolved discussions (waiting on @jbowens)


level_iter.go line 1037 at r1 (raw file):

			// the top of the heap and immediately skip the entry, advancing to
			// the next file.
			if l.iterFile.LargestPointKey.Kind() == InternalKeyKindRangeDelete ||

If the largest point key is a range delete, wouln't *rangeDelIterPtr always be non-nil?

Conversely, if the largest point key kind is not correct (the bound can be loose for vtables), is it correct if we still emit the boundary?


merging_iter.go line 955 at r1 (raw file):

			break
		}
		// Skip ignorable boundary keys. These are not real keys and exist to

I think this is what I was missing to make it work.

Isn't it strange that we need to do this in findPrevEntry but not for findNextEntry?

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 0 of 15 files reviewed, 2 unresolved discussions (waiting on @RaduBerinde)


level_iter.go line 1037 at r1 (raw file):

If the largest point key is a range delete, wouln't *rangeDelIterPtr always be non-nil?

I think we should be able to reduce it to this, but there's currently warts in the getIter where it retains the span after it's nil'd the rangeDelIterPtr. I left a TODO.

Conversely, if the largest point key kind is not correct (the bound can be loose for vtables), is it correct if we still emit the boundary?

Yeah, it is; we'll just pay the cost of an unnecessary extra step.


merging_iter.go line 955 at r1 (raw file):

Previously, RaduBerinde wrote…

I think this is what I was missing to make it work.

Isn't it strange that we need to do this in findPrevEntry but not for findNextEntry?

We actually already do it in findNextEntry:

pebble/merging_iter.go

Lines 806 to 816 in 1c7bcd1

// Skip ignorable boundary keys. These are not real keys and exist to
// keep sstables open until we've surpassed their end boundaries so that
// their range deletions are visible.
if m.levels[item.index].isIgnorableBoundaryKey {
m.err = m.nextEntry(item, nil /* succKey */)
if m.err != nil {
return nil, base.LazyValue{}
}
continue
}

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 6 files at r1, 9 of 10 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens)


merging_iter.go line 955 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

We actually already do it in findNextEntry:

pebble/merging_iter.go

Lines 806 to 816 in 1c7bcd1

// Skip ignorable boundary keys. These are not real keys and exist to
// keep sstables open until we've surpassed their end boundaries so that
// their range deletions are visible.
if m.levels[item.index].isIgnorableBoundaryKey {
m.err = m.nextEntry(item, nil /* succKey */)
if m.err != nil {
return nil, base.LazyValue{}
}
continue
}

Oh, I think I looked at the wrong findNextEntry. Great, that explains why I could do it for the lower boundary before. Thanks!!

jbowens added 3 commits April 1, 2024 17:26
The levelIter has a concept of ignorable boundary keys. When a levelIter
exhausts a file's point keys, it's possible that the same file still contains
range deletions relevant to other levels of the LSM. The levelIter will, under
some conditions, interleave the largest boundary key of the sstable into
iteration as an 'ignorable boundary key,' so that the file's range deletions
remain accessible until all other levels progress beyound the file's boundary.

When block-property filters are in use, a file's point key iterator may become
exhausted early, before the file's range deletions are irrelevant, even if the
file's largest key is not a range deletion. To work around this subtlety, the
sstable iterator previously would surface knowledge of whether any point keys
may have been skipped through a MaybeFilteredKeys method. The levelIter used
this method to determine when to interleave an ignorable largest boundary in
case there may be additional relevant range deletions.

This commit removes the conditioning on the output of MaybeFilteredKeys,
instead unconditionally interleaving a synthetic boundary at a file's largest
point key if the levelIter user is using the file's range deletion iterator.
This simplifies the logic and removes a fragile reliance on the sstable's
iterators accounting of when it may have filtered keys.

Future work (cockroachdb#2863) will remove the need to interleave synthetic boundaries at
all, instead interleaving the range deletion bounds themselves among point
keys.

Informs cockroachdb#2863.
Close cockroachdb#3334.
Remove the MaybeFilteredKeys method that surfaced knowledge of whether an
sstable iterator may have omitted keys due to block-property filters. With the
previous commit's refactor to unconditionally interleave sstable largest
boundaries when using a levelIter to drive range deletion iteration, we no
longer need to know whether keys might've been filtered. This reduces the
code complexity considerably.
…ations

Previously an assertion that the TrySeekUsingNext and monotonic bounds
optimizations could not simultaneously be active was disabled due to the
TestIterRandomizedMaybeFilteredKeys test that could violate the invariant. The
test has been removed (along with the method it was testing), so we can
now uncondtionally restore the invariant.
@jbowens
Copy link
Collaborator Author

jbowens commented Apr 5, 2024

TFTR!

@jbowens jbowens merged commit 45b7a80 into cockroachdb:master Apr 5, 2024
11 checks passed
@jbowens jbowens deleted the leveliter branch April 5, 2024 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

db: simplify upper synthetic boundary in level iterator
3 participants