Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom SAI components discovery #1338

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

pcmanus
Copy link

@pcmanus pcmanus commented Oct 8, 2024

Refactor the code that "discover" which SAI components files an SSTable has to allow making that code customizable. The default behavior remains the existing one, that is the TOC file is consulted, with a fallback to scanning disk if the TOC file is missing or corrupted.

The way to customize this behavior is similar to what SSTableWatcher does: a new system property, -Dcassandra.sai.custom_components_discovery_class allows to load a specific class (that extends the IndexCoomponentDiscovery class) instead of the default.

As part of the refactor to make this new customisability easier, this introduces a new ComponentsBuildId class that is the pair of the version and generation of a group of components, since this is whatultimately defines a given component "group" build.

This PR is used by https://github.com/riptano/cndb/pull/11361 on the CNDB side, in the context of https://github.com/riptano/cndb/issues/10889.

@pcmanus pcmanus force-pushed the cndb-10889-customizable-index-discovery branch 5 times, most recently from e19c101 to dacb203 Compare October 17, 2024 14:00
@pcmanus pcmanus force-pushed the cndb-10889-customizable-index-discovery branch 11 times, most recently from 32f9a1b to 362bf10 Compare October 23, 2024 14:46
@pcmanus pcmanus force-pushed the cndb-10889-customizable-index-discovery branch from 6f1346d to 2961010 Compare October 29, 2024 09:43
@pcmanus pcmanus marked this pull request as ready for review October 29, 2024 09:46
private volatile boolean isComplete;
// Mark groups that are being read/have been fully written, and thus should not have new components added.
// This is just to catch errors where we'd try to add a component after the completion marker was written.
private volatile boolean sealed;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is "sealed" a reserved keyword in some newer java version ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. It's a "contextual reserved keyword" in java 16+ but this is not a context where it is reserved (it's only in class/interface names).

Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pcmanus pcmanus force-pushed the cndb-10889-customizable-index-discovery branch 2 times, most recently from 9b22677 to 16b0adc Compare October 30, 2024 11:26
Copy link

@jasonstack jasonstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

boolean isUsable = candidate.expectedComponentsForVersion().stream().noneMatch(candidate::componentExistsOnDisk);
if (!isUsable)
{
noSpamLogger.warn(logMessage("Wanted to use generation {} for new build of {} SAI components of {}, but found some existing components on disk for that generation (maybe leftover from an incomplete/corrupted build?); trying next generation"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be nice to include sstable id in the log

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I misunderstand what you mean by "sstable id", I believe it's in there already: the last arg of the message is descriptor which is the sstable Descriptor and thus includes the id.

Refactor the code that "discover" which SAI components files an SSTable
has to allow making that code customizable. The default behavior remains
the existing one, that is the TOC file is consulted, with a fallback to
scanning disk if the TOC file is missing or corrupted.

The way to customize this behavior is similar to what `SSTableWatcher`
does: a new system property, `-Dcassandra.sai.custom_components_discovery_class`
allows to load a specific class (that extends the `IndexCoomponentDiscovery` class)
instead of the default.

As part of the refactor to make this new customisability easier, this
introduces a new `ComponentsBuildId` class that is the pair of the
version and generation of a group of components, since this is what
ultimately defines a given component "group" build.
It's easy to do so, and easier for implementation of that callback
if they need it (they could get it indirectly from the sstable
descriptor keyspace and table name, but passing it is easier).
@pcmanus pcmanus force-pushed the cndb-10889-customizable-index-discovery branch from 16b0adc to e3a70a8 Compare November 5, 2024 14:29
Copy link

sonarcloud bot commented Nov 7, 2024

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1338 rejected by Butler


1 new test failure(s) in 4 builds
See build details here


Found 1 new test failures

Test Explanation Branch history Upstream history
...yIteratorTest.testTotalAndReadBytesManySSTables flaky 🔵🔴🔵 🔵🔵🔵🔵🔵🔵🔵

Found 14 known test failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants