Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dynamic arbitrary filtering on NoSQL databases #1682

Merged
merged 11 commits into from
May 22, 2024

Conversation

jnmt
Copy link
Contributor

@jnmt jnmt commented Apr 18, 2024

Description

This PR adds cross-partition scan filtering (i.e., ScanAll with a where clause`) for non-JDBC databases. It filters results based on the specified arbitrary conditions in the ScalarDB layer. It also limits results in the ScalarDB layer if it's a cross-partition scan with filtering to count the results correctly.

Sorry for the big PR, but the basic concept is the same for all three databases. See inline comments for details.

Related issues and/or PRs

N/A

Changes made

  • Add dynamic filtering in Cassandra, Cosmos DB, and DynamoDB.
    • Add filterable (and limitable) Scanner implementations.
    • Move the function to evaluate conditional expressions from Snapshot to ScalarDbUtils.
    • Add integration tests with conditions and limits for cross-partition scan.
  • Fix warning for incorrectly specified projections when getting keys in results.

Checklist

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

Dynamic filtering will also be applied for Get, GetWithIndex, Scan, and ScanWithIndex in this PR, but the builders to actually run those queries (and the integration tests) will be added in other PRs since this PR is already very big.

Release notes

Added dynamic arbitrary filtering for non-JDBC databases.

import org.junit.jupiter.api.Test;

public class CassandraCrossPartitionScanIntegrationTest
extends DistributedStorageCrossPartitionScanIntegrationTestBase {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added storage-level tests for each database since transaction-level tests would be tested enough using JDBC databases and they would increase CI time. But please feel free to comment if more tests should be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed my mind and added integration tests with consensus commit since it's good to create a chance to detect bugs like the one I fixed in 9729ce8.

Scan scan, ResultSet resultSet, ResultInterpreter resultInterpreter) {
super(resultSet, resultInterpreter);
this.conjunctions = scan.getConjunctions();
this.left = scan.getLimit() > 0 ? scan.getLimit() : -1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to manage left items that can be returned before reaching the limit.

Comment on lines 124 to 130
if (scan.getConjunctions().isEmpty()) {
return new ScannerImpl(handlers.select().handle(scan), resultInterpreter);
} else {
// Ignore limit to control it in FilterableScannerImpl
Scan scanAll = Scan.newBuilder(scan).limit(0).build();
return new FilterableScannerImpl(scan, handlers.select().handle(scanAll), resultInterpreter);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduce a new filterable scanner implementation so that the regular ScanAll (w/o filtering) does not have to check the existence of filters every time when iterating. It's also nice to keep the logic simple.

@@ -560,7 +560,6 @@ private Scan prepareScanWithLike(boolean isLike, String pattern) {
.table(CONDITION_TEST_TABLE)
.all()
.where(condition)
.ordering(Ordering.asc(PARTITION_KEY_NAME))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed orderings (including at other places) to commonly use these tests in non-JDBC databases except for order-specifc tests.

@jnmt jnmt self-assigned this Apr 18, 2024
@jnmt jnmt added the enhancement New feature or request label Apr 18, 2024
@jnmt
Copy link
Contributor Author

jnmt commented Apr 18, 2024

Sorry, I faced an issue when using this with consensus commit. Let me investigate and maybe fix it.

@jnmt jnmt marked this pull request as draft April 18, 2024 23:45
@Override
protected Properties getProps(String testName) {
Properties properties = CassandraEnv.getProperties(testName);
properties.setProperty(ConsensusCommitConfig.ISOLATION_LEVEL, "SERIALIZABLE");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though serializable isolation is not recommended in non-JDBC databases, as we warn when using it, I think it's better to use serializable for testing purposes so as not to miss bugs regarding extra-read.

It might be better to use serializable for other consensus commit integration tests if no concerns. Any thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though serializable isolation is not recommended in non-JDBC databases

Does this mean that we don't recommend using SERIALIZABLE with the cross-partition scan in non-JDBC databases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I mean this warning: https://scalardb.scalar-labs.com/docs/latest/configurations/#cross-partition-scan-configurations. Even so, I thought these tests should be done with extra-read just in case.

// For JDBC databases or non-JDBC databases without filtering, we only get tx_id,
// tx_version, and primary key columns because we use only them to compare. For
// non-JDBC databases with filtering, we also need filtering columns to compare.
projectRequiredColumns(storage, scan);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix for a bug with the serializable consensus commit. We need non-primary key columns to perform filtering correctly in the ScalarDB layer if they are used in the conditions, even for the extra-read purpose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] Does it mean that the values of scanSet doesn't contain filtering columns in the case of JDBC (thanks to the condition pushdown?), while the values of scanSet contain filtering columns in the case of non-JDBC?

If so, I guess it would be a bit more helpful to understand if Scan tells whether filtering columns are contained with a method like scan.containsFilteringColumnsInKey() so that we can revise the condition in projectRequiredColumns like this?

if (scan.containsFilteringColumnsInKey()) {
  scan.getConjunctions()
      .forEach(
          conjunction ->
              conjunction
   ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not get your point perfectly, but it doesn't mean whether the user's original Scan has filtering columns in projections or not. Do you think this condition if (!(storage instanceof JdbcDatabase)) is hard to understand the background? For JDBC databases, we don't have to care about the existence of filtering columns in projections, so I didn't want to check it excessively.

By the way, thanks to your question, I might find an issue regarding projections...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply!

What I thought was as follows:

  • Why filtering columns were needed for non-JDBC (w/ filtering) ?
  • The results of this scan are put to currentReadMap as a part of the key
  • Then, the map is looked-up with the values of scanSet (TransactionResult latestResult = currentReadMap.get(key))
  • So, including filtering columns in projection is for consistency of the key types related to the equals()'s logic?

But I think I had the wrong idea... Can you elaborate a bit more why filtering columns are needed for projection in the case of non-JDBC with filtering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, the reason is not related to the key consistency of currentReadMap since the validation is always handled with the primary key basis.

We originally project columns except for tx_id, tx_version, and primary key columns for efficiency since they are enough for the re-scan for the validation phase (extra-read). We now want to filter results in the ScalarDB layer for non-JDBC databases. So, we need to additionally have filtering columns here since they are used even for the re-scan, the same as the first scan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I misread the code around Key class... Sorry for bothering you.

We originally project columns except for tx_id, tx_version, and primary key columns

s/except for/only for/ ?

But, I think I don't still get why columns for filtering need to be projected. My understanding is as follows:

  • Scan instances stored in scanSet contain the original conditions
  • FilterableScannerImpl#getResultWithFiltering filters records based on the conditions. Whether the columns for filtering are projected isn't related to the filtering. (The columns for filtering included in the conditions are automatically scanned at the underlying storage layer, right?)
  • The columns for filtering of the result in currentReadMap are not used
  • So, the columns for filtering don't need to be projected?

I think I have some wrong ideas of them. Can you point out it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I misread the code around Key class... Sorry for bothering you.

No no, it's OK. I think this part is so complicated and worth while to share the background.

We originally project columns except for tx_id, tx_version, and primary key columns

s/except for/only for/ ?

Sorry, "only for".

But, I think I don't still get why columns for filtering need to be projected. My understanding is as follows:

  • Scan instances stored in scanSet contain the original conditions

Right. Precisely, the list of projections is basically cleared in CrudHandler for the sake of transaction processing. In that sense, it's a bit different from the original one specified by users.

  • FilterableScannerImpl#getResultWithFiltering filters records based on the conditions. Whether the columns for filtering are projected isn't related to the filtering. (The columns for filtering included in the conditions are automatically scanned at the underlying storage layer, right?)

Right. We always need columns specified in the where clause to filter results. (*)

  • The columns for filtering of the result in currentReadMap are not used

Right. currentReadMap is for the read set verification. So, primary key columns are enough, but, of course, the map should be based on the result of the scan with the filtering conditions in the current database state.

  • So, the columns for filtering don't need to be projected?

No. We used them to get the current results above.

Let's chat a bit if you have further questions : )

Copy link
Contributor

@komamitsu komamitsu Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed answers!

I found my misunderstanding...

I thought Cassandra.scan() implicitly projected the columns for filtering when there were any conditions in the scan. What I expected was like this:

--- a/core/src/main/java/com/scalar/db/storage/cassandra/Cassandra.java
+++ b/core/src/main/java/com/scalar/db/storage/cassandra/Cassandra.java
@@ -14,6 +14,7 @@ import com.scalar.db.api.Mutation;
 import com.scalar.db.api.Put;
 import com.scalar.db.api.Result;
 import com.scalar.db.api.Scan;
+import com.scalar.db.api.ScanBuilder.BuildableScanOrScanAllFromExisting;
 import com.scalar.db.api.Scanner;
 import com.scalar.db.common.AbstractDistributedStorage;
 import com.scalar.db.common.TableMetadataManager;
@@ -125,7 +126,9 @@ public class Cassandra extends AbstractDistributedStorage {
       return new ScannerImpl(handlers.select().handle(scan), resultInterpreter);
     } else {
       // Ignore limit to control it in FilterableScannerImpl
-      Scan scanAll = Scan.newBuilder(scan).limit(0).build();
+      BuildableScanOrScanAllFromExisting builder = Scan.newBuilder(scan);
+      withProjectionsForFilteringIfMissing(builder);
+      Scan scanAll = builder.limit(0).build();
       return new FilterableScannerImpl(scan, handlers.select().handle(scanAll), resultInterpreter);
     }
   }

BTW, assuming the above example works, it is a kind of provisioning the columns for filtering at storage layer while the change of this PR is provisioning the columns for filtering at an upper layer. Which do you think better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. As you thought, we needed to handle projections properly in the storage layer, and it would be better to hide such a storage matter from the upper layer. So, your question got to the core of the matter from the beginning. Thanks again. Fixed in 6b1e92a.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the consideration!

@jnmt jnmt marked this pull request as ready for review April 19, 2024 09:31
@jnmt
Copy link
Contributor Author

jnmt commented Apr 19, 2024

Now ready for review! PTAL, when you get a chance since the issue was fixed in 9729ce8.

}
} else {
// Otherwise, advance to the next page
currentPageRecords = recordsPages.next().getResults().iterator();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[trivial] I'm not sure if recordsPages.next() fetches the next page data lazily, but checking if left is zero before calling next() might be always good?

Like this

    while (currentPageRecords.hasNext() || recordsPages.hasNext()) {
      if (left == 0) {
        return Optional.empty();
      }
      // Return the next record of the current page if there is one
      if (currentPageRecords.hasNext()) {
        Optional<Result> result = getResultWithFiltering(currentPageRecords.next());
        if (result.isPresent()) {
          return result;
        }

Copy link
Contributor Author

@jnmt jnmt Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thank you. Yeah, we don't have to call next() if there are no results left that can be returned. Moreover, we don't have to check left several times in one() since decreasing left only happens in getResultWithFiltering at most once. So, the following should be possible (the same as for all other scanners):

  public Optional<Result> one() {
    if (left == 0) {
      return Optional.empty();
    }
    while (currentPageRecords.hasNext() || recordsPages.hasNext()) {
      // Return the next record of the current page if there is one
      if (currentPageRecords.hasNext()) {
        Optional<Result> result = getResultWithFiltering(currentPageRecords.next());
        if (result.isPresent()) {
          return result;
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in new FilterableScanner of 6b1e92a.

import java.util.Set;
import javax.annotation.Nonnull;

public class FilterableQueryScanner extends QueryScanner implements Scanner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NotThreadSafe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in new FilterableScanner of 6b1e92a.

@komamitsu komamitsu self-requested a review April 23, 2024 01:01
@jnmt
Copy link
Contributor Author

jnmt commented Apr 24, 2024

@komamitsu Based on the feedback, I updated the following in 6b1e92a. PTAL again when you get a chance.

  • Fixed a bug where both projections and conditions are given and the projections do not include the columns specified in conditions.
  • Refactored filtering and limiting in the storage layer using a wrapper scanner.

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did my first review for this PR and left several comments. Please take a look when you have time! I'll review it again later.

Also, I have a question: it seems that for Cassandra, Cosmos DB, and DynamoDB, we always execute dynamic filtering for the conjunctions. Are there any cases where we can push down the conjunctions to the databases?

Additionally, I think we could support conjunctions for the normal Scan and ScanWithIndex. What are your thoughts on this?


@Test
@Override
@Disabled
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can document the reason why this test is disabled as follows:

Suggested change
@Disabled
@Disabled("<REASON>")

Adding some documentation would be beneficial in this case.

The same applies to other parts in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Fixed in 580fae5 and edd250c.

}

@Override
public void close() {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should close the scanner here:

Suggested change
public void close() {}
public void close() throws IOException {
scanner.close();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 580fae5.

throw new IllegalArgumentException(
CoreError.CASSANDRA_CROSS_PARTITION_SCAN_WITH_FILTERING_OR_ORDERING_NOT_SUPPORTED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer use CASSANDRA_CROSS_PARTITION_SCAN_WITH_FILTERING_OR_ORDERING_NOT_SUPPORTED, so we can remove it from CoreError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, we do not maintain old ones for reference, do we? OK, then I will remove them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove old error messages.

@feeblefakie @josh-wong What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is for version 3.13 and later, I think we can remove the error message since it won't appear in version 3.13 or later.

Note

We should keep the error message in the doc for version 3.12 since the error message will still exist in that version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 580fae5. Note that, as discussed with Toshi, we need to maintain deleted messages to avoid reusing already deleted message IDs somehow, especially the most recent ones.

throw new IllegalArgumentException(
CoreError.COSMOS_CROSS_PARTITION_SCAN_WITH_FILTERING_OR_ORDERING_NOT_SUPPORTED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. We can remove CoreError. COSMOS_CROSS_PARTITION_SCAN_WITH_FILTERING_OR_ORDERING_NOT_SUPPORTED.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 580fae5.

throw new IllegalArgumentException(
CoreError.DYNAMO_CROSS_PARTITION_SCAN_WITH_FILTERING_OR_ORDERING_NOT_SUPPORTED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. We can remove CoreError. DYNAMO_CROSS_PARTITION_SCAN_WITH_FILTERING_OR_ORDERING_NOT_SUPPORTED.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 580fae5.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this change of this file is not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my bad. Will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 580fae5.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. It seems like this change of this file is not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 580fae5.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. It seems like this change of this file is not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 580fae5.

@jnmt
Copy link
Contributor Author

jnmt commented Apr 25, 2024

@brfrn169 Thanks for the comments. Those are very good points.

We can push down the conditions with some limitations. For example, Cosmos DB and DynamoDB do not support a LIKE operator. Cassandra has it, but only for index columns. Since all those databases scan all partitions in any case, I guess the performance benefit could not be so significant in spite of the effort for per-storage implementation. But, if it’s highly prioritized, we can improve one by one (in the future, IMHO).

For Scan and ScanWithIndex, my first thought was that supporting them was a bit complicated for the benefits and use cases, especially when it comes to considering ScalarDB SQL, as I mentioned in a Demo session. But after reconsideration, it’s very nice to have for non-JDBC database users.

Since we additionally need the following, and they will get a bit big changes to some extent, I think it’s good to separate PRs. What do you think?

  • API changes so that we can call where after specifying keys
  • JDBC database changes for better query building when mixing partition-scan conditions
  • Non-JDBC database changes with dynamic filtering (maybe not so big deal)
  • (ScalarDB SQL changes for a kind of optimization when mixing partition-scan conditions)

}
for (Value<?> value : key.get()) {
if (!containedColumnNames.contains(value.getName())) {
logger.warn("Full key doesn't seem to be projected into the result");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it means Cassandra.scan() or equivalents didn't setup proper projections? If so, it's not an expected situation and throwing an AssertionError or IllegalStateException would be good, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I just followed the same manner as FilteredResult in the consensus commit, but IllegalStateException with a message like CoreError.COLUMN_NOT_FOUND would be better as you mentioned.

@brfrn169 Do you remember why you chose a warning rather than an exception?

Copy link
Contributor Author

@jnmt jnmt May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@komamitsu Thank you for the suggestion. I fixed it in 580fae5, including other similar classes in this PR based on the discussion with @brfrn169.

Copy link
Member

@josh-wong josh-wong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left one minor suggestion. Other than that, LGTM. Thank you!🙇🏻‍♂️

docs/configurations.md Outdated Show resolved Hide resolved
@jnmt
Copy link
Contributor Author

jnmt commented May 1, 2024

I will close this PR because we discussed and decided to first refactor Scan.Conjunction, ScanBuilder.ConditionSetBuilder, and so on so that we can support the where clause and dynamic filtering in Get. I will re-open once the refactoring PR is merged.

@jnmt jnmt closed this May 1, 2024
@jnmt jnmt mentioned this pull request May 8, 2024
6 tasks
@jnmt jnmt reopened this May 17, 2024
@jnmt jnmt changed the title Support cross-partition scan with filtering on NoSQL databases Support dynamic arbitrary filtering on NoSQL databases May 17, 2024
@jnmt
Copy link
Contributor Author

jnmt commented May 17, 2024

@feeblefakie @komamitsu @brfrn169 @Torch3333
I revised and reopened this PR since it's now ready for review. One of the major changes is to make it handle not only cross-partition scans but also partition and index scans and get operations in the same manner. However, the builder part (e.g., get with where) and integration tests for get and partition/index scan will be added in other PRs since this PR is already big.

@jnmt jnmt requested review from komamitsu and brfrn169 May 17, 2024 09:17
private final Scanner scanner;
private final List<String> projections;
private final Set<Conjunction> conjunctions;
private int left;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[trivial] How about using Integer so that we can clearly represent N/A with null ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's better than the unclear value like -1. Fixed in 74ab4fb. Note that I needed to add @SuppressFBWarnings to suppress a warning regarding left--, although I guess it is due to the false positive since left is used in the condition for returning the empty.

private final Set<Conjunction> conjunctions;
private int left;

private ScannerIterator scannerIterator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[trivial] @Nullable or @LazyInit might be good?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 74ab4fb. Thank you! I didn't know the @LazyInit annotation :P

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left several comments. Please take a look when you have time!

@Override
protected Properties getProps(String testName) {
Properties properties = CassandraEnv.getProperties(testName);
properties.setProperty(ConsensusCommitConfig.ISOLATION_LEVEL, "SERIALIZABLE");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though serializable isolation is not recommended in non-JDBC databases

Does this mean that we don't recommend using SERIALIZABLE with the cross-partition scan in non-JDBC databases?

if (get.getConjunctions().isEmpty()) {
scanner = getInternal(get);
} else {
scanner = new FilterableScanner(get, getInternal(copyAndPrepareForDynamicFiltering(get)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually since we already copied the get object here (https://github.com/scalar-labs/scalardb/pull/1682/files#diff-660ca76c9b64d94000e0854e2a18fdc5e6a69c028a74b8cacfa4ef733d4604c2L100), I don't think we need to copy the get object here with copyAndPrepareForDynamicFiltering() again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment. I wanted to pass the original get for FilterableScanner and the different modified get for getInternal() since we need the original limit value and projections in FilterableScanner. It might be possible to avoid copying the whole get operation somehow, but I think the current one is not so bad from the readability perspective. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Thanks.

Copy link
Contributor

@Torch3333 Torch3333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Copy link
Contributor

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@jnmt jnmt requested a review from komamitsu May 21, 2024 08:24
Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants