-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add config to reject Iceberg queries without partition filter #17263
Conversation
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergTableHandle.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergConnectorSmokeTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java
Outdated
Show resolved
Hide resolved
6391560
to
8088b27
Compare
Thank you for @Praveen2112 @marton-bod review, I have made some changes to the code, please help review again when you have time. |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java
Outdated
Show resolved
Hide resolved
@marton-bod Thanks! I have updated the code, Could you please take a look again ? |
f2e0ada
to
4de5bb4
Compare
return; | ||
} | ||
if (isQueryPartitionFilterRequired(session) | ||
&& table.getEnforcedPredicate().equals(TupleDomain.all())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhangminglei Can you explain why this
table.getEnforcedPredicate().equals(TupleDomain.all())
is needed? Doesn't this mean that we only enter the block if the table has no enforced predicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right! @marton-bod , If a table has no enforced predicates, like select * from T where udf(partition_field) = someValue
then enter the block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the difference, Marton, is that some filters show up as TupleDomain
s and some show up as Predicates
depending on how they can be represented.
WHERE x = 42
for example, is a tuple domain and will show up in the enforced constraint
WHERE x % 2 = 0
will be a Predicate because it cannot be represented as a discrete set or range
but if x is a partition column, either should be fine for this feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! Just curious to hear what would happen in the following example:
CREATE TABLE tbl (a int, b int) partitioning = ARRAY['a'] AS SELECT * FROM ...;
SELECT * FROM tbl WHERE b % 2 = 0;
Would we fail the read query because it has no predicate defined for the partition column a
? Or would we skip the check becase there is an enforced predicate for b
@findepi @alexjo2144 @findinpath Can you please take a look at this PR so it can make some progress? It generally looks good to me, but there are some places where your expertise and iceberg connector knowledge would be needed. Thanks a lot! |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergTableHandle.java
Outdated
Show resolved
Hide resolved
@@ -69,6 +72,286 @@ public void tearDown() | |||
deleteRecursively(metastoreDir.toPath(), ALLOW_INSECURE); | |||
} | |||
|
|||
@Test | |||
public void testLackOfPartitionFilterNotAllowed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave one typical test case here in the smoke test and move the rest to BaseIcebergConnectorTest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Yes, smoke tests are designed to be quick and simple
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
I saw in the Hive connector version of this we have a set of schemas where the filters are required. Any thoughts on if we should have that configurable here as well? |
Thanks so much for your review @alexjo2144 !
I also suggested to implement that flag as well. Whether we do it in this PR or a follow-up I'm OK with both. cc @zhangminglei |
@zhangminglei just a friendly bump to see if you could take a look at @alexjo2144's comments. Our team would love to move this feature forward to completion |
1907b6f
to
5fd7f03
Compare
Hi @ebyhr , I think all the PR comments have been addressed now - can you please check if anything else is required, and if not, merge this in? Thanks a lot! |
5fd7f03
to
766a60a
Compare
@ebyhr I've rebased this PR to the latest master, so it's again ready for review. Thanks! |
@ebyhr Can you please take a final look? I appreciate that all maintainers are probably busy, but it's been waiting for review/merge for a while now. Thanks a lot! |
@ebyhr Could you please take a look again ? Thank you very much. |
Hi @raunaqmorarka! I think @ebyhr is busy, so could I ask you please to help us with this? We have rebased multiple times and responded to all the review comments so I don't think there is anything outstanding here. Thank you so much! |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
return Optional.empty(); | ||
} | ||
|
||
Set<IcebergColumnHandle> icebergColumnHandleSet = constraint.getPredicateColumns() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should include the columns from enforced and unenforced tuple domains. See the implementation in DeltaLakeMetadata#applyFilter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand constraintColumns
in HiveTableHandle includes only constraint predicate columns. Either way should be okay as long as we clearly document in IcebergTableHandle what is included in constraintColumns
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Marton Bod.
|
0168d8d
to
1530a96
Compare
@raunaqmorarka I highly appreciate your review. I have addressed your comments. To make it easier to check, I kept the new changes in a separate commit, but once you approve the PR I will squash the commits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments
please squash the commits
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
1530a96
to
d83bd42
Compare
@raunaqmorarka Thanks a lot again for you review, I truly appreciate it! |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergTableHandle.java
Outdated
Show resolved
Hide resolved
d83bd42
to
95a5405
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only nits
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Show resolved
Hide resolved
95a5405
to
cec480a
Compare
Co-authored-by: zhangminglei <[email protected]>
cec480a
to
94d05bd
Compare
Description
For iceberg partitioned tables, we should reject the table scan produced by the planner when the query does not have partition field.
Additional context and related issues
Fixes #17239
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: