Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize DISTINCT over singleton collections #34702

Merged
merged 2 commits into from
Oct 7, 2024

Conversation

ranma42
Copy link
Contributor

@ranma42 ranma42 commented Sep 17, 2024

This does some progress towards fixing #34482, but in several cases the optimization does not trigger because of another intermediate operation.

I will investigate whether it makes sense to combine these operations in a separate PR or if they can be easily handled in this one.

The Single/First[OrDefault] case seems to be reasonably tested; for the Take(1) case it might be better to add a new test (it is tested, but AFAICT only in combination with Single/First[OrDefault]).

@@ -255,6 +255,11 @@ public void ApplyDistinct()
throw new InvalidOperationException(RelationalStrings.DistinctOnCollectionNotSupported);
}

if (Limit is SqlConstantExpression { Value: 0 or 1 })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to handle the case of 0? AFAICT SELECT expressions with a LIMIT 0 could be optimized away heavily (basically only the projection/typing matters; offset, predicate, distinct, ... are all irrelevant)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree - if we want to optimize the LIMIT 0 case, we should probably do that separately and go much farther than just removing DISTINCT... Any thoughts on cases where LIMIT 0 could be an actually useful thing to do (and therefore worth working on)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A corner case where a LIMIT of 0 is sometimes used (that I know of) is in paginated queries that (ab?)use it to only retrieve the count of the relevant entities, but that is usually going to be handled as a 0-valued parameter.

@@ -1901,6 +1906,10 @@ public void ApplyLimit(SqlExpression sqlExpression)
}

Limit = sqlExpression;
if (Offset is null && Limit is SqlConstantExpression { Value: 0 or 1 })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
if (Offset is null && Limit is SqlConstantExpression { Value: 0 or 1 })
if (Offset is null && Limit is SqlConstantExpression { Value: 0 or 1 })

@@ -255,6 +255,11 @@ public void ApplyDistinct()
throw new InvalidOperationException(RelationalStrings.DistinctOnCollectionNotSupported);
}

if (Limit is SqlConstantExpression { Value: 0 or 1 })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree - if we want to optimize the LIMIT 0 case, we should probably do that separately and go much farther than just removing DISTINCT... Any thoughts on cases where LIMIT 0 could be an actually useful thing to do (and therefore worth working on)?

@ranma42 ranma42 changed the title Optimize DISTINCT over collections of 0 or 1 element Optimize DISTINCT over singleton collections Oct 5, 2024
@ranma42
Copy link
Contributor Author

ranma42 commented Oct 5, 2024

I dropped the limit=0 cases; if optimizations are desired for them, they can be way more aggressive and best handled separately (as they basically drop any rows).

@ranma42 ranma42 marked this pull request as ready for review October 5, 2024 15:31
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ranma42!

@roji roji merged commit d57b7c1 into dotnet:main Oct 7, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants