-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Specialized GroupColumn
for Date
/Time
/Timestamp
types for multi-column GROUP BY
#13263
Comments
GroupColumn
for Date
/Time
/Timestamp
typesGroupColumn
for Date
/Time
/Timestamp
types for multi-column GROUP BY
I plan to wrap up nth_value PR today thanks to @jcsherin's reviews, then start on this one if it is okay. Otherwise, I can drop this |
take |
I might also be working on an implementation for this alongside if that is fine? |
Another thing, should this be done before or after #13262? I'm thinking after, just to reduce overall work. |
I agree after would be good -- I am hoping that #13262 can be quickly banged out / merged. However, I think most of the work for this PR will be testing (the actual code is likely not all that much) so that might be good to get ready as well |
@buraksenn Will you be working on this? if not, I would like to try working on this. |
I was waiting #13262 PR but from that PR I see that you've more knowledge on this topic so I don't want to block this :D. Feel free to work if you've time. I'm dropping this issue I can take a look if you cant find time |
take |
Is your feature request related to a problem or challenge?
In #12269 @jayzhan211 made significant improvements to how group values are stored in multi-column aggregations.
Specifically for queries like
The improvement relies on implementing specialized versions of
GroupColumn
for the types ofcol1
,colN
We have implemented the primitive types and Strings/StringViews now, but we have not implemented all types
This means queries like
Will fall back to the slower (but general)
GroupValuesRows
:datafusion/datafusion/physical-plan/src/aggregates/group_values/row.rs
Lines 40 to 41 in a6586cc
Describe the solution you'd like
Implement
GroupColumn
for all primitive types.You can see how to do this here:
datafusion/datafusion/physical-plan/src/aggregates/group_values/mod.rs
Lines 117 to 121 in e4bd579
and the make sure there are tests for each of those types in queries that group on multiple columns
Describe alternatives you've considered
No response
Additional context
Here is an example for how this was done for Strings: #12809
The text was updated successfully, but these errors were encountered: