-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplify project when we merge two projects #17
base: DrillCalcite1.15.0
Are you sure you want to change the base?
simplify project when we merge two projects #17
Conversation
@chunhui-shi, looks like Calcite uses HepPlanner for this rule, but Drill uses VolcanoPlanner. I think we should also use HepPlanner to avoid stack overflow instead of changing the rule. I have just made a quick change in |
Moving the rule to another stage might help breaking the loop of PushFilterPastProject, PushProjectPastFilter triggered by MergeProject. I think it will be better to put this rule after LOGICAL stage to have better chance to do the optimization. I will be running some more tests to verify this idea. |
I saw many unit tests were broken when I tried your suggestion although the hanged case this fix supposed to solve passed. Have you tried some tests with your change? I think this rule could be important to optimize some intermediate results generated by other rules. |
Can we split rules from the |
What would it take to get this fix into Apache Calcite ? We would need unit tests for the nested CAST functions. @chunhui-shi you mentioned that Calcite has a similar simplification (not sure if it does the exact thing).. can you provide the link to that piece of code ? |
@amansinha100, I also saw similar simplifications in several rules, for example here: https://github.com/apache/calcite/blob/b60b67eb8f62463ccbc230358969ef2450cdbe05/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L122 |
@vvysotskyi yes Calcite does similar simplifications, but its not clear why the ProjectMergeRule does not do that for CAST.
|
I tried to reproduce it in Calcite, I simplified the query to use only values tuples:
But it was not reproduced. In Drill, I have limited a list of rules, with which bug still reproduced (for the query above):
The last four rules are used for converting. I tried to use all these rules in Calcite unit test with the same Drill plan and replaced Drill converting rules by Enumerable rules. It also didn't help. With the specified set of rules, I noticed the interesting thing: one of LogicalProject rel nodes, which is created alter applying |
@vvysotskyi thanks for trying the repro within Calcite and comparing with Drill. Since you had mentioned earlier that Calcite does the ProjectMergeRule as part of HepPlanner while Drill does it with Volcano, chances are there may be a costing change needed. Can you try modifying |
@amansinha100, Calcite also able to use this rule with VolcanoPlanner. With all my tries to reproduce this issue in Calcite, I used VolcanoPlanner. The idea of modifying the cost depending on the complexity of expressions looks interesting, we should try to implement it and see if it will help. |
Modifying the cost to consider complexity of expression is a good idea. But it might not help solving this case. Since in my debugging process, none of these RelNode/Subsets was generating any best or bestCost yet before hitting this issue.
…________________________________
From: Aman Sinha <[email protected]>
Sent: Wednesday, March 21, 2018 12:15:31 PM
To: mapr/incubator-calcite
Cc: Chunhui Shi; Mention
Subject: Re: [mapr/incubator-calcite] simplify project when we merge two projects (#17)
@vvysotskyi<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_vvysotskyi&d=DwMFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=7bZjGOKpQi7qeyQ_xRpzEQ&m=zS6W-QeXEzjg4DTn67536dED7UdWWDzUSqVYCTeuAzs&s=B40c_mu76ItG5QZei4Fyfze4-8lxzphziOb0JwXZmZE&e=> thanks for trying the repro within Calcite and comparing with Drill. Since you had mentioned earlier that Calcite does the ProjectMergeRule as part of HepPlanner while Drill does it with Volcano, chances are there may be a costing change needed. Can you try modifying DrillProjectRelBase.computeSelfCost() ? Currently, it just checks for the number of expressions. It does not account for the complexity of that expression. We should modify it such that it treat CAST(CAST) as more expensive than just one CAST.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mapr_incubator-2Dcalcite_pull_17-23issuecomment-2D375063696&d=DwMFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=7bZjGOKpQi7qeyQ_xRpzEQ&m=zS6W-QeXEzjg4DTn67536dED7UdWWDzUSqVYCTeuAzs&s=zYuxaFj-FuJiPlyZoVQAjMEZfFG5ozyiPbZ9AWW2HvQ&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AJa9A6TYaFuBa3PGB6GxQuBZdekHOlR8ks5tgqbTgaJpZM4SkyLW&d=DwMFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=7bZjGOKpQi7qeyQ_xRpzEQ&m=zS6W-QeXEzjg4DTn67536dED7UdWWDzUSqVYCTeuAzs&s=sh-5AX3CKmTBU6peINRc_MXZVv6FlMW8IG8piu2EGr4&e=>.
|
b478a8f
to
83db34c
Compare
83db34c
to
e77338b
Compare
No description provided.