-
Notifications
You must be signed in to change notification settings - Fork 2
Tests for TPCDS queries join ordering #21
Tests for TPCDS queries join ordering #21
Conversation
@@ -0,0 +1 @@ | |||
Move queries to TPCDS generator library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't find this file. Better to just reg an issue somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I asked on presto slack to copy that project under prestodb
community group, see https://prestodb.slack.com/archives/C07JH9WMQ/p1519824551000061
customer_total_return ctr1 | ||
, store | ||
, customer | ||
WHERE ("ctr1"."ctr_total_return" > ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we need yet another copy of TPCDS queries.
$ find -regex '.*ds.*q01.sql' -exec sha1sum {} +
550d3e6497028853792bc46c8500776e92d9971b ./presto-tpcds/src/main/resources/tpcds/queries/q01.sql
550d3e6497028853792bc46c8500776e92d9971b ./presto-product-tests/src/main/resources/sql-tests/testcases/tpcds/q01.sql
0169942554f00ce16abf51e98b1730fa8be8e005 ./presto-benchto-benchmarks/src/main/resources/sql/presto/tpcds/q01.sql
I would rather see some reuse (even hacky) then a copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
queries in benchmarks are modified (they have variables like ${catalog}.${schema}
).
Product tests have their headers. Here we need vanilla queries.
I prefer to copy files which no-one ever should modify, instead of dragging and/or playing with all the product-tests deps (like cassandra or kafka).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
queries in benchmarks are modified (they have variables like ${catalog}.${schema}).
you could simply find-replace ${catalog}
with tpcds
...
I prefer to copy files which no-one ever should modify,
This is wishful thinking. The queries are being modified from time to time. Casts, grouping, .. And they will be modified in the future (casts and chars).
import static java.util.Objects.requireNonNull; | ||
import static org.testng.Assert.assertEquals; | ||
|
||
public class BaseJoinReorderingTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure. Is
Extract BaseJoinReorderingTest
an extract only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure it is. :)
cross join: | ||
cross join: | ||
cross join: | ||
cross join: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one looks weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep... this is how we handle this query today, see: prestodb#6944
join (LEFT, REPLICATED): | ||
join (INNER, REPLICATED): | ||
tpcds:customer:sf3000.0 | ||
join (INNER, REPLICATED): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these files needed? I thought the idea was that there is no persistent 'expected' value?
Also, include in the cmt msg how the files were generated (technique, what session settings).
Also, name the files .txt
to avoid stupid problems like ".join_ordering
is an unknown extension" or whatever.
|
||
private static File getExpectedJoinOrderingFile(String queryId) | ||
{ | ||
return new File(String.format("src/test/resources/join_ordering/%s.join_ordering", queryId)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rnn queryId
static import format
use Path
end file name with e.g. .txt
} | ||
} | ||
|
||
@Test(dataProvider = "getTpcdsQueryIds", enabled = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, calling this requires temporarily removing enabled = false
. What about making this a main
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it has to be copied to both classes ; /
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it has to be copied to both classes ; /
this i don't understand
<dependency> | ||
<groupId>io.airlift</groupId> | ||
<artifactId>bytecode</artifactId> | ||
<scope>test</scope> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, what are these for?
public void test(String queryId) | ||
throws IOException | ||
{ | ||
assertEquals(joinOrderString(tpcdsQuery(queryId)), read(getExpectedJoinOrderingFile(queryId))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice and functional.
However, if something goes wrong (or one wants to run this with a debugger), this is no longer so nice.
public void update(String queryId) | ||
throws IOException | ||
{ | ||
Files.write(joinOrderString(tpcdsQuery(queryId)), getExpectedJoinOrderingFile(queryId), UTF_8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice and functional.
However, if something goes wrong (or one wants to run this with a debugger), this is no longer so nice.
import static java.lang.String.format; | ||
import static java.util.Objects.requireNonNull; | ||
import static org.testng.Assert.assertEquals; | ||
|
||
/** | ||
* This class tests cost-based optimization rules related to joins. It contains unmodified TPCH queries. | ||
* This class is using TPCH connector configured in way to mock Hive connector with unpartitioned TPCH tables. | ||
*/ | ||
public class TestJoinReordering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestTpchJoinReordering
?
Superseded by https://github.com/starburstdata/presto/pull/29 |
Tests for TPCDS queries join ordering