-
Notifications
You must be signed in to change notification settings - Fork 924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #5509] Add Apache Impala JDBC engine dialect #6104
[KYUUBI #5509] Add Apache Impala JDBC engine dialect #6104
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6104 +/- ##
============================================
+ Coverage 61.06% 61.13% +0.07%
Complexity 23 23
============================================
Files 623 623
Lines 37202 37203 +1
Branches 5039 5039
============================================
+ Hits 22716 22745 +29
+ Misses 12023 12009 -14
+ Partials 2463 2449 -14 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// getMoreResults() javadoc says, that it should implicitly close | ||
// any current ResultSet object, so here in case of Statement.CLOSE_CURRENT_RESULT | ||
// we can implement the same logic | ||
// see also https://issues.apache.org/jira/browse/HIVE-7680 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a bug fix, and is worth making this change in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR: #6115
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, which errors will be happen without this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, ResultSetWrapper
calls statement.getMoreResults(Statement.CLOSE_CURRENT_RESULT)
# under the License. | ||
|
||
# modified compose file from Impala repo, see | ||
# https://github.com/apache/impala/blob/master/docker/quickstart.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we refers a latest release tag here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added link to the release 4.3.0
.getCodeSourceLocation(getClass).split("integration-tests").head | ||
|
||
private val hiveJdbcConnectorPath: String = { | ||
val keyword = "kyuubi-hive-jdbc-shaded" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In CI, we explicitly set -Pjdbc-shaded
to use the kyuubi-hive-jdbc-shaded
for testing, let's make the test also works without -Pjdbc-shaded
(kyuubi-hive-jdbc
is used in such a case)
val keyword = "kyuubi-hive-jdbc-shaded" | |
val keyword = "kyuubi-hive-jdbc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I used ${hive.jdbc.artifact}
in integration tests, but in case of non-shaded driver artifact, test failed with ClassNotFoundException
of driver dependencies (for instance, org.apache.http.client.HttpClient
). So, in my opinion, here we should use self-contained driver fat jar with all needed dependencies, which is what a shaded artifact is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, impala integration tests will pass without -Pjdbc-shaded
if the updated Kyuubi JDBC driver with fixed getMoreResults()
method will be available in snapshot maven repo, so I'll create separate PR with that fix as you've suggested in #6104 (comment)
- impala-data:/user/hive/warehouse | ||
- ./impala_conf:/opt/impala/conf:ro |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please use the same path style? e.g.
- impala-data:/user/hive/warehouse | |
- ./impala_conf:/opt/impala/conf:ro | |
- impala-data:/user/hive/warehouse | |
- impala_conf:/opt/impala/conf:ro |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first one is named volume, controlled by docker, and second is directory path located relatively to the compose file. If we replace it with impala_conf:
docker will fail finding named volume called impala_conf
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for explanation
protected def hiveServerJdbcUrl: String = withContainers { container => | ||
val feHost: String = container.getServiceHost(IMPALAD_SERVICE_NAME, IMPALAD_PORT) | ||
val fePort: Int = container.getServicePort(IMPALAD_SERVICE_NAME, IMPALAD_PORT) | ||
s"jdbc:hive2://$feHost:$fePort/;auth=noSasl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"kyuubi" is encouraged over "hive2"
s"jdbc:hive2://$feHost:$fePort/;auth=noSasl" | |
s"jdbc:kyuubi://$feHost:$fePort/;auth=noSasl" |
integration-tests/kyuubi-jdbc-it/src/test/resources/impala_conf/hive-site.xml
Outdated
Show resolved
Hide resolved
tableName: String, | ||
tableTypes: util.List[String]): String = { | ||
if (isPattern(schema)) { | ||
throw KyuubiSQLException.featureNotSupported("Pattern-like schema names not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible to add a nagetive test case for it? someday we may want to implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests for such cases to org.apache.kyuubi.engine.jdbc.impala.DialectSuite
@@ -133,6 +133,13 @@ | |||
<overWrite>true</overWrite> | |||
<outputDirectory>${project.build.directory}</outputDirectory> | |||
</artifactItem> | |||
<artifactItem> | |||
<groupId>org.apache.kyuubi</groupId> | |||
<artifactId>kyuubi-hive-jdbc-shaded</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make it respect -Pjdbc-shaded
<artifactId>kyuubi-hive-jdbc-shaded</artifactId> | |
<artifactId>${hive.jdbc.artifact}</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as in #6104 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, the existing approach does not handle transitive deps properly, leave it then
"-redirect_stdout_stderr=false", "-logtostderr", | ||
"-mt_dop_auto_fallback=true", | ||
"-default_query_options=mt_dop=4,default_file_format=parquet,default_transactional_type=insert_only", | ||
"-mem_limit=4gb"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the single GitHub Actions runner has 7GiB memory limit, please reduce the memory consumption as much as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decreased to 1500mb
. Test cases fail at lower values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for experimenting
case Types.FLOAT => | ||
floatToTTypeId | ||
|
||
case Types.BOOLEAN => | ||
booleanToTTypeId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like affects(or not?) all dialects, if so, change it in a seperate PR first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, other dialects don't use this types or use another JDBC types for the same types from db (for instance,Types.BIT
for db boolean
), so I suggest to left these changes in current PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
@tigrulya-exe thanks for your excellent works! |
Also cc @wenzhenghu @adios666 @qian0817 |
856338b
to
43872e1
Compare
43872e1
to
9852125
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nits, overall LGTM
version: "3.5" | ||
services: | ||
metastore: | ||
image: apache/impala:4.0.0-impala_quickstart_hms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible to use the latest 4.3? or are there good reasons to stick to 4.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no official impala quickstart images for version 4.3
in Dockerhub, images from compose file are the latest ones. I also tried to use other user-built images, but they were either too old or too large, so I thought that 4.0
would be suitable for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
externals/kyuubi-jdbc-engine/src/test/resources/impala_conf/hive-site.xml
Outdated
Show resolved
Hide resolved
externals/kyuubi-jdbc-engine/src/test/resources/impala_conf/hive-site.xml
Outdated
Show resolved
Hide resolved
integration-tests/kyuubi-jdbc-it/src/test/resources/impala-compose.yml
Outdated
Show resolved
Hide resolved
integration-tests/kyuubi-jdbc-it/src/test/resources/impala_conf/hive-site.xml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will merge once CI passed
# 🔍 Description ## Issue References 🔗 This pull request fixes apache#5509 ## Describe Your Solution 🔧 Added [Apache Impala](https://impala.apache.org) support in the form of the JDBC engine dialect. Slightly modified Kyuubi Hive JDBC driver in order to use it as driver for Impala dialect instead of the original Hive driver. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Related Unit Tests - `org.apache.kyuubi.engine.jdbc.impala.OperationWithImpalaEngineSuite` - `org.apache.kyuubi.engine.jdbc.impala.SessionSuite` - `org.apache.kyuubi.engine.jdbc.impala.StatementSuite` #### Related Integration Tests - `org.apache.kyuubi.it.jdbc.impala.OperationWithServerSuite` - `org.apache.kyuubi.it.jdbc.impala.SessionWithServerSuite` - `org.apache.kyuubi.it.jdbc.impala.StatementWithServerSuite` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes apache#6104 from tigrulya-exe/feature/5509-support-impala-jdbc-dialect. Closes apache#5509 32ae6d8 [Tigran Manasyan] Codestyle fixes 9852125 [Tigran Manasyan] fix review comments ecb0d7d [Tigran Manasyan] copy impala compose file to integration tests resources 5ea3474 [Tigran Manasyan] fix order in services file 2c63a70 [Tigran Manasyan] Add Apache Impala JDBC engine dialect Authored-by: Tigran Manasyan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
🔍 Description
Issue References 🔗
This pull request fixes #5509
Describe Your Solution 🔧
Added Apache Impala support in the form of the JDBC engine dialect. Slightly modified Kyuubi Hive JDBC driver in order to use it as driver for Impala dialect instead of the original Hive driver.
Types of changes 🔖
Test Plan 🧪
Related Unit Tests
org.apache.kyuubi.engine.jdbc.impala.OperationWithImpalaEngineSuite
org.apache.kyuubi.engine.jdbc.impala.SessionSuite
org.apache.kyuubi.engine.jdbc.impala.StatementSuite
Related Integration Tests
org.apache.kyuubi.it.jdbc.impala.OperationWithServerSuite
org.apache.kyuubi.it.jdbc.impala.SessionWithServerSuite
org.apache.kyuubi.it.jdbc.impala.StatementWithServerSuite
Checklist 📝
Be nice. Be informative.