-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for HTTP transport in thrift metastore client #17925
Conversation
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
1d19c77
to
47ca558
Compare
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
@cla-bot check |
The cla-bot has been summoned, and re-checked this pull request! |
...src/main/java/io/trino/plugin/hive/metastore/thrift/DefaultThriftMetastoreClientFactory.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/trino/plugin/hive/metastore/thrift/DefaultThriftMetastoreClientFactory.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/trino/plugin/hive/metastore/thrift/DefaultThriftMetastoreClientFactory.java
Outdated
Show resolved
Hide resolved
...in/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftMetastoreConfig.java
Outdated
Show resolved
Hide resolved
...in/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftMetastoreConfig.java
Outdated
Show resolved
Hide resolved
...-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/MockThriftHttpMetastoreServer.java
Outdated
Show resolved
Hide resolved
...-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/MockThriftHttpMetastoreServer.java
Outdated
Show resolved
Hide resolved
...-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/MockThriftHttpMetastoreServer.java
Outdated
Show resolved
Hide resolved
...-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/MockThriftHttpMetastoreServer.java
Outdated
Show resolved
Hide resolved
...-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/TestThriftHttpMetastoreClient.java
Outdated
Show resolved
Hide resolved
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Vihang Karajgaonkar.
|
75f7992
to
556adc5
Compare
f44a1bf
to
40eb59b
Compare
Hi @electrum Would you like to take another look? I think I addressed all your review comments. |
Can you squash all of the commits? |
f7d82b7
to
1174392
Compare
@electrum Squashed all commits into one and force pushed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this. Looking at the config closer, rather than adding hive.metastore.thrift.transport-mode
, we should use URI from hive.metastore.uri
to determine the protocol. That's actually why we used a URI in the first place, rather than just a host/port. This will also allow mixing raw and HTTP endpoints.
I think that makes sense. Let me update the PR to derive the HTTP mode directly from the URI scheme. |
@electrum Ah, I remember now why I had to introduce a http mode config. The interface to create the metastore client takes a Line 23 in 15019f0
Due to this we don't have the URI scheme to decide if we should use a http client or a binary client. I can change the interface to include uri but then the changes trickle down to other files as well and the scope of the PR increases. I am okay to do that but just wanted to make sure that I am understanding this right before getting into the changes. |
I updated the PR to reuse the metastore URI scheme to determine if we should use the http thrift transport. I added a separate commit so that we can see what all changes are needed just to reuse the scheme from the metastore uri directly. |
191c15e
to
9302b9a
Compare
Please fix CI failure https://github.com/trinodb/trino/actions/runs/6664185107/job/18111804372?pr=19085 |
I think the product test in the link above is failing due to some infrastructure issue. It is failing because it cannot find the jdbc driver to connect to the databricks cluster. The pt works for me when I run locally on my environment. |
9d60718
to
69a386e
Compare
The product test is still failing https://github.com/trinodb/trino/actions/runs/6688269641/job/18170433430?pr=19085 |
I saw the error. those tests are failing due to cluster not being found (see below). Can we check if the databricks cluster terminated during that time?
|
9bc89ec
to
64f1e65
Compare
ThriftMetastoreConfig metastoreConfig = buildConfigObject(ThriftMetastoreConfig.class); | ||
ThriftHttpMetastoreConfig httpMetastoreConfig = buildConfigObject(ThriftHttpMetastoreConfig.class); | ||
|
||
if (!staticMetastoreConfig.getMetastoreUris().isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does getMetastoreUris return empty? It would be nice to deny if it's unexpected:
checkArgument(!staticMetastoreConfig.getMetastoreUris().isEmpty(), "...");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, I think there were some tests which were failing which caused me add this line. But I cannot find which ones were those. I have added this precondition check for now to see if the CI reveals those failing tests again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I add a checkArgument condition here, I see test failures for TestHiveConnectoryFactory
which expect a different error message asserted in StaticTokenAwareMetastoreClientFactory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the precondition check to checkNotNull for the metastoreUris. If the metastoreUris is empty the validateThriftHttpConfiguration
becomes no-op and the higher layers like StaticTokenAwareMetastoreClientFactory
will assert the metastoreUris are non-empty.
...-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/TestThriftHttpMetastoreClient.java
Outdated
Show resolved
Hide resolved
...main/java/io/trino/tests/product/launcher/env/environment/EnvMultinodeDatabricksHttpHms.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/trino/plugin/hive/metastore/thrift/DefaultThriftMetastoreClientFactory.java
Outdated
Show resolved
Hide resolved
502e7c8
to
9ffa0b3
Compare
...src/main/java/io/trino/plugin/hive/metastore/thrift/DefaultThriftMetastoreClientFactory.java
Outdated
Show resolved
Hide resolved
28eea16
to
8ce2605
Compare
rebased and resolved conflicts. |
558dbfe
to
7952c33
Compare
7952c33
to
a5ade38
Compare
The CI failure looks unrelated. It failed due to "Terminating due to java.lang.OutOfMemoryError: Java heap space" |
Thanks @vihangk1 for keeping the PR alive even after 5 months of to and fro , in true spirit of open source :) |
Any updates? It will be great if we can get this merged so that Trino can integrate Unity catalog. |
Adding here a snippet from a Slack conversation with @vihangk1
I was suggesting him to separate what is related to thrift http configuration in a separate class which would be bound only if we're dealing with a http metastore. @vihangk1 do you have any free cycles in the near future to continue with this PR? |
Thanks @sajjoseph and @findinpath for your comments. I could not find enough time to investigate the approach which @findinpath suggested above in the last month. I made some progress on that and I just need to address some test failures after my changes. I will send out a PR update by this weekend with the suggested approach. |
Closing this PR in favor of #20371 |
Description
This PR adds support for using HTTP as the transport for the thrift metastore client. In this mode the
hive.metastore.uri
can support a http(s) URL to the metastore service. Since each thrift API is a POST request to the the metastore endpoint, this PR also adds support for adding a HTTP bearer token which must be configured to authenticate the metastore client to the metastore server. The token can be configured using a configurationhive.metastore.http.client.bearer-token
. To support Databricks Unity Catalog's HMS API interface, the configurationhive.metastore.http.client.additional-headers
must be set toX-Databricks-Unity-Catalog-Name=<catalog_name>
where the catalog_name is the name of the catalog object in Databricks Unity Catalog.Additional context and related issues
Thrift has support HTTP based transport for a long time. Ref https://github.com/apache/thrift/blob/master/lib/javame/src/org/apache/thrift/transport/THttpClient.java. We have used the http mode for thrift on various services in other projects like HiveServer2 in Apache Hive and Apache Spark for a while now.
More recently, Apache Hive also added support for Hive Metastore server to use HTTP mode.
https://issues.apache.org/jira/browse/HIVE-21456
This PR modifies the logic in
DefaultThriftMetastoreClientFactory
to instantiateTHttpClient
for the transport when creating the thrift metastore client.Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:
Add HTTP support for the thrift metastore client.
Fixes #17865